Uploaded by amit shinde

1 merged (1)

advertisement
Lecture 1: Sets and Mappings
What is a Set?
If we simply define a set to be a collection of objects, then we run into some logical
difficulties. For example, suppose we define
R = set of all sets that do no contain themselves
and ask if R contains itself. We get the following absurd situation:
if R contains itself then by definition of R, it does not contain itself.
and
if R does not contain itself then by definition of R, it contains itself
This was observed by Bertrand Russell in 1901 and is called Russell’s paradox after his
name.
Thus, to rigorously define a set we need to set some ground rules or mathematical axioms.
In order to come up with rigorous definition of a set, several axiomatic set theories were
developed. To name a few:
1. Zermelo - Frankel set theory
2. Morse - Kelly set theory
3. Tarski - Grothendiech set theory
We will not go into the details of these theories. Instead we use the naive set theory. In
naive set theory a set is defined to be a collection of objects that are contained in a huge
hypothetical set called the ‘universal set’.
The universal set is often mistaken to be the ‘set of all sets’, but such a set cannot exist. It
was proved by Cantor and is called Cantor’s theorem. We will give a prove of this theorem
but for this we need to set some definitions.
Sets are usually denoted by capital letters like A, B, C, . . . and the objects (elements) inside
the sets are denoted by small letters like a, b, c, . . ..
When we write a ∈ A, we mean that a is an element of the set A or simply a is (containd)
in A.
Given sets A and B, the union of A and B, denoted by A ∪ B, is the set of all elements
that are either in A or in B.
A ∪ B = {x | x ∈ A or x ∈ B}.
1
The intersection of A and B, denoted by A ∩ B is the set of elements that are both in A
and in B.
A ∩ B = {x | x ∈ A and x ∈ B}.
The set with no elements is called the empty set and is denoted by ∅.
The measure of the size of a set is called the ‘cardinality’ of the set. Given a set A, the
cardinality of A is denoted by |A|.
Given two sets A and B, B is said to be a proper subset of A is every element of B is also
an element of A and there is an element a ∈ A which is not in B. By B ⊂ A we mean B
is a proper subset of A. If the latter property is unclear, we write B ⊆ A, and we say B is
a subset of A. Two sets A and B are equal if A ⊆ B and B ⊆ A.
Exercise: Show that
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
Mappings:
Mappings are assignment of elements of one set to another. Mappings can be used to
compare the cardinalities of two sets as we will see later.
Given two sets A and B a map f from A to B is “assigning an element of B to elements
of A”. For example, in the following map, f assigns 1 to both a and b, while 2 is assigned
to c and 3 is assigned to d.
We write f : A → B to say that f is a map from A to B. The set denoted by f (A) =
f (a) | a ∈ A is said to be the image of A under f . We write f (a) = x if x is assigned to a
under f and x is said to be the image of a under f .
2
We write f : A → B to say that f is a map from A to B. The set denoted by f (A) =
f (a) | a ∈ A is said to be the image of A under f . We write f (a) = x if x is assigned to a
under f and x is said to be the image of a under f .
A mapping f : A → B is said to be surjective (or onto) if f (A) = B.
A mapping f : A → B is said to be injective (or one-to-one) if any two different elements
of A have different images in B. That is, if a ̸= b then f (a) ̸= f (b). Or, equivalently if
f (a) = f (b), then a = b.
A mapping that is both injective and surjective is called a bijective (one-to-one and onto)
mapping. If f : A → B is bijective, then we can define an inverse of f , denoted by f −1 ,
given by f −1 (b) = a if f (a) = b.
We can then define the following rules for comparing the cardinalities of two sets A and
B.
1. |A| = |B| if there exists a bijection between A and B.
2. |A| ≤ |B| if there exists an injective map between A and B.
3. |A| < |B| if there exists an injective but no surjective map between A and B.
3
Lecture 2: Cantor’s Theorem, Products of Sets, Equivalence
Relations
Russell’s Paradox (revisited)
There are several axiomatic theories developed in order to give a rigorous definition of a
set, we named a few in the last lecture. If we simply define a set to be a collection of
objects then we run into logical difficulties. For example we can as well have sets that
contain themselves as set, but this can lead to a problem.
Suppose we call the sets that contain themselves as ‘abnormal sets’ and the sets that do
not contain themselves as ’normal’ sets. Now, consider the set X of normal sets. If we ask
whether X is itself normal and abnormal we have the following situation:
If X is normal, then this implies X does not contain itself. But if X does not contain itself,
then X is in the set of normal sets, which is X itself. This implies that X is abnormal.
If X is abnormal, then this implies X contains itself. But if X contains itself, then it is
not in the set of normal sets, which is X itself. This implies that X is normal.
This is absurd, the set X cannot be normal and abnormal at the same time. This is called
Russell’s Paradox, names after British mathematician Bertrand Russell.
Cantor’s Theorem:
Recall that mappings are assigning elements of one set to another. Mappings can be used
to compare the cardinalities of two sets as we will see later.
Given two sets A and B a map f from A to B is “assigning an element of B to elements
of A”. For example, in the following map, f assigns 1 to both a and b, while 2 is assigned
to c and 3 is assigned to d.
1
We write f : A → B to say that f is a map from A to B. The set denoted by f (A) =
f (a) | a ∈ A is said to be the image of A under f . We write f (a) = x if x is assigned to a
under f and x is said to be the image of a under f .
A mapping f : A → B is said to be surjective (or onto) if f (A) = B.
A mapping f : A → B is said to be injective (or one-to-one) if any two different elements
of A have different images in B. That is, if a ̸= b then f (a) ̸= f (b). Or, equivalently if
f (a) = f (b), then a = b.
A mapping that is both injective and surjective is called a bijective (one-to-one and onto)
mapping. If f : A → B is bijective, then we can define an inverse of f , denoted by f −1 ,
given by f −1 (b) = a if f (a) = b.
We can then define the following rules for comparing the cardinalities of two sets A and
B.
1. |A| = |B| if there exists a bijection between A and B.
2. |A| ≤ |B| if there exists an injective map between A and B.
3. |A| < |B| if there exists an injective but no surjective map between A and B.
Give a set A, denote by P (A) the power set of A, that is the set of all subsets of A including
the empty set and A itself.
We can now prove Cantor’s theorem. The statement is as follows:
Theorem: There is no surjective map between a set A and its powerset P (A).
Proof: Suppose that there is a surjective map f : A → P (A). Since f is a map between
A and P (A), the images f (a) are subsets of A.
Define, B = {x ∈ A | x ̸∈ f (x)}, then B is a subset of A, and therefore since f is surjective,
it has a pre-image. Let y ∈ A be an element such that f (y) = B. Then, observe that:
If y ∈ B then by definition of B, y ̸∈ B
and
If y ̸∈ B then by definition of B, y ∈ B
which is absurd. Therefore, our assumption that f is a surjective mapping is false.
□
However, we do have an injective map g between A and P (A) given by f (x) = {x}. This
means that the cardinality of P (A) is strictly greater than the cardinality of A for any
given set A.
This also means that there cannot be a set of all sets, because for any such set, its powerset
will be a bigger set in terms of cardinality.
2
Product of Sets:
Given two sets A and B, the product of A and B is the set of all couples (a, b) where a ∈ A
and b ∈ B.
A × B = {(a, b) | a ∈ A, b ∈ B}.
For example, if A = {1, 2, 3} and B = {x, y}, then
A × B = {(1, x), (2, x), (3, x), (1, y), (2, y), (3, y}.
This can be generalized to any number of sets:
A1 × A2 × . . . × An = {(a1 , a2 , . . . , an ) | a1 ∈ A1 , . . . , an ∈ An }.
Exercise: Let A1 , A2 , B1 , B2 be sets, then show that
(A1 × B1 ) ∩ (A2 × B2 ) = (A1 ∩ A2 ) × (B1 ∩ B2 ).
Partitions and Equivalence Relations:
Two sets A and B are said to be disjoint if A ∩ B = ∅.
Given a set A, a partition of A is a collection of subsets {Xi } of A such that Xi ∩ Xj = ∅
if i ̸= j and ∪i Xi = A.
Give a set A, any subset R ⊆ A × A is called a ‘relation’ in A. For example, if A = {a, b, c}
then R1 = {(a, a), (b, b), (c, c)} is a relation in A. R2 = {(a, b), (b, c), (c, a)} is another
relation in A.
If R is a relation, then (a, b) ∈ R is also denoted by a ∼ b. In fact, R itself is often denoted
by ∼. We will use this notation in the rest of the lecture.
A relation ∼ is said to be an equivalence relation if:
1. a ∼ a for all a ∈ A. (Reflexivity)
2. if a ∼ b, then b ∼ a. (Symmetry)
3. if a ∼ b and b ∼ c, then a ∼ c. (Transitivity)
Observe that R1 defined above is an equivalence relation but R2 is not.
Suppose ∼ is an equivalence relation in A, then it induces a partition of A as follows:
Given x ∈ A, put [x] = {y ∈ A | x ∼ y} (the set [x] is called the equivalence class of x
under ∼ in A).
Collect all equivalence classes {[x] | x ∈ A} of all elements of A. We will show that the
collection of these equivalence classes give a partition of A. Observe that since ∼ is reflexive,
∪x∈A [x] = A.
We need to show that the equivalence classes are disjoint.
(For this, we will show that any two classes are either disjoint or identical.)
3
Suppose [x1 ] and [x2 ] are two equivalence classes with a common element z. Then, if
y ∈ [x1 ], by the definition of an equivalence class, x1 ∼ y and also x1 ∼ z. Then, by
symmetric and transitive property, y ∼ z. But, since y ∼ z and z ∼ x2 , this implies
x2 ∼ y and so y ∈ [x2 ]. This implies that [x1 ] ⊂ [x2 ]. And similarly [x2 ] ⊆ [x1 ]. Therefore
[x1 ] = [x2 ]. We have just shown that if two equivalence classes have a common element
then they are both identical.
This provides a one-to-one correspondence between equivalence relations and partitions of
a given set.
Example: From now on we will denote by N, Z, Q and R the set of natural numbers
{1, 2, 3, . . .}, integers {. . . , −2, −1, 0, 1, 2, . . .}, rational numbers and real numbers respectively.
Let F be the set of all fractions
Z × (Z − {0}).
a
b
where a and b ̸= 0 are integers. F can be identified with
Define a relation ∼ in F by
a
c
∼ if ad − bc = 0.
b
d
Verify that this is an equivalence relation.
Question: What are the equivalence classes?
Answer: The set of rational numbers
Exercise: Define a relation in R by x ∼ y if |x| = |y|. Show that this is an equivalence
relation. What are the equivalence classes?
4
Lecture 3: Equivalence Relations and Examples
Equivalence Relations and Equivalence Classes:
Recall that a relation in a set A is defined as a subset of the product A × A. A relation ∼
in A is said to be an equivalence relation if:
1. a ∼ a for all a ∈ A. (Reflexivity)
2. if a ∼ b, then b ∼ a. (Symmetry)
3. if a ∼ b and b ∼ c, then a ∼ c. (Transitivity)
Given an equivalence relation ∼ in A and an element x ∈ A, we denote by [x] the set of
all elements of A that are related to x under ∼. That is,
[x] = {a ∈ A|a ∼ x}.
The subset [x] is called the equivalence class of x in A under ∼.
We will now see that equivalence relations and partitions are very closely related. First of
all, suppose {P1 , P2 , . . .} is a partition of a set A. That is,
1. The union of Pi ’s is equal to A.
2. Pi ∩ Pj = ∅ if i ̸= j.
(The subsets Pi are called the classes of the partition.)
Define a relation ∼ in A by saying:
a ∼ b if both a and b belong to the same class of the partition.
Verify that this is an equivalence relation.
Suppose ∼ is an equivalence relation in A, then it induces a partition of A as follows:
Collect all equivalence classes {[x] | x ∈ A} of all elements of A. We will show that the
collection of these equivalence classes give a partition of A. Observe that since ∼ is reflexive,
∪x∈A [x] = A.
We need to show that the equivalence classes are disjoint.
(For this, we will show that, if there is a common element in two equivalence classes, then
both equivalence classes are in fact the same)
Suppose [x1 ] and [x2 ] are two equivalence classes with a common element z. Then, if
y ∈ [x1 ], by the definition of an equivalence class, x1 ∼ y and also x1 ∼ z. Then, by
symmetric and transitive property, y ∼ z. But, since y ∼ z and z ∼ x2 , this implies
x2 ∼ y and so y ∈ [x2 ]. This implies that [x1 ] ⊂ [x2 ]. And similarly [x2 ] ⊆ [x1 ]. Therefore
[x1 ] = [x2 ]. We have just shown that if two equivalence classes have a common element
then they are both identical.
Examples: From now on we will denote by N, Z, Q and R the set of natural numbers
{1, 2, 3, . . .}, integers {. . . , −2, −1, 0, 1, 2, . . .}, rational numbers and real numbers respectively.
1
1. Let F be the set of all fractions
with Z × (Z − {0}).
a
b
where a and b ̸= 0 are integers. F can be identified
Define a relation ∼ in F by
a
c
∼ if ad − bc = 0.
b
d
Verify that this is an equivalence relation.
Question: What are the equivalence classes?
Answer: The set of rational numbers. (Verify!)
2. Let L be the set of all lines in the plane R2 . Define a relation ∼ in L by
given two lines l and m, l ∼ m if l and m are parallel
Verify that this is an equivalence relation.
What are the equivalence classes?
To describe the equivalence classes, we first fix an origin in R2 . Now, observe that every
line in R2 is parallel to exactly one line passing through the origin. So, every line passing
through the origin gives exactly one equivalence class. So, the set of equivalence classes
can be represented by the set of lines passing through the origin. This set is called the
projective line in projective geometry.
3. Let A = N × N × N. Define a relation ∼ in A by:
(x1 , x2 , x3 ) ∼ (y1 , y2 , y3 ) if x1 + x2 + x3 = y1 + y2 + y3 .
Verify that ∼ is an equivalence relation.
Observe that for every number n ≥ 3, there is exactly one equivalence class, namely the
set of all triples (x1 , x2 , x3 ) such that x1 + x2 + x3 = 3.
2
Lecture 4: Partially Ordered Sets
Partially ordered sets:
Apart from equivalence relations, another important relation that arise often in mathematics is ordered relation.
Let A be a non-empty set. A partial order relation in P is a relation (often denoted by ≤)
that satisfies the following properties:
1. x ≤ x for every x ∈ A (reflexivity);
2. x ≤ y and y ≤ x implies x = y (antisymmetry);
3. x ≤ y and y ≤ z implies x ≤ z (transitivity).
A set with a partial order relation is said to be a partially ordered set.
Example 1: On N, the usual relation of ‘less than and equal to’ is a partial order.
Example 2: Let N be the set of natural numbers. Define m ≤ n if m divides n. Verify
that ≤ is a partial order in N.
Example 3: Let P be the set of all subsets of some universal set U and let A ≤ B if A is
a subset of B. Verify that this is a partial order.
Give a set A with a partial order ≤, two elements x and y are said to be comparable if
either x ≤ y or y ≤ x.
Observe that, any two elements in Example 1 are comparable. In Example 2 and 3, there
exist elements that are not comparable.
For Example, the numbers 2 and 3 in Example 2 are not comparable, because neither 2
divides 3, not 3 divides 2. Similarly, for Example 3, take sets A = {a, b, c} and B = {b, c, d},
then A and B are not comparable, because neither A ⊂ B, nor B ⊂ A.
A partial order ≤ in a set A is called a total order, if any two elements x, y in A are
comparable. Example 1 is a total order, while Example 2 and 3 are not. A subset of a
partially ordered set that is totally ordered in itself is called a chain. For example, the
subset {2, 4, 8, . . . , 2n , . . .} in Example 2 is a chain.
An element x in a partially ordered set P is called a maximal element if for any y ≥ x,
y = x. (That is, there is no element greater than or equal to x)
In Example 1 and 2 above there are no maximal elements. In the Example 3, there is
exactly one maximal element, namely the set U itself.
Let A be a non-empty subset of a partially ordered set P . An element x ∈ P is said to
be a lower bound (upper bound) of A if x ≤ a (x ≥ a) for each a ∈ A. And, x is said to
be the greatest lower bound (least upper bound) if it is greater (less) than or equal to any
other lower bound (upper bound).
Take for example
1
Let A be a collection of subsets of U . A lower bound of A is any subset of U which is
contained in every set in A. The greatest lower bound is the intersection of sets in A. The
least upper bound is the union of elements in A.
Exercise: Take a two element subset {a, b} of N. For the partial order defined in Example
2, what is the least upper bound and greatest lower bound of {a, b}?
An important result about partially ordered set is Zorn’s lemma:
Theorem: If P is a partially ordered set such that chain has a maximal element, then P
itself has a maximal element.
The proof of Tychonoff’s theorem in topology is an application of Zorn’s lemma. Tychonoff’s theorem says that the product of compact sets is compact.
2
Lecture 5: Schroeder-Bernstein Theorem
Composition of Maps:
Given a map f : X → Y and g : Y → Z, we can construct a map h : X → Z, given by
h(x) = g(f (x)).
The map h is called the composition of f and g and is denoted by g ◦ f .
If f : X → Y is a bijective map, then there exists a map g : Y → X, such that g◦f : X → X
and f ◦ g : Y → Y are identity maps on X and Y respectively. The map g is called the
inverse of f and is denoted by f −1 .
Given an injective map f : X → Y , we can consider f to be a map from X to f (X). The
map f : X → f (X) thus obtained is in fact a bijective map. We can then also define the
inverse of f , f −1 : f (X) → X.
Schroeder-Bernstein Theorem:
Recall that we defined the following three rules for comparing cardinalities of two sets:
1. card (X) = card() if there exists a bijection between A and B.
2. card (A) ≤ card(B) if there exists an injective map between A and B.
3. card (A) < card(B) if there exists an injective but no surjective map between A and B.
It seems natural to conclude that two sets have the same cardinalities if the second rules
is satisfied both ways. That is, if
card(X) ≤ card(Y ) and card(Y ) ≤ card(X),
then card(A) = card(B). Of course, if the sets A and B are finite, the proof becomes easy.
But, in general, although the statement sounds easy and natural, the proof requires some
work.
Theorem: Given two sets X and Y , if
card(X) ≤ card(Y ) and card(Y ) ≤ card(X),
then card(A) = card(B).
(In other words, if there exists an injective map f : X → Y and an injective map g : Y →
X, then there exists a bijective map between X and Y . This also implies that the relation
≤ on the set of cardinalities is anti-symmetric.)
Proof: If either of the two maps f and g is surjective, then it will be bijective by definition.
Thus, we can assume that neither f nor g is a surjective map.
1
Then, f : X → f (X) is a bijective map onto its image and g : Y → g(Y ) is a bijective
map onto its image. Therefore, we can talk about f −1 : f (X) → X and g −1 : g(Y ) → Y .
We will now construct a bijective map F : X → Y . For this, we need some notations:
Let x be an element of X. Apply g −1 to x (if possible) and call the element g −1 (x) ∈ Y
the first ancestor of x.
Apply f −1 to g −1 (x) (if possible) and call the element f −1 g −1 (x) ∈ X the second ancestor
of x.
Also, we call x the zeroth ancestor of x.
We can continue this process of finding ancestors and observe that for any given x ∈ X,
there are three possibilities:
1. x has infinitely many ancestors.
2. x has an even number of ancestors.
3. x has an odd number of ancestors.
Denote, by Xi , Xo and Xe the set of elements of X that have infinite, odd and even number
of ancestors respectively.
Note that, Xi , Xe and Xo partition X into three subsets.
Similarly, we can do the same with Y to obtain a partition of Y into Yi , Ye and Yo .
Verify now that
f maps Xi to Yi and Xe to Yo surjectively, while g −1 maps Ye to Xo surjectively.
2
Now, define F : X → Y given by
(
f (x)
F (x) =
g −1 (x)
if x ∈ Xi ∪ Xe
if x ∈ Xo
It is easy to show that F is in fact a bijection between X and Y .
□
As an application of Schroeder-Bernstein Theorem, we can show that Q and Z have the
same cardinality. Observe that ϕ : Z → Q given by ϕ(n) = n1 is an injective map.
Exercise: Show that ψ : Q → Z given by


1
if q = 1




−1
if q = −1

ψ(q) = 0
if q = 0


m

2 (2n + 1)
if q = m

n


−2m (2n + 1)
if q = − m
n
is an injective map.
Thus, by Schroeder-Bernstein Theorem, Z and Q have the same cardinality.
3
Lecture 6: Countable sets
Countable sets:
Recall that we denote the set of natural numbers {1, 2, 3, . . .} by N.
We say that a set A is finite if |A| = |{1, 2, . . . , n}| for some n ∈ N. That is, the cardinality
of A is equal to the cardinality of {1, 2, . . . , n} for some natural number n ∈ N.
Can we count the elements of N? Yes, but the counting will never end. This motivates
the definition of a countably infinite set. We say that a set A is countably infinite if the
cardinality of A is equal to the cardinality of N. Or, equivalently if there is a bijection
between A and N.
Some examples of countably infinite sets other than N itself are:
1. The set of even numbers
2. The set of square integers
3. The set of prime numbers
4. In fact, any infinite subset of N is countably infinite.
We will show that Q+ , the set of positive rational numbers is also countably infinite.
Actually, we have seen a proof of this using Schroeder-Bernstein Theorem. Here we give
another proof of this using Cantor’s diagonal method.
First list all positive rational numbers
one rational number, 11 .
a
b
such that a + b = 2. This list trivially has only
Now, list all positive rational numbers
a
b
such that a + b = 2. This will contain { 12 , 21 }.
Continuing this way, list all positive rational numbers ab such that a + b = 3, 4, 5, . . .. It is
possible that during the listing of rational numbers in this way, we repeat some of them ( 11
and 22 for example). Omit the rational numbers that have already appeared in some list.
Observe that this indeed gives a bijection between N and Q+ . This method is called
Cantor’s diagonal method and is depicted in the picture below:
The numbers that are circled in the picture above are omitted while listing of the rationals.
1
It is clear that all rational numbers will appear in the list above.
We have:
|N| = |Z| = |Q+ | = |Q|.
The cardinality of N (and for any countably infinite set) is denoted by ℵ0 (Aleph not).
Cardinality of Real Numbers:
Recall that the set of real numbers is denoted by R. A real number is a number of the
form
a1
a2
A+
+ 2 + ···
10 10
where ai ’s are natural numbers between 0 and 9. This representation of a real number
is called its decimal form or decimal expansion. There is however one problem with this
representation. A number can be written in two different forms, for example 1 and .999...
are the same numbers. We will not consider the representation with infinitely many 9’s.
In this way any real number has a unique decimal expansion. Moreover, the decimal
expansion of any rational number is one of the following two types:
1. The decimal expansion terminates;
2. The decimal expansion has repeating numbers,
for example 1/2 = .5, 1/3 = .33333.... and 1/21 = .047619047619....
The real numbers have the following three types:
1. terminating decimal expansion;
2. repeating decimal expansion;
3. non-repeating and not terminating decimal expansion (irrational numbers).
Here is a natural question:
Question: How does the cardinality of R compare with the cardinality of N?
There is a natural injection map f : N → R, given by f (x) = x. So, the cardinality of
N is at most the cardinality of R. We will show that there is no bijection between N and R.
Claim: There is no bijection between N and R.
Proof: We will prove this by contradiction. We will in fact show that there is no bijection
between N and the interval (0, 1) in R. Suppose there is a bijection between N and (0, 1)
given by the following enumeration of real numbers in (0, 1).
.a11 a12 a13 a14 . . .
.a21 a22 a23 a24 . . .
2
.a31 a32 a33 a34 . . .
.a41 a42 a43 a44 . . .
..
.
Let x = .b1 b2 b3 b4 . . . be the real number formed by the numbers bi ’s having the property
that bi ̸= aii . This number can be formed by replacing aii by some other number bi . Then
the real number x clearly does not appear in the above enumeration of real numbers in
(0, 1). This means that this enumeration is not a bijection between N and (0, 1).
□
Thus, the cardinality of N is strictly less than the cardinality of (0, 1). There is however a
bijection between (0, 1) and R itself, given by f (x) = tan π(x − 21 ).
So, |N| < |R|. Moreover, there is bijection between (0, 1) any interval (a, b) given by
g(t) = a + t(b − a). From this, we conclude that R is infinite but not countably infinite.
We therefore call R as an uncountable set.
3
Lecture 7: Uncountable sets
We showed in the last lecture that the cardinality of N is strictly less than the cardinality of
(0, 1). There is however a bijection between (0, 1) and R itself, given by f (x) = tan π(x− 21 ).
So, |N| < |R|. Moreover, there is bijection between (0, 1) any interval (a, b) given by
g(t) = a + t(b − a). From this, we conclude that R is infinite but not countably infinite.
We therefore call R as an uncountable set.
We can define a set to be an ‘uncountable’ set if its cardinality is strictly greater than
the cardinality of N.
Using the Schroeder-Bernstein Theorem we can show that:
Claim: Any subset X of R containing (0, 1) has the same cardinality as R.
Proof: Since X is a subset of R, the map f : X → R given by f (x) = x is an injective
map. We have already seen a bijection g : R → (0, 1). Composing g with f we get an
injective map (prove this!) g ◦ f between X and (0, 1).
At the same time the map h : (0, 1) → X given by h(y) = y is also an injective map
between (0, 1) and X. Thus, by the Schroeder-Bernstein Theorem the cardinality of X is
the same as the cardinality of (0, 1) and also the cardinality of R.
□
And:
Claim: There exists a bijection between (0, 1) and (0, 1) × (0, 1).
Proof: There is clearly an injective map between (0, 1) and (0, 1)×(0, 1), given for example
by f (x) = (x, 1/2).
We will produce an injective map g : (0, 1) × (0, 1) → (0, 1) as follows:
1
Given (x, y) = (.b1 b2 b3 . . . , .c1 c2 c3 . . .), define g(x, y) = .b1 c1 b2 c2 b3 c3 . . .. The image of (x, y)
under g is the number formed by putting the numbers in the decimal expansions of x and
y alternatively. Verify that g is an injective map. By the Schroeder-Bernstein Theorem it
follows that (0, 1) and (0, 1) × (0, 1) have the same cardinality.
□
Finally, we will show that:
Claim: The cardinality of the power set of N (P (N)) is equal to the cardinality of (0, 1).
Proof: We will prove this by producing injective maps between (P (N)) and R and vice
versa and apply Schroeder-Bernstein theorem to conclude the result.
Given a subset A ⊆ N , put f (A) = .b1 b2 b3 . . . where
(
1 if i ∈ A
bi =
3 if i ∈
/A
This defines a map f : P (N) → (0, 1). Verify that this map is injective.
Given a real number x = .a1 a2 . . . ∈ (0, 1), written in its binary form, put g(x) = B, where
B = {n ∈ N | an = 1}.
This gives a map g : (0, 1) → P (N). Verify that this map is injective.
The claim follows from the Schroeder-Bernstein Theorem.
□
Now that we have shown that the cardinality of the power set of N is equal to the cardinality
of R, it is logical to write the cardinality of R as ℵ1 = 2ℵ0 . (Recall that we denote by ℵ0
the cardinality of N.)
We conclude the discussion on uncountable sets by stating the Continuum Hypothesis:
There is no set with cardinality strictly greater than the cardinality N and strictly less than
the cardinality of R.
It follows from the work of Gödel (1940s) that the existence of such set cannot be proved
within the framework of naive set theory. Read more about the continuum hypothesis on
Wikipedia: https://en.wikipedia.org/wiki/Continuum_hypothesis
2
Lecture 8: Division with Remainder, GCD and Euclid’s
Algorithm
We will discuss some topics in Number Theory in today’s and the next few lectures. The
theory of numbers studies integers and in particular the prime numbers.
Addition, subtraction, multiplication and division are some basic operations that we can
perform with numbers. The set of integers Z is closed under addition, subtraction and
multiplication, in the sense that the application of these three operations between two integers give another integer. We will work with only natural numbers including 0. However,
most of what we are going to do can be extended to integers as well.
The division of two numbers has the following property:
Division with remainder: Given two numbers a, b (b ̸= 0) we can find two numbers m and
r with 0 ≤ r < b such that a = mb + R. (m is called the quotient and r is called the
remainder.)
(This basic fact about division dates back to the time of Euclid (300 BC) when the base
10 numerals had still not been discovered. Numbers were geometrical objects represented
by line segments. Euclid found that if we are given a line segment of length a and b is
another line segment smaller than the first one, then we can divide the first line segment
in equal parts of length b line segment and at the end something less than b would be left,
which is called the remainder.)
We say that a number b divides a (denoted b|a) if a = mb for some number m.
Verify the following properties:
1. if a|1, then a = ±1;
2. If a|b and b|a, then a = b;
3. Any number divides 0;
4. If b|g and b|h, then b|g + h and in fact b|mg + bh for any numbers m and n.
Greatest Common Divisor (GCD)
Given two numbers a and b, the greatest common divisor gcd(a, b) of a and b is the number
c such that:
1. c|a and c|b;
2. If d|a and d|b, then d|c. (That is, any divisor of a and b is also a divisor of c, making c
the biggest divisor)
Example: gcd(6, 10) = 2
1
More generally, we can define the greatest common divisor of any subset of natural numbers.
Given a finite set of numbers A = {a1 , . . . , ar } or even a countably infinite set of numbers
A = {ai }∞
i=1 , we can define gcd(A) as follows,
A number c is the gcd(A) if:
1. c|ai for all i = 1, 2, . . .,
2. if d|ai for all i = 1, 2, . . ., then d|c.
We will show the following:
Theorem: Let {ai }∞
i=1 be a countably infinite set of numbers. Then, there exists a finite
∞
set A ⊂ {ai }i=1 , such that gcd(A) = gcd({ai }∞
i=1 ).
Proof: Let Ai = {a1 , . . . , ai } and ci = gcd(Ai ). This will give us a sequence of numbers
c1 , c2 , c3 , . . . for i = 1, 2, . . ..
Now, observe that
c2 ≥ c3 ≥ c4 ≥ · · · .
(This is because, for each i, Ai ⊂ Ai+1 and if X ⊆ Y , then gcd(X) ≥ gcd(Y ) (verify this
statement!))
Now, since ci is a decreasing sequence of positive numbers, it stabilises eventually, that is
there exists N ∈ N such that cn = cN for all n > N .
This implies gcd(AN ) = gcd({ai }∞
i=1 ) and AN is a finite set.
□
Algorithms for finding the GCD
There are several ways of finding the GCD of given two numbers a and b.
1. Perhaps the simplest of them all would be to check every number between 1 and b if it
divides both a and b, and pick the largest one, that will be the gcd(a, b). But, this method
is very slow. In terms of computational complexity, this method takes exponential time in
the number of digits of b.
2. A second method would be to use the ‘Fundamental theorem of Arithmetic’ which says
that any number is a product of prime powers unique upto ordering. For example, suppose
we want to find the GCD of 180 and 120. Then,
180 = 22 × 32 × 5
and
120 = 23 × 3 × 5
The GCD of the powers of 2 in the prime power representations of 180 and 120 is 22 while
the GCDs of the powers of 3 and 5 are 31 and 51 respectively. Therefore, gcd(180, 120) =
22 × 3 × 5 = 60.
It is not known how fast one can factorize a given number into prime powers and it is still
and active area of research in computational complexity.
2
3. Euclid’s algorithm: This algorithm named after Euclid is a fast algorithm for computing the GCD of two numbers. This is quite remarkable given that Euclid considered
numbers as geometric objects.
Let us explain the algorithm using an example:
Suppose we want to find the GCD of 78 and 14. Using the division with remainder
successively we get:
78 = 5 × 14 + 8
14 = 1 × 8 + 6
8=1×6+2
6=3×2+0
We stop when we get 0 as the remainder. The GCD of the given numbers should be the
remained in the last but one step, 2 in our case above.
In general, if we want to find the GCD of r0 and r1 we do the following:
Write
r0 = q2 × r1 + r2
r1 = q3 × r2 + r3
.. ..
..
..
. .
.
.
rn−1 = qn+1 × rn + 0
The number rn is then the GCD of r0 and r1 .
We must of course check and prove the correctness of this algorithm. Observe that, for the
numbers 180 and 120 the algorithm stops in two steps and gives the correct answer:
180 = 1 × 120 + 60
120 = 3 × 60 + 0
To prove the correctness of Euclid’s algorithm in general, observe that:
1. The algorithm always terminates, since at every step the remainder ri gets smaller and
smaller, so eventually we will hit 0 as the remainder.
2. It gives the right answer. The reason is, at every step of the algorithm, we claim
gcd(ri , ri+1 ) = gcd(ri+1 , ri+2 ).
So, eventually gcd(r0 , r1 ) = gcd(rn−1 , rn ) = rn .
The above claim can be proved as follows:
3
Claim: If a, b are two numbers and a = qb + r, then gcd(a, b) = gcd(b, r).
Proof: We will show that the pairs (a, b) and (b, r) have the same divisors. Then, in
particular their GCDs will also be the same.
Suppose c|a and c|b, then since a = qb + r, we have r = a − qb, so c|r.
Similarly if d|b and d|r, then since a = qb + r, d|a. This implies that the divisors of (a, b)
and (b, r) coincide and so in particular their GDCs are equal.
□
Another consequence of the Euclid’s algorithm is the following theorem.
Theorem: Given two numbers r0 , r1 , there exist integers m0 and n0 such that m0 r0 +
n0 r1 = gcd(r0 , r1 ).
Proof: This can be proved by going backwards in the Euclid’s algorithm. Suppose, for
example that the gcd(r0 , r1 ) is r4 , found using the algorithm as follows:
r0
r1
r2
r3
= q2 r1 + r2
= q3 r2 + r3
= q4 r3 + r4
= q 5 r4 + 0
(1)
(2)
(3)
(4)
Then, by (3)
gcd(r0 , r1 ) = r4 = r2 − q4 r3 .
(5)
gcd(r0 , r1 ) = r4 = r2 − q4 r1 − q3 r2 .
(6)
gcd(r0 , r1 ) = r4 = (r0 − q2 r1 ) − q4 r1 − q3 (r0 − q2 r1 ).
(7)
Use (2) to rewrite (5) as
Use (1) to rewrite (6) as
From (7), after rearranging the terms, we will get:
gcd(r0 , r1 ) = m0 r0 + n0 r1 .
This can be done irrespective of the number of steps taken by Euclid’s algorithm.
We will end this lecture with the following definition:
Two numbers (a, b) are said to be coprime (or relatively prime) if gcd(a, b) = 1.
Exercise: Use the above idea to find m and n such that m78 + n14 = 2.
4
□
Lecture 9: More on GCD and Euclid’s Algorithm
In the last lecture we saw that using Euclid’s algorithm for finding GCD of two numbers
a, b, we can show that there exist numbers m0 , n0 , such that
m0 a + n0 b = gcd(a, b).
Also, that the gcd(a, b) is the remainder in the last but one step of the algorithm and is
the smallest remainder obtained in the steps of the algorithm. This suggests that:
Theorem: Given two numbers (a, b) the gcd(a, b) is the smallest positive number that can
be written as m0 a + n0 b.
Proof: Suppose S = {ma + nb > 0 | m, n ∈ N}, that is S is the set of all positive numbers
of the form ma + nb. Then, since S contains only positive numbers, there is a smallest
number in S. Let us call it c.
We can assume c = m0 a + n0 b.
Claim: c is the GCD of a and b.
Proof: Since c = m0 a + n0 b, if d|a and d|b, then d|c.
It remains to show that c|a and c|b. Suppose on the contrary that c ∤ a. Then, by division
with remainder, there exist q and r such that a = qc + r and 0 < r < c.
But, since c = m0 a + n0 b, we have,
a = q(m0 a + n0 b) + r
=⇒ r = (1 − qm0 )a − qn0 b
=⇒ r = m1 a + n1 b
This means that r, which is strictly less than c, can be written as m1 a + n1 b. This is a
contradiction to the assumption that c is the smallest number of the form ma + nb. This
implies that c | a and similarly c | b. Therefore, c = m0 a + n0 b = gcd(a, b).
□
Efficiency of Euclid’s Algorithm
Given a number x ∈ R, we denote by ⌊x⌋ (floor of x) the biggest integer less than or equal
to x and by ⌈x⌉ (ceiling of x) the smallest integer bigger than or equal to x.
Claim: Let a ∈ N. Then, the number of digits in a is ⌊log10 a⌋ + 1.
Proof: Suppose the number of digits in a is d. Then,
10d−1 ≤ a < 10d
(because 10d−1 is the smallest number with d digits). Taking log on both sides, we have
d − 1 ≤ log10 a < d.
1
This implies d − 1 = ⌊log10 a⌋. This proves the claim.
□
Claim: Let x be a number satisfying 2n/2 ≤ x < 2n . Then, there are roughly n digits in
x.
(by ‘roughly’ we mean that the number of digits in x is comparable to n, or more mathematically the number of digits in x is less than or equal to some constant times n.)
Proof: Since 2n/2 ≤ x < 2n , the result follows from taking log on both sides.
□
We will now discuss how efficient is Euclid’s Algorithm. The aim is to show (vaguely!)
that if (a, b) are numbers with n digits, then Euclid’s algorithm roughly takes a constant
times n steps to compute the GCD of a and b.
Recall that at every step of the algorithm we use division with remainder, r0 = q2 r1 + r2 .
Now, first observe that for the algorithm to work fast, the remainders ri must go to 0
quickly. And, it will work the slowest when qi s are 1. For example, for (13, 8) the algorithm
will work as follows:
13 = 1 × 8 + 5
8=1×5+3
5=1×3+2
3=1×2+1
1=2×1+0
It turns out that this indeed is the case when the algorithm performs the worst. Observe
that, the remainders in the above case follow the following recursive formula
ri = ri−1 + ri−2 .
This is in fact the formula for ‘Fibonacci sequence’. The algorithm performs the worst on
the pairs of Fibonacci numbers.
Remark: 1. One can rigorously prove that the smallest pair of numbers for which Euclid’s
algorithm performs the slowest is the pair (Fn , Fn+1 ) of n-th and (n + 1)-th Fibonacci
sequence. See ‘Art of Computer Programming, vol 2’ (page 356) by Donald Knuth.
2. Fibonacci sequence, though named after an Italian mathematician Fibonacci who lived
around 11th century, had already been described in the works of Pingala, an Indian mathematician who lived around 200 years before christ.
We will observe that the Fibonacci numbers, denoted by Fn , roughly grow exponentially.
Observe that
2n/2 < Fn < 2n ,
and Euclid’s algorithm applied to a pair (Fn+1 , Fn ) will terminate after roughly n steps.
Also, a number between 2n/2 and 2n will roughly have a constant time n digits. We,
therefore, conclude that if (a, b) are numbers with n digits, then Euclid’s algorithm roughly
2
takes a constant times n steps to compute the GCD of a and b. (All this can be made
quite rigorous but I would like you to give it some time and understand the idea behind
it.)
3
Lecture 10 and 11: Congruence Relations, Linear Congruence
Equations and Prime Numbers
Suppose we are given a number n > 0. If we divide any number a with n, the possible
remainders we get are 0, 1, . . . , n − 1. Let use denote by r the set of numbers that give r as
the remainder after dividing by n. These sets partition the set of integers Z into n classes.
That is
Z = 0 ∪ 1 ∪ . . . ∪ (n − 1).
Given a number n, denote by nZ the set of multiples of n, that is {. . . , −2n, n, 0, n, 2n, . . .}.
And, define a relation in Z as follows:
a ∼ b if (a − b) ∈ nZ
(or equivalently a is related to b if (a−b) is divisible by n). Verify that this is an equivalence
relation.
Another way of representation of the same relation is as follow:
If two numbers (a, b) are related, then we say that a is congruent to b modulo n, and write
a ≡ b mod n. So,
a ≡ b mod n,
if n divides (a − b). This equivalence relation is called the congruence relation on Z modulo
n.
What are the equivalence classes under the congruence relation? Suppose a, n ∈ Z and
a is its equivalence class under the congruence relation modulo n. Since a = kn + r for
some r = 0, . . . , n − 1, and a − r = kn. This means that n divides (a − r). Thus, a ≡ r
mod n. Therefore, the equivalence class of a and r are the same. Therefore, since the
possible remainder after division by n are {0, . . . , n − 1}, the equivalence classes under the
congruence relation are {0, 1, . . . , (n − 1)}.
Properties of congruence relation
Let us denote the set of equivalence classes under the congruence relation modulo n by
Z/nZ. That is,
Z/nZ = {0, 1, . . . , (n − 1)}.
We can define an addition and multiplication in Z/nZ. Given two equivalence classes, say
b and c in Z/nZ, define
b+c=b+c
b.c = b.c.
(Equivalently, (b + c) mod n = b mod n + c mod n and bc mod n = (b mod n)(c
mod n).)
1
Of course, we need to check if this addition and multiplication is well defined. That is we
need to show that if we take a number d in b and a number e in c, then d + e lies in b + c,
and d.e lies in b.c.
More formally, we need to show:
1. If d ≡ b mod n and e ≡ c mod n, then (d + e) ≡ (b + c) mod n.
Since d ≡ b mod n and e ≡ c mod n, n divides (d − b) and n divides (e − c). So, n divides
(d − b) + (e − c) = (d + e) − (b + d). This implies that (d + e) ≡ (b + c) mod n.
2. If d ≡ b mod n and e ≡ c mod n, then (d.e) ≡ (b.c) mod n.
Since n divides (d − b) and n divides (e − c), n also divides e.(d − b) + b.(e − c) = ed − bd.
This implies d.e ≡ b.c mod n.
Let us use these properties to do the following exercise.
Exercise 1. Show that a number is divisible by 3 if and only if the sum of its digits is
divisible by 3.
Sol.Observe that 10 ≡ 1 mod 3, 10n ≡ 1 mod 3 and c10n ≡ c mod 3 for any numbers n
and c. Suppose the given number a is an an−1 . . . a0 , that is ai ’s are the digits of a. Then,
a = an 10n + an−1 10n + · · · + a0
a mod 3 ≡ (an 10n + an−1 10n + · · · + a0 ) mod 3
a mod 3 ≡ (an 10n mod 3) + (an−1 10n mod 3) + · · · + (a0
a mod 3 ≡ (an + . . . + a0 ) mod 3
mod 3)
Thus the sum of the digits of a is divisible by 3, if and only if a is divisible by 3.
□
Linear Congruence Equations
An equation of the form
ax = b mod n
is called a Linear congruence Equation. We will show that:
Theorem. The equation ax = b mod n has a solution if and only if gcd(a, n)|b.
Proof. Suppose ax = b mod n has a solution r. This implies that n divides ar − b. So,
there exists a number s such that ar − b = sn, or equivalently ar − sn = b. Since the
gcd(a, n) divides a and n, it divides b = ar − sn.
On the other hand, suppose gcd(a, n) divides b. Then, there exists a number t such that
t gcd(a, n) = b. By, Euclid’s algorithm, we can also find r and s such that
ra + sn = gcd(a, n).
Multiplying the above equation by t, we get tra + tsn = t gcd(a, n) = b. This implies, tr
is a solution to ax = b mod n.
2
Exercise: Show that if gcd(a, n)|b and r is a solution to ax = b mod n, then r +
(n/ gcd(a, n) also is a solution.
Prime Numbers
A number p > 1 is said to be a prime number if its only divisors are 1 and p itself.
The number 1 is not considered a prime number by most mathematicians. One very
compelling reason for not considering 1 as a prime number is the Fundamental Theorem
of Arithmetic (FTA).
Theorem (FTA): Any positive number a > 1 can be factored into a unique way as
a = pα1 1 pα2 2 . . . pαt t
where p1 < p2 < · · · < pt are prime numbers and αi > 0 for all i = 1, 2, . . . , t.
If 1 were considered a prime, then the FTA would not hold. We could simply put any
power of 1 in the product of prime powers and have multiple representations for the same
number.
We will prove FTA a little later. Let us first prove that there are infinitely many prime
numbers. We will see two different proofs of this result;
First Proof
Theorem: There are infinitely many prime numbers.
Proof: Suppose on the contrary that there are only finitely many primes, p1 , p2 . . . , pn .
Consider the number
pn+1 = p1 p2 . . . pn + 1.
Question. What are the prime factors of pn+1 ?
Observe that p1 ∤ pn+1 , because if p1 was a divisor of pn+1 , then it would divide 1.
(Show in general that if p|a + b and p|a, then p|b)
In fact, none of pi s will divide pn+1 for the same reason. Therefore, pn+1 would be a new
prime number. This contradicts the hypothesis that p1 , . . . pn are the only primes.
□
This proof is due to Euclid and is often mistaken for an algorithm to produce prime
numbers, in the sense that if we take n prime numbers, take their product and add 1, we
get a new prime. This is not true, for
a. 2.3.7.43 + 1 = 1807 = 13 × 39.
It is also not true that if we take the product of first few primes and add 1 to it, we get a
new prime, for
b. 2.3.5.7.11.13 + 1 = 30031 = 59 × 509
3
As of today, the largest known prime is 282,589,933 − 1 and it has 24, 862, 042 digits (source:
wikipedia). Observe that this number is of the form 2n − 1. Prime numbers of this form
are called Mersenne primes, named after French mathematician Marin Mersenne (1588 1648). Observe also that for 2n − 1 to be a prime, n must itself be a prime, for
2ab − 1 = (2a ) − 1)(1 + 2a + 22a + 22a + . . . + 2(b−1)a ).
Notice on the other hand that the inverse implication is not true. Observe that 21 1 − 1 is
not a prime number.
Second Proof
The second proof uses the Riemann-Zeta function, defined as:
∞
X
1
1
1
1
ζ(s) =
= s + s + s + ...
s
n
1
2
3
n=1
(A)
Typically s is taken to be a complex number, but for our purpose we assume it is only real.
The series ζ(s) converges for s > 1.
We claim that ζ(s) takes the following product form:
ζ(s) =
1
1 − p−s
p− prime number
Y
(B)
(the product is taken over all prime numbers)
To see that the summation given in (A) and the product given (B) are equal, we observe:
1
1 − 2−s
1
1 − 3−s
1
1 − 5−s
..
.
= 1 + 2−s + (2−s )2 + (2−s )3 + . . .
= 1 + 3−s + (3−s )2 + (3−s )3 + . . .
= 1 + 5−s + (5−s )2 + (5−s )3 + . . .
=
..
.
Now, a typical term in the product of the terms above would look, for example, like
(2−s )i1 (3−s )i2 (5−s )i3 . . . .
Observe that every term in the summation (A) will appear exactly once in the expansion
of the product (B), thanks to the Fundamental Theorem of Arithmetic. This proves the
identity:
∞
X
Y
1
1
=
.
s
n
1 − p−s
n=1
p− prime number
4
Put s = 1 in the above identity and note that its left hand side becomes
1 + 1/2 + 1/3 + . . . ,
which is a divergent series while the right hand side is finite if there were only finitely many
prime numbers. This implies, there must be infinitely many prime numbers.
5
Lecture 10 and 11: Fermat’s Theorem and Euler’s Theorem
Fermat’s Theorem
Theorem. Let a ∈ N ∪ {0} be a number and p a prime number, then ap ≡ a mod p.
(That is, (ap − a) is divisible by p).
We will present two proofs of this result. The first one uses only a counting technique.
Proof 1. Suppose X is a set with a number of elements and S is the set of strings of
length p made up of elements in X.
(For example, for a = 2 and p = 3, we can take X = {A, B} and S = {AAA, AAB, ABB,
BBB, ABA, BBA, BAB, BAA})
Then, the cardinality of S will be ap . (Verify!)
Let S be the set of strings made up of at least two elements of X (that is, we remove from
S the set of strings made up of just one element). Then, |S| = ap − a.
We will show that we can partition S into classes with exactly p elements. Once this is
done, the result follows, since |S| will be p times the number of classes in the partition,
which is divisible by p.
We will define this partition using an equivalence relation in S. Let ∼ be a relation in S
defined as follows:
Two strings s1 ∼ s2 if s1 can be formed from s2 by shifting the elements while preserving
the order of the elements in s2 .
(For example, AAB ∼ BAA ∼ ABA and ABB ∼ BAB ∼ BBA.)
It is easy to see that ∼ is an equivalence relation. We claim that the equivalence classes
under ∼ have cardinality exactly p. This can be seen as follow. Observe that:
1. Every string in S has at least 2 elements.
2. Suppose a string ‘in general’ has length r × m, formed of a string of length m repeated
r times, then the equivalence class of such a string has exactly m elements.
(For example, suppose s = ABBABBABBABB, made of 4 copies of the string ABB of
length 3. Then, the equivalence class of s will consists of ABBABBABBABB, BABBAB
BABBAB , BBABBABBABAA.)
3. In our case, we have strings of length p (’repeated’ only once)
By the above observations, it is then clear that the equivalence class of each string has p
elements.
The results now follows from the fact that
|S| = ap − a = p × no. of equivalence classes under ∼.
□
1
Observe that if p|ap − a, then p|a(ap−1 − a). And so, if gcd(a, p) = 1, then p|ap−1 − 1. This
is the reason Fermat’s theorem is also written in the following form sometimes.
If gcd(a, p) = 1, then ap−1 ≡ 1 mod p.
Euler’s Theorem
Given a number n ∈ N, let ϕ(m) be the number of positive integers ≤ m that are coprime
with m. This gives us a map ϕ : N → N, and it is called Euler’s totient function. The
values of ϕ for the first few numbers are:
ϕ(1) = 1, ϕ(2) = 1, ϕ(3) = 2, ϕ(4) = 2, ϕ(5) = 4, ϕ(6) = 2, ϕ(9) = 6, . . .
Euler’s theorem states that:
If gcd(a, m) = 1, then aϕ(m) ≡ 1 mod m.
In particular, if p is a prime, then ϕ(p) = p − 1, and Fermat’s theorem follows from Euler’s
theorem.
n!
. This is
Proof 2. Recall that the binomial coefficient n Cr is given by the formula r!(n−r!)
the number of r-element sets in an n-element set. Also, the binomial coefficients can be
defined using the expansion of (x + y)p , given by
X
p
C i xi y j .
(x + y)p =
i+j=n
In particular for x = a and y = 1, we get
(a + 1)p = ap + p C1 ap−1 + p C2 ap−2 + · · · + p Cp−1 a1 + p Cp a0 .
We will prove Fermat’s theorem by induction.
For a = 1, clearly 1 ≡ 1 mod p. Assume that the theorem holds for a. We will show that
(a + 1)p ≡ (a + 1)
mod p.
Since,
(a + 1)p = ap + n C1 ap−1 + n C2 ap−2 + · · · + n Cp−1 a1 + n Cp a0
(a + 1)p mod p = ap mod p + · · · + a0 mod p
Since p Cr is divisible by p if r ̸= 0 or p (verify!), we get
(a + 1)p
mod p = (a + 1)
2
mod p.
□
Let us use Fermat’s theorem to show
Exercise: Show that P (n) = 3n5 + 5n3 + 7n is an integer for all n.
Sol. Multiply by 15 to get
15P (n) = 3n5 + 5n3 + 7n.
If we can prove that the expression on the right hand side is divisible by 15, then P (n)
must be an integer for every n. We will check if the expression is divisible by 3 and 5
separately.
So,
(3n5 + 5n3 + 7n)
mod 3 ≡ (3n5 + 5n + 7n) mod 3
(By Fermat’s Theorem n3 ≡ n mod 3)
≡ (3n5 + 12n)
≡ 0 mod 3
mod 3
We can similarly show that the expression is also divisible by 5. This implies that P (n) is
an integer for every n.
□
3
Download