Uploaded by Anesu Munhuweyi

descrete notes gsm

advertisement
UNIVERSITY OF ZIMBABWE
DEPARTMENT OF MATHEMATICS
DESCRETE MATHS NOTES
Mr G.S Maridza
1
2
Chapter 1
Number systems
1.1
Number Systems
Mathematics has its own language with numbers as the alphabet. The language is given
structure with the aid of connective symbols, rules of operation, and a rigorous mode of
thought (logic). The number systems that we use in calculus are the natural numbers, the
integers, the rational numbers, and the real numbers. Let us describe each of these :
1. The natural numbers are the system of positive counting numbers 1, 2, 3 . . . . We
denote the set of all natural numbers by N.
N = {1, 2, 3, 4, 5, 6, 7, 8, . . . }.
2. The integers are the positive and negative whole numbers and zero, . . . , −3, −2, −1, 0, 1, 2, 3, . . . .
We denote the set of all integers by Z.
Z = {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . }.
3. The rational numbers are quotients of integers or fractions, such as 32 , − 54 . Any
p
number of the form , with p, q ∈ Z and q 6= 0, is a rational number. We denote the
q
set of all rational numbers by Q.
p
Q=
p, q ∈ Z, q 6= 0 .
q
4. The real numbers are the set of all decimals, both terminating and non-terminating.
We denote the set of all real numbers by R. A decimal number of the form x = 3.16792
is actually a rational number, for it represents
x = 3.16792 =
316792
.
100000
A decimal number of the form
m = 4.27519191919 . . . ,
3
with a group of digits that repeats itself interminably, is also a rational number. To see
this, notice that
100 · m = 427.519191919 . . .
and therefore we may subtract
100m = 427.519191919 . . .
m = 4.27519191919 . . .
Subtracting, we see that
99m = 423.244
or
423244
.
99000
So, as we asserted, m is a rational number or quotient of integers. To indicate recurring
decimals we sometimes place dots over the repeating cycle of digits, e.g., m = 4.2751̇9̇,
19
= 3.16̇.
6
Another kind of decimal number is one which has a non-terminating decimal expansion
that does not keep repeating. An example is π = 3.14159265 . . . . Such a number is
irrational, that is, it cannot be expressed as the quotient of two integers.
In summary : There are three types of real numbers : (i) terminating decimals, (ii)
non-terminating decimals that repeat, (iii) non-terminating decimals that do not
repeat. Types (i) and (ii) are rational numbers. Type (iii) are irrational numbers.
The geometric representation of real numbers as points on a line is called the real
axis. Between any two rational numbers on the line there are infinitely many rational
numbers. This leads us to call the set of rational numbers an everywhere dense set.
Real numbers are characterised by three fundamental properties :
m=
(a) algebraic means formalisations of the rules of calculation (addition, subtraction,
multiplication, division). Example : 2(3 + 5) = 2 · 3 + 2 · 5 = 6 + 10 = 16.
3
1
(b) order denote inequalities. Example : − < .
4
3
(c) completeness implies that there are “no gaps” on the real line.
Algebraic properties of the reals for addition (a, b, c ∈ R) are :
(A1) a + (b + c) = (a + b) + c. associativity
(A2) a + b = b + a. commutativity
(A3) There is a 0 such that a + 0 = a. identity
(A4) There is an x such that a + x = 0. inverse
Why these rules? They define an algebraic structure (commutative group). Now define
analogous algebraic properties for multiplication :
(M1) a(bc) = (ab)c.
4
(M2) ab = ba.
(M3) There is a 1 such that a · 1 = a.
(M4) There is an x such that ax = 1 for a 6= 0.
Finally, connect multiplication and addition :
(D) a(b + c) = ab + ac. distributivity
These 9 rules define an algebraic structure called a field.
Order properties of the reals are :
(O1) for any a, b ∈ R, a ≤ b or b ≤ a. totality of ordering I
(O2) if a ≤ b and b ≤ a, then a = b. totality of ordering II
(O3) if a ≤ b and b ≤ c, then a ≤ c. transitivity
(O4) if a ≤ b, then a + c ≤ b + c. order under addition
(O5) if a ≤ b and c ≥ 0, then ac ≤ bc. order under multiplication
Some useful rules for calculations with inequalities are : If a, b, c are real numbers, then
:
(a) if a < b and c < 0 ⇒ bc < ac.
(b) if a < b ⇒ −b < −a.
1
(c) if a > 0 ⇒ > 0.
a
1
1
< .
b
a
The completeness property can be understood by the following construction of the
real numbers : Start with the counting numbers 1, 2, 3, . . . .
(d) if a and b are both positive or negative, then a < b ⇒
ˆ N = {1, 2, 3, 4, . . . } natural numbers ⇒ Can we solve a + x = b for x?
ˆ Z = {. . . , −2, −1, 0, 1, 2, . . . } integers ⇒ Can we solve ax = b for x?
ˆ Q = { pq |p, q ∈ Z, q 6= 0} rational numbers ⇒ Can we solve x2 = 2 for x?
2
ˆ R
√ real numbers, for example : The positive solution to the equation x = 2 is
2. This is an irrational number whose decimal representation is not eventually
repeating.
⇒ N⊂Z⊂Q⊂R
In summary, the real numbers R are complete in the sense that they correspond to all
points on the real line, i.e., there are no “holes” or “gaps”, whereas the rationals have
“holes” (namely the irrationals).
You Try It : What type of real number is 3.41287548754875 . . . ? Can you express
this number in more compact form?
5
6
Chapter 2
Basic Concepts of Set Theory
The purpose of this chapter is twofold: to provide an introduction to, or review of, the
terminology, notation, and basic properties of sets, and, perhaps more important, to serve
as a starting point for our primary goal — the development of the ability to discover and
prove mathematical theorems. The emphasis in this chapter is on discovery, with particular
attention paid to the kinds of evidence (e.g., specific examples, pictures) that mathematicians
use to formulate conjectures about general properties.
2.1
A brief history of sets
A set is an unordered collection of objects, and as such a set is determined by the objects
it contains. Before the 19th century it was uncommon to think of sets as completed objects
in their own right. Mathematicians were familiar with properties such as being a natural
number, or being irrational, but it was rare to think of say the collection of rational numbers
as itself an object. (There were exceptions. From Euclid mathematicians were used to
thinking of geometric objects such as lines and planes and spheres which we might today
identify with their sets of points.) In the mid 19th century there was a renaissance in Logic.
For thousands of years, since the time of Aristotle and before, learned individuals had been
familiar with syllogisms as patterns of legitimate reasoning, for example:
All men are mortal. Socrates is a man. Therefore Socrates is mortal.
But syllogisms involved descriptions of properties. The idea of pioneers such as Boole was
to assign a meaning as a set to these descriptions. For example, the two descriptions “is a
man” and “is a male homo sapiens” both describe the same set, the set of all men. It was
this objectification of meaning, under- standing properties as sets, that led to a rebirth of
Logic and Mathematics in the 19th century. Cantor took the idea of set to a revolutionary
level, unveiling its true power. By inventing a notion of size of set he was able compare different forms of infinity and, almost incidentally, to shortcut several traditional mathematical
arguments.
But the power of sets came at a price; it came with dangerous paradoxes. The work of Boole
and others suggested a programme exposited by Frege, and Russell and Whitehead, to build
a foundation for all of Mathematics on Logic. Though to be more accurate, they were really
7
reinventing Logic in the process, and regarding it as intimately bound up with a theory of
sets. The paradoxes of set theory were a real threat to the security of the foundations. But
with a lot of worry and care the paradoxes were sidestepped, first by Russell and Whitehead’s
theory of stratified types and then more elegantly, in for exam- ple the influential work of
Zermelo and Fraenkel. The notion of set is now a cornerstone of Mathematics.
The formal development of set theory began in 1874 with the work of Georg Cantor (18451918). Since then, motivated particularly by the discovery of certain paradoxes (e.g., Russell’s paradox), logicians have made formal set theory and the foundations of mathematics
a vital area of mathematical research, and mathematicians at large have incorporated the
language and methods of set theory into their work, so that it permeates all of modern
mathematics.
2.2
Sets and Elements
The notion of set is a primitive, or undefined, term in mathematics, analogous to point and
line in plane geometry. Therefore, our starting point, rather than a formal definition, is an informal description of how the term ”set” is generally viewed in applications to undergraduate
mathematics.
Similar (but informal) words : collection, group, aggregate, bundle, ensemble, family,
class.
Description : A set is a collection of objects which are called the members or elements
of that set. For example
1. The set of students in this room.
2. The English alphabet may be viewed as the set of letters of the English language.
3. The set of natural numbers, e.t.c.
Sets can consist of elements of various nature : people, physical objects, numbers,
signs, other sets, e.t.c.
A set is an ABSTRACT object, its members do not have to be physically collected together
for them to constitute a set. The membership criteria for a set must in principle be welldefined, and not vague. Sets can be finite or infinite.
2.3
Some Interesting Sets of Numbers
Let’s look at different types of numbers that we can have in our sets.
1. Natural Numbers
The set of natural numbers is {1, 2, 3, 4, . . . } and is denoted by N.
2. Integers
The set of integers is {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . } and is denoted by Z. The
Z symbol comes from the German word, Zahlen, which means number. Define the
non-negative integers {0, 1, 2, 3, 4, . . . } often denoted by Z+ . All natural numbers
are integers.
8
3. Rational Numbers
The set of rational numbers is denoted by Q and consists of all fractional numbers i.e.,
x ∈ Q if x can be written in the form pq , where p, q ∈ Z with q 6= 0.
4. Real Numbers
The real numbers are denoted by R.
5. Complex Numbers
The complex numbers are denoted by C.
2.4
Notation
1. A, B, C, . . . for sets.
2. a, b, c, . . . or x, y, z, . . . for members.
3. b ∈ A, if b belongs to A.
4. c ∈
/ A, if c does not belong to A.
5. ∅ is used for the empty set. There is exactly one set, the empty set or null set, which
has no members at all.
6. A set with only one member is called a singleton or singleton set. for example, {x}.
2.5
Specification of Sets
One advantage of having an informal definition of the term set is that, through it, we can
introduce some other terminology related to sets. The term element is one example, and the
notion well-defined is another. The latter term relates to the primary requirement for any
such description: Given an object, we must be able to determine whether or not the object
lies in the described set.
2.5.1
Three Ways to Specify a Set
1. Listing all its members (List Notation).
2. By stating a property of its elements (Predicate Notation).
3. By defining a set of rules which generates (defines) its members (Recursive Rules).
List Notation
This is suitable for finite sets. It lists names of elements of a set, separated by commas and
enclose them in braces. For example
A = {2, 3, 5},
B = {a, b, d, m},
C = {George Washington, Bill Clinton}.
Note that 2 ∈ A and George Washington ∈ C, but 8 ∈
/ A and k ∈
/ B.
9
Two important facts are: (i) the order in which elements are listed is irrelevant and (ii)
an object should be listed only once in the list, since listing it more than once does not
change the set. As an example, the set {1, 1, 2} is the same as the set {1, 2} (so that the
representation {1, 1, 2} is never used) which, in turn, is the same as the set {2, 1}.
Three Dot Abbreviation
For example, {1, 2, . . . , 100}.
Predicate Notation
We describe a set in terms of one or more properties to be satisfied by objects in the
set, and by those objects only. Such a description is formulated in so-called set-builder
notation, that is, in the form A = {x|x satisfies some property or properties}, which
we read “A is the set of all objects x such that x satisfies . . . .”. For example,
{x|x is a natural number and x < 8}.
Reading : the set of all x such that x is a natural number and is less than 8. For example,
(i) {x|x is a positive number} (ii) {x|x is a letter of the Russian alphabet}.
The general form is {x|P (x)} where P is some predicate (condition,property).
In all these examples the vertical line is read “such that” and the set is understood to consist
of all objects satisfying the preceding description, and only those objects.
Recursive Rules
For example, the set E of even numbers greater than 3,
(a) 4 ∈ E (b) if x ∈ E, then x + 2 ∈ E (c) nothing else belongs to E. The first rule is
the basis of the recursion, the second one generates new elements from the elements defined
before and the third rule restricts the defined set to the elements generated by (a) and (b).
The collection, out of which all sets under consideration may be formed, is called the universe of discourse or universal set, denoted by U. For our purposes a universal set is
the set of all objects under discussion in a particular setting. A universal set will often be
specified at the start of a problem involving sets, whereas in other situations a universal set
is more or less clearly, but implicitly, understood as background to a problem. The role then
of a universal set is to put some bounds on the nature of the objects that can be considered
for membership in the sets involved in a given situation.
It is in connection with the description method that “well definedness” comes into play.
The rule or rules used in describing a set must be (i) meaningful, that is, use words and/or
symbols with an understood meaning and (ii) specific and definitive, as opposed to vague
and indefinite. Thus descriptions like G = {x|x is a goople} or E = {x|x % & 3} or
Z = {x|x is a large state in the U nited States} do not define sets. The descriptions
of G and E involve nonsense symbols or words, while the description of Z gives a purely
subjective criterion for membership.
10
2.6
The Empty set (Null Set)
We have that the fundamental property of a set is that we can assert of each object whether
or not it is a member of the set.
Consider a set constructed by asserting of each object that it is not a member of the set.
This set has no members and is therefore called the empty set.
Definition 2.6.1. The null or empty set is the set that does not contain any elements,
denoted by the Scandinavian letter ∅ = {} = {x|x 6= x}.
Example 2.6.1. (i) {x ∈ R|x2 = −1}
(ii) {x ∈ Z|x2 = 2}.
Theorem 2.6.1. There is exactly one empty set.
2.7
Identity and Cardinality
Two sets are identical if and only if (iff) they have the same elements or both are empty.
So A = B iff, for every x, x ∈ A ⇔ x ∈ B.
Example 2.7.1. {0, 2, 4} = {x|x is an even positive integer less than 5}.
As the above example shows, equality of sets does not mean they have identical defi-nitions;
there are often many different ways of describing the same set. The definition of equality
reflects rather the fact that a set is just a collection of objects.
If we have to prove that the sets A and B are equal, it is often quite difficult to prove in one
go that they have the same elements. What is usually done is to split the proof into two
parts:
(a) Show that every member of A is a member of B.
(b) Show that every member of B is a member of A.
The number of elements in a set A is called the cardinality of A, denoted by |A|. The
cardinality of a finite set is a natural number. Infinite sets also have cardinalities but they
are not natural numbers. The set A is said to be countable or enumerable if there is a
way to list the elements of A.
2.8
Russell’s Paradox (antimony)
A paradox (antimony) is an apparently true statement that seems to lead to a logical
self-contradiction.
Its important to note that any given property, P (x) does not necessarily determine a set, i.e.,
we cannot say that given an arbitrarily property P , there corresponds a set whose elements
satisfy the property P .
Consider the following, There was once a barber. Wherever he lived, all of the men in this
town either shaved themselves or were shaved by the barber. And this barber only shaved
the men who did not shave themselves. Did the barber shave himself?
Let’s say that he did shave himself. But from this he shaved only the men in town, who did
11
not shave themselves, therefore, he did not shave himself.
But we see that every men in town either shaved himself or was shaved by the barber. So
he did shave himself. We have a contradiction.
Russell observed that if S is a set, then either S ∈ S or S ∈
/ S, since a given object is either
a member of a given set or is not a member of that set. Consider the set of all sets that are
not members of themselves, R = {x|x is a set and x ∈
/ x}. R is an object, either R ∈ R
or R ∈
/ R.
(i) Assume R ∈ R, then R is a set, and R ∈
/ R by definition.
(ii) Assume R ∈
/ R, then R ∈ R by definition of R, since we are assuming R is a set. But
we cannot have both R ∈ R and R ∈ R, so we reach a contradiction.
In both cases we have inferred the paradox that R ∈ R iff R ∈
/ R. In other words, the
assumption that R is a set has led to a contradiction and therefore there is no such thing,
then, as the set of all sets. To avoid unnecessary paradoxes, we assume the existence of the
universal set, U. All this leads to the following problems
1. There are things that are true in mathematics (based on assumptions).
2. There are things that are false.
3. There are things that are true that can never be proved.
4. There are things that are false that can never be disproved.
After this paradox was described, set theory had to be reformulated axiomatically as
axiomatic set theory.
2.9
Inclusion
Definition 2.9.1. Having fixed our universal set, U, then for all x ∈ U. If A and B are sets
(with all members in U), we write A ⊆ B or B ⊇ A iff x ∈ A =⇒ x ∈ B. (⊆ , set inclusion
symbol)
A set A is a subset of a set B iff every element of A is also an element of B. If A ⊆ B and
A 6= B, we call A a proper subset of B and write A ⊂ B.
Theorem 2.9.1. If A ⊆ B and B ⊆ C then A ⊆ C.
Proof. Let x ∈ A, then since A ⊆ B, we have x ∈ B and given that B ⊆ C, we conclude
that x ∈ C, thus A ⊆ C.
Example 2.9.1. (i) {a, b} ⊆ {d, a, b, e}
(iv) {a, b} 6⊂ {a, b}.
(ii) {a, b} ⊆ {a, b} (iii) {a, b} ⊂ {d, a, b, e}
Note that the empty set is a subset of every set, ∅ ⊆ A, for every set A and that for any
set A, we have A ⊆ A.
12
2.10
Axiom of Extensionality
Theorem 2.10.1. For any two sets A and B, A = B ⇐⇒ A ⊆ B and B ⊆ A.
2.10.1
Power Sets
The set of all subsets of A is called the power set of A and is denoted by P(A) and
|P(A)| = 2|A| where |A| is finite.
Example 2.10.1. If A = {a, b}, then P(A) = {∅, {a}, {b}, {a, b}}.
From the above example, a ∈ A, {a} ⊆ A, {a} ∈ P(A), ∅ ⊆ A, ∅ ∈
/ A, ∅ ⊆ P(A), ∅ ∈
P(A).
2.11
Operations on Sets
Just as there is an “algebra of numbers” based on operations such as addition and multiplication, there is also an algebra of sets based on several fundamental operations of set theory.
We develop properties of set algebra later in this chapter; for now our goal is to introduce
the operations by which we are able to combine sets to get another set, just as in arithmetic
we add or multiply numbers to get a number.
2.11.1
Union and Intersection
Let A and B be arbitrary sets. The union of A and B, written A ∪ B, is the set whose
elements are just the elements of A or B or both.
A ∪ B := {x|x ∈ A or x ∈ B}.
Example 2.11.1. Let K = {a, b}, L = {c, d}, M = {b, d}, then K ∪ L = {a, b, c, d},
K ∪ M = {a, b, d}, L ∪ M = {b, c, d}, (K ∪ L) ∪ M = K ∪ (L ∪ M ) = {a, b, c, d}, K ∪ K = K,
K ∪ ∅ = ∅ ∪ K = K = {a, b}.
The intersection of A and B, written A ∩ B, is the set whose elements are just the elements
of both A and B.
A ∩ B := {x|x ∈ A and x ∈ B}.
Example 2.11.2. K ∩L = ∅, K ∩M = {b}, L∩M = {d}, (K ∩L)∩M = K ∩(L∩M ) = ∅,
K ∩ K = K, K ∩ ∅ = ∅ ∩ K = ∅.
Observe also that the sets that result from the operation of union tend to be relatively large,
whereas those obtained through intersection are relatively small.
2.12
Properties of ∪ and ∩
1. Every element x in A ∩ B belongs to both A and B, hence x belongs to A and x belongs
to B. Thus A ∩ B is a subset of A and of B i.e.,
A ∩ B ⊆ A and A ∩ B ⊆ B.
13
2. An element x belongs to the union A ∪ B if x belongs to A or x belongs to B, hence
every element in A belongs to A ∪ B and every element in B belong to A ∪ B, i.e.,
A⊆A∪B
and B ⊆ A ∪ B.
Theorem 2.12.1. For any sets A and B, we have (i) A ∩ B ⊆ A ⊆ A ∪ B and
(ii) A ∩ B ⊆ B ⊆ A ∪ B.
Theorem 2.12.2. The following are equivalent, A ⊆ B, A ∩ B = A, A ∪ B = B.
Proof. Suppose A ⊆ B and let x ∈ A. Then x ∈ B, hence x ∈ A ∩ B and A ⊆ A ∩ B. Then
A ∩ B ⊆ A. Therefore A ∩ B = A. Suppose A ∩ B = A and let x ∈ A. Then x ∈ (A ∩ B),
hence x ∈ A and x ∈ B. Therefore A ⊆ B.
Suppose again that A ⊆ B. let x ∈ (A ∪ B). Then x ∈ A or x ∈ B. If x ∈ A, then
x ∈ B because A ⊆ B. In either case x ∈ B. Therefore A ∪ B ⊆ B. But B ⊆ A ∪ B.
Therefore A ∪ B = B. Now suppose A ∪ B = B and let x ∈ A. Thus x ∈ (A ∪ B). Hence
x ∈ B = A ∪ B, therefore A ⊆ B.
Definition 2.12.1. Two sets A and B are called disjoint sets if the intersection of A and
B is the null set i.e., A ∩ B = ∅.
2.13
Difference and Complement
Definition 2.13.1. A minus B written A \ B or A − B, which subtracts from A all elements
which are in B. (also called relative complement, or the complement of B relative to A)
A − B := {x|x ∈ A and x ∈
/ B}.
Example 2.13.1. K − L = {a, b}, K − K = ∅, K − M = {a}, K − ∅ = K, L − M = {c},
∅ − K = ∅.
2.13.1
Symmetric Difference
Definition 2.13.2. A 4 B = A ⊕ B := {x|x ∈ A or x ∈ B but not in both} or
A 4 B = A ⊕ B := (A ∪ B) \ (A ∩ B) = (A \ B) ∪ (B \ A).
The operation, complement, is unary rather than binary; we obtain a resultant set from a
single given set rather than from two such sets. The role of the universal set is so important
in calculating complements that we mention it explicitly in the following definition. The
complement of a set A, is the set of elements which do not belong to A, i.e., the difference
of the universal set U and A. Denote the complement of A by A0 or Ac .
A0 = {x|x ∈ U and x ∈
/ A} or A0 = U − A.
The complement of a set consists of all objects in the universe at hand that are not in the
given set. Clearly the complement of A is very much dependent on the universal set, as well
as on A itself. If A = {1}, then A0 is one thing if U = N, something quite different if U = R.
Example 2.13.2. Let E = {2, 4, 6, . . . }, the set of all even numbers. Then E c = {1, 3, 5, . . . },
the set of odd numbers.
14
2.14
Venn Diagrams
A simple and instructive way of illustrating the relationship between sets in the use of the
so called Venn-Euler diagrams or simply Venn diagrams.
A∩B
A
B
A∩B
A
B
A∪B
A
B
A−B
A
B
15
B−A
A
2.15
B
Set Theoretic Equalities
1. Idempotent Laws (i) X ∪ X = X
(ii) X ∩ X = X.
2. Commutative Laws (i) X ∪ Y = Y ∪ X
(ii) X ∩ Y = Y ∩ X.
3. Associative Laws (i) (X ∪ Y ) ∪ Z = X ∪ (Y ∪ Z)
(ii) (X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z).
4. Distributive Laws (i) X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∪ Z)
(X ∩ Y ) ∪ (X ∩ Z).
5. Identity Laws (i) X ∪ ∅ = X
(ii) X ∪ U = U
6. Complement Laws (i) X ∪ X c = U
(iv) X − Y = X ∩ Y c .
(iii) X ∩ ∅ = ∅
(ii) (X c )c = X
7. De Morgan’s Laws (i) (X ∪ Y )c = X c ∩ Y c
(iv) X ∩ U = X.
(iii) X ∩ X c = ∅
(ii) (X ∩ Y )c = X c ∪ Y c .
8. Consistency Principle (i) X ⊆ Y iff X ∪ Y = Y
Example 2.15.1.
(ii) X ∩ (Y ∪ Z) =
(ii) X ⊆ Y iff X ∩ Y = X.
1. Show that (Ac )c = A.
Proof. We need to show that A ⊆ (Ac )c and (Ac )c ⊆ A. Let x ∈ A then x ∈
/ Ac . If
c
c c
c c
x∈
/ A , then x ∈ (A ) . By definition of subsets A ⊆ (A ) .
We want to show that (Ac )c ⊆ A. Let y ∈ (Ac )c , then y ∈
/ Ac . If y ∈
/ Ac , then y ∈ A.
c c
c c
We have shown that y ∈ (A ) =⇒ y ∈ A. Thus (A ) ⊆ A. By equality of sets
(Ac )c = A.
2. Show that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C).
Proof. Let D = A ∩ (B ∪ C) and E = (A ∩ B) ∪ (A ∩ C). We have to prove first that
D ⊆ E. Let x ∈ D, then x ∈ A and x ∈ (B ∪ C). Since x ∈ (B ∪ C), either x ∈ B
or x ∈ C or both. In case x ∈ B we have x ∈ A and x ∈ B, so x ∈ (A ∩ B). On the
other hand, if x ∈
/ B, then we must have x ∈ C, so x ∈ (A ∩ C). Taking these two cases
together, x ∈ (A ∩ B) or x ∈ (A ∩ C), so x ∈ E.
Now, we prove that E ⊆ D. Let x ∈ E. Suppose first that x ∈ (A ∩ B), then x ∈ A
and x ∈ B, so x ∈ A and x ∈ (B ∪ C) . so x ∈ D. On the other hand, if x 6∈ (A ∩ B),
then x ∈ (A ∩ C) so again we obtain x ∈ A and x ∈ (B ∪ C), giving x ∈ D. Hence
E ⊆ D. Hence both D ⊆ E and E ⊆ D and we conclude that D = E and consequently
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) .
16
2.16
Counting Elements in Sets
If A and B are disjoint sets, then
|A ∪ B| = |A| + |B|,
otherwise
|A ∪ B| = |A| + |B| − |A ∩ B|.
Example 2.16.1. Let A = {a, b, c, d, e} and B = {d, e, f, g, h, i}, so that A∪B = {a, b, c, d, e, f, g, h, i}
and A ∩ B = {d, e}. Since |A| = 5, |B| = 6, |A ∪ B| = 9, |A ∩ B| = 2, we have
|A ∪ B| = |A| + |B| − |A ∩ B| = 5 + 6 − 2 = 9.
2.17
The Algebra of Sets
We have considered the problem of showing that two sets are the same, however this technique
becomes tedious should the expressions involved be at all complicated. We shall develop an
algebra of sets, to assist us in simplifying a given expression. The following basic laws are
easily established.
Law 1 : (Ac )c = A Law 2 : A ∪ B = B ∪ A Law 3 : A ∩ B = B ∩ A
Law 4 : A ∪ (B ∩ C) = (A ∪ B) ∪ C Law 5 : A ∩ (B ∩ C) = (A ∩ B) ∩ C
Law 6 : A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) Law 7 : A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
Law 8 : (A ∪ B)c = Ac ∩ B c Law 9 : (A ∩ B)c = Ac ∪ B c Law 10 : U c = ∅
Law 11 : ∅c = U Law 12 : A ∪ ∅ = A Law 13 : A ∪ U = U Law 14 : A ∩ U = A
Law 15 : A ∩ ∅ = ∅ Law 16 : A ∪ Ac = U Law 17 : A ∩ Ac = ∅.
Example 2.17.1. By using the algebra of sets, show that A ∪ (B ∩ Ac ) = A ∪ B.
Proof.
A ∪ (B ∩ Ac ) = (A ∪ B) ∩ (A ∪ Ac ) by Law 6
= (A ∪ B) ∩ U by Law 16
= A ∪ B by Law 14.
17
2.18
Set Products
2.18.1
Ordered Pairs
Definition 2.18.1. Let n be any natural number and let a1 , a2 , . . . , an be any objects. Then
(a1 , a2 , . . . , an ) denotes the ordered n-tuple with first term a1 , second term a2 , . . . and nth
term an .
Example 2.18.1. (5, 7) denotes the ordered pair whose first term is 5 and second term 7.
Note that (5, 7, 2) is called an ordered triple, (5, 7, 2, 4) is called an ordered 4-tuple.
The idea of a product of sets can be extended to any finite number of sets. For any
sets A1 , A2 , . . . , An , the set of all ordered n-tuples (a1 , a2 , . . . , an ) where a1 ∈ A1 , a2 ∈
A2 , . . . , an ∈ An is called the product of sets A1 , A2 , . . . , An and is denoted
A1 × A2 × · · · × An
or
n
Y
Ai .
i=1
The fundamental statement we can make about an ordered n-tuple is that a given object is
the ith term of an ordered n-tuple.
Definition 2.18.2. Let A and B be any non-empty sets, then
A × B := {(a, b)|a ∈ A and b ∈ B}.
If A and B are both finite sets, then |A × B| = |A| · |B|. If A = B, we sometimes write A2
for A × A.
Example 2.18.2. 1. If A = {1, 2} and B = {2, 3, 4}, then A×B = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4)
and B × A = {(2, 1), (2, 2), (3, 1), (3, 2), (4, 1), (4, 2)}.
Notice that A × B 6= B × A, in general.
2. The Cartesian product R × R = R2 is the set of all ordered pairs of real numbers and
this represents the 2-dimensional Cartesian plane.
3. (s1 , t1 ) = (s2 , t2 ) if and only if s1 = s2 and t1 = t2 .
2.19
Theorems on Set Products
Let A, B, C, D be sets, then
1. A × (B ∪ C) = (A × B) ∪ (A × C).
2. A × (B ∩ C) = (A × B) ∩ (A × C).
18
3. (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D).
4. (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D).
5. (A − B) × C = (A × C) − (B × C).
6. If A and B are non-empty sets, then A × B = B × A if and only if A = B.
7. If A1 ∈ P(A) and B1 ∈ P(B), then A1 × B1 ∈ P(A × B).
Example 2.19.1. Prove that (A ∪ B) × C = (A × C) ∪ (B × C).
Proof. Consider any element (u, v) ∈ (A ∪ B) × C. By definition u ∈ (A ∪ B) and v ∈ C.
Thus u ∈ A or u ∈ B. If u ∈ A, then (u, v) ∈ (A × C) and if u ∈ B, then (u, v) ∈ (B × C).
Thus (u, v) is in A × C or in B × C and therefore (u, v) ∈ (A × C) ∪ (B × C). This proves
that (A ∪ B) × C ⊆ (A × C) ∪ (B × C).
Now consider any element (u, v) ∈ (A × C) ∪ (B × C). This implies that (u, v) ∈ (A × C)
or (u, v) ∈ (B × C). In the first case u ∈ A and v ∈ C and in the second case u ∈ B
and v ∈ C. Thus u ∈ (A ∪ B) and v ∈ C which implies (u, v) ∈ (A ∪ B) × C. Therefore
(A × C) ∪ (B × C) ⊆ (A ∪ B) × C. Hence (A ∪ B) × C = (A × C) ∪ (B × C).
19
20
Chapter 3
Relations and Functions
3.1
Relations
In natural language relations are a kind of links existing between objects. For example,
mother of, neighbour of, part of, is older than, is an ancestor of, e.t.c. In mathematics
there are endless ways that two entities can be related to each other. Consider the following
mathematical statements.
30
5 < 10
5≤5
6=
5|80
5
x 6= y
6∈Z
X⊆Y
π ≈ 3.14.
In each case two entities appear on either side of a symbol, and we interpret the symbol as
expressing some relationship between the two entities. Symbols such as <, ≤, =, |, ≥, >, ∈
e.t.c are called relations because they convey relationship among things. Given a set A, a
relation on A is some property that is either true or false, for any ordered pair (x, y) ∈ A×A.
Example 3.1.1. Let A = {eggs, milk, corn} and B = {cows, goats, hens}. We can define
a relation R from A to B by (a, b) ∈ R if a is produced by b. In other words
R = {(eggs, hens), (milk, cows), (milk, goats)}.
With respect to this relation eggs R hens, milk R cows and so on.
Example 3.1.2. “greater than” is a relation on Z, denoted by >. It is true that for the pair
(3, 2) but false for the pairs (2, 2) and (2, 3).
Definition 3.1.1. Given sets A and B, a relation R between A and B is a subset of A × B
i.e., R ⊆ A × B.
A binary relation is a set of ordered pairs. Any subset of A × A is called a relation on
A. Since a relation R on A is a subset of A × A, it is an element of the power set of A × A
i.e., R ⊆ P(A × A). All the following expressions mean the same thing
21
1. x bears relationship R to y.
2. x and y are in the R relationship.
(x, y) ∈ R, usually written xRy or x ∼ y.
Example 3.1.3. 1. Let A be the set of people and B the set of dogs. Define a relation R
on A × B by aRb. In this case a is related to an object b if and only if a owns b.
2. Let X = Y . The equality is a relation , we say xRy if x = y.
3. Let X = Y = R. Then ≤, <, ≥, > are all relations between R and R.
4. Let X = Y = Z. Then divisibility is a relation between Z and Z, we say xRy if x|y.
3.1.1
Domain and Range
If R is a relation on A × B, we call the set A the domain of R and B the range of R i.e.,
domR = {a ∈ A|there exists some b ∈ B such that (a, b) ∈ R},
and
ranR = {b ∈ B|there exists some a ∈ A such that (a, b) ∈ R}.
fldR = domR ∪ ranR is called the field of R. Observe that domR, ranR and f ldR are all
subsets of A.
Example 3.1.4. Let A = {1, 2, 3, 4, 5, 6} and define R by xRy if and only if x < y and
x divides y. So R = {(1, 2), (1, 3), . . . , (1, 6), (2, 4), (2, 6), (3, 6)}. So domR = {1, 2, 3},
ranR = {2, 3, 4, 5, 6} and f ldR = A.
3.1.2
Inverse Relations
Every relation R from A to B has inverse relation R−1 from B to A, which is defined by
R−1 = {(b, a)|(a, b) ∈ R}.
bR−1 a if and only if aRb.
Example 3.1.5. Let A = {1, 2, 3} and B = {a, b}. Then R = {(1, a), (1, b), (3, a)} is a
relation from A to B. The inverse relation is R−1 = {(a, 1), (b, 1), (a, 3)}.
Relations can be represented using arrow diagrams or mappings. Venn diagrams and
arrows can be used for representing relations between given sets.
Example 3.1.6. If A = {a, b, c, d} and B = {1, 2, 3, 4} and R = {(a, 1), (b, 1), (c, 2), (c, 3)}
is a relation from A to B Draw a Venn diagram to demonstrate the relation.
22
3.1.3
Matrix of a Relation
Its rows are labelled with elements of A and its column are labelled with the elements of B.
If a ∈ A and b ∈ B we write 1 ia a row a and column b if aRb, otherwise we write 0. From
the example above, R = {(a, 1), (b, 1), (c, 2), (c, 3)} has the following matrix


1 0 0 0
1 0 0 0


0 1 1 0
1 0 0 0
3.2
Kinds of Relations
3.2.1
Reflexive Relations
Definition 3.2.1. A relation R on a set A is called reflexive, if for all, a ∈ A, aRa. More
concisely, for all a ∈ A, (a, a) ∈ R.
All the values are related to themselves. For example, the relation of equality =, is
reflexive, for all numbers a ∈ R, a = a. So = is reflexive. ≤ is also reflexive (a ≤ a for any
a ∈ R).
Example 3.2.1. Consider the following five relations on the set A = {1, 2, 3, 4} :
R1
R2
R3
R4
R5
=
=
=
=
=
{(1, 1), (1, 2), (2, 3), (1, 3), (4, 4)}
{(1, 1), (1, 2), (2, 1), (2, 2), (3, 3), (4, 4)}
{(1, 3), (2, 1)}
∅, empty relation
A × A, universal relation
Determine which of the following are reflexive.
Since A contains the four elements 1, 2, 3 and 4, a relation R on A is reflexive if it contains
the four pairs (1, 1), (2, 2), (3, 3) and (4, 4). Only R2 and R5 are reflexive. Note that R1 , R3
and R4 are not reflexive, since, for example, (2, 2) does not belong to any of them.
Example 3.2.2. Consider the following five relations
(1) Relation ≤ on the set Z of integers.
(2) Set inclusion ⊆ on a collection C of sets.
(3) Relation ⊥ (perpendicular) on the set L of lines in the plane.
23
(4) Relation k (parallel) on the set L of lines in the plane.
(5) Relation | of divisibility on the set N of positive integers.
Determine which of the relations are reflexive.
The relation (3) is not reflexive since no line is perpendicular to itself. Also (4) is not
reflexive since no line is parallel to itself. The other relations are reflexive, that is, x ≤ x
for every x ∈ Z, A ⊆ A for any set A ∈ C and n | n for every positive integer n in N.
Example 3.2.3. Let V = {1, 2, 3} and R = {(1, 1), (2, 4), (4, 4)}. Then R is not a reflexive
relation, since (2, 2) does not belong R.
One should note that all ordered pairs (a, a) must belong to R in order for R to be
reflexive.
3.2.2
Symmetric Relations
Definition 3.2.2. A relation R on a set A is called symmetric, if for all, a, b ∈ A, aRb
implies bRa.
For example, = is symmetric, since x = y then y = x also. Neither ≤ nor < are symmetric
(2 ≤ 3 and 2 < 3 but not 3 ≤ 2 nor 3 < 2 is true).
Example 3.2.4. (a) Determine which of the relations in Example 3.2.1 are symmetric.
R1 is not symmetric since (1, 2) ∈ R1 but (2, 1) ∈
/ R1 . R3 is not symmetric since (1, 3) ∈ R3
but (3, 1) ∈
/ R3 . The other relations are symmetric.
(b) Determine which of the relations in Example 3.2.2 are symmetric.
The relation ⊥ is symmetric since if line a is perpendicular to line b then b is perpendicular
to a. Also k is symmetric since if line a is parallel to line b then b is parallel to a. The others
are not symmetric. For example, 3 ≤ 4 but 4
3, {1, 2} ⊆ {1, 2, 3} but {1, 2, 3} * {1, 2}
and 2 | 6 but 6 - 2.
Example 3.2.5. Let P = {1, 2, 3, 4} and R = {(1, 3), (4, 2), (2, 4), (2, 3), (3, 1)}. Then R is
not a symmetric relation, since (2, 3) ∈ R but (3, 2) 6∈ R.
3.2.3
Anti-Symmetric Relations
Definition 3.2.3. A relation R on a set A is called anti-symmetric, if for all, a, b ∈ A,
aRb and bRa implies a = b.
Anti-symmetric is not the same as not symmetric.
24
Example 3.2.6. Determine which of the relations in Example 3.2.2 are antisymmetric.
The relation ≤ is antisymmetric since whenever a ≤ b and b ≤ a then a = b. Set inclusion
⊆ is antisymmetric since whenever A ⊆ B and B ⊆ A then A = B. Also divisibility on
N is anti-symmetric since whenever m | n and n | m then m = n (Note that divisibility on
Z is not anti-symmetric since 3 | −3 and −3 | 3 but 3 6= −3). The relation ⊥ is not antisymmetric since we cannot have distinct lines a and b such that a ⊥ b and b ⊥ a. Similarly
k is not anti-symmetric.
3.2.4
Transitive Relations
Definition 3.2.4. A relation R on a set A is called transitive, if for all a, b, c ∈ A, aRb,
bRc implies aRc.
Example 3.2.7. Determine which of the relations in Example 3.2.2 are transitive.
The relations ≤, ⊆ and | are transitive, that is, (i) If a ≤ b and b ≤ c then a ≤ c (ii) If
A ⊆ B and B ⊆ C then A ⊆ C (iii) If a | b and b | c then a | c. On the other hand the
relation ⊥ is not transitive. If a ⊥ b and b ⊥ c, then it is not true that a ⊥ c.
3.3
Equivalence Relations
Some kind of equality notion.
Definition 3.3.1. A relation that is reflexive, symmetric and transitive is called an
equivalence relation.
For example, = on R is an equivalence relation, the classification of animals by species, that
is, the relation “ is of the same species as” is an equivalence relation on the set of animals and
the relation ⊆ of set inclusion is not an equivalence relation. It is reflexive and transitive,
but it is not symmetric since A ⊆ B does not imply B ⊆ A. Not all relations are equivalence
relations.
Example 3.3.1. Let U = Z and define R = {(x, y)|x and y have the same parity}, i.e., x
and y are either both even or both odd. The parity is an equivalence relation.
1. For any x ∈ Z, x has the same parity as itself, so (x, x) ∈ R.
2. If (x, y) ∈ R, x and y have the same parity, so (y, x) ∈ R.
3. If (x, y) ∈ R and (y, z) ∈ R, then x and z have the same parity as y, so they have the
same parity as each other (if y is odd, both x and z are odd, if y is even both x and z
are even), thus (x, z) ∈ R.
25
Example 3.3.2. For any set S, the identity relation on S, IS = {(x, x)|x ∈ S} is an
equivalence relation.
1. Obvious.
2. If (x, y) ∈ R, then y = x so (y, x) = (x, x) ∈ R.
3. If (x, y) ∈ R and (y, z) ∈ R, then x = y = z so (x, z) = (x, x) ∈ R.
Example 3.3.3. Let U = R and define the square relation R = {(x, y)|x2 = y 2 }. Square
relation is an equivalence relation.
1. For all x ∈ R, x2 = x2 , so (x, x) ∈ R.
2. If (x, y) ∈ R, x2 = y 2 so y 2 = x2 and (y, x) ∈ R.
3. If (x, y) ∈ R and (y, z) ∈ R then x2 = y 2 = z 2 so (x, z) ∈ R.
Example 3.3.4. Show that the relation D defined by
xDy ⇐⇒ 3 | (x2 − y 2 )
is an equivalence relation.
(i) Reflexive : For any x ∈ Z we have x2 − x2 = 0 and since 3 | 0 it follows that xDx for all
x ∈ Z.
(ii) Symmetric : Suppose xDy. Then 3 | (x2 − y 2 ) so x2 − y 2 = 3n for some n ∈ Z. It
follows that y 2 − x2 = 3(−n) and hence 3 | (y 2 − x2 ). Consequently yDx, so D is symmetric.
(iii) Transitive : Suppose xDy and yDz. There there exists n, m ∈ Z such that x2 − y 2 = 3n
and y 2 − z 2 = 3m. It follows that x2 − (3m + z 2 ) = 3n or x2 − z 2 = 3m + 3n = 3(m + n)
and so xDz, that is, D is transitive.
Example 3.3.5. Modular Arithmetic
We say an integer a is congruent to another integer b modulo a positive integer n, denoted
as, a ≡ b mod n, if b − a is an integer multiple of n. Let n = 3 and let A be the set of
integers from 0 to 11. Then x ≡ y mod 3 if x and y belongs to A0 = {0, 3, 6, 9} or both
belong to A1 = {1, 4, 7, 10} or both belong to A3 = {2, 5, 8, 11}. Congruence modulo 3 is in
fact an equivalence relation on A.
Reflexive
Since x − x = 0 · 3 we know that x ≡ x mod 3.
Symmetric
If x ≡ y mod 3, then y − x = 3k for some integer k. Hence x − y = −3k and since −k is
an integer we have y ≡ x mod 3.
Transitive
Let x ≡ y mod 3 and y ≡ z mod 3. Then there are integers k and l such that y − x = 3k
and z − y = 3l. It follows that z − x = 3k + 3l = 3(k + l) and since k + l is an integer we
have x ≡ z mod 3.
More generally, congruence modulo n is an equivalence relation on the integers.
26
3.4
Functions
Let X and Y be sets. A function f : X → Y is a special kind of relation between X and
Y . Its a relation R ⊂ X × Y satisfying the following condition : for all x ∈ X, there exists
exactly one y ∈ Y such that (x, y) ∈ R.
Definition 3.4.1. Let X and Y be sets. A function f from X to Y is a relation from X to
Y such that
(i) for any x ∈ X, there is a y ∈ Y , such that (x, y) ∈ f .
(ii) for any x ∈ X, if (x, y) ∈ f and (x, z) ∈ f then y = z.
Definition 3.4.2. Let f be a function from X to Y .
(i) the set X is called the domain (source) of f and the set Y is called the co-domain
(target) of f .
(ii) range of f is the set
Range (f )= {y ∈ Y |there is an x ∈ X such that y = f (x)} = {f (x)|x ∈ X} = f (X).
The set f (X) is the image of X under f .
Definition 3.4.3. Let f : X → Y be a function. Then
(i) f is said to be surjective (onto) if f (X) = Y i.e., f is surjective if and only if, for
any y ∈ Y there is an x ∈ X such that f (x) = y.
(ii) f is said to be injective (one-to-one, 1−1) if f maps different points of X to different
points of Y i.e., f is injective if and only if, for any x, x0 ∈ X, f (x) = f (x0 ) =⇒ x = x0
i.e., x 6= x0 then f (x) 6= f (x0 ).
(iii) A function f : X → X is called a bijection (or bijective) if f is both injective and
surjective. Two sets X and Y are said to be in 1 − 1 correspondence (i.e., to every
element of X there corresponds an element of Y and vice versa) if there is a bijection
between them.
3.5
Composition of functions
Suppose we have two functions f : X → Y and g : Y → Z, natural to think of a function
from X to Z being the combined action of f and g.
Definition 3.5.1. Let f : X → Y and g : Y → Z be two functions. Then the composition
of f and g is the function g ◦ f . This is a function from X to Z defined by : for any
x ∈ X, g ◦ f (x) = g(f (x)).
27
The composition function is generally not commutative.
Example 3.5.1. If f (x) = x2 and g(x) = x + 1, then g(f (x)) = x2 + 1 whereas f (g(x)) =
(x + 1)2 = x2 + 2x + 1.
Composition is always associative : if f : X → Y, g : Y → Z and h : Z → W are
functions, then we have (h ◦ g) ◦ f = h ◦ (g ◦ f ).
Theorem 3.5.1. Let f : X → Y and g : Y → Z be two functions.
(a) If f and g are injective, then so is g ◦ f .
(b) If f and g are surjective, then so is g ◦ f .
(c) If f and g are bijective, then so is g ◦ f .
Proof. (a) We must show that for all x1 , x2 ∈ X if g(f (x1 )) = g(f (x2 )), then x1 = x2 . But
put y1 = f (x1 ) and y2 = f (x2 ). Then g(y1 ) = g(y2 ). Since g is assumed to be injective, this
implies that f (x1 ) = y1 = y2 = f (x2 ). Since f is also assumed to be injective, this implies
that x1 = x2 .
(b) We must show that for all z ∈ Z, there exists at least one x in X such that g(f (x)) = z.
Since g : Y → Z is surjective, there exists y ∈ Y such that g(y) = z. Since f : X → Y is
surjective, there exists x ∈ X such that f (x) = y. Then g(f (x)) = g(y) = z.
3.5.1
The Identity Function
Definition 3.5.2. Let X be any set. Let a function f : X → X be defined by f (x) = x i.e.,
let f mapping to each element in X, to itself. Then f is called the identity function on x.
Denoted by IX or 1X . Note f ◦ IX = f = IX ◦ f .
3.5.2
Inverse of a Function
Let f : XtoY be a function and let y ∈ Y , then the inverse of y, denoted f −1 (y) consists
of those elements of X which are mapped onto y i.e., elements in X which have y as their
image.
f −1 (y) = {x|x ∈ X and f (x) = y} = {x ∈ X|y = f (x)}.
f −1 (y) is a subset of X. If f : X → Y is both one-to-one function and onto function, then
f −1 : Y → X and call f −1 the inverse function of f . We say that a function g : Y → X is
the inverse function to f : X → Y if both of the following hold :
(i) g ◦ f = IX i.e., for all x ∈ X, g(f (x)) = x.
28
(ii) f ◦ g = IY i.e., for all y ∈ Y, f (g(y)) = y.
Theorem 3.5.2. Let f : X → Y .
(a) The following are equivalent
(i) f is bijective.
(ii) the inverse relation f −1 ; Y → X is a function.
(iii) f has an inverse function g.
(b) When the equivalent conditions of part (a) hold then the inverse function g is uniquely
determined and it is the function f −1 .
Proof. (a) (ii) =⇒ (iii)
Assume (ii) i.e., the inverse relation f −1 is a function. We claim that it is then the inverse
function to f in the sense that f −1 ◦ f = IX and f ◦ f −1 = IY for some x ∈ X, f −1 (f (x))
is the unique element of X which get mapped under f to f (x). Since x is such an element
and the uniqueness is assumed, we must have f −1 (f (x)) = x. Similarly, for y ∈ Y, f −1 (y) is
the unique element x of X such that f (x) = y, so f (f −1 (y)) = f (x) = y.
(iii) =⇒ (i)
We have g ◦ f = IX and the identity function is bijective, so f is injective. Similarly we have
f ◦ g = IY is bijective and f is surjective. Therefore f is bijective.
(b) Suppose that we have any function g : Y → X such that g ◦ f = IX and f ◦ g = IY . We
know that f is bijective and thus the inverse relation f −1 is a function such that f −1 ◦ f =
IX , f ◦ f −1 = IY . Thus
g = g ◦ IY = g ◦ (f ◦ f −1 ) = (g ◦ f ) ◦ f −1 = IX ◦ f −1 = f −1 .
29
30
Chapter 4
Symbolic Logic
Roots in study of language. Uses words and phrases that have a bearing on the truth or
falsity of the sentence in which they occur. Such words or phrases are aptly called logical
connectives. For example, not, or, and, if, then, if and only if, . . . . For example,
consider the sentence : It is cold and the sun is shining. Sentence is obtained by joining
the two sentences : It is cold and The sun is shining. The resulting sentence is called a
compound sentence and is true provided that each of the two component sentences is
true.
Definition 4.0.1. A proposition/statement is a declarative sentence which is true or
false (but not both).
Notation is useful in the study of compound statements. If we let p denote the statement
“All cows eat grass” and let q denote the statement “Columbus discovered America”, the we
can write the compound statement p and q.
4.1
Abbreviations
∧ denotes and , ⇐⇒ denotes if and only if , ∨ denotes or, =⇒ denotes if.
¬p denotes not p. For example, If p denotes the proposition “It is raining”, then ¬p denotes
“It is not raining”. The truthfulness or falsity of the statement is called its truth value.
Denoting “true” by “T ” and “false” by “F ”, the logical connectives are conveniently defined
by means of a truth-table which spells out the truth value of a compound statement in
each possible truth-value cases.
31
4.1.1
Conjunction, p ∧ q
Two statements can be combined by the word “and” to form a composite statement which
is called the conjugation of the original statements. Denoted by p ∧ q.
Example 4.1.1. 1. Let p be it is raining and q be it is overcast. Then p ∧ q denotes it is
raining and it is overcast.
2. The symbol ∧ can be used to define the intersection of two sets,
C ∩ D = {x|x ∈ C ∧ x ∈ D}.
Truth value of a composite statement satisfies the following property : If p is true and q
is true, the p ∧ q is true, otherwise, p ∧ q is false. Conjugation of two statements is true if
and only if each component is true.
Truth-table
p
T
T
F
F
4.2
q
T
F
T
F
p∧q
T
F
F
F
Disjunction, p ∨ q
Two statements combined by the word “or” and denoted by p ∨ q.
Example 4.2.1. 1. Let p denote the statement He studied Mathematics at University and
q be He lives in Harare, then p ∨ q denotes He studied Mathematics at University or he
lives in Harare.
2. ∨ can be used to define the union of two sets,
P ∪ Q = {x|x ∈ P ∨ x ∈ Q}.
Truth value of the composite statement p ∨ q satisfies the property : If p is true or q is
true or both p and q are true, then p ∨ q is true, otherwise, p ∨ q is false.
32
Truth-table
p
T
T
F
F
4.3
p∨q
T
T
T
F
q
T
F
T
F
Negation, ¬p
Given any statement p, another statement “not p”, called the negation of p and is denoted
by ¬p.
Example 4.3.1. Chinhoyi is in Zimbabwe.
Negation (i) It is false that Chinhoyi is in Zimbabwe or (ii) Chinhoyi is not in Zimbabwe.
Truth-table
p
T
T
F
F
4.4
¬p
F
F
T
T
The Conditional, p =⇒ q
If p then q, also read as (a) p implies q (b) p only if q (c) p is sufficient for q (d) q is
necessary for p. The truth value of the conditional statement p =⇒ q satisfies the following
property : p =⇒ q is true unless p is true and q is false, a true statement cannot imply a
false statement.
33
Truth-table
p
T
T
F
F
q
T
F
T
F
4.4.1
p =⇒ q
T
F
T
T
The Bi-conditional,p ⇐⇒ q
p if and only if q. Truth value satisfied if : p and q have the same truth value, then p ⇐⇒ q
is true, otherwise, it is false.
Truth-table
p
T
T
F
F
q
T
F
T
F
p ⇐⇒ q
T
F
F
T
Example 4.4.1.
p
T
1. Find the truth values for (¬p∨q) =⇒ p. T
F
F
p
T
2. Construct a truth table for (p =⇒ q) ⇐⇒ (¬p∨q). T
F
F
q
T
F
T
F
q
T
F
T
F
p =⇒ q
T
F
T
T
¬p
F
F
T
T
¬p ∨ q
T
F
T
T
¬p
F
F
T
T
¬p ∨ q
T
F
T
T
(¬p ∨ q) =⇒ p
T
T
F
F
(p =⇒ p) ⇐⇒ (¬p
In this example, the truth values of (p =⇒ q) ⇐⇒ (¬p ∨ q) are all true.
Definition 4.4.1. A compound proposition that is true regardless of the truth values of its
initial components is called a tautology.
Exercise 4.4.1. Show that [(p =⇒ q) ∧ q =⇒ p] ⇐⇒ (p ⇐⇒ q) is a tautology.
Some sentences are not statements because they contain unspecified variables, for example, we cannot assign a truth value to the sentence, He was a president of the United States,
34
until a proper name is substituted for the pronoun he. We call a sentence that contains
unspecified variables a predicate. For example, S is green and X discovered America are
predicates. S and X are unspecified variables that may be replaced by various nouns. The
predicate is neither true nor false, its truth value depends upon the name that replaces X.
In these examples, he, S and X are called free variables in their respective predicates. A
statement that is always false is called an absurdity. A statement that may be true or false,
depending upon the values of its constituent statements, is called a contigency.
4.5
Logical Equivalence
Definition 4.5.1. The propositions p and q are said to be logically equivalent if their
truth tables are identical. Denoted by p ≡ q.
Example 4.5.1. ¬(p ∧ q) ≡ ¬p ∨ ¬q.
p q p ∧ q ¬(p ∧ q)
p q
T T
T
F
T T
T F
F
T
T F
F T
F
T
F T
F F
F
T
F F
¬p
F
F
T
T
¬q
F
T
F
T
¬p ∨ ¬q
F
T
T
T
Consider the statement, “It is not the case that roses are red and violets are blue”. This
statement can be written in the form ¬(p ∧ q) where p is “roses are red” and q is “violets
are blue”. However, as noted above, ¬(p ∧ q) ≡ ¬p ∨ ¬q. Thus the statement, Roses are not
red, or violets are not blue, has the same meaning as the given statement.
4.5.1
Algebra of Propositions
Propositions satisfy various laws
4.5.2
Laws of the Algebra of Propositions
Idempotent Laws
(1a) p ∨ p ≡ p (1b) p ∧ p ≡ p.
Associative Laws
(2a) (p ∨ q) ∨ r ≡ p ∨ (q ∨ r) (2b) (p ∧ q) ∧ r ≡ p ∧ (q ∧ r).
Commutative Laws
(3a) p ∨ q ≡ q ∨ p (3b) p ∧ q ≡ q ∧ p.
Distributive Laws
35
(4a) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r) (4b) p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r).
Identity Laws
(5a) p ∧ T ≡ p (5b) p ∨ F ≡ p.
(6a) p ∨ T ≡ T (6b) p ∧ F ≡ F .
Complement Laws
(7a) p ∨ ¬p ≡ T (8a) ¬T ≡ F .
(7b) p ∧ ¬p ≡ F (8b) ¬F ≡ T .
Involution Law
(9) ¬¬p ≡ p.
De Morgan’s Laws
(10a) ¬(p ∨ q) ≡ ¬p ∧ ¬q (10b) ¬(p ∧ q) ≡ (¬p ∨ ¬q).
4.6
The Converse
Let p =⇒ q be a conditional proposition. The converse of p =⇒ q is q =⇒ p.
Example 4.6.1. If John Gumbo is a student of University of Zimbabwe, then 2 + 2 = 4 is
a true proposition. If 2 + 2 = 4, then John Gumbo is a student of University of Zimbabwe is
a false proposition.
4.7
The Contrapositive
The contrapositive of a proposition p =⇒ q is the proposition ¬q =⇒ ¬p.
Theorem 4.7.1. The conditional proposition p =⇒ q is logically equivalent to its contrapositive ¬q =⇒ ¬p.
p q p =⇒ q ¬p ¬q ¬q =⇒ ¬p
T T
T
F
F
T
T F
F
F
T
F
F T
T
T
F
T
F F
T
T
T
T
4.8
List of Tautologies
1. p ∨ ¬p Law of the excluded middle.
2. ¬(p ∧ ¬p) Contradiction.
3. [(p =⇒ q) ∧ q] =⇒ ¬p
4. ¬¬p ⇐⇒ p
Modus tollens.
Double negation.
36
5. [(p =⇒ q) ∧ (q =⇒ r)] =⇒ (p =⇒ r)
Law of syllogism.
6. (p ∧ q) =⇒ p
Decomposing a conjunction.
7. (p ∧ q) =⇒ q
Decomposing a conjunction.
8. p =⇒ (p ∨ q)
Constructing a disjunction.
9. q =⇒ (p ∨ q)
Constructing a disjunction.
10. (p =⇒ q) ⇐⇒ [(p =⇒ q) ∧ (q =⇒ p)]
Definition of the bi-conditional.
11. (p ∧ q) ⇐⇒ (q ∧ p)
Commutative law for ∧.
12. (p ∨ q) ⇐⇒ (q ∨ p)
Commutative law for ∨.
13. (p =⇒ q) ⇐⇒ (¬p ∨ q)
14. [(p ∨ q) ∧ ¬p] =⇒ q
15. (p ∨ p) ⇐⇒ p
4.9
Conditional disjunction.
Disjunctive syllogism.
Simplification.
Propositional Functions, Quantifiers
Let A be a given set. A propositional function (or open sentence or condition)
defined on A is an expression p(x) which has the property that p(a) is true or false for each
a ∈ A, i.e., p(x) becomes a statement (with a truth value) whenever any element a ∈ A
is substituted for the variable x. Set A is called the domain of p(x) and the set Tp of all
elements of A for which p(a) is true is called the truth set of p(x), i.e.,
Tp = {x|x ∈ A, p(x) is true} or {x|p(x)}.
When A is some set of numbers, the condition p(x) has the form of an equation or inequality
involving the variable x.
Example 4.9.1. Find the truth set Tp of each propositional function p(x) defined on the set
P = {1, 2, 3 . . . }.
(a) Let p(x) be x + 2 > 7. Then Tp = {x|x ∈ P, x + 2 > 7} = {6, 7, 8, . . . }. Consisting of all
integers greater than 5.
(b) Let p(x) be x + 5 < 3. Then Tp = {x|x ∈ P, x + 5 < 3} = ∅, the empty set.
(c) Let p(x) be x + 5 > 1. Then Tp = {x|x ∈ P, x + 5 > 1} = P.
From the above example, shows that if p(x) is a propositional function defined on a set
A, then p(x) could be true for all x ∈ A, for some x ∈ A or for no x ∈ A.
37
4.10
Universal Quantifier
Let p(x) be a propositional function defined on a set A. Consider the expression, (∀x ∈
A)p(x) or ∀x, p(x) which reads “For every x in A, p(x) is a true statement”, or simply
“For all x, p(x)”. The symbol ∀, (for all, for every) is called the universal quantifier.
(∀x ∈ A)p(x) is equivalent to the statement Tp = {x|x ∈ A, p(x)} = A, i.e., the truth set of
p(x) is the entire set of A. If {x|x ∈ A, p(x)} = A, then ∀x, p(x) is true, otherwise, ∀x, p(x)
is false.
Example 4.10.1. 1. The proposition (∀n ∈ P)(n + 4 > 3) is true since {n|n + 4 > 3} =
{1, 2, 3, . . . } = P.
2. The proposition (∀n ∈ P)(n + 2 > 8) is false since {n|n + 2 > 8} = {7, 8, . . . } =
6 P.
4.11
Existential Quantifier
Let p(x) be the propositional function defined on a set A. Consider the expression (∃x ∈
A)p(x) or ∃x, p(x), which reads “There exists an x in A such that p(x)” is a true statement
or simply, “For some x, p(x)”. The symbol ∃ (there exists, for some, for at least one) is
called the existential quantifier. (∃x ∈ A)p(x) is equivalent to the statement Tp = {x|x ∈
A, p(x)} =
6 ∅, i.e., the truth set of p(x) is not empty. If {x|p(x)} 6= ∅ then ∃x, p(x) is true,
otherwise, ∃x, p(x) is false.
Example 4.11.1.
{1, 2} =
6 ∅.
1. The proposition (∃n ∈ P)(n + 4 < 7) is true since {n|n + 4 < 7} =
2. The proposition (∃n ∈ P)(n + 6 < 4) is false since {n|n + 6 < 4} = ∅.
4.11.1
Notation
Let A = {2, 3, 5} and let p(x) be the sentence “x is a prime number” or simply x is prime.
Then the proposition “Two is prime and three is prime and five is prime”, can be denoted
by p(2) ∧ p(3) ∧ p(5) or ∧(a ∈ A, p(a)), which is equivalent to the statement, “Every number
in A is prime or ∀a ∈ A, p(a)”. Similarly, the proposition, “Two is prime or three is prime
or five is prime”, can be denoted by p(2) ∨ p(3) ∨ p(5) or ∨(a ∈ A, p(a)), which is equivalent
to the statement “At least one number in A is prime or ∃a ∈ A, p(a)”. Alternatively,
∧(a ∈ A, p(a)) ≡ ∀a ∈ A, p(a) and ∨(a ∈ A, p(a)) ≡ ∃a ∈ A, p(a).
38
4.12
Negation of Quantified Statements
Consider the statement “All Mathematics majors are male”. Its negation is either the
following equivalent statements
1. It is not the case that all Mathematics majors are male.
2. There exists at least one Mathematics major who is a female.
Symbolically, using M to denote the set of Mathematics major, the above can be written as
¬(∀x ∈ M )(x is male) ≡ (∃x ∈ M )(x is not male),
or when p(x) denotes “x is a male”, we have ¬(∀x ∈ M )p(x) ≡ (∃x ∈ M )¬p(x) or
¬∀x, p(x) ≡ ∃x, ¬p(x).
Theorem 4.12.1 (De Morgan).
¬(∀x ∈ A)p(x) ≡ (∃x ∈ A)¬p(x),
i.e., (a) It is not true that, for all x ∈ A, p(x) is true and (b) There exists an x ∈ A such
that p(x) is false.
Theorem 4.12.2 (De Morgan).
¬(∃x ∈ A)p(x) ≡ (∀x ∈ A)¬p(x),
i.e., (a) It is not true for some x ∈ A, p(x) is true and (b) For all x ∈ A, p(x) is false.
The following statements are also negatives of each other : There exists a college student
who is 60 years old, Every college student is not 60 years old.
The opposite of “For all x, p(x) is true”, is “There exists x for which p(x) is not true”.
The opposite of “There exists x for which p(x) is true ”, is “For all x, p(x) is not true”.
For example, All rational numbers equal one, the opposite (negation) is, There exists a
rational number that does not equal one. All eleven-legged crocodiles are orange with blue
spots is true, if it was false, the there would exist an eleven-legged crocodile that is not
orange with blue spots.
4.13
Proofs
In Italy it’s said that it requires two men to make a good salad dressing; a generous man
to add the oil and a mean man the vinegar. Constructing proofs in mathematics is similar.
Often a tolerant openness and awareness is important in discovering or understanding a proof,
39
while a strictness and discipline is needed in writing it down. There are many different
styles of thinking, even amongst professional mathematicians, yet they can communicate
well through the common medium of written proof. It’s important not to confuse the rigour
of a well-written-down proof with the human and very individual activity of going about
discovering it or understanding it. Too much of a straightjacket on your thinking is likely to
stymie anything but the simplest proofs. On the other hand too little discipline, and writing
down too little on the way to a proof, can leave you uncertain and lost.
When you cannot see a proof immediately (this may happen most of the time initially), it can
help to write down the assumptions and the goal. Often starting to write down a proof helps
you discover it. You may have already experienced this in carrying out proofs by induction.
It can happen that the induction hypothesis one starts out with isn’t strong enough to get
the induction step. But starting to do the proof even with the ‘wrong’ induction hypothesis
can help you spot how to strengthen it. Of course, there’s no better way to learn the art
of proof than to do proofs, no better way to read and understand a proof than to pause
occasionally and try to continue the proof yourself. For this reason you are very strongly
encouraged to do the exercises, most of them are placed strategically in the appropriate place
in the text.
Mathematicians solve problems and proofs is the guarantee that our solutions are correct.
A proof is an explanation of why a statement is true. A conjecture is a statement which
we believe to be true for which we have no proof. An axiom is a basic assumption about a
mathematical situation.
4.14
Techniques of Proof
4.14.1
Direct Method
Solves statements of the nature “If A then B.
Theorem 4.14.1. Let m be an integer, if m is odd, then m2 is odd.
Proof. If m is odd, then m = 2r + 1 for some integer r. Then m2 = (2r + 1)2 = 4r2 + 4r + 1 =
2(2r2 + 2r) + 1, i.e., m2 is odd.
Theorem 4.14.2. Suppose that p ∈ Q and p2 ∈ Z, then p ∈ Z.
a
Proof. By assumption p = for some integers a and b, where the fraction is in its lowest
a 2 a2b
2
form. Thus p =
= 2 . Since p2 ∈ Z and the fraction is in its lowest form so we have
b
b
a
that b2 = 1. Thus b = ±1 ⇒ p =
= ±a ∈ Z.
±1
40
Example 4.14.1. Prove that the square of every odd number is of the form 8a + 1 for some
a ∈ N.
Proof. Any odd number n is of the form n = 2l+1 for some l ∈ Z. Therefore n2 = (2l+1)2 =
4l2 + 4l + 1 = 4(l2 + l) + 1. Thus it is enough to show that l2 + 1 is even. If l is even, then
l = 2m for some m ∈ Z, so l2 + l = 4m2 + 2m = 2(2m2 + m) which is divisible by 2. If l is
odd, then l = 2m + 1 for some m ∈ Z, so l2 + l = 4m2 + 4m + 1 + 2m + 1 = 2(2m2 + 3m + 1)
which is also divisible by 2. Thus l2 + l is always even and so n2 is of the form 8a + 1 for
some a ∈ Z. But n2 ≥ 1 so a ∈ N.
4.15
Some Common Mistakes
The biggest mistake is assuming what has to be proved and incorrect use of equivalence.
4.15.1
Don’t Assume What Has to be Proved
Suppose that we had to prove the statement P . If we assume it is true, then it is not
surprising that we can deduce it is true, P ⇒ P , would seem to be obviously true. P is
assumed to be true and this is used to deduce something that is true and so it is concluded
that P is true.
Example 4.15.1. Consider the following statement ; If a and b are real numbers, then
a2 + b2 ≥ 2ab.
A fallacious proof is : We have a2 + b2 ≥ 2ab ⇒ a2 − 2ab + b2 ≥ 0 ⇒ (a − b)2 ≥ 0. The
last inequality is true as the square of a number is always non-negative, so a2 + b2 ≥ 0. The
error is the conclusion has been assumed (i.e., a2 + b2 ≥ 2ab) and has lead to something we
know is true. However, we cannot conclude that a statement is true just because it implies a
known truth. The real proof is a reverse of the argument, begin with (a − b)2 ≥ 0 something
we know is true.
4.16
Proof By Cases
For example, x = y can be proved by that x ≤ y and y ≤ x. We have broken the problem
into two cases.
Example 4.16.1. The number n2 + 3n + 7 is odd for all n ∈ Z.
41
Proof. Divide into two cases (i) n is even and (ii) n is odd.
If n is even, then n = 2k for some integer k. Then n2 + 3n + 7 = (2k)2 + 3(2k) + 7 =
4k 2 + 6k + 7 = 2(2k 2 + 3k + 3) + 1. Hence n2 + 3n + 7 is odd when n is even.
If n is odd, then n = 2k+1 for some integer k. We have n2 +3n+7 = (2k+1)2 +3(2k+1)+7 =
4k 2 + 10k + 11 = 2(2k 2 + 5k + 5) + 1. This is also odd. Hence n2 + 3n + 7 is odd for all
integers n.
As you see, this method of cases involves exhausting all the possibilities and so this method
is also known as exhaustion.
4.17
Contradiction
The law of the excluded middle asserts that a statement is true or false, it cannot be anything
in between. The name comes from the fact that assuming that the statement is false is later
contradicted by some other fact. Also called reductio ad absurdum (reduction to the
absurd).
Example 4.17.1. Suppose that n is an odd integer. Then n2 is an odd integer.
Proof. Assume to the contrary, i.e., we suppose that n is an odd integer but that the conclusion is false, i.e., n2 is an even integer. As n is odd, n = 2k + 1 for some k ∈ Z. Thus
n2 = (2k + 1)2 = 4k 2 + 2k + 1 which contradicts n2 is even. Thus our assumption that n2 is
even must be wrong, i.e., n2 must be odd.
√
Example 4.17.2. Prove that 3 is irrational.
√
√
a
Proof. Suppose a contradiction that 3 is rational. Then we can write 3 = for some
b
integers a and b. Assume that a and b have no common divisors. Now squaring both sides,
a2
we get 3 = 2 and so 3b2 = a2 . This implies that a2 is divisible by 3 and so a is also divisible
b
by 3. Thus we can write a = 3c for some integer c. Replacing this in the above equation we
get 3b2 = 9c2 and so b2 = 3c2 . Hence b2 is divisible by 3√and so is b. But this contradicts the
fact that a and b have no common divisors. Therefore 3 has to be irrational.
4.18
Induction
Is applied when we have an infinite number of statements indexed by the natural numbers,
for example, n5 − n is even for all n ∈ N.
42
4.18.1
Principle of Mathematical Induction
Let A(n) be an infinite collection of statements with n ∈ N. Suppose (i) A(1) is true and
(ii) A(k) ⇒ A(k + 1) for all k ∈ N. Then A(n) is true for all n ∈ N.
Checking condition (i) is called the initial step and checking condition (ii) is called the
inductive step. assuming that A(k) is true for some k in (ii) is called the inductive
hypothesis.
Example 4.18.1. 6n − 1 is divisible by 5 for all n ∈ N.
Proof. Initial Step: 61 − 1 = 5, and is true because 5 is divisible by 5.
Inductive Step: Assume statement is true for some k ∈ N, that means that 6k − 1 = 5m
for some m ∈ N. Then 6k+1 − 1 = 6(6k ) − 1 = 6(5m + 1) − 1 (by inductive hypothesis)=
30m + 6 − 1 = 5(6m + 1). This is divisible by 5 and so the statement is truce for k + 1.
Hence statement is true for all n ∈ N.
Example 4.18.2. Show that 2n−1 ≤ n! for all n ∈ N.
Proof. For n = 1, we have 2n−1 = 20 = 1 and n! = 1! = 1. Hence 2n−1 ≤ n! for n = 1.
Assume the statement is true for some k ∈ N i.e., 2k−1 ≤ k!. Then for n = k + 1,
2(k+1) − 1 = 2k = 2(2k−1 ) ≤ 2(k!) (by inductive hypothesis) ≤ (k + 1)(k!) (as 2 ≤ k + 1)
= (k + 1)!.
4.19
The Contrapositive Method
A ⇒ B is equivalent to not B ⇒ not A. For example, If x2 − 9 = 0, then x = 2 has the
contrapositive, If x 6= 2, then x2 − 9 6= 0 and If I am Jane, then I am a woman has the
contrapositive, If I am not a woman, then I am not Jane.
Example 4.19.1. Suppose that A, B, C, D are sets such that C \ D ⊂ A ∩ B and that x ∈ C.
Prove that if x ∈
/ A, then x ∈ D.
Proof. If x ∈
/ D, then x ∈ A (the contrapositive). Let us suppose that x ∈
/ D. Since x ∈ C
is assumed, then x ∈ C \ D. Because C \ D ⊂ A ∩ B ⇒ x ∈ A ∩ B, i.e., x ∈ A.
Example 4.19.2. Let a be any integer. prove that if a2 is divisible by 3, then a is divisible
by 3.
Proof. By contrapositive. Assume that a is not divisible by 3. Then a = 3t + 1 or 3t + 2 for
some t ∈ Z. If a = 3t + 1 then a2 = (3t + 1)2 = 9t2 + 6t + 1 = 3(3t2 + 2t) + 1 which is not
divisible by 3. If a = 3t + 2 then a2 = (3t + 2)2 = 9t2 + 12t + 4 = 3(3t2 + 4t + 1) + 1 which
is not divisible by 3.
43
4.20
Counterexamples
For example, Is all multiples of 3 are multiples of 6, true or false? Prove your answer. It is
false because 9 is a multiple of 3, but is not a multiple of 6. (9 is called a counterexample
to the “all” statement, All multiples of 3 are multiples of 6).
4.21
Divisors
Uses the set of integers, Z. An integer a divides the integer b if there exists an integer ksuch
that b = ka. In this case we say b is divisible by a and write a | b. We also say that a is a
divisor of b. If a does not divide b, then we write a - b. For example, 3 | 6 since 6 = 2 × 3.
Theorem 4.21.1. If a | b and a | c then a | (mb + nc) for all integers m and n.
For trivial examples, if m = n = 1, we have if a | b and a | c, then a | (b + c). If we take
m = 1 and n = −1, we get, If a | b and a | c, then a | (b − c).
Proof. By assumption, there exists integers k1 and k2 such that b = k1 a and c = k2 a. For
any integers m and n, we have mb + nc = m(k1 a) + n(k2 a) (by assumption) = (mk1 + nk2 )a.
Thus mb + nc is divisible by a.
Theorem 4.21.2. Let a, b, c ∈ Z. Then
(i) If a | b and b | c, then a | c.
(ii) If a | b and b | a, then a = b or a = −b.
Proof. (i) By assumption, ∃k1 , k2 ∈ Z such that b = k1 a and c = k2 b. Hence c = k2 k1 a and
we deduce that a divides c.
Example 4.21.1. For n even, n2 + 2n + 8 is divisible by 4.
Proof. n is even implies that n = 2m for some m ∈ Z. Then n2 +2n+8 = (2m)2 +2(2m)+8 =
4m2 +4m+8 = 4(m2 +m+2). Since m2 +m+2 is an integer we can conclude that n2 +2n+8
is divisible by 4.
Exercise 4.21.1. Show that x2 + 9x + 20 is divisible by 2 for all x ∈ Z.
Exercise 4.21.2. Show that x3 − 6x2 + 11x − 6 is divisible by 3 for all x ∈ Z.
Exercise 4.21.3. For each positive integer, show that x3 − x is divisible by 3 and x5 − x is
divisible by 5. Can you generalise this? Is xn − x divisible by n?
44
4.22
The Principle of Mathematical Induction
It is an important property of the positive integers (natural numbers) and is used in proving
statements involving all positive integers when it is known for, for example, that the statements are valid for n = 1, 2, 3, . . . but it is suspected or conjectured that they hold for all
positive integers.
4.22.1
Steps
1. Prove the statement for n = 1 or some other positive integer. (Initial Step)
2. Assume the statement true for n = k, where k ∈ Z+ . (Inductive Hypothesis)
3. From the assumption in 2 prove the statement must be true for n = k + 1.
4. Since the statement is true for n = 1 (from 1) it must (from 3) be true for n = 1 + 1 = 2
and from this for n = 2 + 1 = 3, and so on, so must be true for all positive integers.
(Conclusion)
Example: For any positive integer n,
1 + 2 + ··· + n =
n(n + 1)
.
2
Solution:
1(1 + 1)
2
= = 1, which is clearly true.
2
2
2. Assume that the statement holds for n = k, that is,
1. Prove for n = 1, 1 =
1 + 2 + ··· + k =
k(k + 1)
.
2
3. Prove for n = k + 1. So
k(k + 1)
+ (k + 1) (by inductive hypothesis)
2
k(k + 1) + 2(k + 1)
=
2
k 2 + 3k + 2
=
2
(k + 1)(k + 2)
=
2
1 + 2 + · · · + k + (k + 1) =
so holds for n = k + 1.
45
4. Hence by induction, 1 + 2 + · · · + n =
n(n + 1)
is true for any positive integer n.
2
Example: Prove that for any natural number
1 + 3 + 5 + · · · + 2n − 1 = n2 .
Solution:
1. Prove for n = 1, 1 = 12 = 1, so it is true.
2. Assume that the statement holds for n = k, that is,
1 + 3 + 5 + · · · + 2k − 1 = k 2 .
3. Prove for n = k + 1. We have
1 + 3 + 5 + · · · + (2k − 1) + 2(k + 1) − 1 = k 2 + 2k + 1 (by inductive hypothesis)
= (k + 1)2 .
So it is true for n = k + 1.
4. Hence by induction 1 + 3 + 5 + · · · + 2n − 1 = n2 is true for all natural numbers n.
46
Example: Prove that 3n > 2n for all natural numbers n.
Solution:
1. Prove for n = 1 =⇒ 31 = 3 > 21 = 2, which is true.
2. Assume the statements holds for n = k, that is, 3k > 2k .
3. Prove for n = k + 1.
3k+1 =
>
>
>
3k · 3
2k · 3 by inductive hypothesis
2k · 2 since 3 > 2
2k+1 ,
which is true.
4. Hence, by induction 3n > 2n for all natural numbers n.
Example: Prove that for any integer n ≥ 1, 22n − 1 is divisible by 3.
Solution:
1. Prove for n = 1 =⇒ 22 − 1 = 3 and is divisible by 3, hence its true.
2. Assume that the statement holds for n = k, that is, for k ≥ 1, 22k − 1 is divisible by 3,
i.e., 22k − 1 = 3l, for some l ∈ Z.
3. Prove for n = k + 1.
22(k+1) − 1 =
=
=
=
=
4 · 22k − 1 but 22k = 3l + 1 by the inductive hypothesis
4(3l + 1) − 1
12l + 4 − 1
12l + 3
3(4l + 1),
which is true.
4. Hence, by induction 22n − 1 is divisible by 3 for all n ≥ 1.
47
48
Chapter 5
Operations and Structures
We have the following sets, C-set of complex numbers, Z-set of integer numbers, N-set of
natural numbers, Q-set of rational numbers and R-set of real numbers.
5.1
Operations
Operations (such as addition) that involve two input values, for example, 2 + 3 are called
binary operations. Those that
√ involve only one input value, such as finding the square
root of a number (for example, 8) are called unary operations. Others that involve three
input values are called ternary operations.
Definition 5.1.1. An operation on a non-empty set A is a mapping from A × A to A.
Definition 5.1.2. A mapping ∗ : A × A −→ A is called a binary operation on the set A.
5.2
Idea
An operation on A is the combining of arbitrary elements a and b of A in some prescribed way
to obtain a unique element c of A i.e., a ∗ b = c. Binary operations are usually represented by
symbols like ∗, ·, +, ◦. For example, the binary operations of addition (+) and multiplication
(·) in the set Z of integers.
Example 5.2.1. (i) Addition (+) is an operation on Z since + : Z × Z −→ Z i.e.,for
example, 1 + 2 = 3.
(ii) Let P(S) be the power set of S, i.e., P(S) = {T |T ⊆ S}. ∪ and ∩ are both operations
on P(S).
49
5.2.1
Properties
Definition 5.2.1. Let ∗ be an operation on A, ∗ is commutative on A if for all a, b ∈
A, a ∗ b = b ∗ a.
of
Forexample,
operation
multiplication
applied
tothe setof 2 × 2 matrices.
1 2
3 7
5 17
3 7
1 2
3 34
=
and
=
, so a ∗ b 6= b ∗ a. If on the
0 4
1 5
4 20
1 5
0 4
1 22
other hand,
a ∗ b= b ∗a, for
pairs of matrices a and b, i.e., for
wetake∗ as (+),then all 1 2
3 7
3 7
1 2
4 9
example
+
=
+
=
.
0 4
1 5
1 5
0 4
1 9
Definition 5.2.2. ∗ is associative on A, if for all a, b, c ∈ A we have (a ∗ b) ∗ c = a ∗ (b ∗ c).
For example, (i) Addition is both commutative and associative on Z. (ii) ∪ and ∩ are
ommutative and associative on P(S).
5.2.2
Closure
Suppose we have a binary operation ∗ and apply to the elements a, b ∈ A to produce a ∗ b.
Its important to know whether or not a ∗ b belongs to the original set A. If a ∗ b ∈ A for
every pair a, b ∈ A, then we say A is closed under the binary operation ∗. For example,
Z+ is not closed under subtraction, since, for example, 2 − 5 = −3 ∈
/ Z+ but Z+ is closed
under addition and multiplication. A is closed under the binary operation ∗ if and only if
a, b ∈ A =⇒ a ∗ b ∈ A.
5.2.3
Identity Elements
An identity element is an element that when involved in an operation with another element
does not change the value of that element.
Example 5.2.2. Find a the identity element for the operation defined as a ◦ b = a + b + 2.
Let a + b + 2 = a, then b = −2, identity element = −2.
Consider the product a ∗ b = c of a, b ∈ A which is closed under the binary operation ∗.
a ∗ b is called the right hand product of b with a and b ∗ a is called the left hand product of
b with a. For example, in the set of R under the operation +, 3 + 0 = 3. We have formed
the right hand sum of 0 and 3, addition of 0 does not affect the number 3. If in any set
A, there is an element i for which, for all elements a ∈ A, a ∗ i = i ∗ a = a, then we call
i an identity element or neutral element for set A under the operation ∗. An identity
50
element is unique. For example, consider P(S), the power set of S. The ∅ acts as an identity
for the operation of union on P(S).
5.2.4
Inverse Elements
An inverse element is an element that, when involved in an operation with another results
in the identity element for that operation.
Consider the set of R i.e., given a real number say 2 12 we can obtain another real number
−2 12 with the property that 2 21 + (−2 12 ) = 0. For all x ∈ R, x + (−x) = 0 and (−x) + x = 0.
An element (−x) is called an inverse element. We only seek inverse elements in sets
which have an identity element. Take a set A with binary operation ∗, (A, ∗). Suppose that
there is an identity element i ∈ A and a ∈ A. Then there is an element b in A such that
a ∗ b = b ∗ a = i, then b is called the inverse of a and write b = a−1 .
Example 5.2.3. (i) Relative to + on Z, the inverse of a is −a, for all a ∈ Z.
(ii) Relative to · on Z only 1 and −1 have inverses.
1
(iii) Relative to multiplication on Q, if a ∈ Q and a 6= 0, then a−1 = .
a
5.3
Operation Tables (Cayley Table)
A binary operation on a finite Suppose that A is a small finite set and ∗ is an operation on
A. We can write out a complete description of ∗ as follows ; Let A = {a, b, c} and ∗ be such
that
a ∗ a = a b ∗ a = b b ∗ c = a a ∗ b = b c ∗ a = c c ∗ b = a a ∗ c = c b ∗ b = c c ∗ c = b.
The situation is easily improved by recording the information in the form of a table.
∗ a b c
a a b c
b b c a
c c a b
It is easy to determine the existence of special elements from the table and also whether or
not the operation is commutative.
Commutativity of the operation corresponds to symmetry in the table with respect to the
diagonal i.e., if we fold the table along the diagonal, then the operation is commutative
51
only when the corresponding entries are the same. The existence of a left identity can be
determined simply by locating a row that is a repetition of the column headings. Element a
is a left identity for ∗. A right identity is found (if one exists) by locating a column that is
a repeat of the row headings. Again a is a right identity for ∗. Putting these two together
yields a method for determining the existence (non-existence) of an identity relative to a
given operation.
Example 5.3.1. The Cayley table for the set S = {1, i, −1, i} with operation multiplication is
×
1
i
−1
−i
1
i −1
1
i −1
i −1 −i
−1 −i 1
−i 1
i
−i
−i
1
i
−1
Test four properties, closed set, associative, identity and multiplicative inverse. All the results
are members of the original set {1, i, −1, −i}. Therefore a closed set. The set is associative,
for example, (1 × i) × (−i) = i × −i = 1 and 1 × (i × −i) = 1 × 1 = 1. The identity = 1.
There is a 1 in every row of the table, so each element has a unique inverse. Note the table
is symmetric about the leading diagonal, hence it is also commutative.
5.4
Groups
Definition 5.4.1. A group (G, ∗) is an algebraic structure, where G is a set and ∗ is a
composition on that set, satisfying the following laws
1. Closure : ∀a, b ∈ G, a ∗ b ∈ G.
2. Associativity : ∀a, b, c ∈ G, a ∗ (b ∗ c) = (a ∗ b) ∗ c.
3. Neutral Element : there is a unique e in G such that ∀a ∈ G, e ∗ a = a ∗ e = a.
4. Inverse Elements : ∀a ∈ G, ∃ a unique a−1 ∈ G such that a ∗ a−1 = a−1 ∗ a = e.
A group can be infinite, the number |G| denotes the order of G, i.e., the number of
elements in the group G.
Example 5.4.1.
1. (Q − {0}, ·) (multiplication on the non-zero rationals)
2. (Z, +) (addition of integers)
3. (R, +) (addition of the reals)
52
4. (R − {0}, ·) (multiplication on the non-zero reals)
Definition 5.4.2. An abelian group or commutative group, is a group for which the
commutative axiom holds. i.e., a ∗ b = b ∗ a for every a, b ∈ G.
Theorem 5.4.1.
1. The identity of a group is unique.
2. Every element of a group has a unique inverse.
3. For all a, b in a group (G, ∗), we have (a ∗ b)−1 = b−1 ∗ a−1 .
4. Let e be the identity in (G, ∗). An element a ∈ G is idempotent if a ∗ a = a. In G the
only idempotent element is e.
5. If a, b, c ∈ G, then a ∗ c = b ∗ c =⇒ a = b. (Cancellation Law)
Theorem 5. Suppose that a ∗ c = b ∗ c, then (a ∗ c) ∗ c−1 = (b ∗ c) ∗ c−1 . Hence a ∗ (c ∗ c−1 ) =
b ∗ (c ∗ c−1 ) (by the associativity law). Hence a ∗ i = b ∗ i (by the inverse property), and so
a = b (by the identity property).
5.5
Permutations
Let A be the set {1, 2, · · · , n}. The
A is denoted by Sn . A permutation
set of permutations of 1
2
···
n
σ ∈ Sn is usually represented as
where σ maps 1 to σ(1), σ maps
σ(1) σ(2) · · · σ(n)
2 to σ(2)and so on. The order of the columns in this representation of σ is immaterial, for
1 2 3 4
example
and
2 4 1 3
53
54
Chapter 6
Introduction to Probability Theory
6.1
Probability
Probability theory provides a mathematical foundation to concepts such as “probability”,
“information”, “belief”, “uncertainty”, “confidence”, “randomness”, “variability”, “chance”
and “risk”. Probability theory is important to empirical scientists because it gives them
a rational framework to make inferences and test hypotheses based on uncertain empirical
data. Probability theory is also useful to engineers building systems that have to operate
intelligently in an uncertain world. For example, some of the most successful approaches
in machine perception (e.g., automatic speech recognition, computer vision) and artificial
intelligence are based on probabilistic models. Moreover probability theory is also proving
very valuable as a theoretical framework for scientists trying to understand how the brain
works. Many computational neuroscientists think of the brain as a probabilistic computer
built with unreliable components, i.e., neurons, and use probability theory as a guiding
framework to understand the principles of computation used by the brain. Consider the
following examples:
• You need to decide whether a coin is loaded (i.e., whether it tends to favor one side
over the other when tossed). You toss the coin 6 times and in all cases you get “Tails”.
Would you say that the coin is loaded?
• You are trying to figure out whether newborn babies can distinguish green from red. To
do so you present two colored cards (one green, one red) to 6 newborn babies. You make
sure that the 2 cards have equal overall luminance so that they are indistinguishable
if recorded by a black and white camera. The 6 babies are randomly divided into two
groups. The first group gets the red card on the left visual field, and the second group
on the right visual field. You find that all 6 babies look longer to the red card than the
green card. Would you say that babies can distinguish red from green?
55
• A pregnancy test has a 99% validity (i.e., 99 of 100 pregnant women test positive)
and 95% specificity (i.e., 95 out of 100 non pregnant women test negative). A woman
believes she has a 10% chance of being pregnant. She takes the test and tests positive.
How should she combine her prior beliefs with the results of the test?
• You need to design a system that detects a sinusoidal tone of 1000Hz in the presence
of white noise. How should you design the system to solve this task optimally?
• How should the photo receptors in the human retina be interconnected to maximize
information transmission to the brain?
While these tasks appear different from each other, they all share a common problem: The
need to combine different sources of uncertain information to make rational decisions. Probability theory provides a very powerful mathematical framework to do so. We now go into
the mathematical aspects of probability theory.
6.2
Sample Spaces
A set S that consists of all possible outcomes of a random experiment is called a sample
space, and each outcome is called a sample point. Often there will be more than one sample
space that can describe outcomes of an experiment, but there is usually only one that will
provide the most information.
Example 6.2.1. If we toss a die, then one sample space is given by {1, 2, 3, 4, 5, 6} while another is {even, odd}. It is clear, however, that the latter would not be adequate to determine,
for example, whether an outcome is divisible by 3.
The sample space is also called the outcome space, reference set, and universal set.
It is often useful to portray a sample space graphically. In such cases, it is desirable to
use numbers in place of letters whenever possible. If a sample space has a finite number of
points, it is called a finite sample space. If it has as many points as there are natural numbers
1, 2, 3, . . . , it is called a countably infinite sample space. If it has as many points as there are
in some interval on the x axis, such as 0 ≤ x ≤ 1, it is called a noncountably infinite sample
space. A sample space that is finite or countably finite is often called a discrete sample space,
while one that is noncountably infinite is called a nondiscrete sample space.
Example 6.2.2. The sample space resulting from tossing a die yields a discrete sample
space. However, picking any number, not just integers, from 1 to 10, yields a non-discrete
sample space.
56
6.3
Events
We have defined outcomes as the elements of a sample space S. In practice, we are interested
in assigning probability values not only to outcomes but also to sets of outcomes. For
example, we may want to know the probability of getting an even number when rolling a
die. In other words, we want the probability of the set {2, 4, 6}. An event is a subset A of
the sample space S, i.e., it is set of possible outcomes. If the outcome of an experiment is
an element of A, we say that the event A has occurred. An event consisting of a single point
of S is called a simple or elementary event.
As particular events, we have S itself, which is the sure or certain event since an element of
S must occur, and the empty set ∅, which is called the impossible event because an element
of ∅ cannot occur.
By using set operations on events in S, we can obtain other events in S. For example, if A
and B are events, then
1. A ∪ B is the event “either A or B or both.” A ∪ B is called the union of A and B.
2. A ∩ B is the event “both A and B.” A ∩ B is called the intersection of A and B.
3. A0 is the event “not A.” A0 is called the complement of A.
4. A − B = A ∩ B 0 is the event “A but not B.” In particular, A0 = S − A.
If the sets corresponding to events A and B are disjoint, i.e., A ∩ B = ∅, we often say that
the events are mutually exclusive. This means that they cannot both occur. We say that
a collection of events A1 , A2 , . . . , An is mutually exclusive if every pair in the collection is
mutually exclusive.
Example 6.3.1. Consider an experiment of tossing a coin twice, let A be the event “at least
one head occurs” and B the event “the second toss results in a tail.” Find the events A ∪ B,
A ∩ B, A0 and A − B.
Solution: We observe that A = {HT, T H, HH}, B = {HT, T T } and so we have
A ∪ B = {HT, T H, HH, T T } = S,
A ∩ B = {HT }
A0 = {T T }
A − B = {T H, HH}.
6.4
The Concept of Probability
In any random experiment there is always uncertainty as to whether a particular event will
or will not occur. As a measure of the chance, or probability, with which we can expect the
57
event to occur, it is convenient to assign a number between 0 and 1. If we are sure or certain
that an event will occur, we say that its probability is 100% or 1. If we are sure that the
event will not occur, we say that its probability is zero. If, for example, the probability is
1/4, we would say that there is a 25% chance it will occur and a 75% chance that it will not
occur. Equivalently, we can say that the odds against occurrence are 75% to 25%, or 3 to 1.
There are two important procedures by means of which we can estimate the probability
of an event.
1. CLASSICAL APPROACH: If an event can occur in h different ways out of a total
of n possible ways, all of which are equally likely, then the probability of the event is
h/n.
Example 6.4.1. Suppose we want to know the probability that a head will turn up in a
single toss of a coin. Since there are two equally likely ways in which the coin can come
up-namely, heads and tails (assuming it does not roll away or stand on its edge)- and of
these two ways a head can arise in only one way, we reason that the required probability
is 1/2. In arriving at this, we assume that the coin is fair, i.e., not loaded in any way.
2. FREQUENCY APPROACH: If after n repetitions of an experiment, where n is
very large, an event is observed to occur in h of these, then the probability of the event
is h/n. This is also called the empirical probability of the event.
Example 6.4.2. If we toss a coin 1000 times and find that it comes up heads 532 times,
we estimate the probability of a head coming up to be 532/1000 = 0.532.
Both the classical and frequency approaches have serious drawbacks, the first because the
words “equally likely” are vague and the second because the “large number” involved is
vague. Because of these difficulties, mathematicians have been led to an axiomatic approach
to probability.
6.5
The Axioms of Probability
Suppose we have a sample space S. If S is discrete, all subsets correspond to events and
conversely; if S is nondiscrete, only special subsets (called measurable) correspond to events.
To each event A in the class C of events, we associate a real number P (A). The P is called
a probability function, and P (A) the probability of the event, if the following axioms are
satisfied.
Axiom 1. For every event A in class C, P (A) ≥ 0
Axiom 2. For the sure or certain event S in the class C, P (S) = 1
58
Axiom 3. For any number of mutually exclusive events A1 , A2 , . . . , in the class C,
P (A1 ∪ A2 ∪ . . .) = P (A1 ) + P (A2 ) + . . . In particular, for two mutually exclusive events
A1 and A2 , P (A1 ∪ A2 ) = P (A1 ) + P (A2 ).
6.6
Some Important Theorems on Probability
From the above axioms we can now prove various theorems on probability that are important
in further work.
Theorem 6.6.1. If A1 ⊂ A2 , then P (A1 ) ≤ P (A2 ) and P (A2 − A1 ) = P (A2 ) − P (A1 ).
Theorem 6.6.2. For every event A, 0 ≤ P (A) ≤ 1, i.e., a probability between 0 and 1.
Theorem 6.6.3. For ∅, the empty set, P (∅) = 0, i.e., the impossible event has probability
zero.
Theorem 6.6.4. If A0 is the complement of A, then P (A0 ) = 1 − P (A).
Theorem 6.6.5. If A = A1 ∪ A2 ∪ A3 ∪ . . . ∪ An , where A1 , A2 , . . . , An are mutually exclusive
events, then
P (A) = P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ).
In particular, if A = S, the sample space, then
P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ) = 1.
Theorem 6.6.6. If A and B are any two events, then P (A∪B) = P (A)+P (B)−P (A∩B).
More generally, if A1 , A2 , A3 are any three events, then
P (A1 ∪A2 ∪A3 ) = P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A2 )−P (A2 ∩A3 )−P (A3 ∩A1 )+P (A1 ∩A2 ∩A3 ).
Generalizations to n events can also be made.
Theorem 6.6.7. For any events A and B, P (A) = P (A ∩ B) + P (A ∩ B 0 ).
Theorem 6.6.8. If an event A must result in the occurrence of one of the mutually exclusive
events A1 , A2 , . . . , An , then
P (A) = P (A ∩ A1 ) + P (A ∩ A2 ) + · · · + P (A ∩ An ).
6.7
Assignment of Probabilities
If a sample space S consists of a finite number of outcomes a1 , a2 , . . . , an , then by theorem
6.6.5,
P (A1 ) + P (A2 ) + . . . + P (An ) = 1
59
where A1 , A2 , . . . , An are elementary events given by Ai = {ai }.
It follows that we can arbitrarily choose any nonnegative numbers for the probabilities of
these simple events as long as the previous equation is satisfied. In particular, if we assume
equal probabilities for all simple events, then
P (Ak ) =
1
,
n
k = 1, 2, . . . , n
And if A is any event made up of h such simple events, we have
P (A) =
h
.
n
This is equivalent to the classical approach to probability. We could of course use other
procedures for assigning probabilities, such as frequency approach.
Assigning probabilities provides a mathematical model, the success of which must be tested
by experiment in much the same manner that the theories in physics or others sciences must
be tested by experiment.
Example 6.7.1. A single die is tossed once. Find the probability of a 2 or 5 turning up.
Solution: The sample space is S = {1, 2, 3, 4, 5, 6}. If we assign equal probabilities to
the sample points, i.e., if we assume that the die is fair, then
1
P (1) = P (2) = · · · = P (6) = .
6
The event that either 2 or 5 turns up is indicated by 2 ∪ 5. Therefore,
P (2 ∪ 5) = P (2) + P (5) =
6.8
1 1
1
+ = .
6 6
3
Conditional Probability
Let A and B be two events such that P (A) > 0. Denote P (B|A) the probability of B given
that A has occurred. Since A is known to have occurred, it becomes the new sample space
replacing the original S. From this we are led to the definition
P (B|A) ≡
P (A ∩ B)
P (A)
(6.1)
or
P (A ∩ B) ≡ P (A)P (B|A).
(6.2)
In words, this is saying that the probability that both A and B occur is equal to the probability that A occurs times the probability that B occurs given that A has occurred. We call
60
P (B|A) the conditional probability of B given A, i.e., the probability that B will occur given
that A has occurred. It is easy to show that conditional probability satisfies the axioms of
probability previously discussed.
Example 6.8.1. Find the probability that a single toss of a die will result in a number less
than 4 if
(a) no other information is given and
(b) it is given that the toss resulted in an odd number.
Solution:
(a) Let B denote the event {less than 4}. Since B is the union of the events 1, 2, or 3
turning up, we see by Theorem 6.6.5 that
P (B) = P (1) + P (2) + P (3) =
1 1 1
1
+ + =
6 6 6
2
assuming equal probabilities for the sample points.
1
3
(b) Letting A be the event {odd number}, we see that P (A) = = . Also, P (A ∩ B) =
6
2
2
1
= . Then
6
3
1/3
2
P (A ∩ B)
=
= .
P (B|A) =
P (A)
1/2
3
Hence, the added knowledge that the toss results in an odd number raises the probability
from 1/2 to 2/3.
6.9
Theorems on Conditional Probability
Theorem 6.9.1. For any three events A1 , A2 , A3 , we have
P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 |A1 )P (A3 |A1 ∩ A2 ).
(6.3)
In words, the probability that A1 and A2 and A3 all occur is equal to the probability
that A1 occurs times the probability that A2 occurs given that A1 has occurred times the
probability that A3 occurs given that both A1 and A2 have occurred. The result is easily
generalized to n events.
Theorem 6.9.2. If an event A must result in one of the mutually exclusive events A1 , A2 , . . . , An ,
then
P (A) = P (A1 )P (A|A1 ) + P (A2 )P (A|A2 ) + . . . + P (An )P (A|An ).
61
(6.4)
6.10
Independent Events
If P (B|A) = P (B), i.e., the probability of B occurring is not affected by the occurrence or
nonoccurrence of A, then we say that A and B are independent events. This is equivalent
to
P (A ∩ B) = P (A)P (B).
(6.5)
Notice also that if this equation holds, then A and B are independent.
We say that three events A1 , A2 , A3 are independent if they are pairwise independent.
P (Aj ∩ Ak ) = P (Aj )P (Ak ), j 6= k
where j, k = 1, 2, 3
(6.6)
and
P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ).
(6.7)
Both of these properties must hold in order for the events to be independent. Independence
of more than three events is easily defined.
Note: In order to use this multiplication rule, all of your events must be independent.
6.11
Bayes’ Theorem or Rule
Suppose that A1 , A2 , . . . , An are mutually exclusive events whose union is the sample space
S, i.e., one of the events must occur. Then if A is any event, we have the following important
theorem:
Theorem 6.11.1. (Bayes’ Rule):
P (Ak |A) =
P (Ak )P (A|Ak )
n
X
.
(6.8)
P (Aj )P (A|Aj )
j=1
This enables us to find the probabilities of the various events A1 , A2 , . . . , An that can
occur. For this reason Bayes’ theorem is often referred to as a theorem on the probability of
causes.
A
62
Download