UNIVERSITY OF ZIMBABWE DEPARTMENT OF MATHEMATICS DESCRETE MATHS NOTES Mr G.S Maridza 1 2 Chapter 1 Number systems 1.1 Number Systems Mathematics has its own language with numbers as the alphabet. The language is given structure with the aid of connective symbols, rules of operation, and a rigorous mode of thought (logic). The number systems that we use in calculus are the natural numbers, the integers, the rational numbers, and the real numbers. Let us describe each of these : 1. The natural numbers are the system of positive counting numbers 1, 2, 3 . . . . We denote the set of all natural numbers by N. N = {1, 2, 3, 4, 5, 6, 7, 8, . . . }. 2. The integers are the positive and negative whole numbers and zero, . . . , −3, −2, −1, 0, 1, 2, 3, . . . . We denote the set of all integers by Z. Z = {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . }. 3. The rational numbers are quotients of integers or fractions, such as 32 , − 54 . Any p number of the form , with p, q ∈ Z and q 6= 0, is a rational number. We denote the q set of all rational numbers by Q. p Q= p, q ∈ Z, q 6= 0 . q 4. The real numbers are the set of all decimals, both terminating and non-terminating. We denote the set of all real numbers by R. A decimal number of the form x = 3.16792 is actually a rational number, for it represents x = 3.16792 = 316792 . 100000 A decimal number of the form m = 4.27519191919 . . . , 3 with a group of digits that repeats itself interminably, is also a rational number. To see this, notice that 100 · m = 427.519191919 . . . and therefore we may subtract 100m = 427.519191919 . . . m = 4.27519191919 . . . Subtracting, we see that 99m = 423.244 or 423244 . 99000 So, as we asserted, m is a rational number or quotient of integers. To indicate recurring decimals we sometimes place dots over the repeating cycle of digits, e.g., m = 4.2751̇9̇, 19 = 3.16̇. 6 Another kind of decimal number is one which has a non-terminating decimal expansion that does not keep repeating. An example is π = 3.14159265 . . . . Such a number is irrational, that is, it cannot be expressed as the quotient of two integers. In summary : There are three types of real numbers : (i) terminating decimals, (ii) non-terminating decimals that repeat, (iii) non-terminating decimals that do not repeat. Types (i) and (ii) are rational numbers. Type (iii) are irrational numbers. The geometric representation of real numbers as points on a line is called the real axis. Between any two rational numbers on the line there are infinitely many rational numbers. This leads us to call the set of rational numbers an everywhere dense set. Real numbers are characterised by three fundamental properties : m= (a) algebraic means formalisations of the rules of calculation (addition, subtraction, multiplication, division). Example : 2(3 + 5) = 2 · 3 + 2 · 5 = 6 + 10 = 16. 3 1 (b) order denote inequalities. Example : − < . 4 3 (c) completeness implies that there are “no gaps” on the real line. Algebraic properties of the reals for addition (a, b, c ∈ R) are : (A1) a + (b + c) = (a + b) + c. associativity (A2) a + b = b + a. commutativity (A3) There is a 0 such that a + 0 = a. identity (A4) There is an x such that a + x = 0. inverse Why these rules? They define an algebraic structure (commutative group). Now define analogous algebraic properties for multiplication : (M1) a(bc) = (ab)c. 4 (M2) ab = ba. (M3) There is a 1 such that a · 1 = a. (M4) There is an x such that ax = 1 for a 6= 0. Finally, connect multiplication and addition : (D) a(b + c) = ab + ac. distributivity These 9 rules define an algebraic structure called a field. Order properties of the reals are : (O1) for any a, b ∈ R, a ≤ b or b ≤ a. totality of ordering I (O2) if a ≤ b and b ≤ a, then a = b. totality of ordering II (O3) if a ≤ b and b ≤ c, then a ≤ c. transitivity (O4) if a ≤ b, then a + c ≤ b + c. order under addition (O5) if a ≤ b and c ≥ 0, then ac ≤ bc. order under multiplication Some useful rules for calculations with inequalities are : If a, b, c are real numbers, then : (a) if a < b and c < 0 ⇒ bc < ac. (b) if a < b ⇒ −b < −a. 1 (c) if a > 0 ⇒ > 0. a 1 1 < . b a The completeness property can be understood by the following construction of the real numbers : Start with the counting numbers 1, 2, 3, . . . . (d) if a and b are both positive or negative, then a < b ⇒ N = {1, 2, 3, 4, . . . } natural numbers ⇒ Can we solve a + x = b for x? Z = {. . . , −2, −1, 0, 1, 2, . . . } integers ⇒ Can we solve ax = b for x? Q = { pq |p, q ∈ Z, q 6= 0} rational numbers ⇒ Can we solve x2 = 2 for x? 2 R √ real numbers, for example : The positive solution to the equation x = 2 is 2. This is an irrational number whose decimal representation is not eventually repeating. ⇒ N⊂Z⊂Q⊂R In summary, the real numbers R are complete in the sense that they correspond to all points on the real line, i.e., there are no “holes” or “gaps”, whereas the rationals have “holes” (namely the irrationals). You Try It : What type of real number is 3.41287548754875 . . . ? Can you express this number in more compact form? 5 6 Chapter 2 Basic Concepts of Set Theory The purpose of this chapter is twofold: to provide an introduction to, or review of, the terminology, notation, and basic properties of sets, and, perhaps more important, to serve as a starting point for our primary goal — the development of the ability to discover and prove mathematical theorems. The emphasis in this chapter is on discovery, with particular attention paid to the kinds of evidence (e.g., specific examples, pictures) that mathematicians use to formulate conjectures about general properties. 2.1 A brief history of sets A set is an unordered collection of objects, and as such a set is determined by the objects it contains. Before the 19th century it was uncommon to think of sets as completed objects in their own right. Mathematicians were familiar with properties such as being a natural number, or being irrational, but it was rare to think of say the collection of rational numbers as itself an object. (There were exceptions. From Euclid mathematicians were used to thinking of geometric objects such as lines and planes and spheres which we might today identify with their sets of points.) In the mid 19th century there was a renaissance in Logic. For thousands of years, since the time of Aristotle and before, learned individuals had been familiar with syllogisms as patterns of legitimate reasoning, for example: All men are mortal. Socrates is a man. Therefore Socrates is mortal. But syllogisms involved descriptions of properties. The idea of pioneers such as Boole was to assign a meaning as a set to these descriptions. For example, the two descriptions “is a man” and “is a male homo sapiens” both describe the same set, the set of all men. It was this objectification of meaning, under- standing properties as sets, that led to a rebirth of Logic and Mathematics in the 19th century. Cantor took the idea of set to a revolutionary level, unveiling its true power. By inventing a notion of size of set he was able compare different forms of infinity and, almost incidentally, to shortcut several traditional mathematical arguments. But the power of sets came at a price; it came with dangerous paradoxes. The work of Boole and others suggested a programme exposited by Frege, and Russell and Whitehead, to build a foundation for all of Mathematics on Logic. Though to be more accurate, they were really 7 reinventing Logic in the process, and regarding it as intimately bound up with a theory of sets. The paradoxes of set theory were a real threat to the security of the foundations. But with a lot of worry and care the paradoxes were sidestepped, first by Russell and Whitehead’s theory of stratified types and then more elegantly, in for exam- ple the influential work of Zermelo and Fraenkel. The notion of set is now a cornerstone of Mathematics. The formal development of set theory began in 1874 with the work of Georg Cantor (18451918). Since then, motivated particularly by the discovery of certain paradoxes (e.g., Russell’s paradox), logicians have made formal set theory and the foundations of mathematics a vital area of mathematical research, and mathematicians at large have incorporated the language and methods of set theory into their work, so that it permeates all of modern mathematics. 2.2 Sets and Elements The notion of set is a primitive, or undefined, term in mathematics, analogous to point and line in plane geometry. Therefore, our starting point, rather than a formal definition, is an informal description of how the term ”set” is generally viewed in applications to undergraduate mathematics. Similar (but informal) words : collection, group, aggregate, bundle, ensemble, family, class. Description : A set is a collection of objects which are called the members or elements of that set. For example 1. The set of students in this room. 2. The English alphabet may be viewed as the set of letters of the English language. 3. The set of natural numbers, e.t.c. Sets can consist of elements of various nature : people, physical objects, numbers, signs, other sets, e.t.c. A set is an ABSTRACT object, its members do not have to be physically collected together for them to constitute a set. The membership criteria for a set must in principle be welldefined, and not vague. Sets can be finite or infinite. 2.3 Some Interesting Sets of Numbers Let’s look at different types of numbers that we can have in our sets. 1. Natural Numbers The set of natural numbers is {1, 2, 3, 4, . . . } and is denoted by N. 2. Integers The set of integers is {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . } and is denoted by Z. The Z symbol comes from the German word, Zahlen, which means number. Define the non-negative integers {0, 1, 2, 3, 4, . . . } often denoted by Z+ . All natural numbers are integers. 8 3. Rational Numbers The set of rational numbers is denoted by Q and consists of all fractional numbers i.e., x ∈ Q if x can be written in the form pq , where p, q ∈ Z with q 6= 0. 4. Real Numbers The real numbers are denoted by R. 5. Complex Numbers The complex numbers are denoted by C. 2.4 Notation 1. A, B, C, . . . for sets. 2. a, b, c, . . . or x, y, z, . . . for members. 3. b ∈ A, if b belongs to A. 4. c ∈ / A, if c does not belong to A. 5. ∅ is used for the empty set. There is exactly one set, the empty set or null set, which has no members at all. 6. A set with only one member is called a singleton or singleton set. for example, {x}. 2.5 Specification of Sets One advantage of having an informal definition of the term set is that, through it, we can introduce some other terminology related to sets. The term element is one example, and the notion well-defined is another. The latter term relates to the primary requirement for any such description: Given an object, we must be able to determine whether or not the object lies in the described set. 2.5.1 Three Ways to Specify a Set 1. Listing all its members (List Notation). 2. By stating a property of its elements (Predicate Notation). 3. By defining a set of rules which generates (defines) its members (Recursive Rules). List Notation This is suitable for finite sets. It lists names of elements of a set, separated by commas and enclose them in braces. For example A = {2, 3, 5}, B = {a, b, d, m}, C = {George Washington, Bill Clinton}. Note that 2 ∈ A and George Washington ∈ C, but 8 ∈ / A and k ∈ / B. 9 Two important facts are: (i) the order in which elements are listed is irrelevant and (ii) an object should be listed only once in the list, since listing it more than once does not change the set. As an example, the set {1, 1, 2} is the same as the set {1, 2} (so that the representation {1, 1, 2} is never used) which, in turn, is the same as the set {2, 1}. Three Dot Abbreviation For example, {1, 2, . . . , 100}. Predicate Notation We describe a set in terms of one or more properties to be satisfied by objects in the set, and by those objects only. Such a description is formulated in so-called set-builder notation, that is, in the form A = {x|x satisfies some property or properties}, which we read “A is the set of all objects x such that x satisfies . . . .”. For example, {x|x is a natural number and x < 8}. Reading : the set of all x such that x is a natural number and is less than 8. For example, (i) {x|x is a positive number} (ii) {x|x is a letter of the Russian alphabet}. The general form is {x|P (x)} where P is some predicate (condition,property). In all these examples the vertical line is read “such that” and the set is understood to consist of all objects satisfying the preceding description, and only those objects. Recursive Rules For example, the set E of even numbers greater than 3, (a) 4 ∈ E (b) if x ∈ E, then x + 2 ∈ E (c) nothing else belongs to E. The first rule is the basis of the recursion, the second one generates new elements from the elements defined before and the third rule restricts the defined set to the elements generated by (a) and (b). The collection, out of which all sets under consideration may be formed, is called the universe of discourse or universal set, denoted by U. For our purposes a universal set is the set of all objects under discussion in a particular setting. A universal set will often be specified at the start of a problem involving sets, whereas in other situations a universal set is more or less clearly, but implicitly, understood as background to a problem. The role then of a universal set is to put some bounds on the nature of the objects that can be considered for membership in the sets involved in a given situation. It is in connection with the description method that “well definedness” comes into play. The rule or rules used in describing a set must be (i) meaningful, that is, use words and/or symbols with an understood meaning and (ii) specific and definitive, as opposed to vague and indefinite. Thus descriptions like G = {x|x is a goople} or E = {x|x % & 3} or Z = {x|x is a large state in the U nited States} do not define sets. The descriptions of G and E involve nonsense symbols or words, while the description of Z gives a purely subjective criterion for membership. 10 2.6 The Empty set (Null Set) We have that the fundamental property of a set is that we can assert of each object whether or not it is a member of the set. Consider a set constructed by asserting of each object that it is not a member of the set. This set has no members and is therefore called the empty set. Definition 2.6.1. The null or empty set is the set that does not contain any elements, denoted by the Scandinavian letter ∅ = {} = {x|x 6= x}. Example 2.6.1. (i) {x ∈ R|x2 = −1} (ii) {x ∈ Z|x2 = 2}. Theorem 2.6.1. There is exactly one empty set. 2.7 Identity and Cardinality Two sets are identical if and only if (iff) they have the same elements or both are empty. So A = B iff, for every x, x ∈ A ⇔ x ∈ B. Example 2.7.1. {0, 2, 4} = {x|x is an even positive integer less than 5}. As the above example shows, equality of sets does not mean they have identical defi-nitions; there are often many different ways of describing the same set. The definition of equality reflects rather the fact that a set is just a collection of objects. If we have to prove that the sets A and B are equal, it is often quite difficult to prove in one go that they have the same elements. What is usually done is to split the proof into two parts: (a) Show that every member of A is a member of B. (b) Show that every member of B is a member of A. The number of elements in a set A is called the cardinality of A, denoted by |A|. The cardinality of a finite set is a natural number. Infinite sets also have cardinalities but they are not natural numbers. The set A is said to be countable or enumerable if there is a way to list the elements of A. 2.8 Russell’s Paradox (antimony) A paradox (antimony) is an apparently true statement that seems to lead to a logical self-contradiction. Its important to note that any given property, P (x) does not necessarily determine a set, i.e., we cannot say that given an arbitrarily property P , there corresponds a set whose elements satisfy the property P . Consider the following, There was once a barber. Wherever he lived, all of the men in this town either shaved themselves or were shaved by the barber. And this barber only shaved the men who did not shave themselves. Did the barber shave himself? Let’s say that he did shave himself. But from this he shaved only the men in town, who did 11 not shave themselves, therefore, he did not shave himself. But we see that every men in town either shaved himself or was shaved by the barber. So he did shave himself. We have a contradiction. Russell observed that if S is a set, then either S ∈ S or S ∈ / S, since a given object is either a member of a given set or is not a member of that set. Consider the set of all sets that are not members of themselves, R = {x|x is a set and x ∈ / x}. R is an object, either R ∈ R or R ∈ / R. (i) Assume R ∈ R, then R is a set, and R ∈ / R by definition. (ii) Assume R ∈ / R, then R ∈ R by definition of R, since we are assuming R is a set. But we cannot have both R ∈ R and R ∈ R, so we reach a contradiction. In both cases we have inferred the paradox that R ∈ R iff R ∈ / R. In other words, the assumption that R is a set has led to a contradiction and therefore there is no such thing, then, as the set of all sets. To avoid unnecessary paradoxes, we assume the existence of the universal set, U. All this leads to the following problems 1. There are things that are true in mathematics (based on assumptions). 2. There are things that are false. 3. There are things that are true that can never be proved. 4. There are things that are false that can never be disproved. After this paradox was described, set theory had to be reformulated axiomatically as axiomatic set theory. 2.9 Inclusion Definition 2.9.1. Having fixed our universal set, U, then for all x ∈ U. If A and B are sets (with all members in U), we write A ⊆ B or B ⊇ A iff x ∈ A =⇒ x ∈ B. (⊆ , set inclusion symbol) A set A is a subset of a set B iff every element of A is also an element of B. If A ⊆ B and A 6= B, we call A a proper subset of B and write A ⊂ B. Theorem 2.9.1. If A ⊆ B and B ⊆ C then A ⊆ C. Proof. Let x ∈ A, then since A ⊆ B, we have x ∈ B and given that B ⊆ C, we conclude that x ∈ C, thus A ⊆ C. Example 2.9.1. (i) {a, b} ⊆ {d, a, b, e} (iv) {a, b} 6⊂ {a, b}. (ii) {a, b} ⊆ {a, b} (iii) {a, b} ⊂ {d, a, b, e} Note that the empty set is a subset of every set, ∅ ⊆ A, for every set A and that for any set A, we have A ⊆ A. 12 2.10 Axiom of Extensionality Theorem 2.10.1. For any two sets A and B, A = B ⇐⇒ A ⊆ B and B ⊆ A. 2.10.1 Power Sets The set of all subsets of A is called the power set of A and is denoted by P(A) and |P(A)| = 2|A| where |A| is finite. Example 2.10.1. If A = {a, b}, then P(A) = {∅, {a}, {b}, {a, b}}. From the above example, a ∈ A, {a} ⊆ A, {a} ∈ P(A), ∅ ⊆ A, ∅ ∈ / A, ∅ ⊆ P(A), ∅ ∈ P(A). 2.11 Operations on Sets Just as there is an “algebra of numbers” based on operations such as addition and multiplication, there is also an algebra of sets based on several fundamental operations of set theory. We develop properties of set algebra later in this chapter; for now our goal is to introduce the operations by which we are able to combine sets to get another set, just as in arithmetic we add or multiply numbers to get a number. 2.11.1 Union and Intersection Let A and B be arbitrary sets. The union of A and B, written A ∪ B, is the set whose elements are just the elements of A or B or both. A ∪ B := {x|x ∈ A or x ∈ B}. Example 2.11.1. Let K = {a, b}, L = {c, d}, M = {b, d}, then K ∪ L = {a, b, c, d}, K ∪ M = {a, b, d}, L ∪ M = {b, c, d}, (K ∪ L) ∪ M = K ∪ (L ∪ M ) = {a, b, c, d}, K ∪ K = K, K ∪ ∅ = ∅ ∪ K = K = {a, b}. The intersection of A and B, written A ∩ B, is the set whose elements are just the elements of both A and B. A ∩ B := {x|x ∈ A and x ∈ B}. Example 2.11.2. K ∩L = ∅, K ∩M = {b}, L∩M = {d}, (K ∩L)∩M = K ∩(L∩M ) = ∅, K ∩ K = K, K ∩ ∅ = ∅ ∩ K = ∅. Observe also that the sets that result from the operation of union tend to be relatively large, whereas those obtained through intersection are relatively small. 2.12 Properties of ∪ and ∩ 1. Every element x in A ∩ B belongs to both A and B, hence x belongs to A and x belongs to B. Thus A ∩ B is a subset of A and of B i.e., A ∩ B ⊆ A and A ∩ B ⊆ B. 13 2. An element x belongs to the union A ∪ B if x belongs to A or x belongs to B, hence every element in A belongs to A ∪ B and every element in B belong to A ∪ B, i.e., A⊆A∪B and B ⊆ A ∪ B. Theorem 2.12.1. For any sets A and B, we have (i) A ∩ B ⊆ A ⊆ A ∪ B and (ii) A ∩ B ⊆ B ⊆ A ∪ B. Theorem 2.12.2. The following are equivalent, A ⊆ B, A ∩ B = A, A ∪ B = B. Proof. Suppose A ⊆ B and let x ∈ A. Then x ∈ B, hence x ∈ A ∩ B and A ⊆ A ∩ B. Then A ∩ B ⊆ A. Therefore A ∩ B = A. Suppose A ∩ B = A and let x ∈ A. Then x ∈ (A ∩ B), hence x ∈ A and x ∈ B. Therefore A ⊆ B. Suppose again that A ⊆ B. let x ∈ (A ∪ B). Then x ∈ A or x ∈ B. If x ∈ A, then x ∈ B because A ⊆ B. In either case x ∈ B. Therefore A ∪ B ⊆ B. But B ⊆ A ∪ B. Therefore A ∪ B = B. Now suppose A ∪ B = B and let x ∈ A. Thus x ∈ (A ∪ B). Hence x ∈ B = A ∪ B, therefore A ⊆ B. Definition 2.12.1. Two sets A and B are called disjoint sets if the intersection of A and B is the null set i.e., A ∩ B = ∅. 2.13 Difference and Complement Definition 2.13.1. A minus B written A \ B or A − B, which subtracts from A all elements which are in B. (also called relative complement, or the complement of B relative to A) A − B := {x|x ∈ A and x ∈ / B}. Example 2.13.1. K − L = {a, b}, K − K = ∅, K − M = {a}, K − ∅ = K, L − M = {c}, ∅ − K = ∅. 2.13.1 Symmetric Difference Definition 2.13.2. A 4 B = A ⊕ B := {x|x ∈ A or x ∈ B but not in both} or A 4 B = A ⊕ B := (A ∪ B) \ (A ∩ B) = (A \ B) ∪ (B \ A). The operation, complement, is unary rather than binary; we obtain a resultant set from a single given set rather than from two such sets. The role of the universal set is so important in calculating complements that we mention it explicitly in the following definition. The complement of a set A, is the set of elements which do not belong to A, i.e., the difference of the universal set U and A. Denote the complement of A by A0 or Ac . A0 = {x|x ∈ U and x ∈ / A} or A0 = U − A. The complement of a set consists of all objects in the universe at hand that are not in the given set. Clearly the complement of A is very much dependent on the universal set, as well as on A itself. If A = {1}, then A0 is one thing if U = N, something quite different if U = R. Example 2.13.2. Let E = {2, 4, 6, . . . }, the set of all even numbers. Then E c = {1, 3, 5, . . . }, the set of odd numbers. 14 2.14 Venn Diagrams A simple and instructive way of illustrating the relationship between sets in the use of the so called Venn-Euler diagrams or simply Venn diagrams. A∩B A B A∩B A B A∪B A B A−B A B 15 B−A A 2.15 B Set Theoretic Equalities 1. Idempotent Laws (i) X ∪ X = X (ii) X ∩ X = X. 2. Commutative Laws (i) X ∪ Y = Y ∪ X (ii) X ∩ Y = Y ∩ X. 3. Associative Laws (i) (X ∪ Y ) ∪ Z = X ∪ (Y ∪ Z) (ii) (X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z). 4. Distributive Laws (i) X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∪ Z) (X ∩ Y ) ∪ (X ∩ Z). 5. Identity Laws (i) X ∪ ∅ = X (ii) X ∪ U = U 6. Complement Laws (i) X ∪ X c = U (iv) X − Y = X ∩ Y c . (iii) X ∩ ∅ = ∅ (ii) (X c )c = X 7. De Morgan’s Laws (i) (X ∪ Y )c = X c ∩ Y c (iv) X ∩ U = X. (iii) X ∩ X c = ∅ (ii) (X ∩ Y )c = X c ∪ Y c . 8. Consistency Principle (i) X ⊆ Y iff X ∪ Y = Y Example 2.15.1. (ii) X ∩ (Y ∪ Z) = (ii) X ⊆ Y iff X ∩ Y = X. 1. Show that (Ac )c = A. Proof. We need to show that A ⊆ (Ac )c and (Ac )c ⊆ A. Let x ∈ A then x ∈ / Ac . If c c c c c x∈ / A , then x ∈ (A ) . By definition of subsets A ⊆ (A ) . We want to show that (Ac )c ⊆ A. Let y ∈ (Ac )c , then y ∈ / Ac . If y ∈ / Ac , then y ∈ A. c c c c We have shown that y ∈ (A ) =⇒ y ∈ A. Thus (A ) ⊆ A. By equality of sets (Ac )c = A. 2. Show that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). Proof. Let D = A ∩ (B ∪ C) and E = (A ∩ B) ∪ (A ∩ C). We have to prove first that D ⊆ E. Let x ∈ D, then x ∈ A and x ∈ (B ∪ C). Since x ∈ (B ∪ C), either x ∈ B or x ∈ C or both. In case x ∈ B we have x ∈ A and x ∈ B, so x ∈ (A ∩ B). On the other hand, if x ∈ / B, then we must have x ∈ C, so x ∈ (A ∩ C). Taking these two cases together, x ∈ (A ∩ B) or x ∈ (A ∩ C), so x ∈ E. Now, we prove that E ⊆ D. Let x ∈ E. Suppose first that x ∈ (A ∩ B), then x ∈ A and x ∈ B, so x ∈ A and x ∈ (B ∪ C) . so x ∈ D. On the other hand, if x 6∈ (A ∩ B), then x ∈ (A ∩ C) so again we obtain x ∈ A and x ∈ (B ∪ C), giving x ∈ D. Hence E ⊆ D. Hence both D ⊆ E and E ⊆ D and we conclude that D = E and consequently A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) . 16 2.16 Counting Elements in Sets If A and B are disjoint sets, then |A ∪ B| = |A| + |B|, otherwise |A ∪ B| = |A| + |B| − |A ∩ B|. Example 2.16.1. Let A = {a, b, c, d, e} and B = {d, e, f, g, h, i}, so that A∪B = {a, b, c, d, e, f, g, h, i} and A ∩ B = {d, e}. Since |A| = 5, |B| = 6, |A ∪ B| = 9, |A ∩ B| = 2, we have |A ∪ B| = |A| + |B| − |A ∩ B| = 5 + 6 − 2 = 9. 2.17 The Algebra of Sets We have considered the problem of showing that two sets are the same, however this technique becomes tedious should the expressions involved be at all complicated. We shall develop an algebra of sets, to assist us in simplifying a given expression. The following basic laws are easily established. Law 1 : (Ac )c = A Law 2 : A ∪ B = B ∪ A Law 3 : A ∩ B = B ∩ A Law 4 : A ∪ (B ∩ C) = (A ∪ B) ∪ C Law 5 : A ∩ (B ∩ C) = (A ∩ B) ∩ C Law 6 : A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) Law 7 : A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) Law 8 : (A ∪ B)c = Ac ∩ B c Law 9 : (A ∩ B)c = Ac ∪ B c Law 10 : U c = ∅ Law 11 : ∅c = U Law 12 : A ∪ ∅ = A Law 13 : A ∪ U = U Law 14 : A ∩ U = A Law 15 : A ∩ ∅ = ∅ Law 16 : A ∪ Ac = U Law 17 : A ∩ Ac = ∅. Example 2.17.1. By using the algebra of sets, show that A ∪ (B ∩ Ac ) = A ∪ B. Proof. A ∪ (B ∩ Ac ) = (A ∪ B) ∩ (A ∪ Ac ) by Law 6 = (A ∪ B) ∩ U by Law 16 = A ∪ B by Law 14. 17 2.18 Set Products 2.18.1 Ordered Pairs Definition 2.18.1. Let n be any natural number and let a1 , a2 , . . . , an be any objects. Then (a1 , a2 , . . . , an ) denotes the ordered n-tuple with first term a1 , second term a2 , . . . and nth term an . Example 2.18.1. (5, 7) denotes the ordered pair whose first term is 5 and second term 7. Note that (5, 7, 2) is called an ordered triple, (5, 7, 2, 4) is called an ordered 4-tuple. The idea of a product of sets can be extended to any finite number of sets. For any sets A1 , A2 , . . . , An , the set of all ordered n-tuples (a1 , a2 , . . . , an ) where a1 ∈ A1 , a2 ∈ A2 , . . . , an ∈ An is called the product of sets A1 , A2 , . . . , An and is denoted A1 × A2 × · · · × An or n Y Ai . i=1 The fundamental statement we can make about an ordered n-tuple is that a given object is the ith term of an ordered n-tuple. Definition 2.18.2. Let A and B be any non-empty sets, then A × B := {(a, b)|a ∈ A and b ∈ B}. If A and B are both finite sets, then |A × B| = |A| · |B|. If A = B, we sometimes write A2 for A × A. Example 2.18.2. 1. If A = {1, 2} and B = {2, 3, 4}, then A×B = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4) and B × A = {(2, 1), (2, 2), (3, 1), (3, 2), (4, 1), (4, 2)}. Notice that A × B 6= B × A, in general. 2. The Cartesian product R × R = R2 is the set of all ordered pairs of real numbers and this represents the 2-dimensional Cartesian plane. 3. (s1 , t1 ) = (s2 , t2 ) if and only if s1 = s2 and t1 = t2 . 2.19 Theorems on Set Products Let A, B, C, D be sets, then 1. A × (B ∪ C) = (A × B) ∪ (A × C). 2. A × (B ∩ C) = (A × B) ∩ (A × C). 18 3. (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D). 4. (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D). 5. (A − B) × C = (A × C) − (B × C). 6. If A and B are non-empty sets, then A × B = B × A if and only if A = B. 7. If A1 ∈ P(A) and B1 ∈ P(B), then A1 × B1 ∈ P(A × B). Example 2.19.1. Prove that (A ∪ B) × C = (A × C) ∪ (B × C). Proof. Consider any element (u, v) ∈ (A ∪ B) × C. By definition u ∈ (A ∪ B) and v ∈ C. Thus u ∈ A or u ∈ B. If u ∈ A, then (u, v) ∈ (A × C) and if u ∈ B, then (u, v) ∈ (B × C). Thus (u, v) is in A × C or in B × C and therefore (u, v) ∈ (A × C) ∪ (B × C). This proves that (A ∪ B) × C ⊆ (A × C) ∪ (B × C). Now consider any element (u, v) ∈ (A × C) ∪ (B × C). This implies that (u, v) ∈ (A × C) or (u, v) ∈ (B × C). In the first case u ∈ A and v ∈ C and in the second case u ∈ B and v ∈ C. Thus u ∈ (A ∪ B) and v ∈ C which implies (u, v) ∈ (A ∪ B) × C. Therefore (A × C) ∪ (B × C) ⊆ (A ∪ B) × C. Hence (A ∪ B) × C = (A × C) ∪ (B × C). 19 20 Chapter 3 Relations and Functions 3.1 Relations In natural language relations are a kind of links existing between objects. For example, mother of, neighbour of, part of, is older than, is an ancestor of, e.t.c. In mathematics there are endless ways that two entities can be related to each other. Consider the following mathematical statements. 30 5 < 10 5≤5 6= 5|80 5 x 6= y 6∈Z X⊆Y π ≈ 3.14. In each case two entities appear on either side of a symbol, and we interpret the symbol as expressing some relationship between the two entities. Symbols such as <, ≤, =, |, ≥, >, ∈ e.t.c are called relations because they convey relationship among things. Given a set A, a relation on A is some property that is either true or false, for any ordered pair (x, y) ∈ A×A. Example 3.1.1. Let A = {eggs, milk, corn} and B = {cows, goats, hens}. We can define a relation R from A to B by (a, b) ∈ R if a is produced by b. In other words R = {(eggs, hens), (milk, cows), (milk, goats)}. With respect to this relation eggs R hens, milk R cows and so on. Example 3.1.2. “greater than” is a relation on Z, denoted by >. It is true that for the pair (3, 2) but false for the pairs (2, 2) and (2, 3). Definition 3.1.1. Given sets A and B, a relation R between A and B is a subset of A × B i.e., R ⊆ A × B. A binary relation is a set of ordered pairs. Any subset of A × A is called a relation on A. Since a relation R on A is a subset of A × A, it is an element of the power set of A × A i.e., R ⊆ P(A × A). All the following expressions mean the same thing 21 1. x bears relationship R to y. 2. x and y are in the R relationship. (x, y) ∈ R, usually written xRy or x ∼ y. Example 3.1.3. 1. Let A be the set of people and B the set of dogs. Define a relation R on A × B by aRb. In this case a is related to an object b if and only if a owns b. 2. Let X = Y . The equality is a relation , we say xRy if x = y. 3. Let X = Y = R. Then ≤, <, ≥, > are all relations between R and R. 4. Let X = Y = Z. Then divisibility is a relation between Z and Z, we say xRy if x|y. 3.1.1 Domain and Range If R is a relation on A × B, we call the set A the domain of R and B the range of R i.e., domR = {a ∈ A|there exists some b ∈ B such that (a, b) ∈ R}, and ranR = {b ∈ B|there exists some a ∈ A such that (a, b) ∈ R}. fldR = domR ∪ ranR is called the field of R. Observe that domR, ranR and f ldR are all subsets of A. Example 3.1.4. Let A = {1, 2, 3, 4, 5, 6} and define R by xRy if and only if x < y and x divides y. So R = {(1, 2), (1, 3), . . . , (1, 6), (2, 4), (2, 6), (3, 6)}. So domR = {1, 2, 3}, ranR = {2, 3, 4, 5, 6} and f ldR = A. 3.1.2 Inverse Relations Every relation R from A to B has inverse relation R−1 from B to A, which is defined by R−1 = {(b, a)|(a, b) ∈ R}. bR−1 a if and only if aRb. Example 3.1.5. Let A = {1, 2, 3} and B = {a, b}. Then R = {(1, a), (1, b), (3, a)} is a relation from A to B. The inverse relation is R−1 = {(a, 1), (b, 1), (a, 3)}. Relations can be represented using arrow diagrams or mappings. Venn diagrams and arrows can be used for representing relations between given sets. Example 3.1.6. If A = {a, b, c, d} and B = {1, 2, 3, 4} and R = {(a, 1), (b, 1), (c, 2), (c, 3)} is a relation from A to B Draw a Venn diagram to demonstrate the relation. 22 3.1.3 Matrix of a Relation Its rows are labelled with elements of A and its column are labelled with the elements of B. If a ∈ A and b ∈ B we write 1 ia a row a and column b if aRb, otherwise we write 0. From the example above, R = {(a, 1), (b, 1), (c, 2), (c, 3)} has the following matrix 1 0 0 0 1 0 0 0 0 1 1 0 1 0 0 0 3.2 Kinds of Relations 3.2.1 Reflexive Relations Definition 3.2.1. A relation R on a set A is called reflexive, if for all, a ∈ A, aRa. More concisely, for all a ∈ A, (a, a) ∈ R. All the values are related to themselves. For example, the relation of equality =, is reflexive, for all numbers a ∈ R, a = a. So = is reflexive. ≤ is also reflexive (a ≤ a for any a ∈ R). Example 3.2.1. Consider the following five relations on the set A = {1, 2, 3, 4} : R1 R2 R3 R4 R5 = = = = = {(1, 1), (1, 2), (2, 3), (1, 3), (4, 4)} {(1, 1), (1, 2), (2, 1), (2, 2), (3, 3), (4, 4)} {(1, 3), (2, 1)} ∅, empty relation A × A, universal relation Determine which of the following are reflexive. Since A contains the four elements 1, 2, 3 and 4, a relation R on A is reflexive if it contains the four pairs (1, 1), (2, 2), (3, 3) and (4, 4). Only R2 and R5 are reflexive. Note that R1 , R3 and R4 are not reflexive, since, for example, (2, 2) does not belong to any of them. Example 3.2.2. Consider the following five relations (1) Relation ≤ on the set Z of integers. (2) Set inclusion ⊆ on a collection C of sets. (3) Relation ⊥ (perpendicular) on the set L of lines in the plane. 23 (4) Relation k (parallel) on the set L of lines in the plane. (5) Relation | of divisibility on the set N of positive integers. Determine which of the relations are reflexive. The relation (3) is not reflexive since no line is perpendicular to itself. Also (4) is not reflexive since no line is parallel to itself. The other relations are reflexive, that is, x ≤ x for every x ∈ Z, A ⊆ A for any set A ∈ C and n | n for every positive integer n in N. Example 3.2.3. Let V = {1, 2, 3} and R = {(1, 1), (2, 4), (4, 4)}. Then R is not a reflexive relation, since (2, 2) does not belong R. One should note that all ordered pairs (a, a) must belong to R in order for R to be reflexive. 3.2.2 Symmetric Relations Definition 3.2.2. A relation R on a set A is called symmetric, if for all, a, b ∈ A, aRb implies bRa. For example, = is symmetric, since x = y then y = x also. Neither ≤ nor < are symmetric (2 ≤ 3 and 2 < 3 but not 3 ≤ 2 nor 3 < 2 is true). Example 3.2.4. (a) Determine which of the relations in Example 3.2.1 are symmetric. R1 is not symmetric since (1, 2) ∈ R1 but (2, 1) ∈ / R1 . R3 is not symmetric since (1, 3) ∈ R3 but (3, 1) ∈ / R3 . The other relations are symmetric. (b) Determine which of the relations in Example 3.2.2 are symmetric. The relation ⊥ is symmetric since if line a is perpendicular to line b then b is perpendicular to a. Also k is symmetric since if line a is parallel to line b then b is parallel to a. The others are not symmetric. For example, 3 ≤ 4 but 4 3, {1, 2} ⊆ {1, 2, 3} but {1, 2, 3} * {1, 2} and 2 | 6 but 6 - 2. Example 3.2.5. Let P = {1, 2, 3, 4} and R = {(1, 3), (4, 2), (2, 4), (2, 3), (3, 1)}. Then R is not a symmetric relation, since (2, 3) ∈ R but (3, 2) 6∈ R. 3.2.3 Anti-Symmetric Relations Definition 3.2.3. A relation R on a set A is called anti-symmetric, if for all, a, b ∈ A, aRb and bRa implies a = b. Anti-symmetric is not the same as not symmetric. 24 Example 3.2.6. Determine which of the relations in Example 3.2.2 are antisymmetric. The relation ≤ is antisymmetric since whenever a ≤ b and b ≤ a then a = b. Set inclusion ⊆ is antisymmetric since whenever A ⊆ B and B ⊆ A then A = B. Also divisibility on N is anti-symmetric since whenever m | n and n | m then m = n (Note that divisibility on Z is not anti-symmetric since 3 | −3 and −3 | 3 but 3 6= −3). The relation ⊥ is not antisymmetric since we cannot have distinct lines a and b such that a ⊥ b and b ⊥ a. Similarly k is not anti-symmetric. 3.2.4 Transitive Relations Definition 3.2.4. A relation R on a set A is called transitive, if for all a, b, c ∈ A, aRb, bRc implies aRc. Example 3.2.7. Determine which of the relations in Example 3.2.2 are transitive. The relations ≤, ⊆ and | are transitive, that is, (i) If a ≤ b and b ≤ c then a ≤ c (ii) If A ⊆ B and B ⊆ C then A ⊆ C (iii) If a | b and b | c then a | c. On the other hand the relation ⊥ is not transitive. If a ⊥ b and b ⊥ c, then it is not true that a ⊥ c. 3.3 Equivalence Relations Some kind of equality notion. Definition 3.3.1. A relation that is reflexive, symmetric and transitive is called an equivalence relation. For example, = on R is an equivalence relation, the classification of animals by species, that is, the relation “ is of the same species as” is an equivalence relation on the set of animals and the relation ⊆ of set inclusion is not an equivalence relation. It is reflexive and transitive, but it is not symmetric since A ⊆ B does not imply B ⊆ A. Not all relations are equivalence relations. Example 3.3.1. Let U = Z and define R = {(x, y)|x and y have the same parity}, i.e., x and y are either both even or both odd. The parity is an equivalence relation. 1. For any x ∈ Z, x has the same parity as itself, so (x, x) ∈ R. 2. If (x, y) ∈ R, x and y have the same parity, so (y, x) ∈ R. 3. If (x, y) ∈ R and (y, z) ∈ R, then x and z have the same parity as y, so they have the same parity as each other (if y is odd, both x and z are odd, if y is even both x and z are even), thus (x, z) ∈ R. 25 Example 3.3.2. For any set S, the identity relation on S, IS = {(x, x)|x ∈ S} is an equivalence relation. 1. Obvious. 2. If (x, y) ∈ R, then y = x so (y, x) = (x, x) ∈ R. 3. If (x, y) ∈ R and (y, z) ∈ R, then x = y = z so (x, z) = (x, x) ∈ R. Example 3.3.3. Let U = R and define the square relation R = {(x, y)|x2 = y 2 }. Square relation is an equivalence relation. 1. For all x ∈ R, x2 = x2 , so (x, x) ∈ R. 2. If (x, y) ∈ R, x2 = y 2 so y 2 = x2 and (y, x) ∈ R. 3. If (x, y) ∈ R and (y, z) ∈ R then x2 = y 2 = z 2 so (x, z) ∈ R. Example 3.3.4. Show that the relation D defined by xDy ⇐⇒ 3 | (x2 − y 2 ) is an equivalence relation. (i) Reflexive : For any x ∈ Z we have x2 − x2 = 0 and since 3 | 0 it follows that xDx for all x ∈ Z. (ii) Symmetric : Suppose xDy. Then 3 | (x2 − y 2 ) so x2 − y 2 = 3n for some n ∈ Z. It follows that y 2 − x2 = 3(−n) and hence 3 | (y 2 − x2 ). Consequently yDx, so D is symmetric. (iii) Transitive : Suppose xDy and yDz. There there exists n, m ∈ Z such that x2 − y 2 = 3n and y 2 − z 2 = 3m. It follows that x2 − (3m + z 2 ) = 3n or x2 − z 2 = 3m + 3n = 3(m + n) and so xDz, that is, D is transitive. Example 3.3.5. Modular Arithmetic We say an integer a is congruent to another integer b modulo a positive integer n, denoted as, a ≡ b mod n, if b − a is an integer multiple of n. Let n = 3 and let A be the set of integers from 0 to 11. Then x ≡ y mod 3 if x and y belongs to A0 = {0, 3, 6, 9} or both belong to A1 = {1, 4, 7, 10} or both belong to A3 = {2, 5, 8, 11}. Congruence modulo 3 is in fact an equivalence relation on A. Reflexive Since x − x = 0 · 3 we know that x ≡ x mod 3. Symmetric If x ≡ y mod 3, then y − x = 3k for some integer k. Hence x − y = −3k and since −k is an integer we have y ≡ x mod 3. Transitive Let x ≡ y mod 3 and y ≡ z mod 3. Then there are integers k and l such that y − x = 3k and z − y = 3l. It follows that z − x = 3k + 3l = 3(k + l) and since k + l is an integer we have x ≡ z mod 3. More generally, congruence modulo n is an equivalence relation on the integers. 26 3.4 Functions Let X and Y be sets. A function f : X → Y is a special kind of relation between X and Y . Its a relation R ⊂ X × Y satisfying the following condition : for all x ∈ X, there exists exactly one y ∈ Y such that (x, y) ∈ R. Definition 3.4.1. Let X and Y be sets. A function f from X to Y is a relation from X to Y such that (i) for any x ∈ X, there is a y ∈ Y , such that (x, y) ∈ f . (ii) for any x ∈ X, if (x, y) ∈ f and (x, z) ∈ f then y = z. Definition 3.4.2. Let f be a function from X to Y . (i) the set X is called the domain (source) of f and the set Y is called the co-domain (target) of f . (ii) range of f is the set Range (f )= {y ∈ Y |there is an x ∈ X such that y = f (x)} = {f (x)|x ∈ X} = f (X). The set f (X) is the image of X under f . Definition 3.4.3. Let f : X → Y be a function. Then (i) f is said to be surjective (onto) if f (X) = Y i.e., f is surjective if and only if, for any y ∈ Y there is an x ∈ X such that f (x) = y. (ii) f is said to be injective (one-to-one, 1−1) if f maps different points of X to different points of Y i.e., f is injective if and only if, for any x, x0 ∈ X, f (x) = f (x0 ) =⇒ x = x0 i.e., x 6= x0 then f (x) 6= f (x0 ). (iii) A function f : X → X is called a bijection (or bijective) if f is both injective and surjective. Two sets X and Y are said to be in 1 − 1 correspondence (i.e., to every element of X there corresponds an element of Y and vice versa) if there is a bijection between them. 3.5 Composition of functions Suppose we have two functions f : X → Y and g : Y → Z, natural to think of a function from X to Z being the combined action of f and g. Definition 3.5.1. Let f : X → Y and g : Y → Z be two functions. Then the composition of f and g is the function g ◦ f . This is a function from X to Z defined by : for any x ∈ X, g ◦ f (x) = g(f (x)). 27 The composition function is generally not commutative. Example 3.5.1. If f (x) = x2 and g(x) = x + 1, then g(f (x)) = x2 + 1 whereas f (g(x)) = (x + 1)2 = x2 + 2x + 1. Composition is always associative : if f : X → Y, g : Y → Z and h : Z → W are functions, then we have (h ◦ g) ◦ f = h ◦ (g ◦ f ). Theorem 3.5.1. Let f : X → Y and g : Y → Z be two functions. (a) If f and g are injective, then so is g ◦ f . (b) If f and g are surjective, then so is g ◦ f . (c) If f and g are bijective, then so is g ◦ f . Proof. (a) We must show that for all x1 , x2 ∈ X if g(f (x1 )) = g(f (x2 )), then x1 = x2 . But put y1 = f (x1 ) and y2 = f (x2 ). Then g(y1 ) = g(y2 ). Since g is assumed to be injective, this implies that f (x1 ) = y1 = y2 = f (x2 ). Since f is also assumed to be injective, this implies that x1 = x2 . (b) We must show that for all z ∈ Z, there exists at least one x in X such that g(f (x)) = z. Since g : Y → Z is surjective, there exists y ∈ Y such that g(y) = z. Since f : X → Y is surjective, there exists x ∈ X such that f (x) = y. Then g(f (x)) = g(y) = z. 3.5.1 The Identity Function Definition 3.5.2. Let X be any set. Let a function f : X → X be defined by f (x) = x i.e., let f mapping to each element in X, to itself. Then f is called the identity function on x. Denoted by IX or 1X . Note f ◦ IX = f = IX ◦ f . 3.5.2 Inverse of a Function Let f : XtoY be a function and let y ∈ Y , then the inverse of y, denoted f −1 (y) consists of those elements of X which are mapped onto y i.e., elements in X which have y as their image. f −1 (y) = {x|x ∈ X and f (x) = y} = {x ∈ X|y = f (x)}. f −1 (y) is a subset of X. If f : X → Y is both one-to-one function and onto function, then f −1 : Y → X and call f −1 the inverse function of f . We say that a function g : Y → X is the inverse function to f : X → Y if both of the following hold : (i) g ◦ f = IX i.e., for all x ∈ X, g(f (x)) = x. 28 (ii) f ◦ g = IY i.e., for all y ∈ Y, f (g(y)) = y. Theorem 3.5.2. Let f : X → Y . (a) The following are equivalent (i) f is bijective. (ii) the inverse relation f −1 ; Y → X is a function. (iii) f has an inverse function g. (b) When the equivalent conditions of part (a) hold then the inverse function g is uniquely determined and it is the function f −1 . Proof. (a) (ii) =⇒ (iii) Assume (ii) i.e., the inverse relation f −1 is a function. We claim that it is then the inverse function to f in the sense that f −1 ◦ f = IX and f ◦ f −1 = IY for some x ∈ X, f −1 (f (x)) is the unique element of X which get mapped under f to f (x). Since x is such an element and the uniqueness is assumed, we must have f −1 (f (x)) = x. Similarly, for y ∈ Y, f −1 (y) is the unique element x of X such that f (x) = y, so f (f −1 (y)) = f (x) = y. (iii) =⇒ (i) We have g ◦ f = IX and the identity function is bijective, so f is injective. Similarly we have f ◦ g = IY is bijective and f is surjective. Therefore f is bijective. (b) Suppose that we have any function g : Y → X such that g ◦ f = IX and f ◦ g = IY . We know that f is bijective and thus the inverse relation f −1 is a function such that f −1 ◦ f = IX , f ◦ f −1 = IY . Thus g = g ◦ IY = g ◦ (f ◦ f −1 ) = (g ◦ f ) ◦ f −1 = IX ◦ f −1 = f −1 . 29 30 Chapter 4 Symbolic Logic Roots in study of language. Uses words and phrases that have a bearing on the truth or falsity of the sentence in which they occur. Such words or phrases are aptly called logical connectives. For example, not, or, and, if, then, if and only if, . . . . For example, consider the sentence : It is cold and the sun is shining. Sentence is obtained by joining the two sentences : It is cold and The sun is shining. The resulting sentence is called a compound sentence and is true provided that each of the two component sentences is true. Definition 4.0.1. A proposition/statement is a declarative sentence which is true or false (but not both). Notation is useful in the study of compound statements. If we let p denote the statement “All cows eat grass” and let q denote the statement “Columbus discovered America”, the we can write the compound statement p and q. 4.1 Abbreviations ∧ denotes and , ⇐⇒ denotes if and only if , ∨ denotes or, =⇒ denotes if. ¬p denotes not p. For example, If p denotes the proposition “It is raining”, then ¬p denotes “It is not raining”. The truthfulness or falsity of the statement is called its truth value. Denoting “true” by “T ” and “false” by “F ”, the logical connectives are conveniently defined by means of a truth-table which spells out the truth value of a compound statement in each possible truth-value cases. 31 4.1.1 Conjunction, p ∧ q Two statements can be combined by the word “and” to form a composite statement which is called the conjugation of the original statements. Denoted by p ∧ q. Example 4.1.1. 1. Let p be it is raining and q be it is overcast. Then p ∧ q denotes it is raining and it is overcast. 2. The symbol ∧ can be used to define the intersection of two sets, C ∩ D = {x|x ∈ C ∧ x ∈ D}. Truth value of a composite statement satisfies the following property : If p is true and q is true, the p ∧ q is true, otherwise, p ∧ q is false. Conjugation of two statements is true if and only if each component is true. Truth-table p T T F F 4.2 q T F T F p∧q T F F F Disjunction, p ∨ q Two statements combined by the word “or” and denoted by p ∨ q. Example 4.2.1. 1. Let p denote the statement He studied Mathematics at University and q be He lives in Harare, then p ∨ q denotes He studied Mathematics at University or he lives in Harare. 2. ∨ can be used to define the union of two sets, P ∪ Q = {x|x ∈ P ∨ x ∈ Q}. Truth value of the composite statement p ∨ q satisfies the property : If p is true or q is true or both p and q are true, then p ∨ q is true, otherwise, p ∨ q is false. 32 Truth-table p T T F F 4.3 p∨q T T T F q T F T F Negation, ¬p Given any statement p, another statement “not p”, called the negation of p and is denoted by ¬p. Example 4.3.1. Chinhoyi is in Zimbabwe. Negation (i) It is false that Chinhoyi is in Zimbabwe or (ii) Chinhoyi is not in Zimbabwe. Truth-table p T T F F 4.4 ¬p F F T T The Conditional, p =⇒ q If p then q, also read as (a) p implies q (b) p only if q (c) p is sufficient for q (d) q is necessary for p. The truth value of the conditional statement p =⇒ q satisfies the following property : p =⇒ q is true unless p is true and q is false, a true statement cannot imply a false statement. 33 Truth-table p T T F F q T F T F 4.4.1 p =⇒ q T F T T The Bi-conditional,p ⇐⇒ q p if and only if q. Truth value satisfied if : p and q have the same truth value, then p ⇐⇒ q is true, otherwise, it is false. Truth-table p T T F F q T F T F p ⇐⇒ q T F F T Example 4.4.1. p T 1. Find the truth values for (¬p∨q) =⇒ p. T F F p T 2. Construct a truth table for (p =⇒ q) ⇐⇒ (¬p∨q). T F F q T F T F q T F T F p =⇒ q T F T T ¬p F F T T ¬p ∨ q T F T T ¬p F F T T ¬p ∨ q T F T T (¬p ∨ q) =⇒ p T T F F (p =⇒ p) ⇐⇒ (¬p In this example, the truth values of (p =⇒ q) ⇐⇒ (¬p ∨ q) are all true. Definition 4.4.1. A compound proposition that is true regardless of the truth values of its initial components is called a tautology. Exercise 4.4.1. Show that [(p =⇒ q) ∧ q =⇒ p] ⇐⇒ (p ⇐⇒ q) is a tautology. Some sentences are not statements because they contain unspecified variables, for example, we cannot assign a truth value to the sentence, He was a president of the United States, 34 until a proper name is substituted for the pronoun he. We call a sentence that contains unspecified variables a predicate. For example, S is green and X discovered America are predicates. S and X are unspecified variables that may be replaced by various nouns. The predicate is neither true nor false, its truth value depends upon the name that replaces X. In these examples, he, S and X are called free variables in their respective predicates. A statement that is always false is called an absurdity. A statement that may be true or false, depending upon the values of its constituent statements, is called a contigency. 4.5 Logical Equivalence Definition 4.5.1. The propositions p and q are said to be logically equivalent if their truth tables are identical. Denoted by p ≡ q. Example 4.5.1. ¬(p ∧ q) ≡ ¬p ∨ ¬q. p q p ∧ q ¬(p ∧ q) p q T T T F T T T F F T T F F T F T F T F F F T F F ¬p F F T T ¬q F T F T ¬p ∨ ¬q F T T T Consider the statement, “It is not the case that roses are red and violets are blue”. This statement can be written in the form ¬(p ∧ q) where p is “roses are red” and q is “violets are blue”. However, as noted above, ¬(p ∧ q) ≡ ¬p ∨ ¬q. Thus the statement, Roses are not red, or violets are not blue, has the same meaning as the given statement. 4.5.1 Algebra of Propositions Propositions satisfy various laws 4.5.2 Laws of the Algebra of Propositions Idempotent Laws (1a) p ∨ p ≡ p (1b) p ∧ p ≡ p. Associative Laws (2a) (p ∨ q) ∨ r ≡ p ∨ (q ∨ r) (2b) (p ∧ q) ∧ r ≡ p ∧ (q ∧ r). Commutative Laws (3a) p ∨ q ≡ q ∨ p (3b) p ∧ q ≡ q ∧ p. Distributive Laws 35 (4a) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r) (4b) p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r). Identity Laws (5a) p ∧ T ≡ p (5b) p ∨ F ≡ p. (6a) p ∨ T ≡ T (6b) p ∧ F ≡ F . Complement Laws (7a) p ∨ ¬p ≡ T (8a) ¬T ≡ F . (7b) p ∧ ¬p ≡ F (8b) ¬F ≡ T . Involution Law (9) ¬¬p ≡ p. De Morgan’s Laws (10a) ¬(p ∨ q) ≡ ¬p ∧ ¬q (10b) ¬(p ∧ q) ≡ (¬p ∨ ¬q). 4.6 The Converse Let p =⇒ q be a conditional proposition. The converse of p =⇒ q is q =⇒ p. Example 4.6.1. If John Gumbo is a student of University of Zimbabwe, then 2 + 2 = 4 is a true proposition. If 2 + 2 = 4, then John Gumbo is a student of University of Zimbabwe is a false proposition. 4.7 The Contrapositive The contrapositive of a proposition p =⇒ q is the proposition ¬q =⇒ ¬p. Theorem 4.7.1. The conditional proposition p =⇒ q is logically equivalent to its contrapositive ¬q =⇒ ¬p. p q p =⇒ q ¬p ¬q ¬q =⇒ ¬p T T T F F T T F F F T F F T T T F T F F T T T T 4.8 List of Tautologies 1. p ∨ ¬p Law of the excluded middle. 2. ¬(p ∧ ¬p) Contradiction. 3. [(p =⇒ q) ∧ q] =⇒ ¬p 4. ¬¬p ⇐⇒ p Modus tollens. Double negation. 36 5. [(p =⇒ q) ∧ (q =⇒ r)] =⇒ (p =⇒ r) Law of syllogism. 6. (p ∧ q) =⇒ p Decomposing a conjunction. 7. (p ∧ q) =⇒ q Decomposing a conjunction. 8. p =⇒ (p ∨ q) Constructing a disjunction. 9. q =⇒ (p ∨ q) Constructing a disjunction. 10. (p =⇒ q) ⇐⇒ [(p =⇒ q) ∧ (q =⇒ p)] Definition of the bi-conditional. 11. (p ∧ q) ⇐⇒ (q ∧ p) Commutative law for ∧. 12. (p ∨ q) ⇐⇒ (q ∨ p) Commutative law for ∨. 13. (p =⇒ q) ⇐⇒ (¬p ∨ q) 14. [(p ∨ q) ∧ ¬p] =⇒ q 15. (p ∨ p) ⇐⇒ p 4.9 Conditional disjunction. Disjunctive syllogism. Simplification. Propositional Functions, Quantifiers Let A be a given set. A propositional function (or open sentence or condition) defined on A is an expression p(x) which has the property that p(a) is true or false for each a ∈ A, i.e., p(x) becomes a statement (with a truth value) whenever any element a ∈ A is substituted for the variable x. Set A is called the domain of p(x) and the set Tp of all elements of A for which p(a) is true is called the truth set of p(x), i.e., Tp = {x|x ∈ A, p(x) is true} or {x|p(x)}. When A is some set of numbers, the condition p(x) has the form of an equation or inequality involving the variable x. Example 4.9.1. Find the truth set Tp of each propositional function p(x) defined on the set P = {1, 2, 3 . . . }. (a) Let p(x) be x + 2 > 7. Then Tp = {x|x ∈ P, x + 2 > 7} = {6, 7, 8, . . . }. Consisting of all integers greater than 5. (b) Let p(x) be x + 5 < 3. Then Tp = {x|x ∈ P, x + 5 < 3} = ∅, the empty set. (c) Let p(x) be x + 5 > 1. Then Tp = {x|x ∈ P, x + 5 > 1} = P. From the above example, shows that if p(x) is a propositional function defined on a set A, then p(x) could be true for all x ∈ A, for some x ∈ A or for no x ∈ A. 37 4.10 Universal Quantifier Let p(x) be a propositional function defined on a set A. Consider the expression, (∀x ∈ A)p(x) or ∀x, p(x) which reads “For every x in A, p(x) is a true statement”, or simply “For all x, p(x)”. The symbol ∀, (for all, for every) is called the universal quantifier. (∀x ∈ A)p(x) is equivalent to the statement Tp = {x|x ∈ A, p(x)} = A, i.e., the truth set of p(x) is the entire set of A. If {x|x ∈ A, p(x)} = A, then ∀x, p(x) is true, otherwise, ∀x, p(x) is false. Example 4.10.1. 1. The proposition (∀n ∈ P)(n + 4 > 3) is true since {n|n + 4 > 3} = {1, 2, 3, . . . } = P. 2. The proposition (∀n ∈ P)(n + 2 > 8) is false since {n|n + 2 > 8} = {7, 8, . . . } = 6 P. 4.11 Existential Quantifier Let p(x) be the propositional function defined on a set A. Consider the expression (∃x ∈ A)p(x) or ∃x, p(x), which reads “There exists an x in A such that p(x)” is a true statement or simply, “For some x, p(x)”. The symbol ∃ (there exists, for some, for at least one) is called the existential quantifier. (∃x ∈ A)p(x) is equivalent to the statement Tp = {x|x ∈ A, p(x)} = 6 ∅, i.e., the truth set of p(x) is not empty. If {x|p(x)} 6= ∅ then ∃x, p(x) is true, otherwise, ∃x, p(x) is false. Example 4.11.1. {1, 2} = 6 ∅. 1. The proposition (∃n ∈ P)(n + 4 < 7) is true since {n|n + 4 < 7} = 2. The proposition (∃n ∈ P)(n + 6 < 4) is false since {n|n + 6 < 4} = ∅. 4.11.1 Notation Let A = {2, 3, 5} and let p(x) be the sentence “x is a prime number” or simply x is prime. Then the proposition “Two is prime and three is prime and five is prime”, can be denoted by p(2) ∧ p(3) ∧ p(5) or ∧(a ∈ A, p(a)), which is equivalent to the statement, “Every number in A is prime or ∀a ∈ A, p(a)”. Similarly, the proposition, “Two is prime or three is prime or five is prime”, can be denoted by p(2) ∨ p(3) ∨ p(5) or ∨(a ∈ A, p(a)), which is equivalent to the statement “At least one number in A is prime or ∃a ∈ A, p(a)”. Alternatively, ∧(a ∈ A, p(a)) ≡ ∀a ∈ A, p(a) and ∨(a ∈ A, p(a)) ≡ ∃a ∈ A, p(a). 38 4.12 Negation of Quantified Statements Consider the statement “All Mathematics majors are male”. Its negation is either the following equivalent statements 1. It is not the case that all Mathematics majors are male. 2. There exists at least one Mathematics major who is a female. Symbolically, using M to denote the set of Mathematics major, the above can be written as ¬(∀x ∈ M )(x is male) ≡ (∃x ∈ M )(x is not male), or when p(x) denotes “x is a male”, we have ¬(∀x ∈ M )p(x) ≡ (∃x ∈ M )¬p(x) or ¬∀x, p(x) ≡ ∃x, ¬p(x). Theorem 4.12.1 (De Morgan). ¬(∀x ∈ A)p(x) ≡ (∃x ∈ A)¬p(x), i.e., (a) It is not true that, for all x ∈ A, p(x) is true and (b) There exists an x ∈ A such that p(x) is false. Theorem 4.12.2 (De Morgan). ¬(∃x ∈ A)p(x) ≡ (∀x ∈ A)¬p(x), i.e., (a) It is not true for some x ∈ A, p(x) is true and (b) For all x ∈ A, p(x) is false. The following statements are also negatives of each other : There exists a college student who is 60 years old, Every college student is not 60 years old. The opposite of “For all x, p(x) is true”, is “There exists x for which p(x) is not true”. The opposite of “There exists x for which p(x) is true ”, is “For all x, p(x) is not true”. For example, All rational numbers equal one, the opposite (negation) is, There exists a rational number that does not equal one. All eleven-legged crocodiles are orange with blue spots is true, if it was false, the there would exist an eleven-legged crocodile that is not orange with blue spots. 4.13 Proofs In Italy it’s said that it requires two men to make a good salad dressing; a generous man to add the oil and a mean man the vinegar. Constructing proofs in mathematics is similar. Often a tolerant openness and awareness is important in discovering or understanding a proof, 39 while a strictness and discipline is needed in writing it down. There are many different styles of thinking, even amongst professional mathematicians, yet they can communicate well through the common medium of written proof. It’s important not to confuse the rigour of a well-written-down proof with the human and very individual activity of going about discovering it or understanding it. Too much of a straightjacket on your thinking is likely to stymie anything but the simplest proofs. On the other hand too little discipline, and writing down too little on the way to a proof, can leave you uncertain and lost. When you cannot see a proof immediately (this may happen most of the time initially), it can help to write down the assumptions and the goal. Often starting to write down a proof helps you discover it. You may have already experienced this in carrying out proofs by induction. It can happen that the induction hypothesis one starts out with isn’t strong enough to get the induction step. But starting to do the proof even with the ‘wrong’ induction hypothesis can help you spot how to strengthen it. Of course, there’s no better way to learn the art of proof than to do proofs, no better way to read and understand a proof than to pause occasionally and try to continue the proof yourself. For this reason you are very strongly encouraged to do the exercises, most of them are placed strategically in the appropriate place in the text. Mathematicians solve problems and proofs is the guarantee that our solutions are correct. A proof is an explanation of why a statement is true. A conjecture is a statement which we believe to be true for which we have no proof. An axiom is a basic assumption about a mathematical situation. 4.14 Techniques of Proof 4.14.1 Direct Method Solves statements of the nature “If A then B. Theorem 4.14.1. Let m be an integer, if m is odd, then m2 is odd. Proof. If m is odd, then m = 2r + 1 for some integer r. Then m2 = (2r + 1)2 = 4r2 + 4r + 1 = 2(2r2 + 2r) + 1, i.e., m2 is odd. Theorem 4.14.2. Suppose that p ∈ Q and p2 ∈ Z, then p ∈ Z. a Proof. By assumption p = for some integers a and b, where the fraction is in its lowest a 2 a2b 2 form. Thus p = = 2 . Since p2 ∈ Z and the fraction is in its lowest form so we have b b a that b2 = 1. Thus b = ±1 ⇒ p = = ±a ∈ Z. ±1 40 Example 4.14.1. Prove that the square of every odd number is of the form 8a + 1 for some a ∈ N. Proof. Any odd number n is of the form n = 2l+1 for some l ∈ Z. Therefore n2 = (2l+1)2 = 4l2 + 4l + 1 = 4(l2 + l) + 1. Thus it is enough to show that l2 + 1 is even. If l is even, then l = 2m for some m ∈ Z, so l2 + l = 4m2 + 2m = 2(2m2 + m) which is divisible by 2. If l is odd, then l = 2m + 1 for some m ∈ Z, so l2 + l = 4m2 + 4m + 1 + 2m + 1 = 2(2m2 + 3m + 1) which is also divisible by 2. Thus l2 + l is always even and so n2 is of the form 8a + 1 for some a ∈ Z. But n2 ≥ 1 so a ∈ N. 4.15 Some Common Mistakes The biggest mistake is assuming what has to be proved and incorrect use of equivalence. 4.15.1 Don’t Assume What Has to be Proved Suppose that we had to prove the statement P . If we assume it is true, then it is not surprising that we can deduce it is true, P ⇒ P , would seem to be obviously true. P is assumed to be true and this is used to deduce something that is true and so it is concluded that P is true. Example 4.15.1. Consider the following statement ; If a and b are real numbers, then a2 + b2 ≥ 2ab. A fallacious proof is : We have a2 + b2 ≥ 2ab ⇒ a2 − 2ab + b2 ≥ 0 ⇒ (a − b)2 ≥ 0. The last inequality is true as the square of a number is always non-negative, so a2 + b2 ≥ 0. The error is the conclusion has been assumed (i.e., a2 + b2 ≥ 2ab) and has lead to something we know is true. However, we cannot conclude that a statement is true just because it implies a known truth. The real proof is a reverse of the argument, begin with (a − b)2 ≥ 0 something we know is true. 4.16 Proof By Cases For example, x = y can be proved by that x ≤ y and y ≤ x. We have broken the problem into two cases. Example 4.16.1. The number n2 + 3n + 7 is odd for all n ∈ Z. 41 Proof. Divide into two cases (i) n is even and (ii) n is odd. If n is even, then n = 2k for some integer k. Then n2 + 3n + 7 = (2k)2 + 3(2k) + 7 = 4k 2 + 6k + 7 = 2(2k 2 + 3k + 3) + 1. Hence n2 + 3n + 7 is odd when n is even. If n is odd, then n = 2k+1 for some integer k. We have n2 +3n+7 = (2k+1)2 +3(2k+1)+7 = 4k 2 + 10k + 11 = 2(2k 2 + 5k + 5) + 1. This is also odd. Hence n2 + 3n + 7 is odd for all integers n. As you see, this method of cases involves exhausting all the possibilities and so this method is also known as exhaustion. 4.17 Contradiction The law of the excluded middle asserts that a statement is true or false, it cannot be anything in between. The name comes from the fact that assuming that the statement is false is later contradicted by some other fact. Also called reductio ad absurdum (reduction to the absurd). Example 4.17.1. Suppose that n is an odd integer. Then n2 is an odd integer. Proof. Assume to the contrary, i.e., we suppose that n is an odd integer but that the conclusion is false, i.e., n2 is an even integer. As n is odd, n = 2k + 1 for some k ∈ Z. Thus n2 = (2k + 1)2 = 4k 2 + 2k + 1 which contradicts n2 is even. Thus our assumption that n2 is even must be wrong, i.e., n2 must be odd. √ Example 4.17.2. Prove that 3 is irrational. √ √ a Proof. Suppose a contradiction that 3 is rational. Then we can write 3 = for some b integers a and b. Assume that a and b have no common divisors. Now squaring both sides, a2 we get 3 = 2 and so 3b2 = a2 . This implies that a2 is divisible by 3 and so a is also divisible b by 3. Thus we can write a = 3c for some integer c. Replacing this in the above equation we get 3b2 = 9c2 and so b2 = 3c2 . Hence b2 is divisible by 3√and so is b. But this contradicts the fact that a and b have no common divisors. Therefore 3 has to be irrational. 4.18 Induction Is applied when we have an infinite number of statements indexed by the natural numbers, for example, n5 − n is even for all n ∈ N. 42 4.18.1 Principle of Mathematical Induction Let A(n) be an infinite collection of statements with n ∈ N. Suppose (i) A(1) is true and (ii) A(k) ⇒ A(k + 1) for all k ∈ N. Then A(n) is true for all n ∈ N. Checking condition (i) is called the initial step and checking condition (ii) is called the inductive step. assuming that A(k) is true for some k in (ii) is called the inductive hypothesis. Example 4.18.1. 6n − 1 is divisible by 5 for all n ∈ N. Proof. Initial Step: 61 − 1 = 5, and is true because 5 is divisible by 5. Inductive Step: Assume statement is true for some k ∈ N, that means that 6k − 1 = 5m for some m ∈ N. Then 6k+1 − 1 = 6(6k ) − 1 = 6(5m + 1) − 1 (by inductive hypothesis)= 30m + 6 − 1 = 5(6m + 1). This is divisible by 5 and so the statement is truce for k + 1. Hence statement is true for all n ∈ N. Example 4.18.2. Show that 2n−1 ≤ n! for all n ∈ N. Proof. For n = 1, we have 2n−1 = 20 = 1 and n! = 1! = 1. Hence 2n−1 ≤ n! for n = 1. Assume the statement is true for some k ∈ N i.e., 2k−1 ≤ k!. Then for n = k + 1, 2(k+1) − 1 = 2k = 2(2k−1 ) ≤ 2(k!) (by inductive hypothesis) ≤ (k + 1)(k!) (as 2 ≤ k + 1) = (k + 1)!. 4.19 The Contrapositive Method A ⇒ B is equivalent to not B ⇒ not A. For example, If x2 − 9 = 0, then x = 2 has the contrapositive, If x 6= 2, then x2 − 9 6= 0 and If I am Jane, then I am a woman has the contrapositive, If I am not a woman, then I am not Jane. Example 4.19.1. Suppose that A, B, C, D are sets such that C \ D ⊂ A ∩ B and that x ∈ C. Prove that if x ∈ / A, then x ∈ D. Proof. If x ∈ / D, then x ∈ A (the contrapositive). Let us suppose that x ∈ / D. Since x ∈ C is assumed, then x ∈ C \ D. Because C \ D ⊂ A ∩ B ⇒ x ∈ A ∩ B, i.e., x ∈ A. Example 4.19.2. Let a be any integer. prove that if a2 is divisible by 3, then a is divisible by 3. Proof. By contrapositive. Assume that a is not divisible by 3. Then a = 3t + 1 or 3t + 2 for some t ∈ Z. If a = 3t + 1 then a2 = (3t + 1)2 = 9t2 + 6t + 1 = 3(3t2 + 2t) + 1 which is not divisible by 3. If a = 3t + 2 then a2 = (3t + 2)2 = 9t2 + 12t + 4 = 3(3t2 + 4t + 1) + 1 which is not divisible by 3. 43 4.20 Counterexamples For example, Is all multiples of 3 are multiples of 6, true or false? Prove your answer. It is false because 9 is a multiple of 3, but is not a multiple of 6. (9 is called a counterexample to the “all” statement, All multiples of 3 are multiples of 6). 4.21 Divisors Uses the set of integers, Z. An integer a divides the integer b if there exists an integer ksuch that b = ka. In this case we say b is divisible by a and write a | b. We also say that a is a divisor of b. If a does not divide b, then we write a - b. For example, 3 | 6 since 6 = 2 × 3. Theorem 4.21.1. If a | b and a | c then a | (mb + nc) for all integers m and n. For trivial examples, if m = n = 1, we have if a | b and a | c, then a | (b + c). If we take m = 1 and n = −1, we get, If a | b and a | c, then a | (b − c). Proof. By assumption, there exists integers k1 and k2 such that b = k1 a and c = k2 a. For any integers m and n, we have mb + nc = m(k1 a) + n(k2 a) (by assumption) = (mk1 + nk2 )a. Thus mb + nc is divisible by a. Theorem 4.21.2. Let a, b, c ∈ Z. Then (i) If a | b and b | c, then a | c. (ii) If a | b and b | a, then a = b or a = −b. Proof. (i) By assumption, ∃k1 , k2 ∈ Z such that b = k1 a and c = k2 b. Hence c = k2 k1 a and we deduce that a divides c. Example 4.21.1. For n even, n2 + 2n + 8 is divisible by 4. Proof. n is even implies that n = 2m for some m ∈ Z. Then n2 +2n+8 = (2m)2 +2(2m)+8 = 4m2 +4m+8 = 4(m2 +m+2). Since m2 +m+2 is an integer we can conclude that n2 +2n+8 is divisible by 4. Exercise 4.21.1. Show that x2 + 9x + 20 is divisible by 2 for all x ∈ Z. Exercise 4.21.2. Show that x3 − 6x2 + 11x − 6 is divisible by 3 for all x ∈ Z. Exercise 4.21.3. For each positive integer, show that x3 − x is divisible by 3 and x5 − x is divisible by 5. Can you generalise this? Is xn − x divisible by n? 44 4.22 The Principle of Mathematical Induction It is an important property of the positive integers (natural numbers) and is used in proving statements involving all positive integers when it is known for, for example, that the statements are valid for n = 1, 2, 3, . . . but it is suspected or conjectured that they hold for all positive integers. 4.22.1 Steps 1. Prove the statement for n = 1 or some other positive integer. (Initial Step) 2. Assume the statement true for n = k, where k ∈ Z+ . (Inductive Hypothesis) 3. From the assumption in 2 prove the statement must be true for n = k + 1. 4. Since the statement is true for n = 1 (from 1) it must (from 3) be true for n = 1 + 1 = 2 and from this for n = 2 + 1 = 3, and so on, so must be true for all positive integers. (Conclusion) Example: For any positive integer n, 1 + 2 + ··· + n = n(n + 1) . 2 Solution: 1(1 + 1) 2 = = 1, which is clearly true. 2 2 2. Assume that the statement holds for n = k, that is, 1. Prove for n = 1, 1 = 1 + 2 + ··· + k = k(k + 1) . 2 3. Prove for n = k + 1. So k(k + 1) + (k + 1) (by inductive hypothesis) 2 k(k + 1) + 2(k + 1) = 2 k 2 + 3k + 2 = 2 (k + 1)(k + 2) = 2 1 + 2 + · · · + k + (k + 1) = so holds for n = k + 1. 45 4. Hence by induction, 1 + 2 + · · · + n = n(n + 1) is true for any positive integer n. 2 Example: Prove that for any natural number 1 + 3 + 5 + · · · + 2n − 1 = n2 . Solution: 1. Prove for n = 1, 1 = 12 = 1, so it is true. 2. Assume that the statement holds for n = k, that is, 1 + 3 + 5 + · · · + 2k − 1 = k 2 . 3. Prove for n = k + 1. We have 1 + 3 + 5 + · · · + (2k − 1) + 2(k + 1) − 1 = k 2 + 2k + 1 (by inductive hypothesis) = (k + 1)2 . So it is true for n = k + 1. 4. Hence by induction 1 + 3 + 5 + · · · + 2n − 1 = n2 is true for all natural numbers n. 46 Example: Prove that 3n > 2n for all natural numbers n. Solution: 1. Prove for n = 1 =⇒ 31 = 3 > 21 = 2, which is true. 2. Assume the statements holds for n = k, that is, 3k > 2k . 3. Prove for n = k + 1. 3k+1 = > > > 3k · 3 2k · 3 by inductive hypothesis 2k · 2 since 3 > 2 2k+1 , which is true. 4. Hence, by induction 3n > 2n for all natural numbers n. Example: Prove that for any integer n ≥ 1, 22n − 1 is divisible by 3. Solution: 1. Prove for n = 1 =⇒ 22 − 1 = 3 and is divisible by 3, hence its true. 2. Assume that the statement holds for n = k, that is, for k ≥ 1, 22k − 1 is divisible by 3, i.e., 22k − 1 = 3l, for some l ∈ Z. 3. Prove for n = k + 1. 22(k+1) − 1 = = = = = 4 · 22k − 1 but 22k = 3l + 1 by the inductive hypothesis 4(3l + 1) − 1 12l + 4 − 1 12l + 3 3(4l + 1), which is true. 4. Hence, by induction 22n − 1 is divisible by 3 for all n ≥ 1. 47 48 Chapter 5 Operations and Structures We have the following sets, C-set of complex numbers, Z-set of integer numbers, N-set of natural numbers, Q-set of rational numbers and R-set of real numbers. 5.1 Operations Operations (such as addition) that involve two input values, for example, 2 + 3 are called binary operations. Those that √ involve only one input value, such as finding the square root of a number (for example, 8) are called unary operations. Others that involve three input values are called ternary operations. Definition 5.1.1. An operation on a non-empty set A is a mapping from A × A to A. Definition 5.1.2. A mapping ∗ : A × A −→ A is called a binary operation on the set A. 5.2 Idea An operation on A is the combining of arbitrary elements a and b of A in some prescribed way to obtain a unique element c of A i.e., a ∗ b = c. Binary operations are usually represented by symbols like ∗, ·, +, ◦. For example, the binary operations of addition (+) and multiplication (·) in the set Z of integers. Example 5.2.1. (i) Addition (+) is an operation on Z since + : Z × Z −→ Z i.e.,for example, 1 + 2 = 3. (ii) Let P(S) be the power set of S, i.e., P(S) = {T |T ⊆ S}. ∪ and ∩ are both operations on P(S). 49 5.2.1 Properties Definition 5.2.1. Let ∗ be an operation on A, ∗ is commutative on A if for all a, b ∈ A, a ∗ b = b ∗ a. of Forexample, operation multiplication applied tothe setof 2 × 2 matrices. 1 2 3 7 5 17 3 7 1 2 3 34 = and = , so a ∗ b 6= b ∗ a. If on the 0 4 1 5 4 20 1 5 0 4 1 22 other hand, a ∗ b= b ∗a, for pairs of matrices a and b, i.e., for wetake∗ as (+),then all 1 2 3 7 3 7 1 2 4 9 example + = + = . 0 4 1 5 1 5 0 4 1 9 Definition 5.2.2. ∗ is associative on A, if for all a, b, c ∈ A we have (a ∗ b) ∗ c = a ∗ (b ∗ c). For example, (i) Addition is both commutative and associative on Z. (ii) ∪ and ∩ are ommutative and associative on P(S). 5.2.2 Closure Suppose we have a binary operation ∗ and apply to the elements a, b ∈ A to produce a ∗ b. Its important to know whether or not a ∗ b belongs to the original set A. If a ∗ b ∈ A for every pair a, b ∈ A, then we say A is closed under the binary operation ∗. For example, Z+ is not closed under subtraction, since, for example, 2 − 5 = −3 ∈ / Z+ but Z+ is closed under addition and multiplication. A is closed under the binary operation ∗ if and only if a, b ∈ A =⇒ a ∗ b ∈ A. 5.2.3 Identity Elements An identity element is an element that when involved in an operation with another element does not change the value of that element. Example 5.2.2. Find a the identity element for the operation defined as a ◦ b = a + b + 2. Let a + b + 2 = a, then b = −2, identity element = −2. Consider the product a ∗ b = c of a, b ∈ A which is closed under the binary operation ∗. a ∗ b is called the right hand product of b with a and b ∗ a is called the left hand product of b with a. For example, in the set of R under the operation +, 3 + 0 = 3. We have formed the right hand sum of 0 and 3, addition of 0 does not affect the number 3. If in any set A, there is an element i for which, for all elements a ∈ A, a ∗ i = i ∗ a = a, then we call i an identity element or neutral element for set A under the operation ∗. An identity 50 element is unique. For example, consider P(S), the power set of S. The ∅ acts as an identity for the operation of union on P(S). 5.2.4 Inverse Elements An inverse element is an element that, when involved in an operation with another results in the identity element for that operation. Consider the set of R i.e., given a real number say 2 12 we can obtain another real number −2 12 with the property that 2 21 + (−2 12 ) = 0. For all x ∈ R, x + (−x) = 0 and (−x) + x = 0. An element (−x) is called an inverse element. We only seek inverse elements in sets which have an identity element. Take a set A with binary operation ∗, (A, ∗). Suppose that there is an identity element i ∈ A and a ∈ A. Then there is an element b in A such that a ∗ b = b ∗ a = i, then b is called the inverse of a and write b = a−1 . Example 5.2.3. (i) Relative to + on Z, the inverse of a is −a, for all a ∈ Z. (ii) Relative to · on Z only 1 and −1 have inverses. 1 (iii) Relative to multiplication on Q, if a ∈ Q and a 6= 0, then a−1 = . a 5.3 Operation Tables (Cayley Table) A binary operation on a finite Suppose that A is a small finite set and ∗ is an operation on A. We can write out a complete description of ∗ as follows ; Let A = {a, b, c} and ∗ be such that a ∗ a = a b ∗ a = b b ∗ c = a a ∗ b = b c ∗ a = c c ∗ b = a a ∗ c = c b ∗ b = c c ∗ c = b. The situation is easily improved by recording the information in the form of a table. ∗ a b c a a b c b b c a c c a b It is easy to determine the existence of special elements from the table and also whether or not the operation is commutative. Commutativity of the operation corresponds to symmetry in the table with respect to the diagonal i.e., if we fold the table along the diagonal, then the operation is commutative 51 only when the corresponding entries are the same. The existence of a left identity can be determined simply by locating a row that is a repetition of the column headings. Element a is a left identity for ∗. A right identity is found (if one exists) by locating a column that is a repeat of the row headings. Again a is a right identity for ∗. Putting these two together yields a method for determining the existence (non-existence) of an identity relative to a given operation. Example 5.3.1. The Cayley table for the set S = {1, i, −1, i} with operation multiplication is × 1 i −1 −i 1 i −1 1 i −1 i −1 −i −1 −i 1 −i 1 i −i −i 1 i −1 Test four properties, closed set, associative, identity and multiplicative inverse. All the results are members of the original set {1, i, −1, −i}. Therefore a closed set. The set is associative, for example, (1 × i) × (−i) = i × −i = 1 and 1 × (i × −i) = 1 × 1 = 1. The identity = 1. There is a 1 in every row of the table, so each element has a unique inverse. Note the table is symmetric about the leading diagonal, hence it is also commutative. 5.4 Groups Definition 5.4.1. A group (G, ∗) is an algebraic structure, where G is a set and ∗ is a composition on that set, satisfying the following laws 1. Closure : ∀a, b ∈ G, a ∗ b ∈ G. 2. Associativity : ∀a, b, c ∈ G, a ∗ (b ∗ c) = (a ∗ b) ∗ c. 3. Neutral Element : there is a unique e in G such that ∀a ∈ G, e ∗ a = a ∗ e = a. 4. Inverse Elements : ∀a ∈ G, ∃ a unique a−1 ∈ G such that a ∗ a−1 = a−1 ∗ a = e. A group can be infinite, the number |G| denotes the order of G, i.e., the number of elements in the group G. Example 5.4.1. 1. (Q − {0}, ·) (multiplication on the non-zero rationals) 2. (Z, +) (addition of integers) 3. (R, +) (addition of the reals) 52 4. (R − {0}, ·) (multiplication on the non-zero reals) Definition 5.4.2. An abelian group or commutative group, is a group for which the commutative axiom holds. i.e., a ∗ b = b ∗ a for every a, b ∈ G. Theorem 5.4.1. 1. The identity of a group is unique. 2. Every element of a group has a unique inverse. 3. For all a, b in a group (G, ∗), we have (a ∗ b)−1 = b−1 ∗ a−1 . 4. Let e be the identity in (G, ∗). An element a ∈ G is idempotent if a ∗ a = a. In G the only idempotent element is e. 5. If a, b, c ∈ G, then a ∗ c = b ∗ c =⇒ a = b. (Cancellation Law) Theorem 5. Suppose that a ∗ c = b ∗ c, then (a ∗ c) ∗ c−1 = (b ∗ c) ∗ c−1 . Hence a ∗ (c ∗ c−1 ) = b ∗ (c ∗ c−1 ) (by the associativity law). Hence a ∗ i = b ∗ i (by the inverse property), and so a = b (by the identity property). 5.5 Permutations Let A be the set {1, 2, · · · , n}. The A is denoted by Sn . A permutation set of permutations of 1 2 ··· n σ ∈ Sn is usually represented as where σ maps 1 to σ(1), σ maps σ(1) σ(2) · · · σ(n) 2 to σ(2)and so on. The order of the columns in this representation of σ is immaterial, for 1 2 3 4 example and 2 4 1 3 53 54 Chapter 6 Introduction to Probability Theory 6.1 Probability Probability theory provides a mathematical foundation to concepts such as “probability”, “information”, “belief”, “uncertainty”, “confidence”, “randomness”, “variability”, “chance” and “risk”. Probability theory is important to empirical scientists because it gives them a rational framework to make inferences and test hypotheses based on uncertain empirical data. Probability theory is also useful to engineers building systems that have to operate intelligently in an uncertain world. For example, some of the most successful approaches in machine perception (e.g., automatic speech recognition, computer vision) and artificial intelligence are based on probabilistic models. Moreover probability theory is also proving very valuable as a theoretical framework for scientists trying to understand how the brain works. Many computational neuroscientists think of the brain as a probabilistic computer built with unreliable components, i.e., neurons, and use probability theory as a guiding framework to understand the principles of computation used by the brain. Consider the following examples: • You need to decide whether a coin is loaded (i.e., whether it tends to favor one side over the other when tossed). You toss the coin 6 times and in all cases you get “Tails”. Would you say that the coin is loaded? • You are trying to figure out whether newborn babies can distinguish green from red. To do so you present two colored cards (one green, one red) to 6 newborn babies. You make sure that the 2 cards have equal overall luminance so that they are indistinguishable if recorded by a black and white camera. The 6 babies are randomly divided into two groups. The first group gets the red card on the left visual field, and the second group on the right visual field. You find that all 6 babies look longer to the red card than the green card. Would you say that babies can distinguish red from green? 55 • A pregnancy test has a 99% validity (i.e., 99 of 100 pregnant women test positive) and 95% specificity (i.e., 95 out of 100 non pregnant women test negative). A woman believes she has a 10% chance of being pregnant. She takes the test and tests positive. How should she combine her prior beliefs with the results of the test? • You need to design a system that detects a sinusoidal tone of 1000Hz in the presence of white noise. How should you design the system to solve this task optimally? • How should the photo receptors in the human retina be interconnected to maximize information transmission to the brain? While these tasks appear different from each other, they all share a common problem: The need to combine different sources of uncertain information to make rational decisions. Probability theory provides a very powerful mathematical framework to do so. We now go into the mathematical aspects of probability theory. 6.2 Sample Spaces A set S that consists of all possible outcomes of a random experiment is called a sample space, and each outcome is called a sample point. Often there will be more than one sample space that can describe outcomes of an experiment, but there is usually only one that will provide the most information. Example 6.2.1. If we toss a die, then one sample space is given by {1, 2, 3, 4, 5, 6} while another is {even, odd}. It is clear, however, that the latter would not be adequate to determine, for example, whether an outcome is divisible by 3. The sample space is also called the outcome space, reference set, and universal set. It is often useful to portray a sample space graphically. In such cases, it is desirable to use numbers in place of letters whenever possible. If a sample space has a finite number of points, it is called a finite sample space. If it has as many points as there are natural numbers 1, 2, 3, . . . , it is called a countably infinite sample space. If it has as many points as there are in some interval on the x axis, such as 0 ≤ x ≤ 1, it is called a noncountably infinite sample space. A sample space that is finite or countably finite is often called a discrete sample space, while one that is noncountably infinite is called a nondiscrete sample space. Example 6.2.2. The sample space resulting from tossing a die yields a discrete sample space. However, picking any number, not just integers, from 1 to 10, yields a non-discrete sample space. 56 6.3 Events We have defined outcomes as the elements of a sample space S. In practice, we are interested in assigning probability values not only to outcomes but also to sets of outcomes. For example, we may want to know the probability of getting an even number when rolling a die. In other words, we want the probability of the set {2, 4, 6}. An event is a subset A of the sample space S, i.e., it is set of possible outcomes. If the outcome of an experiment is an element of A, we say that the event A has occurred. An event consisting of a single point of S is called a simple or elementary event. As particular events, we have S itself, which is the sure or certain event since an element of S must occur, and the empty set ∅, which is called the impossible event because an element of ∅ cannot occur. By using set operations on events in S, we can obtain other events in S. For example, if A and B are events, then 1. A ∪ B is the event “either A or B or both.” A ∪ B is called the union of A and B. 2. A ∩ B is the event “both A and B.” A ∩ B is called the intersection of A and B. 3. A0 is the event “not A.” A0 is called the complement of A. 4. A − B = A ∩ B 0 is the event “A but not B.” In particular, A0 = S − A. If the sets corresponding to events A and B are disjoint, i.e., A ∩ B = ∅, we often say that the events are mutually exclusive. This means that they cannot both occur. We say that a collection of events A1 , A2 , . . . , An is mutually exclusive if every pair in the collection is mutually exclusive. Example 6.3.1. Consider an experiment of tossing a coin twice, let A be the event “at least one head occurs” and B the event “the second toss results in a tail.” Find the events A ∪ B, A ∩ B, A0 and A − B. Solution: We observe that A = {HT, T H, HH}, B = {HT, T T } and so we have A ∪ B = {HT, T H, HH, T T } = S, A ∩ B = {HT } A0 = {T T } A − B = {T H, HH}. 6.4 The Concept of Probability In any random experiment there is always uncertainty as to whether a particular event will or will not occur. As a measure of the chance, or probability, with which we can expect the 57 event to occur, it is convenient to assign a number between 0 and 1. If we are sure or certain that an event will occur, we say that its probability is 100% or 1. If we are sure that the event will not occur, we say that its probability is zero. If, for example, the probability is 1/4, we would say that there is a 25% chance it will occur and a 75% chance that it will not occur. Equivalently, we can say that the odds against occurrence are 75% to 25%, or 3 to 1. There are two important procedures by means of which we can estimate the probability of an event. 1. CLASSICAL APPROACH: If an event can occur in h different ways out of a total of n possible ways, all of which are equally likely, then the probability of the event is h/n. Example 6.4.1. Suppose we want to know the probability that a head will turn up in a single toss of a coin. Since there are two equally likely ways in which the coin can come up-namely, heads and tails (assuming it does not roll away or stand on its edge)- and of these two ways a head can arise in only one way, we reason that the required probability is 1/2. In arriving at this, we assume that the coin is fair, i.e., not loaded in any way. 2. FREQUENCY APPROACH: If after n repetitions of an experiment, where n is very large, an event is observed to occur in h of these, then the probability of the event is h/n. This is also called the empirical probability of the event. Example 6.4.2. If we toss a coin 1000 times and find that it comes up heads 532 times, we estimate the probability of a head coming up to be 532/1000 = 0.532. Both the classical and frequency approaches have serious drawbacks, the first because the words “equally likely” are vague and the second because the “large number” involved is vague. Because of these difficulties, mathematicians have been led to an axiomatic approach to probability. 6.5 The Axioms of Probability Suppose we have a sample space S. If S is discrete, all subsets correspond to events and conversely; if S is nondiscrete, only special subsets (called measurable) correspond to events. To each event A in the class C of events, we associate a real number P (A). The P is called a probability function, and P (A) the probability of the event, if the following axioms are satisfied. Axiom 1. For every event A in class C, P (A) ≥ 0 Axiom 2. For the sure or certain event S in the class C, P (S) = 1 58 Axiom 3. For any number of mutually exclusive events A1 , A2 , . . . , in the class C, P (A1 ∪ A2 ∪ . . .) = P (A1 ) + P (A2 ) + . . . In particular, for two mutually exclusive events A1 and A2 , P (A1 ∪ A2 ) = P (A1 ) + P (A2 ). 6.6 Some Important Theorems on Probability From the above axioms we can now prove various theorems on probability that are important in further work. Theorem 6.6.1. If A1 ⊂ A2 , then P (A1 ) ≤ P (A2 ) and P (A2 − A1 ) = P (A2 ) − P (A1 ). Theorem 6.6.2. For every event A, 0 ≤ P (A) ≤ 1, i.e., a probability between 0 and 1. Theorem 6.6.3. For ∅, the empty set, P (∅) = 0, i.e., the impossible event has probability zero. Theorem 6.6.4. If A0 is the complement of A, then P (A0 ) = 1 − P (A). Theorem 6.6.5. If A = A1 ∪ A2 ∪ A3 ∪ . . . ∪ An , where A1 , A2 , . . . , An are mutually exclusive events, then P (A) = P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ). In particular, if A = S, the sample space, then P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ) = 1. Theorem 6.6.6. If A and B are any two events, then P (A∪B) = P (A)+P (B)−P (A∩B). More generally, if A1 , A2 , A3 are any three events, then P (A1 ∪A2 ∪A3 ) = P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A2 )−P (A2 ∩A3 )−P (A3 ∩A1 )+P (A1 ∩A2 ∩A3 ). Generalizations to n events can also be made. Theorem 6.6.7. For any events A and B, P (A) = P (A ∩ B) + P (A ∩ B 0 ). Theorem 6.6.8. If an event A must result in the occurrence of one of the mutually exclusive events A1 , A2 , . . . , An , then P (A) = P (A ∩ A1 ) + P (A ∩ A2 ) + · · · + P (A ∩ An ). 6.7 Assignment of Probabilities If a sample space S consists of a finite number of outcomes a1 , a2 , . . . , an , then by theorem 6.6.5, P (A1 ) + P (A2 ) + . . . + P (An ) = 1 59 where A1 , A2 , . . . , An are elementary events given by Ai = {ai }. It follows that we can arbitrarily choose any nonnegative numbers for the probabilities of these simple events as long as the previous equation is satisfied. In particular, if we assume equal probabilities for all simple events, then P (Ak ) = 1 , n k = 1, 2, . . . , n And if A is any event made up of h such simple events, we have P (A) = h . n This is equivalent to the classical approach to probability. We could of course use other procedures for assigning probabilities, such as frequency approach. Assigning probabilities provides a mathematical model, the success of which must be tested by experiment in much the same manner that the theories in physics or others sciences must be tested by experiment. Example 6.7.1. A single die is tossed once. Find the probability of a 2 or 5 turning up. Solution: The sample space is S = {1, 2, 3, 4, 5, 6}. If we assign equal probabilities to the sample points, i.e., if we assume that the die is fair, then 1 P (1) = P (2) = · · · = P (6) = . 6 The event that either 2 or 5 turns up is indicated by 2 ∪ 5. Therefore, P (2 ∪ 5) = P (2) + P (5) = 6.8 1 1 1 + = . 6 6 3 Conditional Probability Let A and B be two events such that P (A) > 0. Denote P (B|A) the probability of B given that A has occurred. Since A is known to have occurred, it becomes the new sample space replacing the original S. From this we are led to the definition P (B|A) ≡ P (A ∩ B) P (A) (6.1) or P (A ∩ B) ≡ P (A)P (B|A). (6.2) In words, this is saying that the probability that both A and B occur is equal to the probability that A occurs times the probability that B occurs given that A has occurred. We call 60 P (B|A) the conditional probability of B given A, i.e., the probability that B will occur given that A has occurred. It is easy to show that conditional probability satisfies the axioms of probability previously discussed. Example 6.8.1. Find the probability that a single toss of a die will result in a number less than 4 if (a) no other information is given and (b) it is given that the toss resulted in an odd number. Solution: (a) Let B denote the event {less than 4}. Since B is the union of the events 1, 2, or 3 turning up, we see by Theorem 6.6.5 that P (B) = P (1) + P (2) + P (3) = 1 1 1 1 + + = 6 6 6 2 assuming equal probabilities for the sample points. 1 3 (b) Letting A be the event {odd number}, we see that P (A) = = . Also, P (A ∩ B) = 6 2 2 1 = . Then 6 3 1/3 2 P (A ∩ B) = = . P (B|A) = P (A) 1/2 3 Hence, the added knowledge that the toss results in an odd number raises the probability from 1/2 to 2/3. 6.9 Theorems on Conditional Probability Theorem 6.9.1. For any three events A1 , A2 , A3 , we have P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 |A1 )P (A3 |A1 ∩ A2 ). (6.3) In words, the probability that A1 and A2 and A3 all occur is equal to the probability that A1 occurs times the probability that A2 occurs given that A1 has occurred times the probability that A3 occurs given that both A1 and A2 have occurred. The result is easily generalized to n events. Theorem 6.9.2. If an event A must result in one of the mutually exclusive events A1 , A2 , . . . , An , then P (A) = P (A1 )P (A|A1 ) + P (A2 )P (A|A2 ) + . . . + P (An )P (A|An ). 61 (6.4) 6.10 Independent Events If P (B|A) = P (B), i.e., the probability of B occurring is not affected by the occurrence or nonoccurrence of A, then we say that A and B are independent events. This is equivalent to P (A ∩ B) = P (A)P (B). (6.5) Notice also that if this equation holds, then A and B are independent. We say that three events A1 , A2 , A3 are independent if they are pairwise independent. P (Aj ∩ Ak ) = P (Aj )P (Ak ), j 6= k where j, k = 1, 2, 3 (6.6) and P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ). (6.7) Both of these properties must hold in order for the events to be independent. Independence of more than three events is easily defined. Note: In order to use this multiplication rule, all of your events must be independent. 6.11 Bayes’ Theorem or Rule Suppose that A1 , A2 , . . . , An are mutually exclusive events whose union is the sample space S, i.e., one of the events must occur. Then if A is any event, we have the following important theorem: Theorem 6.11.1. (Bayes’ Rule): P (Ak |A) = P (Ak )P (A|Ak ) n X . (6.8) P (Aj )P (A|Aj ) j=1 This enables us to find the probabilities of the various events A1 , A2 , . . . , An that can occur. For this reason Bayes’ theorem is often referred to as a theorem on the probability of causes. A 62