University of Zimbabwe HMTHCS102 LINEAR MATHEMATICS 1 Lecture hours: 60 hours Mathematics building Office No. : 216 Old wing Author: Mr T.V. Mupedza Contact: 0779791216 Email: tvmupedza@science.uz.ac.zw Department: Mathematics And Computational Sciences August 19, 2022 Course Outline 0.1 Purpose This course is designed to cover Calculus of single variables and part of Calculus of several variables for undergraduate engineering students with some background of A-Level calculus. The concentration is on motivating results and concepts geometrically rather than on providing rigorous proofs. Concepts are defined carefully and results stated precisely, but illustrated by way of vivid, concrete examples. We seek to introduce and develop concepts (including complex numbers and the algebra of polynomials) necessary for a first course in algebra. Our goal is the elementary theory of matrices and determinants, and their applications to solving systems of linear equations. The course is a pre-requisite for MT201 (Engineering mathematics 2). It has 20 lectures per week which have a duration of one hour each. Monday to Friday 0800 − 1200 hrs. 0.2 AIM The aim of this course is to introduce gently the rigour of mathematical analysis and provide a good background for applied mathematics. 0.3 Course Content 0.3.1 Number systems 1. Natural, Rational and Irrational Numbers, [1] 2. Principle of Mathematical Induction, [2] 3. The Real Number System, Inequalities, Solution Sets and Geometrical Representation, [1] 4. Absolute Value, Neighborhoods and Intervals. [1] 1 0.3.2 Sequences and series 1. Definitions and Notation, Limits of Sequences and their properties, [2] 2. Monotone Sequences, [1] 3. Convergence or Divergence of infinite Series, [1] 4. Tests for Convergence or Divergence of infinite Series (Direct Comparison Test, Limit Comparison test, Alternating Series Test, Absolute convergence, N th Root test and Ratio Test) [2] 0.3.3 Functions, limits of functions of a single variable 1. Definitions and Notation, Types of Functions and their Inverses, [2]. 2. Definition of a limit of a function and its application, [1] 3. Left and Right hand Limits, [1] 4. L’ Hospital’s Rule, [1] 5. Continuous Functions, [2] 6. Curve sketching, [1] 0.3.4 Differentiation 1. The concept of a derivative, [1] 2. Theorems on Differentiation. [2] 3. Applications of the derivative. The Mean Value Theorem and Rolle’s Theorem, [2] 4. Leibniz Theorem with Applications, [2] 0.3.5 Integration 1. Indefinite integral and definite integral, [2] 2. Techniques of integration, [2] 3. Reduction Formulas, [2] 4. Improper integrals, [2] 2 0.3.6 Functions of two or more variables 1. Limits and continuity, [2] 2. Partial differentiation, [2] 3. The chain rule, [2] 4. The extended chain rule, [1] 5. maxima, Minima and Saddle points, [2] 0.3.7 Integration of functions of several variables 1. Double integration, [1] 2. Changing the order of integration and changing variables of integration, [3] 3. Triple integrals, [1] 0.3.8 Basic Concepts of Set Theory 1. Importance of set theory, Notation, Some interesting sets of numbers, Well-defined sets, Specification of Sets, [1] 2. The empty set (null set), Identity and cardinality, Russell’s Paradox, Inclusion, Axiom of Extensionality, [2] 3. Power set, Operations on Sets, Difference and Complement, Venn diagrams, [2] 4. Set Theoretic Equalities, The Algebra of sets, Set Products. [2] 0.3.9 Introduction to Probability 1. Introduction to probability, random experiments, sample spaces, events, mutually exclusive events, axiomatic definition of probability, relative frequency, [2] 2. Computation of probabilities of finite sample spaces, cardinality of a set, probabilities based on symmetry, methods of enumerating sample points, conditional probability, total probability, independent events, Bayes’ Law. [3] 3 0.3.10 Complex Numbers and Polynomials 1. Introduction, Operations, Rules of Complex arithmetic, [2] 2. Modulus, Complex conjugate, Division, Polar representation of complex numbers, De Moivre’s theorem and its application, [4] 3. Applications of complex numbers, The fundamental theorem of algebra. [4] 0.3.11 Matrices and Determinants 1. Matrix addition and multiplication, properties, Transpose of a matrix, square matrices, diagonal and trace, Powers of matrices, [2] 2. Some special types of square matrices, Determinants, Laplace expansion of the determinant, Inverse of matrices, [5] 3. Application of matrices, Elementary row operations, Inverses using row operations, Solving systems of linear equations. [5] 0.4 Methods/Strategies to be used 1. lecture method, 2. group discussion, 3. seminars, 4. tutorials. 0.5 Student Assessment Students will write three, one hour tests after every week. The average of the tests will constitute the coursework mark where 50% of the coursework mark will contribute to the final mark. A 3 hour final examination will be written in the 4th week of the first semester. The examination will contribute 50% to the final mark. The examination paper has two sections; namely; section A and section B. Candidates may attempt ALL questions in Section A and at most TWO questions in Section B. Section A carries 40 marks and each question in section B carries 30 marks. 4 0.6 Selected Resources(references) Recommended reading S Lang, Calculus of Several Variables (Springer Science+Business Media New York). P D Lax, M S Terrell, Calculus with Applications (Springer Science+Business Media New York). M R Spiegel, Advanced Calculus (Schaum’s Outline Series). J R Kirkwood, An Introduction To Analysis (PWS Publishing Company). A Jeffrey, Linear Algebra and Differential Equations. Antony Howard, Elementary Linear Algebra, 7th edition (Wiley, 1994) M R Spigel, Complex Variables, Schaum Outline Series. Additional reading Any first year university Calculus text book. 5 Abstract Calculus is one of the milestones of Western thought. Building on ideas of Archimedes, Fermat, Newton, Leibniz, Cauchy, and many others, the calculus is arguably the cornerstone of modern science. Any well-educated person should at least be acquainted with the ideas of calculus, and a scientifically literate person must know calculus solidly. Calculus has two main aspects: differential calculus and integral calculus. Differential calculus concerns itself with rates of change. Various types of change, both mathematical and physical, are described by a mathematical quantity called the derivative. Integral calculus is concerned with a generalized type of addition, or amalgamation, of quantities. Many kinds of summation, both mathematical and physical, are described by a mathematical quantity called the integral. Calculus is one of the most important parts of mathematics. It is fundamental to all of modern science. How could one part of mathematics be of such central importance? It is because calculus gives us the tools to study rates of change and motion. All analytical subjects, from biology to physics to chemistry to engineering to mathematics, involve studying quantities that are growing or shrinking or moving, in other words, they are changing. Astronomers study the motions of the planets, chemists study the interaction of substances, physicists study the interactions of physical objects. All of these involve change and motion. 1 2 1 2 To Archimedes, Pierre de Fermat, Isaac Newton, and Gottfried Wilhelm von Leibniz, the fathers of calculus The true sign of intelligence is not knowledge but imagination—— Albert Einstein Chapter 1 The Basics 1.1 Number Systems Mathematics has its own language with numbers as the alphabet. The language is given structure with the aid of connective symbols, rules of operation, and a rigorous mode of thought (logic). The number systems that we use in calculus are the natural numbers, the integers, the rational numbers, and the real numbers. Let us describe each of these : 1. The natural numbers are the system of positive counting numbers 1, 2, 3 . . . . We denote the set of all natural numbers by N. N = {1, 2, 3, 4, 5, 6, 7, 8, . . . }. 2. The integers are the positive and negative whole numbers and zero, . . . , −3, −2, −1, 0, 1, 2, 3, . . . . We denote the set of all integers by Z. Z = {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . }. 3. The rational numbers are quotients of integers or fractions, such as 32 , − 54 . Any number p of the form , with p, q ∈ Z and q 6= 0, is a rational number. We denote the set of all rational q numbers by Q. p Q= p, q ∈ Z, q 6= 0 . q 4. The real numbers are the set of all decimals, both terminating and non-terminating. We denote the set of all real numbers by R. A decimal number of the form x = 3.16792 is actually a rational number, for it represents x = 3.16792 = 1 316792 . 100000 A decimal number of the form m = 4.27519191919 . . . , with a group of digits that repeats itself interminably, is also a rational number. To see this, notice that 100 · m = 427.519191919 . . . and therefore we may subtract 100m = 427.519191919 . . . m = 4.27519191919 . . . Subtracting, we see that 99m = 423.244 or 423244 . 99000 So, as we asserted, m is a rational number or quotient of integers. To indicate recurring decimals we sometimes place dots over the repeating cycle of digits, e.g., m = 4.2751̇9̇, 19 = 3.16̇. 6 m= Another kind of decimal number is one which has a non-terminating decimal expansion that does not keep repeating. An example is π = 3.14159265 . . . . Such a number is irrational, that is, it cannot be expressed as the quotient of two integers. In summary : There are three types of real numbers : (i) terminating decimals, (ii) nonterminating decimals that repeat, (iii) non-terminating decimals that do not repeat. Types (i) and (ii) are rational numbers. Type (iii) are irrational numbers. The geometric representation of real numbers as points on a line is called the real axis. Between any two rational numbers on the line there are infinitely many rational numbers. This leads us to call the set of rational numbers an everywhere dense set. Real numbers are characterised by three fundamental properties : (a) algebraic means formalisations of the rules of calculation (addition, subtraction, multiplication, division). Example : 2(3 + 5) = 2 · 3 + 2 · 5 = 6 + 10 = 16. 1 3 (b) order denote inequalities. Example : − < . 4 3 (c) completeness implies that there are “no gaps” on the real line. Algebraic properties of the reals for addition (a, b, c ∈ R) are : (A1) a + (b + c) = (a + b) + c. associativity (A2) a + b = b + a. commutativity (A3) There is a 0 such that a + 0 = a. identity (A4) There is an x such that a + x = 0. inverse Why these rules? They define an algebraic structure (commutative group). Now define analogous algebraic properties for multiplication : 2 (M1) a(bc) = (ab)c. (M2) ab = ba. (M3) There is a 1 such that a · 1 = a. (M4) There is an x such that ax = 1 for a 6= 0. Finally, connect multiplication and addition : (D) a(b + c) = ab + ac. distributivity These 9 rules define an algebraic structure called a field. Order properties of the reals are : (O1) for any a, b ∈ R, a ≤ b or b ≤ a. totality of ordering I (O2) if a ≤ b and b ≤ a, then a = b. totality of ordering II (O3) if a ≤ b and b ≤ c, then a ≤ c. transitivity (O4) if a ≤ b, then a + c ≤ b + c. order under addition (O5) if a ≤ b and c ≥ 0, then ac ≤ bc. order under multiplication Some useful rules for calculations with inequalities are : If a, b, c are real numbers, then : (a) if a < b and c < 0 ⇒ bc < ac. (b) if a < b ⇒ −b < −a. 1 (c) if a > 0 ⇒ > 0. a (d) if a and b are both positive or negative, then a < b ⇒ 1 1 < . b a The completeness property can be understood by the following construction of the real numbers : Start with the counting numbers 1, 2, 3, . . . . N = {1, 2, 3, 4, . . . } natural numbers ⇒ Can we solve a + x = b for x? Z = {. . . , −2, −1, 0, 1, 2, . . . } integers ⇒ Can we solve ax = b for x? Q = { pq |p, q ∈ Z, q 6= 0} rational numbers ⇒ Can we solve x2 = 2 for x? √ R real numbers, for example : The positive solution to the equation x2 = 2 is 2. This is an irrational number whose decimal representation is not eventually repeating. ⇒ N⊂Z⊂Q⊂R In summary, the real numbers R are complete in the sense that they correspond to all points on the real line, i.e., there are no “holes” or “gaps”, whereas the rationals have “holes” (namely the irrationals). You Try It : What type of real number is 3.41287548754875 . . . ? Can you express this number in more compact form? 3 1.2 Intervals Definition 1.2.1. A subset of the real line is called an interval if it contains at least two numbers and all the real numbers between any of its elements. Examples : 1. x > −2 defines an infinite interval. Geometrically, it corresponds to a ray on the real line. 2. 3 ≤ x ≤ 6 defines a finite interval. Geometrically, it corresponds to a line segment on the real line. Finite Intervals. Let a and b be two points such that a < b. By the open interval (a, b) we mean the set of all points between a and b, that is, the set of all x such that a < x < b. By the closed interval [a, b] we mean the set of all points between a and b or equal to a or b, that is, the set of all x such that a ≤ x ≤ b. The points a and b are called the endpoints of the intervals (a, b) and [a, b]. By a half-open interval we mean an open interval (a, b) together with one of its endpoints. There are two such intervals : [a, b) is the set of all x such that a ≤ x < b and (a, b] is the set of all x such that a < x ≤ b. Infinite Intervals. Let a be any number. The set of all points x such that a < x is denoted by (a, ∞), the set of all points x such that a ≤ x is denoted by [a, ∞). Similarly, (−∞, b) denotes the set of all points x such that x < b and (−∞, b] denotes the set of all x such that x ≤ b. 1.3 Solving Inequalities Solve inequalities to find intervals of x ∈ R. Set of all solutions is the solution set of the inequality. Examples: 4 1. 2x − 1 < x + 3 2x < x + 4 x < 4. 2. For what values of x is x + 3(2 − x) ≥ 4 − x? x + 3(2 − x) x + 6 − 3x 6 − 2x 2 ≥ ≥ ≥ ≥ 4 − x when 4−x 4−x x ⇒ x ≤ 2. 3. For what values of x is (x − 4)(x + 3) < 0? Case 1: (x − 4) > 0 and (x + 3) < 0, =⇒ x > 4 and x < −3. Impossible since x cannot be both greater than 4 and less than −3. Case 2: (x − 4) < 0 and (x + 3) > 0, =⇒ x < 4 and x > −3 =⇒ −3 < x < 4. You Try It: Solve the inequality 1.4 2 3 < . x−1 2x + 1 The Absolute Value It is a quantity that gives the magnitude or size of a real number. The absolute value or modulus of a real number x, denoted by |x|, is given by x, if x ≥ 0 |x| = −x, if x < 0. Geometrically, |x| is the distance between x and 0. For example, | − 6| = 6, |5| = 5, |0| = 0. 1.4.1 Properties of the Absolute Value 1. The absolute value of a real number x is non-negative, that is, |x| ≥ 0. 2. The absolute value of a real number x is zero if and only if x = 0, that is, |x| = 0 ⇐⇒ x = 0. 3. In general, if x and y are any two numbers, then 5 (a) −|x| ≤ x ≤ |x|. (b) | − x| = |x| and |x − y| = |y − x|. (c) |x| = |y| implies x = ±y. x |x| if y 6= 0. (d) |xy| = |x| · |y| and = y |y| (e) |x + y| ≤ |x| + |y|. (Triangle inequality) 4. If a is any positive number, then (a) |x| = a if and only if x = ±a. (b) |x| < a if and only if −a < x < a. (c) |x| > a if and only if x > a or x < −a. (d) |x| ≤ a if and only if −a ≤ x ≤ a. (e) |x| ≥ a if and only if x ≥ a or x ≤ −a. Example: Show that for all real numbers x, | − x| = |x|. Solution: If x ∈ R, then either x > 0, x = 0 or x < 0. If x > 0, then −x < 0. Thus, | − x| = −(−x) = x = |x|, that is, | − x| = |x|. If x = 0, then | − x| = | − 0| = |0| = 0, that is, | − x| = |x|. If x < 0, then −x > 0. Now |x| = −x = | − x| since −x > 0. Therefore in all cases | − x| = |x|. Solving an Equation with Absolute Values: Solve the equation |2x − 3| = 7. Solution: Hence 2x − 3 = ±7, so there are two possibilities, 2x − 3 = 7 2x = 10 x = 5 2x − 3 = −7 2x = −4 x = −2 The solutions of |2x − 3| = 7 are x = 5 and x = −2. Solving Inequalities Involving Absolute values: Sole the inequality 5 − Solution: We have 5− 2 2 < 1 ⇐⇒ −1 < 5 − < 1 x x 2 ⇐⇒ −6 < − < −4 x 1 ⇐⇒ 3 > > 2 x 1 1 ⇐⇒ <x< . 3 2 6 2 < 1. x Solve the inequalities and show the solution set on the real line. (a) |2x − 3| ≤ 1 (b) |2x − 3| ≥ 1. Solution: (a) |2x − 3| ≤ 1 ⇐⇒ −1 ≤ 2x − 3 ≤ 1 ⇐⇒ 2 ≤ 2x ≤ 4 ⇐⇒ 1 ≤ x ≤ 2. The solution set is the closed interval [1, 2]. (b) |2x − 3| ≥ 1 ⇐⇒ 2x − 3 ≥ 1 or 2x − 3 ≤ −1 ⇐⇒ x ≥ 2 or x ≤ 1. The solution set is (−∞, 1] ∪ [2, ∞). You Try It: Solve the inequality 4|x| < 7x − 6. 1.5 The Principle of Mathematical Induction It is an important property of the positive integers (natural numbers) and is used in proving statements involving all positive integers when it is known for, for example, that the statements are valid for n = 1, 2, 3, . . . but it is suspected or conjectured that they hold for all positive integers. 1.5.1 Steps 1. Prove the statement for n = 1 or some other positive integer. (Initial Step) 2. Assume the statement true for n = k, where k ∈ Z+ . (Inductive Hypothesis) 3. From the assumption in 2 prove the statement must be true for n = k + 1. 4. Since the statement is true for n = 1 (from 1) it must (from 3) be true for n = 1 + 1 = 2 and from this for n = 2 + 1 = 3, and so on, so must be true for all positive integers. (Conclusion) Example: For any positive integer n, 1 + 2 + ··· + n = Solution: 7 n(n + 1) . 2 1. Prove for n = 1, 1 = 2 1(1 + 1) = = 1, which is clearly true. 2 2 2. Assume that the statement holds for n = k, that is, 1 + 2 + ··· + k = k(k + 1) . 2 3. Prove for n = k + 1. So k(k + 1) + (k + 1) (by inductive hypothesis) 2 k(k + 1) + 2(k + 1) = 2 2 k + 3k + 2 = 2 (k + 1)(k + 2) = 2 1 + 2 + · · · + k + (k + 1) = so holds for n = k + 1. 4. Hence by induction, 1 + 2 + · · · + n = n(n + 1) is true for any positive integer n. 2 Example: Prove that for any natural number 1 + 3 + 5 + · · · + 2n − 1 = n2 . Solution: 1. Prove for n = 1, 1 = 12 = 1, so it is true. 2. Assume that the statement holds for n = k, that is, 1 + 3 + 5 + · · · + 2k − 1 = k 2 . 3. Prove for n = k + 1. We have 1 + 3 + 5 + · · · + (2k − 1) + 2(k + 1) − 1 = k 2 + 2k + 1 (by inductive hypothesis) = (k + 1)2 . So it is true for n = k + 1. 4. Hence by induction 1 + 3 + 5 + · · · + 2n − 1 = n2 is true for all natural numbers n. 8 Example: Prove that 3n > 2n for all natural numbers n. Solution: 1. Prove for n = 1 =⇒ 31 = 3 > 21 = 2, which is true. 2. Assume the statements holds for n = k, that is, 3k > 2k . 3. Prove for n = k + 1. 3k+1 = > > > 3k · 3 2k · 3 by inductive hypothesis 2k · 2 since 3 > 2 2k+1 , which is true. 4. Hence, by induction 3n > 2n for all natural numbers n. Example: Prove that for any integer n ≥ 1, 22n − 1 is divisible by 3. Solution: 1. Prove for n = 1 =⇒ 22 − 1 = 3 and is divisible by 3, hence its true. 2. Assume that the statement holds for n = k, that is, for k ≥ 1, 22k − 1 is divisible by 3, i.e., 22k − 1 = 3l, for some l ∈ Z. 3. Prove for n = k + 1. 22(k+1) − 1 = = = = = 4 · 22k − 1 but 22k = 3l + 1 by the inductive hypothesis 4(3l + 1) − 1 12l + 4 − 1 12l + 3 3(4l + 1), which is true. 4. Hence, by induction 22n − 1 is divisible by 3 for all n ≥ 1. 1.6 Tutorial 1 1. Express the following recurring decimals in the form p/q where p and q are integers (i) 2, 1737̇3̇ (ii) 0, 3̇2̇4̇. 9 2. Express 0, mnmnmnmn . . . = 0, ṁṅ, where m and n are distinct integers, in the form p/q where p and q are integers. 3. State, giving a reason, whether each of the following numbers is rational or irrational. (i) 0.20200200020 . . . (ii) 537.137137137 . . . . 4. Does the decimal 0, 1234567891011121314151617181920 . . . whose digits are natural numbers strung end-to-end represent a rational or an irrational number? Give a reason for your answer. 5. Show that if 0 < a < b then a2 < b2 . If a2 < b2 , is it necessarily true that a < b? Give an example to illustrate your answer. 6. If a ≥ 0 and b ≥ 0, prove that √ 1 (a + b) ≥ ab. 2 7. Solve the following inequalities. 2x + 3 >3 (i) x2 + x − 2 > 0 (ii) x−5 (iii) 2|x| > 3x − 10 (iv) |x + 1| ≥ 3 8. Prove that |ab| = |a||b| for all a, b ∈ R. 9. Show that |a + b| ≤ |a| + |b| for all a, b ∈ R. 10. Prove that |x|2 = x2 for any real number x. 11. If x and y are real numbers, prove that |x| − |y| ≤ |x − y|. 12. Prove the following by induction. (a) n! > 2n for all n ≥ 4. (c) n2 ≤ 2n for all n ≥ 5. (d) The sum of terms in a geometric series is n X i=0 n 2 (g) 3 > n for n > 2. (h) 72n − 48n − 1 is divisible by 2304. (k) | sin nx| ≤ n| sin x| for all x ∈ R, n ∈ Z+ . (l) 13 + 23 + 33 + · · · + n3 = n2 (n + 1)2 . 4 10 ri = rn+1 − 1 , if r 6= 0, r 6= 1, n ∈ N. r−1 Chapter 2 Sequences Definition 2.0.1. A sequence is a set of numbers u1 , u2 , u3 , . . . in a definite order of arrangement and formed according to a definite rule. Each number in the sequence is called a term and un is called the nth term. The sequence u1 , u2 , u3 , . . . is written briefly as {un }, e.g., {un } = 2n, where u1 = 2, u2 = 4, u3 = 6 and so on. The sequence is called finite or infinite according as there are or are not a finite number of terms. Recursion Formula or Recurrence Relations So far we have seen that a sequence {Un } may be defined by giving a formula for {Un } in terms of n. For example 2n2 − 5n + 4 . Un = √ n2 + 1 We can also define sequences by giving a relation or formula that connect successive terms of a sequence and specifying the value or values of the first term or the first and second terms etc. The formula or relation linking the terms is called a recursion formula or recurrence relation. Example: Find the values of the first four terms of the sequence defined by 2 , u0 = 1, n ∈ N. un+1 = un Solution: 2 2 = =2 u0 1 2 2 = = =1 u1 2 2 2 = = = 2. u2 1 u1 = u0+1 = u2 = u1+1 u3 = u2+1 11 You Try It: Define recursively a0 = a1 = 1, and an = an−1 + 2an−2 , n ≥ 2. Find a6 recursively. 2.1 Limits of Sequences 1 Lets consider the sequence un = . The sequence has the terms 1, 21 , 13 , 14 , . . . . We see that the n terms of the sequence tend to or approach 0. Definition 2.1.1. A number L is called the limit of an infinite sequence a1 , a2 , a3 , . . . or {an }, if for any positive number ε, we can find a positive number N depending on ε such that |an − L| < ε for all integers n > N . We write lim an = L. n→∞ If {an } is a convergent sequence, it means that the terms an can be made arbitrarily close to L for n sufficiently large. 1 3n + 1 Example: If un = 3 + = , the sequence is 4, 27 , 10 , . . . and we can show that 3 n n lim un = 3. n→∞ If the limit of a sequence exists, the sequence is called convergent, otherwise, it is called divergent. 12 1 = 0. n→∞ n Example: Prove that lim 1 1 1 1 1 −0 = < ε. But n > . So N = . = n n n ε ε 1 1 Taking N to be the smallest integer greater than , we have, lim = 0. n→∞ n ε Proof: Let ε > 0, we can find N (ε) such that 1 = 0 if p ∈ N. n→0 np You Try It: Prove that lim 2 2n − 1 = . n→∞ 3n + 2 3 Example: Use the definition of a limit to prove that lim Proof: Let ε > 0, we can find N (ε) such that 3(2n − 1) − 2(3n + 3) 6n − 3 − 6n − 4 −7 7 2n − 1 2 − = = = = <ε 3n + 2 3 3(3n + 2) 3(3n + 2) 3(3n + 2) 3(3n + 2) 7 < ε 3(3n + 2) 7 − 6ε . n > 9ε 7 − 6ε 7 − 6ε . So taking N to be the smallest integer greater than , we have 9ε 9ε 2n − 1 2 2n − 1 2 < ε , i.e., lim − = . n→∞ 3n + 2 3n + 2 3 3 Take N = 2.2 Theorems on Limits If lim an = A and lim bn = B, then n→∞ n→∞ 1. lim (an + bn ) = lim an + lim bn = A + B. n→∞ n→∞ n→∞ 2. lim (an − bn ) = lim an − lim bn = A − B. n→∞ n→∞ n→∞ 3. lim (an · bn ) = ( lim an )( lim bn ) = AB. n→∞ n→∞ n→∞ lim an an A = n→∞ = if lim bn = B 6= 0. n→∞ bn lim bn B n→∞ 4. lim n→∞ 5. The limit of a convergent sequence {un } of real numbers is unique. 13 Proof: We must show that if lim un = l1 and lim un = l2 , then l1 = l2 . By hypothesis, given any n→∞ n→∞ ε ε ε > 0, we can find N such that |un − l1 | < when n > N and |un − l2 | < when n > N . Then 2 2 |l1 − l2 | = |l1 − un + un − l2 | ≤ |l1 − un | + |un − l2 | < ε ε + = ε, 2 2 i.e., |l1 −l2 | is less than any positive ε (however small) and so must be zero, i.e., l1 −l2 = 0 =⇒ l1 = l2 . Example: If lim an = A and lim bn = B, prove that lim (an + bn ) = A + B. n→∞ n→∞ n→∞ Proof: We must show that for any ε > 0, we can find N > 0, such that |(an + bn ) − (A + B)| < ε for all n > N . We have |(an + bn ) − (A + B)| = |(an − A) + (bn − B)| ≤ |an − A| + |bn − B|. ε By hypothesis, given ε > 0 we can find N1 and N2 such that |an − A| < for all n > N1 and 2 ε |bn − B| < for all n > N2 . Then 2 |(an + bn ) − (A + B)| < ε ε + =ε 2 2 for all n > N where N = max(N1 , N2 ). Hence lim (an + bn ) = A + B. n→∞ 2.3 Sequences Tending to Infinity n tends to infinity, n → ∞ (n grows or increases beyond any limit ). Infinity is not a number and the sequences that tend to infinity are not convergent. We write lim an = ∞, if for each positive number M , we can find a positive number N (depending n→∞ on M ) such that an > M for all n > N . Similarly, we write lim an = −∞, if for each positive number M , we can find a positive number N n→∞ such that an < −M for all n > N . Example: Prove that (a) lim 32n−1 = ∞ n→∞ (b) lim (1 − 2n) = −∞. n→∞ Proof: (a) If for each positive number M we can find a positive number N such that an > M for 1 ln M all n > N , then 32n−1 > M when (2n − 1) ln 3 > ln M , i.e., n > + 1 . Taking N to be 2 ln 3 1 ln M + 1 , then lim 32n−1 = ∞. the smallest greater than n→∞ 2 ln 3 14 (b) If for each positive number M , we can find a positive number N such that an < −M for all n > N , i.e., 1 − 2n < −M when 2n − 1 > M or n > 12 (M + 1). Taking N to be the smallest integer greater than 12 (M + 1), we have lim (1 − 2n) = −∞. n→∞ 2.4 Bounded and Monotonic Sequences A sequence that tends to a limit l is said to be convergent and the sequence converges to l. A sequence may tend to +∞ or −∞, and is said to be divergent and it diverges to +∞ or −∞. If un ≤ M for n = 1, 2, 3, . . . , where M is a constant, we say that the sequence {un } is bounded above and M is called an upper bound. The smallest upper bound is called the least upper bound (l.u.b). If un ≥ m, the sequence is bounded below and m is called a lower bound. The largest lower bound is called the greatest lower bound (g.l.b). If m ≤ un ≤ M , the sequence is called bounded, indicated by |un | ≤ P . (Every convergent sequence is bounded, but the converse is not necessarily true) If un+1 ≥ un , the sequence is called monotonic increasing and if un+1 > un it is called strictly increasing. If un+1 ≤ un , the sequence is called monotonic decreasing, while if un+1 < un it is strictly decreasing. Examples: 1. The sequence 1, 1.1, 1.11, 1.111, . . . is bounded and monotonic increasing. 2. The sequence 1, −1, 1, −1, 1, . . . is bounded but not monotonic increasing or decreasing. Definition 2.4.1. A null sequence is a sequence that converges to 0, e.g., un = 1 , n ≥ 11. n − 10 If {un } does not tend to a limit or +∞ or −∞, we say that {un } oscillates (or is an oscillating sequence). It can oscillate finitely (bounded) or infinitely (unbounded). Examples: un = (−1)n , 2.5 un = (−1)n n. Limits of Combination of Sequences We want to be able to evaluate limits, for example, of the form lim n→∞ 15 1 3 2− + 2 n n 5 − 2n2 . n→∞ 4 + 3n + 2n2 or lim 1 3 1 1 Example: lim 2 − + 2 = lim 2 − lim + 3 lim 2 = 2 − 0 + 0 = 2. n→∞ n→∞ n→∞ n n→∞ n n n 3 − n5 3n2 − 5n = lim n→∞ 5 + 2 − n→∞ 5n2 + 2n − 6 n Example: lim 6 n2 = 3 3+0 = . 5+0+0 5 √ √ √ √ √ √ 1 n+1+ n = lim √ Example: lim ( n + 1 − n) = lim ( n + 1 − n) · √ √ √ = 0. n→∞ n→∞ n→∞ n+1+ n n+1+ n 2.6 Squeeze Theorem If lim an = l = lim bn and there exists an N such that an ≤ cn ≤ bn , for all n > N , then n→∞ n→∞ lim cn = l. n→∞ cos n . n→∞ n Example: Find lim Solution: We know that −1 ≤ cos n ≤ 1 1 cos n 1 1 cos n 1 cos n =⇒ − ≤ ≤ =⇒ − lim ≤ lim ≤ lim =⇒ 0 ≤ lim ≤0 n→∞ n n→∞ n→∞ n n→∞ n n n n n cos n = 0. =⇒ lim n→∞ n 16 Chapter 3 Infinite Series One important application of infinite sequences is in representing infinite summations. If {an } is an infinite sequence, then ∞ X an = a1 + a2 + a3 + · · · n=1 is called an infinite series (or simply a series). The numbers a1 , a2 , a3 , . . . are called the terms of the series. To find the sum of an infinite series, consider the following sequence of partial sums. S1 S2 S3 .. . = a1 = a1 + a2 = a1 + a2 + a3 . .. .. = .. . . Sn = a1 + a2 + a3 + · · · + an . If this sequence of partial sums converges, then the series is said to converge and has the sum indicated in the following definition. 3.1 Definition of Convergent and Divergent Series For the infinite series X an , the nth partial sum is given by Sn = a1 + a2 + a3 + · · · + an . X If the sequence of partial sums {Sn } converges to S, then the series an converges. The limit S is called the sum of the series. If {Sn } diverges, then the series diverges. 17 Example 3.1.1. The series ∞ X 1 1 1 1 1 + + + + · · · has the following partial sums. = n 2 2 4 8 16 n=1 S1 = S2 = s3 = .. . = sn = 1 2 3 1 1 + = 2 4 4 1 1 1 7 + + = 2 4 8 8 .. .. .. .. . . . . 1 1 1 1 2n − 1 + + + ··· + n = . 2 4 8 2 2n 2n − 1 = 1, it follows that the series converges and its sum is 1. n→∞ 2n Example 3.1.2. The nth partial sum of the series ∞ X 1 1 1 1 1 1 1 − = 1− + − + − + ··· n n+1 2 2 3 3 4 n=1 Because lim 1 . Because the limit of Sn is 1, the series converges and its sum is 1. n+1 ∞ X Example 3.1.3. The series 1 = 1 + 1 + 1 + · · · diverges, because Sn = n and the sequence of is given by Sn = 1 − n=1 partial sums diverges. The series in Example (3.1.2) is a telescoping series. That is, it is of the form (b1 − b2 ) + (b2 − b3 ) + (b3 − b4 ) + (b4 − b5 ) + · · · note that b2 is canceled by the second term, b3 is canceled by the third term and so on. Because the nth partial sum of the series is Sn = b1 − bn+1 , it follows that a telescoping series will only converge if and only if bn approaches a finite number as n → ∞. Moreover, if the series converges, then its sum is S = b1 − lim bn+1 . n→∞ Example 3.1.4. Find the sum of the series ∞ X n=1 2 . −1 4n2 Solution: Using partial fractions, we can write 2 2 1 1 an = 2 = = − . 4n − 1 (2n − 1)(2n + 1) 2n − 1 2n + 1 From the telescoping form, we can see that the nth partial sum is 1 1 1 1 1 1 1 Sn = − + − + ··· + − =1− . 1 3 3 5 2n − 1 2n + 1 2n + 1 Thus, the series converges and its sum is 1. That is, ∞ X 2 1 = lim Sn = lim 1 − = 1. 2−1 n→∞ n→∞ 4n 2n + 1 n=1 18 3.2 Geometric Series A geometric series with ratio r is given by ∞ X arn = a + ar + ar2 + · · · + arn + · · · , a 6= 0. n=0 Theorem 3.2.1. A geometric series with ratio r diverges if |r| ≥ 1. If 0 < |r| < 1, then the series ∞ X a , 0 < |r| < 1. converges to the sum arn = 1 − r n=0 Example 3.2.1. The geometric series 2 n ∞ ∞ X X 3 1 1 1 +3 = 3 = 3(1) + 3 + ··· n 2 2 2 2 n=0 n=0 has a ratio of r = 21 with a = 3. Because 0 < |r| < 1, the series converges and its sum is a 3 S= = = 6. 1−r 1 − 12 3.2.1 If X Properties of Infinite Series an = A and X bn = B and c is a real number, then the following series converge to the X X X X indicated sums. (i) can = cA (ii) (an ± bn ) = an ± bn = A ± B. n-th Term Test for Divergence Limit of n−th Term of a Convergent Series If the series X an converges, then the sequence {an } converges to 0. If the sequence {an } does not converge to 0, then the series 3.3 X an diverges. Test for Convergence or Divergence of Series In this and the following section, we will study several convergence tests that apply to series with positive terms. 19 3.3.1 The Integral Test If f is positive, continuous, and decreasing for x ≥ 1 and an = f (n), then ∞ X n=1 Z ∞ f (x) dx an and 1 either both converge or both diverge. Example 3.3.1. Apply the integral test to the series ∞ X n=1 Because f (x) = obtain x2 n2 n . +1 x satisfies the conditions for the integral test (check this), we can integrate to +1 Z 1 ∞ x 1 dx = x2 + 1 2 Z ∞ 2x 1 dx = lim x2 + 1 2 b→∞ 1 b 1 2 lim ln(x + 1) = 2 b→∞ 1 1 = lim [ln(b2 + 1) − ln 2] 2 b→∞ = ∞. Z 1 b 2x dx x2 + 1 Thus, the series diverges. Example 3.3.2. Apply the integral test to the series ∞ X n=1 Solution: Because f (x) = obtain Z 1 n2 1 . +1 1 satisfies the conditions for the integral test, we can integrate to x2 + 1 ∞ b dx −1 lim = lim tan x b→∞ 1 x2 + 1 b→∞ 1 −1 −1 lim (tan b − tan 1) b→∞ π π π = − = . 2 4 4 dx = 2 x +1 = Z b Thus, the series converges. 3.3.2 p− Series and Harmonic Series A series of the form ∞ X 1 1 1 1 = p + p + p + ··· p n 1 2 3 n=1 20 is a p−series, where p is a positive constant. For p = 1, the series ∞ X 1 1 1 = 1 + + + ··· n 2 3 n=1 is the harmonic series. Theorem 3.3.1. The p−series ∞ X 1 1 1 1 = p + p + p + ··· p n 1 2 3 n=1 (i) converges if p > 1 and (ii) diverges if 0 < p ≤ 1. Example 3.3.3. From the Theorem it follows that the harmonic series ∞ X 1 1 1 = 1 + + + ··· n 2 3 n=1 diverges. 3.4 Comparisons of Series 3.4.1 Direct Comparison Test This is a test for positive-term series. It allows you to compare a series having complicated terms with a simpler series whose convergence or divergence is known. Direct Comparison Test Theorem 3.4.1. Let 0 ≤ an ≤ bn for all n. 1. If ∞ X bn converges, then n=1 2. If ∞ X n=1 ∞ X an converges. n=1 an diverges, then ∞ X bn diverges. n=1 Example 3.4.1. Determine the convergence or divergence of ∞ X n=1 21 1 . 2 + 3n Solution: This series resembles ∞ X 1 (Convergent geometric series). Term-by-term comparison n 3 n=1 yields 1 1 < n = bn , n ≥ 1. n 2+3 3 Thus, by the Direct Comparison Test, the series converges. an = Example 3.4.2. Determine the convergence or divergence of ∞ X n=1 Solution: The series resembles ∞ X 1 1 n=1 n2 1 √ . 2+ n (Divergent p−series). Term-by-term comparison yields 1 1 √ ≤√ , 2+ n n n≥1 which does not meet the requirements for divergence. Still expecting the series to diverge, we can ∞ X 1 compare the given series with (Divergent Harmonic series). In this case, term-by-term comn n=1 parison yields 1 1 √ = bn , n ≥ 4 an = ≤ n 2+ n and, by the Direct Comparison Test, the given series diverges. 3.4.2 Limit Comparison Test or Quotient Test Often a given series closely resembles a p−series or a geometric series, yet we cannot establish the term-by-term comparison necessary to apply the Direct Comparison Test. We can apply a second comparison test, called the Limit Comparison Test. Limit Comparison Test an = L where L is finite and positive. Then the two Suppose that an > 0 and bn > 0 and lim n→∞ bn X X X series an and bn , either both converge or both diverge. If L = 0 and bn converges, then X X X an converges. If L = ∞ and bn diverges, then an diverges. Example 3.4.3. Show that the following harmonic series diverges. ∞ X n=1 1 , an + b a > 0, b > 0. 22 ∞ 1 X 1 1 n we have lim an+b = . Because this limit is = lim 1 n→∞ n→∞ an + b n a n n=1 grater than 0, we can conclude from the Limit comparison Test that the given series diverges. Solution: By comparison with The limit Comparison Test works well for comparing a messy algebraic series with a p−series. In choosing an appropriate p−series, we must choose one with an nth term of the same magnitude as the nth term of the given series. Given series ∞ X 1 2 3n − 4n + 5 n=1 ∞ X 1 √ 3n − 2 n=1 ∞ 2 X n − 10 4n5 + n3 n=1 3.5 Comparison series ∞ X 1 n2 n=1 ∞ X 1 √ n n=1 ∞ ∞ X n2 X 1 = 5 n n3 n=1 n=1 Conclusion Both series converge. Both series diverge. Both series converge. Alternating Series So far, most series we have dealt with have had positive terms. In this section, we will study series that contain both positive and negative terms. The simplest such series is an alternating series, whose terms alternate in sign. For example, the geometric series n X ∞ ∞ X 1 1 1 1 1 1 (−1)n n = 1 − + − + − ··· − = 2 2 2 4 8 16 n=0 n=0 is an alternating geometric series with r = − 12 . Alternating series occur in two ways, either the odd terms are negative or the even terms are negative. Alternating Series Test Let an > 0. The alternating series ∞ X n (−1) an and n=1 ∞ X n=1 conditions are met. 1. an+1 ≤ an for all n. 2. lim an = 0. n→∞ 23 (−1)n+1 an converge, if the following two Example 3.5.1. Determine the convergence or divergence of ∞ X 1 (−1)n+1 . n n=1 1 1 1 ≤ for all n and the limit (as n → ∞) of is 0, we can apply the n+1 n n Alternating Series Test to conclude that the series converges. (This series is called the alternating harmonic series) Solution: Because Example 3.5.2. Determine the convergence or divergence of ∞ X n=1 n . (−2)n−1 Solution: To apply the Alternating Series Test, note that, for n ≥ 1, n 1 ≤ 2 n+1 n−1 2 n ≤ n 2 n+1 (n + 1)2n−1 ≤ n2n n+1 n ≤ n−1 . n 2 2 Hence, an+1 = n n+1 ≤ n−1 = an for all n. Furthermore, by L’Hopital’s rule, n 2 2 lim x→∞ x 2x−1 = lim x→∞ 1 2x−1 (ln 2) = 0 =⇒ lim n→inf n 2n−1 = 0. Therefore, by the Alternating Series Test, the given series converges. Cases for which the Alternating Series Test Fails Example 3.5.3. The alternating series ∞ X (−1)n+1 (n + 1) n=1 n = 2 3 4 5 6 − + − + − ··· 1 2 3 4 5 passes the first condition in the alternating series test because an+1 ≤ an for all n. We cannot apply the Alternating Series Test, because the series does not pass the second condition. The alternating series 2 1 2 1 2 1 1 1 − + − + − + − + ··· 1 1 2 2 3 3 2 4 passes the second condition because an approaches 0 as n → ∞. We cannot apply the Alternating Series Test, however, because the series does not pass the first condition. 24 3.6 Absolute and Conditional Convergence Occasionally, a series may have both positive and negative terms and not be an alternating series, for example, the series ∞ X sin n sin 1 sin 2 sin 3 + + + ··· = 2 n 1 4 9 n=1 has both positive and negative terms, yet it is not an alternating series. One way to obtain some information about the convergence of this series is to investigate the convergence of the series ∞ X sin n . By direct comparison, we have | sin n| ≤ 1, for all n, so n2 n=1 sin n 1 ≤ 2, 2 n n Thus, by the Direct Comparison Test, the series n ≥ 1. ∞ X sin n n=1 n2 converges. But the question still is “Does the original series converge?” Theorem 3.6.1 (Absolute Convergence). If the series also converges. X |an | converges, then the series X an The converse of the Theorem is not true. For example, the alternating harmonic series ∞ X (−1)n+1 n=1 n 1 1 1 + − + ··· 2 3 4 =1− converges by the Alternating Series Test. Yet the harmonic series diverges. This type of convergence is called conditional. Definition of Absolute and Conditional Convergence 1. X an is absolutely convergent if X 2. X an is conditionally convergent if |an | converges. X an converges but X |an | diverges. 3. An absolutely convergent series converges. Example 3.6.1. Determine whether the following series are convergent or divergent. Classify any convergent series as absolutely or conditionally convergent. n(n+1) ∞ X (−1) 2 1 1 1 1 =− − + + − ···. (a) n 3 3 9 27 81 n=1 25 Solution: This in not an alternating series. However, because n(n+1) ∞ ∞ X X (−1) 2 1 = n 3 3n n=1 n=1 is a convergent geometric series, so the given series is absolutely convergent, hence convergent. ∞ X 1 1 1 1 (−1)n =− + − + − ···. (b) ln(n + 1) ln 2 ln 3 ln 4 ln 5 n=1 Solution: In this case, the alternating series test indicates that the given series converges. However, the series ∞ X 1 1 1 (−1)n = + + + ··· ln(n + 1) ln 2 ln 3 ln 4 n=1 diverges by direct comparison with terms of the harmonic series. Therefore, the given series is conditionally convergent. ∞ X (−1)n 1 1 1 1 √ (c) = −√ + √ − √ + √ − · · · . n 1 2 3 4 n=1 Solution: The given series converges by the Alternating Series Test. Moreover, because the p−series ∞ X (−1)n 1 1 1 1 √ = √ + √ + √ + √ + ··· n 1 2 3 4 n=1 diverges, the given series is conditionally convergent. 3.7 The Ratio and Root Tests 3.7.1 The Ratio Test This is a test for absolute convergence. Ratio Test 1. X 2. X an converges absolutely if lim n→∞ an diverges if lim n→∞ an+1 < 1. an an+1 an+1 > 1 or lim = ∞. n→∞ an an 26 3. The Ratio Test is inconclusive if lim n→∞ an+1 = 1. an Although the Ratio Test is not a cure for all ills related to tests for convergence, it is particularly useful for series that converge rapidly. Series involving factorials or exponentials are frequently of this type. Example 3.7.1. Determine the convergence or divergence of ∞ X 2n n=0 Solution: Because an = 2n , we can write the following n! n+1 an+1 2 lim = lim n→∞ n→∞ (n + 1)! an n+1 2 = lim n→∞ (n + 1)! 2 = lim n→∞ n + 1 = 0. n! . 2n ÷ n! n! · n 2 Therefore, the series converges. Example 3.7.2. Determine whether the following series converge or diverge. ∞ ∞ X X n2 2n+1 nn (a) (b) . 3n n! n=0 n=1 Solution: an+1 is less than 1. an n+2 2 3n 2 = lim (n + 1) n→∞ 3n+1 n2 2n+1 2(n + 1)2 = lim n→∞ 3n2 2 = < 1. 3 (a) This series converges because the limit of lim n→∞ an+1 an an+1 is grater than 1. an (n + 1)n+1 n! lim n→∞ (n + 1)! nn (n + 1)n+1 1 lim n→∞ (n + 1) nn n (n + 1)n 1 lim = lim 1 + n→∞ n→∞ nn n e > 1. (b) This series diverges because the limit of lim n→∞ an+1 an = = = = 27 3.7.2 The Root Test This test of convergence or divergence of series works especially well for series involving nth powers. Root Test Let X an be a series with non-zero terms. 1. X 2. X an converges absolutely if lim p n n→∞ an diverges if lim n→∞ |an | < 1. p p n |an | > 1 or lim n |an | = ∞. n→∞ 3. The Root Test is inconclusive if lim n→∞ p n |an | = 1. Example 3.7.3. Determine the convergence or divergence of ∞ X e2n n=1 nn . Solution: We can apply the Root Test as follows r p lim n |an | = n→∞ lim n→∞ n e2n nn 2n en = lim n n→∞ n n e2 = lim n→∞ n = 0 < 1. Because this limit is less than 1, we can conclude that the series converges absolutely. 3.8 3.8.1 Power Series Definition of Power Series If x is a variable, then the infinite series of the form ∞ X an xx = a0 + a1 x + a2 x2 + a3 x3 + · · · + an xn + · · · n=0 28 is called a power series. More generally, a series of the form ∞ X an (x − c)n = a0 + a1 (x − c) + a2 (x − c)2 + · · · + an (x − c)n + · · · n=0 is called a power series centered at c, where c is a constant. Example 3.8.1. (a) The following power series is centered at 0. ∞ X xn n=0 n! =1+x+ x2 x3 + + ··· 2 3! (b) The following power series is centered at −1. ∞ X (−1)n (x + 1)n = 1 − (x + 1) + (x + 1)2 − (x + 1)3 + · · · n=0 (c) The following power series is centered at 1. ∞ X 1 1 1 (x − 1)n = (x − 1) + (x − 1)2 + (x − 1)3 + · · · n 2 3 n=1 3.8.2 Radius and Interval of Convergence A power series in x can be viewed as a function of x f (x) = ∞ X an (x − c)n n=0 where the domain of f is the set of all x for which the power series converges. Convergence of a Power Series For a power series centered at c, precisely one of the following is true. 1. The series converges only at c. 2. There exists a real number R > 0 such that the series converges absolutely for |x − c| < R, and diverges for |x − c| > R. 3. The series converges absolutely for all x. 29 The number R is the radius of convergence of the power series. In the series converges only at c, then the radius of convergence is R = 0, and if the series converges for all x, then the radius of convergence is R = ∞. The set of all values of x for which the power series converges is the interval of convergence of the power series. Example 3.8.2. Find the radius of convergence of ∞ X n!xn . n=0 Solution: For x = 0, we obtain f (0) = ∞ X n!0n = 1 + 0 + 0 + · · · = 1. n=0 For any fixed value of x such that |x| > 0, let un = n!xn . Then lim n→∞ un+1 un (n + 1)!xn+1 n→∞ n!xn = |x| lim (n + 1) n→∞ = ∞. = lim Therefore, by the Ratio Test, the series diverges for |x| > 0, and converges only at its center, 0. Hence, the radius of convergence is R = 0. Example 3.8.3. Find the radius of convergence of ∞ X 3(x − 2)n . n=0 Solution: For x 6= 2, let un = 3(x − 2)n . Then lim n→∞ un+1 un 3(x − 2)n+1 n→∞ 3(x − 2)n = lim |x − 2| = lim n→∞ = |x − 2|. By the Ratio Test, the series converges if |x − 2| < 1 and diverges if |x − 2| > 1. Therefore, the radius of convergence of the series is R = 1. Finding the Interval of Convergence Example 3.8.4. Find the interval of convergence of ∞ X xn n=1 30 n . Solution: Letting un = xn produces n un+1 lim n→∞ un = lim xn+1 n+1 xn n lim nx n+1 n→∞ = n→∞ = |x|. Therefore, by the Ratio Test, the radius of convergence is R = 1. Moreover, because the series is centered at 0, it converges in the interval (−1, 1). This interval, however, is not necessarily the interval of convergence. To determine this, we must test for convergence at each endpoint. When x = 1, we obtain the divergent harmonic series ∞ X 1 1 1 = 1 + + + ··· n 2 3 n=1 When x = −1, we obtain the convergent alternating harmonic series ∞ X (−1)n n=1 n = −1 + 1 1 1 − + − ··· 2 3 4 Therefore, the interval of convergence for the series is [−1, 1). Example 3.8.5. Find the interval of convergence of ∞ X (−1)n (x + 1)n n=0 Solution: Letting un = 2n . (−1)n (x + 1)n produces 2n un+1 lim n→∞ un = lim n→∞ (−1)n+1 (x+1)n+1 2n+1 (−1)n (x+1)n 2n n 2 (x + 1) n→∞ 2n+1 x+1 = . 2 = lim x+1 < 1 or |x+1| < 2. Hence, the radius of convergence 2 is R = 2. Because the series is centered at x = −1, it will converge in the interval (−3, 1). Furthermore, at the endpoints we have By the Ratio test, the series converges if ∞ X (−1)n (−2)n 2n n=0 and = n=0 ∞ X (−1)n (2)n n=0 ∞ X 2n 2n 2n = ∞ X 1 (Diverges when x=-3) n=0 ∞ X = (−1)n (Diverges when x=1) n=0 both of which diverge. Thus, the interval of convergence is (−3, 1). 31 Example 3.8.6. Find the interval of convergence of ∞ X xn n=1 Solution: Letting un = n2 . xn produces n2 un+1 lim = lim n→∞ n→∞ un xn+1 (n+1)2 xn n2 n2 x = lim = |x|. n→∞ (n + 1)2 Thus, the radius of convergence is R = 1. Because the series is centered at x = 0, it converges in the interval (−1, 1). When x = 1, we obtain the convergent p−series ∞ X 1 1 1 1 = 1 + 2 + 2 + 2 + ··· 2 n 2 3 4 n=1 (Converges when x=1) When x = −1, we obtain the convergent alternating series ∞ X (−1)n n=1 n2 = −1 + 1 1 1 − + − ··· 22 32 42 (Converges when x=-1) Therefore, the interval of convergence for the given series is [−1, 1]. Example 3.8.7. Find the interval of convergence of ∞ X nxn . n=1 Solution: The series is a power series with an = n and c = 0. Let un = nxn , so un+1 = (n+1)xn+1 . Then n+1 (n + 1)|x|n+1 un+1 = |x| =⇒ |x| as n → ∞. = n un n|x| n The limit is less than one whenever |x| < 1. The Ratio Test then shows ∞ X un is convergent for n=1 |x| < 1, and the series diverges for |x| > 1. This means the radius of convergence is R = 1. We know that the series is convergent for −1 < x < 1. We need to check convergence at the endpoints ∞ X of this interval. When x = 1, we have n. This series does not approach zero as n → ∞, we n=1 know this series must diverge. Similarly, the series is divergent when x = −1. The interval of convergence is (−1, 1). Example 3.8.8. Find the interval of convergence of ∞ X (−1)n (x − 2)n n=1 Solution: Let un = n4n . |x − 2|n |x − 2|n+1 , so u = . Then n+1 n4n (n + 1)4n+1 un+1 |x − 2|n+1 n4n n|x − 2| |x − 2| = = =⇒ n+1 n un (n + 1)4 |x − 2| (n + 1)4 4 32 as n → ∞. |x − 2| |x − 2| < 1 and divergence for > 1. Solving the first The Ratio Test gives convergence for 4 4 inequality, we have |x − 2| < 4 =⇒ −4 < x − 2 < 4 =⇒ −2 < x < 6. When x = −2, the series is ∞ X (−1)n (−2 − 2)n n=1 n4n ∞ X 1 , = n n=1 which is a divergent p−series. When x = 6, we have ∞ X (−1)n (6 − 2)n n=1 n4n = ∞ X (−1)n n=1 n . The Alternating Series Test shows that the series is convergent. The interval of convergence is −2 < x ≤ 6. ∞ X xn Example 3.8.9. Find the interval of convergence of the series . n3n n=1 Solution: With un = xn , we find that n3n lim n→∞ un+1 un xn+1 n|x| |x| (n + 1)3n+1 = = lim = . xn n→∞ 3(n + 1) 3 n3n |x| < 1 provided |x| < 3, so the Ratio Test implies that the given series converges absolutely if 3 X1 and when |x| < 3 and diverges if |x| > 3. When x = 3, we have the divergent harmonic series n X (−1)n x = −3, we have the convergent alternating series . Thus the interval of convergence of n the given power series is [−3, 3). Now Example 3.8.10. Find the interval of convergence of ∞ X 2n xn n=0 Solution: With un = n! . 2n xn , we find that n! 2n+1 xn+1 2|x| (n + 1)! lim = lim =0 2n xn n→∞ n→∞ n + 1 n! for all x. Hence the Ratio Test implies that the power series converges for all x, and its interval of convergence is (−∞, ∞). 33 3.9 Tutorial 2 1. Write sequences. ) five terms ( of each of ) the following ( ) ( the first n 1 − (−1) (−1)n−1 2n − 1 (ii) (iii) (i) 3n + 2 n3 2 · 4 · 6 · · · 2n ( (iv) (−1)n−1 x2n−1 (2n − 1)! ) 2. Determine the general term of each sequence. 1 2 3 4 5 , , , , ,.... 2 3 4 5 6 1 3 5 9 (iii) 3 , 5 , 7 , 11 , . . . . 5 5 5 5 (i) 3. (i) Recursively define a0 = 0, a1 = 1, a2 = 2 and an = an−1 − an−2 + an−3 for n ≥ 3. List the first five terms. (ii) Recursively define s0 = 1, s1 = −3 and sn = 6sn−1 − 9sn−2 for n ≥ 2, find s5 . 4. Using the definition of a limit, show that each of the following sequences cannot have the limit shown: n+1 1 n 2n − 1 1 , (ii) un = , (iii) un = 2 , 1. (i) un = 3n + 4 2 7n − 4 6 n +1 5. Use the definition of a limit to verify each of the following limits. 2n − 1 2 4 − 2n 2 sin n (i) lim = (ii) lim =− (iii) lim =0 n→∞ 3n + 2 n→∞ 3n + 2 n→∞ n 3 3 an − bn , where a > 0 and b > 0 for the three cases: n→∞ an + bn (i) a > b (ii) a < b (iii) a = b. 6. Find lim 7. Use the properties of limits to evaluate of thefollowing limits. √ each √ 4 − 2n − 3n2 3n2 − 5n + 4 (ii) lim (i) lim (iii) lim ( n2 + n − n) 2 n→∞ n→∞ n→∞ 2n + n 2n − 7 4 3 √ n(n + 2) 2n − 3 n − 2 (iv) lim (v) lim ( n + 1 − n) (vi) lim n→∞ n→∞ n→∞ n+1 n +1 3n + 7 s √ √ √ n)( n + 2) 3 (3 − (vii) lim ( 4n2 + n + 5 − 2n) (viii) lim . n→∞ n→∞ 8n − 4 √ 2 2 8. Show that if an → l as n → ∞, then an+1 = an + 2 converges to 3 2. 3 3an (Series) 1. Find the sum of the convergent series. n n ∞ n ∞ ∞ X X X 1 2 −1 (i) (ii) (iii) 2 2 3 2 n=0 n=0 n=0 34 2. Determine whether or not the series converges and find its sum if it converges ∞ ∞ ∞ X X X 1 1 20 (i) (ii) (iii) . (3r − 1)(3r + 2) (5r − 2)(5r + 3) (7r − 3)(7r + 4) r=1 r=1 r=1 3. Use the integral test to determine the convergence or divergence of the series. ∞ ∞ ∞ ∞ ∞ X X X X X 1 1 1 1 (ii) ne−n (iii) (iv) (v) (i) 1 . 3 n + 1 4n + 1 n 3 n n=1 n=1 n=1 n=1 n=1 4. Use the Direct Comparison Test to determine the convergence or divergence of the series. ∞ ∞ ∞ X X X 1 1 1 (ii) (iii) (i) 2 n +1 n−1 n! n=2 n=0 n=1 5. Use the Limit Comparison to determine the convergence or divergence of the series. ∞ ∞ ∞ ∞ ∞ X X X X X n+3 n 2n2 − 1 1 n √ (iv) (i) (ii) (iii) (v) . 2 5 n +1 3n + 2n + 1 n(n + 2) (n + 1)2n−1 n n2 + 1 n=1 n=1 n=1 n=1 n=1 6. Use the Alternating Series Test to determine the convergence or divergence of the series. ∞ ∞ ∞ X X X (−1)n+1 n (−1)n (−1)n+1 (ii) (iii) (i) n 2n − 1 ln n n=1 n=2 n=1 7. Determine whether the series converges conditionally or absolutely, or diverges. ∞ ∞ ∞ ∞ X X X X (−1)n+1 (−1)n+1 (2n + 3) (−1)n+1 (−1)n+1 √ (ii) (iii) (iv) (i) (n + 1)2 n + 10 n1.5 n n n=1 n=1 n=1 n=1 8. Use the Ratio Test to test for convergence or divergence of the series. n 3 n−1 ∞ ∞ ∞ (−1) X X X n! n2 2 (i) (ii) (iii) 3n 2n n2 n=0 n=1 n=1 9. Use the Root Test to test for convergence or divergence of the series. n ∞ ∞ ∞ X X X (−1)n n (ii) (iii) e−n (i) n 2n + 1 (ln n) n=2 n=0 n=1 10. Find the radius of convergence of the power series. ∞ ∞ ∞ n X X X (2x)n (−1)n xn n x (i) (−1) (ii) (iii) n+1 n! 2n n=0 n=0 n=0 11. Find the interval of convergence of the power series. ∞ ∞ ∞ X X X (−1)n xn (−1)n+1 (x − 5)n (−1)n+1 x2n−1 (i) (ii) (iii) n n5n 2n − 1 n=1 n=1 n=1 35 Chapter 4 Functions 4.1 What is a Function? Definition 4.1.1. A function f from a set X to a set Y is a rule that assigns to each element x in X a unique element y in Y . The set X is called the domain of the function f and the range is the set of all elements of Y assigned to an element of X. The element of Y assigned to an element x of X is called the image of x under f and is denoted by f (x). We write f : X → Y for saying f is a function from X to Y . In this course, both X and Y are sets of real numbers. Thus, the functions are called real functions. We usually specify a function f by giving the expression for f (x). Below are a few examples of functions: f (x) = 5x4 + 9, g(t) = 1 − t3 , h(s) = 9s2 + 2. Note that in the above examples, the letters f, g, h are used to denote functions whereas the letters x, t, s are used to denote the variables. A variable is an arbitrary element of a set. In the above examples, the letters x, t, s denote the independent variables and f (x), g(t), h(s) denote the dependent variables since their values depend on the values of x, t, s respectively. The domain of a function f is the largest set of real numbers for which the rule makes sense. 36 1 1 Example: Let f (x) = , we cannot compute f (0), since is not defined. Then the domain of x 0 1 f (x) = is the set of all real numbers except 0. x Function y = x2 1 y= x √ y = √x y = 1 − x2 Domain x ∈ X (−∞, ∞) Range y ∈ Y [0, ∞) (−∞, 0) ∪ (0, ∞) (−∞, 0) ∪ (0, ∞) [0, ∞) [−1, 1] [0, ∞) [0, 1] Table 4.1: Examples of functions You Try It: Let g(x) = x2 x . + 4x + 3 What is the domain and range of this function? 4.2 Graphs of Functions It is useful to draw pictures which represent functions. These pictures, or graphs, are a device for helping us to think about functions. We graph functions in the x − y plane. The elements of the domain of the function are thought of as points of the x−axis. The values of a function are measured on the y−axis. The graph of f associates to x a unique y value that the function f assigns to x. The graph of a function f is the set of points {(x, y)|y = f (x) in the domain of f } in the Cartesian plane. As a consequence, a function is characterized geometrically by the fact that any vertical line intersecting its graph does so in exactly one point. 37 4.3 Monotone and Bounded Functions A real function f is increasing (strictly increasing) on an interval I if for all points x1 and x2 in I with x1 < x2 , f (x1 ) ≤ f (x2 ) (f (x1 ) < f (x2 )). A real function f is decreasing (strictly decreasing) on an interval I if for all points x1 and x2 in I with x1 < x2 , f (x1 ) ≥ f (x2 ) (f (x1 ) > f (x2 )). A real function f is monotone on interval I if f is either increasing or decreasing. Example: Consider the function f (x) = (2x − 1)(x + 5). We observe that f is increasing on the interval (−9/4, ∞) and is decreasing on the interval (−∞, −9/4). Bounded Functions A function f is bounded above if there is a real number M such that f (x) ≤ M for all points x in its domain. The number M is then called an upper bound of f . A function f is bounded below if there is a real number m such that f (x) ≥ m for all points x in its domain. The number m is then called a lower bound of f . A function f is bounded if f is bounded above and below, that is, there exist real numbers M and m such that m ≤ f (x) ≤ M for all points x in its domain. Examples: f (x) = x + 3 is bounded in −1 ≤ x ≤ 1. An upper bound is 4 (or any number greater than 4). A lower bound is 2 (or any number less than 2). 38 4.4 Types of Functions 4.4.1 Elementary Functions Polynomial Function Have the form f (x) = a0 xn + a1 xn−1 + · · · + an−1 x + an where a0 , a1 , . . . , an are constants and n is a positive integer called the degree of the polynomial provided a0 6= 0. Examples: x5 + 10x3 − 2x + 1 is a polynomial of degree 5. Rational Functions A function f (x) = P (x) where P (x) and Q(x) are polynomial functions. Q(x) x3 + x + 5 is a rational function. Since (x + 1)(x − 4) = 0 for x = −1 and x2 − 3x − 4 x = 4, the domain of f is the set of all real numbers except −1 and 4. Example: f (x) = Power Function f (x) = kxn , n a real number and k a constant. Examples: y = 1 1 2 , y = x2 , y = x3 . x 39 Piecewise Defined Functions A function need not be defined by a single formula. A piecewise defined function is a function described by using different formula on different parts of its domain. −1, 0, Examples: (a) f (x) = x + 2, x<0 x=0 x>0 −x, x2 , (b) f (x) = 1, x<0 0≤x≤1 x > 1. Transcendental Functions The following are sometimes called elementary transcendental functions. 1. Exponential function, f (x) = ax , a 6= 0, 1. 2. Logarithmic function, f (x) = loga x, a 6= 0, 1. 3. Trigonometric functions (also called circular functions because of their geometric interpretasin x , csc x, cot x, sec x. tion with respect to the unit circle), e.g., sin x, cos x, tan x = cos x 4. Inverse trigonometric functions, e.g., y = sin−1 x, y = cos−1 x. 5. Hyperbolic Functions, e.g., sinh x, cosh x, tanh x, coth x. 40 Even and Odd Functions Let f (x) be a real-valued function of a real variable. Then f is even if f (x) = f (−x). (Symmetric with respect to the y−axis) Examples: |x|, x2 , x4 , cos x, cosh x. Let f (x) be a real-valued function of a real variable. Then f is odd if −f (x) = f (−x) or f (x) + f (−x) = 0. (Symmetric with respect to the origin) Examples: x, x3 , sin x, sinh x. Example: Determine whether the following function is odd or even f (x) = Solution: f (−x) = 3x . x2 + 1 3(−x) 3x =− 2 = −f (x). 2 (−x) + 1 x +1 The function is odd. 4.5 Combining Functions A function f can be combined with another function g by means of arithmetic operations to form f are defined as : other functions, the sum f + g, difference f − g, product f g and quotient g Let f and g denote functions, then 1. Sum : (f + g)(x) = f (x) + g(x). 2. Difference : (f − g)(x) = f (x) − g(x). 3. Product : (f g)(x) = f (x)g(x). f f (x) (x) = 4. Quotient : . g g(x) f Example: If f (x) = 2x2 − 5 and g(x) = 3x + 4. Find f + g, f − g, f g, . g 41 Solution: (f + g)(x) (f − g)(x) (f g)(x) f (x) g 4.6 = (2x2 − 5) + (3x + 4) = 2x2 + 3x − 1. = (2x2 − 5) − (3x + 4) = 2x2 − 3x − 9. = (2x2 − 5)(3x + 4) = 6x3 + 8x2 − 15x − 20 2x2 − 5 = . 3x + 4 Composition of Functions Let f and g denote functions. The composition of f and g, written f ◦ g is the function (f ◦g)(x) = f (g(x)) and the composition of g and f , written g◦f , is the function (g◦f )(x) = g(f (x)). Example: If f (x) = x2 and g(x) = x2 + 1, find f ◦ g and g ◦ f . Solution: (f ◦ g)(x) = f (g(x)) = f (x2 + 1) = (x2 + 1)2 = x4 + 2x2 + 1. and (g ◦ f )(x) = g(f (x)) = g(x2 ) = (x2 )2 + 1 = x4 + 1. In general, f ◦ g 6= g ◦ f . 4.7 Bijection, Injection and Surjection Classes of functions may be distinguished by the manner in which arguments and images are related or mapped to each other. A function f : X → Y is injective (one-to-one, 1 − 1) if every element of the range corresponds to exactly one element in its domain X. For all x, y ∈ X, f (x) = f (y) =⇒ x = y or equivalently For all x, y ∈ X, x 6= y =⇒ f (x) 6= f (y). An injective function is an injection. 42 Example: Show that the functions f (x) = 2x + 3 and g(x) = x3 − 2 are injective. Solution: Need to show that f (x) = f (y) =⇒ x = y. 2x + 3 = 2y + 3 2x = 2y =⇒ x = y. Hence f (x) = 2x + 3 is injective. Need to show that g(x) = g(y) =⇒ x = y. x3 − 2 = y 3 − 2 x3 = y 3 taking cube roots x = y. Hence g(x) = x3 − 2 is injective. A function f : X → y is called onto if for all y in Y there is an x in X such that f (x) = y. All elements in Y are used. Such functions are referred to as surjective. Example: Show that f (x) = 3x − 5 is onto. 43 Solution: For onto f (x) = y, i.e., 3x − 5 = y. Solve for x, =⇒ x = y+5 y+5 So f =3 − 5 = y. Therefore f is onto. 3 3 y+5 . 3 Let X and Y be sets. A function f : X → Y that is one-to-one and onto is called a bijection or bijective function from X to Y . If f is both one-to-one and onto, then we call f a 1 − 1 correspondence. Inverse of a Function. Suppose f is a 1 − 1 function that has domain X and range Y . Since every element y ∈ Y corresponds with precisely one element x of X, the function f must actually determine a reverse function g whose domain is Y and range is X, where f and g must satisfy f (x) = y and g(y) = x. The function g is given the formal name inverse of f and usually written f −1 and read f inverse. Not all functions have inverses, those that do are called invertible functions. Example: Find the inverse of the function f (x) = (2x + 8)3 . Solution: We must solve the equation y = (2x + 8)3 for x. Hence the inverse function f −1 y = (2x + 8)3 √ 3 y = 2x + 8 √ 3 y − 8 = 2x √ 3 y − 8 x = . 2 √ 3 x−8 −1 . is given by f (x) = 2 A 1−1 function f can have only one inverse, i.e., f −1 is unique. A function f : X → Y is invertible if and only if f is one-to-one and maps X onto Y . 44 4.8 Operations on Functions Equality of Functions Equality of functions does not mean the same as equality of two numbers (numbers have a fixed value but values of functions vary). Each function is a relationship between x and y, the two relationships are the same if for every value of x we get the same value of y. Example: The functions (x − 1)(x + 2) and x2 + x − 2 are equal. Example: Equal functions for positive values of x, √ |x| = x2 . Identity Function Generally, an identity function is one which does not change the domain values at all. Its the function f (x) = x. Denoted by IX . 45 Chapter 5 Limits and Continuity The single most important idea in calculus is the idea of limit. More than 2000 years ago, the ancient Greeks wrestled with the limit concept, and they did not succeed. It is only in the past 200 years that we have finally come up with a firm understanding of limits. The study of calculus went through several periods of increased mathematical rigour beginning with the French mathematician Augustin-Loius Cauchy (1789-1857) and later continued by the German mathematician, and former high school teacher, Karl Wilhelm Weierstrass (1815-1897). 5.1 Limit of a Function If f is a function, then we say lim f (x) = A, if the value of f (x) gets arbitrarily closer to A as x gets x→a closer and closer to a. For example, lim x2 = 9, since x2 gets arbitrarily close to 9 as x approaches x→3 as close as one wishes to 3. The definition can be stated more precisely as follows : lim f (x) = A if and only if, for any x→a chosen positive number ε, however small, there exists a positive number δ, such that, whenever 0 < |x − a| < δ, then |f (x) − A| < ε. lim f (x) = A means that f (x) can be made as close as desired to A by making x close enough, but x→a not equal to a. How close is “close enough to a” depends on how close one wants to make f (x) to A. It also of course depends on which function f is and on which number a is. The positive number ε is how close one wants to make f (x) to A ; one wants the distance to be no more then ε. The positive number δ is how close one will make x to a ; if the distance from x to a is less than δ (but not zero), then the distance from f (x) to A will be less than ε. Thus δ depends on ε. The limit statement means that no matter how small ε is made, δ can be made smaller enough. The letters ε and δ can be understood as “error” and “distance”. In these terms the error (ε) can be made as small as desired by reducing the distance (δ). 46 The ε − δ definition of lim f (x) = A x→a For any chosen positive number ε, however small, there exists a positive number δ, such that, whenever 0 < |x − a| < δ, then |f (x) − A| < ε. Example: Show that lim (x2 + 1) = 2. n→1 Solution: Need to find δ so that, for a given ε, |x2 + 1 − 2| < ε for |x − 1| < δ. Now x 2 + 1 − 2 = x2 − 1 = (x + 1)(x − 1). Choose |x − 1| < 1 so that −1 < x − 1 < 1 ⇒ 0 < x < 2 ⇒ 1 < x + 1 < 3. You have |x2 + 1 − 2| < ε ε if 3|x − 1| < ε or |x − 1| < . You have now two conditions on x : 3 ε |x − 1| < 1 and |x − 1| < . 3 Choose δ = min{1, 3ε }. For a given ε > 0, choose δ = min{1, 3ε }, then we have |x − 1| < δ, it would be true that |x2 + 1 − 2| < ε. Example: Show that lim (x2 + 3x) = 10. x→2 Solution: Let ε > 0. We must produce a δ > 0 such that, whenever 0 < |x − 2| < δ then |(x2 + 3x) − 10| < ε. First we note that |(x2 + 3x) − 10| = |(x − 2)2 + 7(x − 2)| ≤ |x − 2|2 + 7|x − 2|. Also, if 0 < δ ≤ 1, then δ 2 ≤ δ. Hence, if we take δ to be the minimum of 1 and 0 < |x − 2| < δ, ε , then, whenever 8 |(x2 + 3x) − 10| < δ 2 + 7δ ≤ δ + 7δ = 8δ ≤ ε. 1 You Try It: Prove that lim x sin = 0. x→0 x Right and Left Limits Considering x and a as points on the real axis where a is fixed and x is moving, then x can approach a from the right or from the left. We indicate these respective approaches by writing x → a+ and x → a− . 47 If lim+ f (x) = A1 and lim− f (x) = A2 , we call A1 and A2 respectively the right and left hand limits x→a x→a of f (x) at a. We have lim f (x) = A if and only if lim+ f (x) = lim− f (x) = A. The existence of the limit from the x→a x→a x→a left does not imply the existence of the limit from the right and conversely. When a function f is defined on only one side of a point a, then lim f (x) is identical to the one-sided limit, if it exists. For x→a √ √ √ example, if f (x) = x, then f is only defined to the right of zero. Hence, lim x = lim+ x = 0. x→0 x→0 √ √ Of course, lim− x does not exist, since x is not defined when x < 0. On the other hand, consider x→0 r 1 the function g(x) = , which is defined only for x > 0. In this case, lim+ g(x) does not exist and, x→0 x therefore lim g(x) does not exist. x→0 5.2 Theorems on Limits 1. If f (x) = c, a constant, then lim f (x) = c. x→a 2. If lim f (x) = A and lim g(x) = B, then x→a x→a (a) lim kf (x) = kA, k being any constant. x→a (b) lim [f (x) ± g(x)] = lim f (x) ± lim g(x) = A ± B. x→a x→a x→a (c) lim f (x)g(x) = lim f (x) lim g(x) = AB. x→a x→a x→a lim f (x) f (x) A = x→a = , provided B 6= 0. x→a g(x) lim g(x) B (d) lim x→a Example: If lim f (x) exists, prove that it must be unique. x→a Solution: Must show that if lim f (x) = A1 and lim f (x) = A2 , then A1 = A2 . x→a x→a By hypothesis, given any ε > 0 we can find δ > 0 such that ε when 0 < |x − a| < δ 2 ε |f (x) − A2 | < when 0 < |x − a| < δ. 2 |f (x) − A1 | < Then ε ε + = ε. 2 2 i.e., |A1 −A2 | is less than any positive number ε (however small) and so must be zero. Thus A1 = A2 . |A1 − A2 | = |A1 − f (x) + f (x) − A2 | ≤ |A1 − f (x)| + |f (x) − A2 | < 48 Example: Given lim f (x) = A and lim g(x) = B. Prove that x→a x→a lim [f (x) + g(x)] = lim f (x) + lim g(x) = A + B x→a x→a x→a . Solution: We must show that for any ε > 0, we can find δ > 0 such that |(f (x)+g(x))−(A+B)| < ε when 0 < |x − a| < δ. By hypothesis, given ε > 0, we can find δ1 > 0 and δ2 > 0 such that ε when 0 < |x − a| < δ1 |f (x) − A| < 2 ε |g(x) − B| < when 0 < |x − a| < δ2 . 2 Then ε ε |(f (x) + g(x)) − (A + B)| ≤ |f (x) − A| + |g(x) − B| < + = ε, 2 2 when 0 < |x − a| < δ where δ is chosen as the smaller of δ1 and δ2 . You Try It: Given lim f (x) = A and lim g(x) = B. Prove that x→a x→a lim f (x)g(x) = lim f (x) lim g(x) = AB x→a x→a x→a . 5.3 Special Limits 1 − cos x sin x = 1, lim = 0. x→0 x→0 x x x 1 1 = e, lim+ (1 + x) x = e. 2. lim 1 + x→∞ x→0 x 1. lim ex − 1 = 1, x→0 x 3. lim 5.4 x−1 = 1. x→1 ln x lim Methods of Calculating lim f (x) x→a If f (a) is defined If x = a is in the domain of f (x) and a is not an endpoint of the domain, and f (x) is defined by a single expression, then lim f (x) = f (a). x→a 49 Example: Find lim (x + 3). x→1 Solution: lim (x + 3) = 1 + 3 = 4. x→1 1 . x→1 x + 2 Example: Find lim 1 1 1 = = . x→1 x + 2 1+2 3 Solution: lim Example: Find lim (x2 − 7x + 5). x→8 Solution: lim (x2 − 7x + 5) = 82 − 7(8) + 5 = 13. x→8 x2 − 4 . x→2 x − 2 Example: Find lim x2 − 4 (x + 2)(x − 2) = lim = lim (x + 2) = 4. x→2 x − 2 x→2 x→2 x−2 Solution: lim Functions Defined By More Than One Expression Suppose that f (x) is defined by one expression for x < a and by a different expression for x > a. |x| does not exist. x→0 x Example: Show that lim Solution: Notice that x if x ≥ 0 |x| x = 1, = x − = −1, x if x < 0. x i.e., you seek a limit at x = 0 of a function that is defined differently on either side of x = 0. |x| = x→0 x |x| lim+ = x→0 x lim− Since lim+ x→0 lim (−1) = −1. x→0− lim (1) = 1. x→0+ |x| |x| |x| 6= lim− , then lim does not exist. x→0 x x→0 x x sin 3x . x→0 x Example: Find lim 50 sin 3x . Then 3x sin 3x sin 3x sin 3x = lim 3 = 3 lim = 3(1) = 3. lim x→0 x→0 3x x→0 x 3x sin 3x Solution: Since =3 x 1 − cos 2x . x→0 sin 3x Example: Find lim 1 − cos 2x 3x 1 2 1 − cos 2x 3x = . Then 2x sin 3x 3x 3 2x sin 3x 1 − cos 2x 2 1 − cos 2x 3x 2 lim = lim lim = (0)(1) = 0. x→0 x→0 sin 3x sin 3x 3 x→0 2x 3 1 − cos 2x Solution: Since = 2x sin 3x Example: Find limπ x→ 4 sin x . cos x sin sin x Solution: limπ = x→ 4 cos x cos π 4 π 4 You Try It: Show that lim 1 − cos θ = 0. θ θ→0 = 1. Limits at Infinity It sometimes happen that as x → a, f (x) increases or decreases without bound. We write lim f (x) = +∞ or lim f (x) = −∞. We say that, lim f (x) = +∞, if for each positive number x→a x→a x→a M we can find a positive number δ (depending on M in general) such that f (x) > M whenever 0 < |x − a| < δ. Similarly, we say that lim f (x) = −∞, if for each positive number M we can find a positive number x→a δ (depending on M in general) such that f (x) < −M whenever 0 < |x − a| < δ. 51 1 1 = 0 and lim = 0. x→∞ x x→−∞ x Note that lim Limits at Infinity of a Rational Function A rational function is a quotient of two polynomials, f (x) = pm (x) , where m and n are the degrees qn (x) of the two polynomials. 1. If m < n, then lim f (x) = 0. x→∞ x+1 . x→∞ x2 + 4 Example: Find lim Solution: The degree of the numerator is one, the degree of the denominator is two. Therefore 1 + x12 x+1 0+0 x lim 2 = 0, since lim = 0. = 4 x→∞ x + 4 x→∞ 1 + 2 1+0 x 2. If m > n, then lim f (x) = ±∞ . (sign depends on the polynomials pm (x) and qn (x), if they x→∞ are of the same sign as x gets larger, the quotient is positive, if they are of opposite signs, the quotient is negative) x3 − 2x2 + 3x + 4 Example: Find lim . x→∞ 3x + 5 1− x3 − 2x2 + 3x + 4 Solution: lim = lim x→∞ x→∞ 3x + 5 2 + x32 + x43 x 3 + x53 x2 = 1 = ∞. 0 a 3. If m = n, then lim f (x) = , where a is the coefficient of xm in the numerator and b is the x→∞ b coefficient of xn in the denominator. x3 − 4x + 1 Example: Find lim . x→∞ 3x3 + 2x + 7 1− x3 − 4x + 1 = lim 3 x→∞ 3 + x→∞ 3x + 2x + 7 Solution: lim 4 x2 2 x2 + + 1 x3 7 x3 = 1−0+0 1 = . 3+0+0 3 a0 xm + a1 xm−1 + · · · + am You Try It: What is lim , where a0 , b0 6= 0 and m and n are x→∞ b0 xn + b1 xn−1 + · · · + bn positive integers, when (a) m > n (b) m = n (c) m < n. x−4 You Try It: Find lim √ . x→4 x−2 5.5 Continuity A function f (x) is continuous at a point x = a if 52 1. f (a) is defined. 2. lim f (x) exists. x→a 3. lim f (x) = f (a). x→a Notice that, for f (x) to be continuous at x = a, all three conditions must be satisfied. If at least one condition fails, f is said to have a discontinuity at x = a. For example, f (x) = x2 + 1 is continuous at x = 2 since lim f (x) = 5 = f (2). The first condition above implies that a function x→2 √ can be continuous only at points of its domain. Thus, f (x) = 4 − x2 is not continuous at x = 3 because f (3) is imaginary, i.e., not defined. A function f is right-continuous (continuous from the right) at a point x = a in its domain if lim+ f (x) = f (a). It is left-continuous (continuous from the left) at x = a if lim− f (x) = f (a). x→a x→a A function is continuous at an interior point x = a of its domain if and only if it is both rightcontinuous and left-continuous at x = a. Example: Determine whether f (x) = x2 + 1 is continuous at x = 1. Solution: lim x2 + 1 = f (1) = 12 + 1 = 2. Therefore f (x) = x2 + 1 is continuous at x = 1. x→1 Example: Determine whether f (x) = |x| is continuous at x = 0. x Solution: Since f (0) is not defined, f (x) is not continuous at x = 0. Example: Determine whether ( f (x) = |x| , x 0, if x 6= 0 if x = 0, is continuous at x = 0. Solution: f (0) now defined. Then Then lim f (x) must be considered in two steps, x→0 lim f (x) = x→0− lim f (x) = x→0+ lim (−1) = −1. x→0− lim (1) = 1. x→0+ Sine the limits are not the same, lim f (x) does not exist and f (x) is not continuous at x = 0. x→0 53 You Try It: Determine whether the function defined 2 x, 5, f (x) = −x + 6, by if x < 2 if x = 2 if x > 2, is continuous at the point x = 2. A function f (x) is discontinuous at x = a if one or more of the conditions for continuity fails there. 1 is discontinuous at x = 2, because f (2) is not defined (has a zero x−2 denominator) and because lim f (x) does not exist (equals ∞). The function is, however, continuous x→2 everywhere except at x = 2, where it is said to have an infinite discontinuity. Example: (a) f (x) = x2 − 4 is discontinuous at x = 2 because f (2) is not defined (both numerator and x−2 denominator are zero) and because lim f (x) = 4. The discontinuity here is called removable since (b) f (x) = x→2 x2 − 4 for x 6= 2 and f (2) = 4. (Note the x−2 discontinuity in (a) cannot be removed because the limit also does not exist.) it may be removed by redefining the function as f (x) = 5.6 The ε − δ Definition of Continuity f (x) is continuous at x = a, if for any ε > 0, we can find δ > 0, such that, |f (x) − f (a)| < ε whenever 0 < |x − a| < δ. Example: Prove that f (x) = x2 is continuous at x = 2. Solution: Must show that, given any ε > 0, we can find δ > 0, such that |f (x)−f (2)| = |x2 −4| < ε when |x − 2| < δ. Choose δ ≤ 1, so that |x − 2| < 1 or 1 < x < 3 (x 6= 2). Then |x2 − 4| = |(x − 2)(x + 2)| = |x − 2||x + 2| < δ|x + 2| < 5δ. Taking δ = min{1, 5ε } whichever is smaller, then we have |x2 − 4| < ε whenever |x − 2| < δ. You Try It: (a) Prove that f (x) = x is continuous at any point x = x0 . (b) Prove that f (x) = 2x3 + x is continuous at any point x = x0 . 54 Theorems on Continuity Theorem 1 If f (x) and g(x) are continuous at x = a, so are the functions f (x) ± g(x), f (x)g(x) and f (x) if g(x) g(x) 6= 0. Theorem 2 The following functions are continuous in every finite interval (a) all polynomials cos x (c) ax , a > 0. (b) sin x and Theorem 3 If y = f (x) is continuous at x = a and z = g(y) is continuous at y = b and if b = f (a), then the function z = g[f (x)] called a function of a function or composite function is continuous at x = a. Briefly: A continuous function of a continuous function is continuous. Theorem 4 If f (x) is continuous in a closed interval, it is bounded in the interval. 5.7 Tutorial 3 1. Find the domain ofreach of the following functions. x3 − 8 x 1 (ii) (iii) 2 (i) √ 2−x x −4 1−x 2. Determine whether or not each of the following correspondences is a function. (i) x2 + y = 1 (ii) x2 y 2 = 5 (iii) x2 y = 4 (iv) {(1, 5), (2, 5), (5, 1)}. 3. For each of the following pairs of functions, calculate f ◦ g and g ◦ f . 2 2 (i) f (x) = ex and g(x) = e−x . 4. Let the function f : R → R be defined by f (x) = 2x − 1. Show that f is bijective and hence find f −1 . 5. Show that if f : A → B is onto and g : B → C is onto, then the product function (g ◦ f ) : A → C is onto. √ 6. Show that the function f (x) = ln(x+ x2 + 1) is an odd function and find the inverse function of f (x). 7. Let 3x − 1, 0, f (x) = 2x + 5, 55 x<0 x=0 x > 0, Evaluate (i) lim f (x) x→2 8. If f (x) = (ii) lim f (x) x→−3 (iii) lim+ f (x) x→0 3x + |x| . Evaluate (i) lim f (x) x→∞ 7x − 5|x| (iv) lim− f (x) (v) lim f (x). x→0 x→0 (ii) lim f (x). x→−∞ 9. Use the theorems on limits limits. √ each of the following √ to evaluate 2 3 3+x− 3 8x + 4 x −8 (ii) lim (iii) lim (iv) lim (4x2 − x + 5) (i) lim 2−x+2 x→0 x→∞ x→2 x→0 x − 2 x 2x √ x+4−2 a0 + a1 x + · · · + am x m (3x − 1)(2x + 3) (v) lim (vi) lim (vii) lim x→0 x→∞ b0 + b1 x + · · · + bn xn x→∞ (5x − 3)(4x + 5) x 10. Use the definition of a limit to prove that, lim (3x2 − 7x + 1) = 7 x→3 x2 − 4 =4 x→2 x − 2 11. Verify (i) lim 1 2 (ii) lim x cos = 0. x→0 x 12. Give the points of discontinuity of each of the following functions. x 1 (i) f (x) = (ii) f (x) = x2 sin , f (0) = 0 (x − 2)(x − 4) x p 1 (iii) f (x) = (x − 3)(6 − x), 3 ≤ x ≤ 6 (iv) f (x) = . 1 + 2 sin x 13. Given that the function f : R → R be defined by 2 x − 4, 2ax + b, f (x) = x e , x≥2 0<x<2 x ≤ 0, is continuous at all points in R. Find the values of a and b. 14. Let f : R → R be the function defined by 2 x, ax + b, f (x) = 2 − x, x≥2 1<x<2 x ≤ 1, where a and b are constants. If f is continuous on R, find the values a and b. 15. Sketch the graphs of the following functions. x+1 (i) f (x) = 2x2 − 5x − 3 (ii) f (x) = (x − 2)(3x + 5) 56 (iii) f (x) = x4 (iv) f (x) = x9 Chapter 6 Differentiation Increments. The increment ∆x of a variable x is the change in x as it increases or decreases from one value x = x0 to another value x = x1 in its domain. Here, ∆x = x1 − x0 and we may write x1 = x0 + ∆x. If the variable x is given an increment ∆x from x = x0 (i.e., if x changes from x = x0 to x1 = x0 + ∆x) and a function y = f (x) is thereby given an increment ∆y = f (x0 + ∆x) − f (x0 ) from y = f (x0 ), then the quotient change in y ∆y = , ∆x change in x is called the average rate of change of the function on the interval between x = x0 and x1 = x0 +∆x. Let f (x) be defined at any point x0 in (a, b). The derivative of f (x) at x = x0 is defined as f (x0 + h) − f (x0 ) h→0 h if this limit exists. A function is called differentiable at a point x = x0 , if it has a derivative at that point, i.e., if f 0 (x0 ) exists. If we write x = x0 + h, then h = x − x0 and h approaches 0 if and only if x approaches x0 . Therefore, an equivalent way of stating the definition of the derivative, is f 0 (x0 ) = lim f 0 (x0 ) = lim x→x0 f (x) − f (x0 ) . x − x0 Example: If f (x) = x3 − x, find a formula for f 0 (x). Solution: f (x + h) − f (x) [(x + h)3 − (x + h)] − [x3 − x] = lim h→0 h→0 h h 3 2 2 3 3 x + 3x h + 3xh + h − x − h − x + x = lim h→0 h 3x2 h + 3xh2 + h3 − h = lim h→0 h 2 = lim (3x + 3xh + h2 − 1) = 3x2 − 1. f 0 (x) = lim h→0 57 Example: If f (x) = √ x, find the derivative of f . Solution: f (x + h) − f (x) h→0 h √ √ x+h− x lim h→0 h √ √ √ √ x+h− x x+h+ x lim ·√ √ h→0 h x+h− x (x + h) − x lim √ √ h→0 h( x + h + x) 1 lim √ √ h→0 x+h+ x 1 1 √ √ = √ . x+ x 2 x f 0 (x) = lim = = = = = d dy d , (f (x)). The symbol is called differThe derivative at x may be denoted by f 0 (x), y 0 , dx dx dx entiation operator because it indicates the operation of differentiation. The process of finding derivatives of functions is called differentiation. A function f is differentiable at x0 if f 0 (x0 ) exists. It is differentiable on an open interval (a, b) [or (a, ∞) or (−∞, a) or (−∞, ∞)], if it is differentiable at every number in the interval. Example: Where is the function f (x) = |x| differentiable? Solution: If x > 0, then |x| = x and we can choose h small enough that x + h > 0 and hence |x + h| = x + h. Therefore, for x > 0, |x + h| − |x| h→0 x (x + h) − x h = lim = lim = 1, h→0 h→0 h h f 0 (x) = lim and so f is differentiable for any x > 0. Similarly, for x < 0, we have |x| = −x and h can be chosen small enough that x + h < 0 and so |x + h| = −(x + h). Therefore, for x < 0, |x + h| − |x| h→0 h −(x + h) − (−x) −h = lim = lim = −1, h→0 h→0 h h f 0 (x) = lim and so f is differentiable for any x < 0. 58 For x = 0 we have to investigate f (0 + h) − f (0) h→0 h |0 + h| − |0| (if it exists). = lim h→0 h f 0 (0) = lim Let’s compute the left and right limits separately; |0 + h| − |0| = h→0 h |0 + h| − |0| lim− = h→0 h lim+ |h| = lim+ 1 = 1. h→0 h→0 h |h| lim− = lim− (−1) = −1. h→0 h→0 h lim+ Since these limits are different f 0 (0) does not exist. Thus, f is differentiable at all x except 0. 6.1 Differentiation Techniques (Finding Derivatives) 1. f (x0 + h) − f (x0 ) h→0 h f 0 (x0 ) = lim if this limit exists. 2. The derivative of any constant function is zero, i.e., c0 = 0. 3. For any real number n, (xn )0 = nxn−1 . When differentiating, results can be expressed in a number of ways. For example, (i) if dy y = 3x2 then = 6x, (ii) if f (x) = 3x2 then f 0 (x) = 6x, (iii) the differential coefficient dx of 3x2 is 6x. √ 1 1 1 1 1 For example, if f (x) = x = x 2 , then f 0 (x) = x 2 −1 = 12 x− 2 = √ . 2 2 x You Try It: Using the general rule, differentiate the following with respect to x : 4 (a) f (x) = 5x7 (b) f (x) = 2 . x 4. For any constant c, (cf (x))0 = cf 0 (x). For example, (5x3 )0 = 5(x3 )0 = 5(3x2 ) = 15x2 . 5. The derivative of a sum (difference) is the sum (difference) of the derivatives, (f (x) ± g(x))0 = f 0 (x) ± g 0 (x). For example, (3x5 − 2x2 + 1)0 = (3x5 )0 − (2x2 )0 + 10 = 3(x5 )0 − 2(x2 )0 + 0 = 3(5x4 ) − 2(2x) = 15x4 − 4x. 6. Product Rule (f (x)g(x))0 = f (x)g 0 (x) + g(x)f 0 (x). 59 7. Quotient Rule f (x) g(x) 0 = g(x)f 0 (x) − f (x)g 0 (x) . (g(x))2 8. Parametric Equations. If the coordinates (x, y) of a point P on a curve are given as functions x = f (u) and y = g(u) of a third variable or parameter u, the equations x = f (u) and y = g(u) are called parametric equations of the curve. For example, x = 12 t, y = 4 − t2 or x = cos θ, y = 4 sin2 θ. dy The First Derivative is given by dx dy dy/du = . dx dx/du The Second Derivative d2 y is given by dx2 d d2 y = 2 dx du dy dx du . dx dy d2 y and 2 given x = θ − sin θ and y = 1 − cos θ. dx dx dy dx = 1 − cos θ and = sin θ, so Solution: Note that dθ dθ Example: Find dy dy/dθ sin θ = = . dx dx/dθ 1 − cos θ Also d2 y d sin θ dθ = 2 dx dθ 1 − cos θ dx 1 1 cos θ − 1 · =− . = 2 (1 − cos θ) 1 − cos θ (1 − cos θ)2 d2 y dy and 2 given x = et cos t and y = et sin t. dx dx dx dy Solution: Note that = et (cos t − sin t) and = et (sin t + cos t), so dt dt Example: Find dy dy/dt sin t + cos t = = . dx dx/dt cos t − sin t Also d2 y d = dx2 dt sin t + cos t dt cos t − sin t dx 2 1 2 = · t = t . 2 (cos t − sin t) e (cos t − sin t) e (cos t − sin t)3 Theorem 6.1.1. If f (x) = c is a constant function, then f 0 (x) = 0 for all real numbers x. 60 c−c f (x + h) − f (x) = = 0. h→0 h h Proof. Observe that f 0 (x) = lim Theorem 6.1.2 (Product Rule). If f and g are both differentiable at x, then the product function f g is also differentiable at x and (f g)0 (x) = f (x)g 0 (x) + g(x)f 0 (x). Proof. f (x + h)g(x + h) − f (x)g(x) . h→0 h Trick of adding and subtracting f (x + h)g(x) to the numerator, (f g)0 (x) = lim lim h→0 f (x + h)g(x + h) − f (x + h)g(x) + f (x + h)g(x) − f (x)g(x) = h g(x + h) − g(x) f (x + h) − f (x) lim f (x + h) lim + g(x) lim = h→0 h→0 h→0 h h f (x)g 0 (x) + g(x)f 0 (x). Theorem 6.1.3. If g is differentiable at x and g(x) 6= 0, then 0 1 g 0 (x) . (x) = − g (g(x))2 Proof. lim h→0 1 g(x+h) − h 1 g(x) g(x) − g(x + h) h→0 h(g(x))(g(x + h)) −(g(x + h) − g(x)) 1 = lim h→0 h g(x)g(x + h) −(g(x + h) − g(x)) 1 = lim lim h→0 h→0 g(x)g(x + h) h 1 . = −g 0 (x) (g(x))2 = lim f Theorem 6.1.4. If f and g are differentiable at x and g(x) 6= 0, then is differentiable at x and g 0 0 0 f g(x)f (x) − f (x)g (x) (x) = . g (g(x))2 61 f Proof. Since = f g Example: f (x) = 1 , we have g 0 0 f 1 (x) = f· (x) g g 0 1 1 0 + f (x) (x) = f (x) g(x) g f 0 (x) g 0 (x) = + f (x) − g(x) (g(x))2 f 0 (x)g(x) − f (x)g 0 (x) = . (g(x))2 x2 − 1 (x2 + 1)(2x) − (x2 − 1)(2x) 4x 0 , then f (x) = = 2 . 2 2 2 x +1 (x + 1) (x + 1)2 Example: If f (x) = (x2 + 1) − x(2x) 1 − x2 x 0 , then f (x) = = . x2 + 1 (x2 + 1)2 (x2 + 1)2 Example: If f (x) = 1 1 , then f 0 (x) = − 2 . x x Example: If x = cos t and y = t sin t, find Solution: 6.2 Recall : dy . dx d (t sin t) sin t + t cos t dy = = dt . d dx − sin t (cos t) dt Derivatives of Trigonometric Functions sin h = 1 and h→0 h lim 62 1 − cos h = 0. h→0 h lim sin(x + h) − sin x h→0 h sin x cos h + cos x sin h − sin x lim h→0 h sin x(cos h − 1) + cos x sin h lim h→0 h (1 − cos h) sin h lim − sin x + cos x h→0 h h − sin x(0) + cos x(1). (sin x)0 = lim = = = = Hence (sin x)0 = cos x. Example: Find Solution: dy if y = x3 sin x. dx dy = (x3 sin x)0 = x3 (sin x)0 + sin x(x3 )0 = x3 cos x + 3x2 sin x. dx Example: Determine whether the function, 1 3 x sin , f (x) = x 0, if x 6= 0 if x = 0, is differentiable at x = 0. Solution: Observe that h3 sin h1 − 0 1 = lim h2 sin = 0. h→0 h→0 h h lim Logarithmic Functions. Assume a > 0 and a 6= 1. If ay = x, then define y = loga x. Let ln x ≡ loge x (ln x is called the natural logarithm of x). Basic Properties of Logarithms 1. loga 1 = 0 (In particular, ln 1 = 0). 2. loga a = 1 (In particular, ln e = 1). 3. loga uv = loga u + loga v. u 4. loga = loga u − loga v. v 5. loga ur = r loga u. Derivatives of ln x and ex are d 1 d x d x e = ex and ln x = . Also (a ) = ax ln x, a > 0. dx dx x dx 63 Example: Calculate the derivative d [(sin x + x) · (x3 − ln x)]. dx d d d 3 d 1 Solution: We know that sin x = cos x, x = 1, x = 3x2 and ln x = . Therefore, by dx dx dx dx x the addition rule, d d d (sin x + x) = sin x + x = cos x + 1 dx dx dx and d 3 d 3 d 1 (x − ln x) = x − ln x = 3x2 − . dx dx dx x Now we may conclude the calculation by applying the product rule; d d d [(sin x + x) · (x3 − ln x)] = (sin x + x) · (x3 − ln x) + (sin x + x) · (x3 − ln x) dx dx dx 1 3 2 = (cos x + 1) · (x − ln x) + (sin x + x) · 3x − x 1 = 4x3 − 1 + x3 cos x + 3x2 sin x − sin x − ln x cos x − ln x. x You Try It: Calculate the derivative d x sin x · cos x − x . dx e + ln x 6.3 Derivative of a Composition [The Chain Rule] We calculate the derivative of a composition by [f ◦ g(x)]0 = f 0 (g(x)) · g 0 (x). If y = f (u) where u = g(x), then dy du du dy = · = f 0 (u) = f 0 (g(x))g 0 (x). dx du dx dx Similarly, if y = f (u) where u = g(v) and v = h(x), then dy dy du dv = · · . dx du dv dx Example: Calculate the derivative d (sin(x3 − x2 )). dx Solution: This is the composition of functions, so we must apply the Chain Rule. It is essential to recognize what function will play the role of f and what function will play the role of g. Notice 64 that, if x is the variable, then x3 − x2 is applied first and sin applied next. So it must be that d d g(x) = 3x2 − 2x. Then g(x) = x3 − x2 and f (s) = sin s. Notice that f (s) = cos s and ds dx sin(x3 − x2 ) = f ◦ g(x) and d d (sin(x3 − x2 )) = (f ◦ g(x)) dx dx df d = (g(x)) · g(x) ds dx = cos(g(x)) · (3x2 − 2x) = [cos(x3 − x2 )] · (3x2 − 2x). d Example: Calculate the derivative ln dx x2 . x−2 x2 x2 Solution: Let h(x) = ln . Then h = f ◦ g, where f (s) = ln s and g(x) = . So x−2 x−2 d 1 d (x − 2) · 2x − x2 · 1 x2 − 4x f (s) = and g(x) = = . As a result, ds s dx (x − 2)2 (x − 2)2 d d h(x) = (f ◦ g) dx dx df d = (g(x)) · g(x) ds dx 1 x2 − 4x = · g(x) (x − 2)2 x2 − 4x 1 · = x2 (x − 2)2 x−2 x−4 = . x(x − 2) You Try It: Calculate the derivative of tan(ex − x). 6.4 Continuity and Differentiation What is the relationship between continuity and differentiation? It appears that functions that have derivatives must be continuous. Theorem 6.4.1. If a function f is differentiable at a point x, then it is continuous at x. 65 Proof. We want to show that f is continuous at x, i.e., lim f (t) = f (x) or lim f (x + h) = f (x), t→x h→0 where h = t − x. It will be sufficient to show that lim [f (x + h) − f (x)] = 0. h→0 Now, f (x + h) − f (x) h lim h→0 h f (x + h) − f (x) lim lim h h→0 h→0 h f 0 (x) · 0 0, lim [f (x + h) − f (x)] = h→0 = = = because f 0 (x) is finite. Thus f is continuous at x. Converse is false: For example, the function f (x) = |x| is continuous at x = 0, but it is not differentiable there. 6.5 Higher Order Derivatives If f (x) is differentiable in an interval, its derivative is given by f 0 (x), y 0 or dy where y = f (x). dx d If f (x) is also differentiable in the interval, its derivative is denoted by f (x), y or dx 0 00 00 dy dx = d2 y . dx2 dn y where n is called Similarly, the nth derivative of f (x), if it exists, is denoted by f (n) , y (n) or dxn the order of the derivative. Example: Let y = f (x) = 12 x4 − 3x2 + 1. Solution: Derivative y 0 = f 0 (x) = Second derivative y 00 = f 00 (x) = d 1 4 ( x − 3x2 + 1) = 2x3 − 6x. dx 2 d d2 y = (2x3 − 6x) = 6x2 − 6. dx2 dx d3 y d (6x2 − 6) = 12x. Third derivative y = f (x) = 3 = dx dx 000 000 Fourth derivative y (4) = f (4) (x) = d4 y d = (12x) = 12. 4 dx dx 66 6.6 Implicit Differentiation Compare 1. x2 − y 3 = 3 ⇐⇒ y = √ 3 x2 − 3. √ 2. x2 + y 2 = 1 ⇐⇒ y = ± 1 − x2 . 3. x3 + y 2 = 3xy ⇐⇒????????. Implicit Functions. A function in which the dependent variable is expressed solely in terms of the independent variable x, namely y = f (x), is said to be an explicit function, for example, y = 12 x3 − 1. An equation f (x, y) = 0, on perhaps certain restricted ranges of the variables, is said to define y implicitly as a function of x. 1−x . Example: (a) The equation xy + x − 2y − 1 = 0, with x 6= 2, defines the function y = x−2 √ (b) The equation 4x2 + 9y 2 − 36 = 0 defines the function y = 23 9 − x2 when |x| ≤ 3 and y ≥ 0 √ and the function y = − 23 9 − x2 when |x| ≤ 3 and y ≤ 0. The derivative y 0 may be obtained by one of the following procedures: 1. Solve, when possible, for y and differentiate with respect to x. 2. Thinking of y as a function of x, differentiate both sides of the given equation with respect to x and solve the resulting relation for y 0 . This differentiation process is known as implicit differentiation. Example: Find dy if x2 + y 2 = 4. dx Solution: We differentiate both sides of the equation d 2 d d x + y2 = 4 dx dx dx dy 2x + 2y =0 . dx Solving the derivative yields dy x =− . dy y Example: Find d2 y if x2 + y 2 = 4. dx2 67 Solution: From the above example, we already know that the first derivative is dy x =− . dx y Hence by the Quotient Rule d2 y d = − 2 dx dx x y y·1−x· = − dy dx y2 x y−x − y = − 2 y 2 y + x2 = − . y3 Substituting for dy dx Noting that x2 + y 2 = 4 permits us to write the second derivative as 4 d2 y = − . dx2 y3 Example: Find dy if sin y = y cos 2x. dx Solution: d sin y dx dy cos y dx dy (cos y − cos 2x) dx dy dx = d y cos 2x dx = y(− sin 2x · 2) + cos 2x dy dx = −2y sin 2x = − 2y sin 2x . cos y − cos 2x Example: Find y 0 , given xy + x − 2y − 1 = 0. Solution: We have x d d d d d d (y) + y (x) + (x) − 2 (y) − (1) = (0) dx dx dx dx dx dx or xy 0 + y + 1 − 2y 0 = 0, then y 0 = 1+y . 2−x Example: Find y 0 , given x2 y − xy 2 + x2 + y 2 = 0. 68 Solution: d d 2 (x y) − (xy 2 ) + dx dx d d d d x2 (y) + y (x2 ) − x (y 2 ) − y 2 (x) + dx dx dx dx Hence, x2 y 0 + 2xy − 2xyy 0 − y 2 + 2x + 2yy 0 = 0 and y 0 = d 2 (x ) + dx d 2 (x ) + dx d 2 (y ) = 0 dx d 2 (y ) = 0. dx y 2 − 2x − 2xy . x2 + 2y − 2xy Example: Find y 0 and y 00 , given x2 − xy + y 2 = 3. Solution: d 2 d d 2 2x − y (x ) − (xy) = (y ) = 2x − xy 0 − y + 2yy 0 = 0. So y 0 = . dx dx dx x − 2y Then y 00 = (x − 2y) d d (2x − y) − (2x − y) (x − 2y) (x − 2y)(2 − y 0 ) − (2x − y)(1 − 2y 0 ) dx dx = (x − 2y)2 (x − 2y)2 2x − y 3x − 3y 6(x2 − xy + y 2 ) 3xy 0 − 3y x − 2y = = = (x − 2y)2 (x − 2y)2 (x − 2y)2 18 . = (x − 2y)2 You Try It: Find y 00 , given x3 − 3xy + y 3 = 1. 6.7 Logarithmic Differentiation Take natural logarithm (ln) both sides, differentiate implicitly and solve for y 0 . √ 2 3 x 7x − 14 Example: Compute y 0 if y = . (1 + x2 )4 69 √ 2√ x2 3 7x − 14 x 3 7x − 14 . Solution: y = ⇒ ln y = ln (1 + x2 )4 (1 + x2 )4 1 ln y = 2 ln x + ln(7x − 14) − 4 ln(1 + x2 ) 3 1 0 1 1 (7x − 14)0 (1 + x2 )0 y = 2 + −4 y x 3 7x − 14 1 + x2 2 7 8x = + − x 3(7x − 14) 1 + x2 7 8x 2 0 + − y = y x 3(7x − 14) 1 + x2 √ x2 3 7x − 14 2 7 8x = + − . (1 + x2 )4 x 3(7x − 14) 1 + x2 6.8 Derivatives of Inverse Trigonometric Functions If x = sin y, the inverse function is written y = sin−1 x or y = arcsin x. The inverse trigonometric functions are multivalued functions. Example: Find the derivative of y = sin−1 x. Solution: Differentiate implicitly with respect to x. Then sin y = x. Hence, (sin y)0 = x0 cos yy 0 = 1 y0 = 1 cos y 1 y0 = p 1 − sin2 y 1 = √ . 1 − x2 Some Derivatives. 1 d (cos−1 x) = − √ , dx 1 − x2 d 1 (cot−1 x) = − , dx 1 + x2 You Try It: Show that the derivative of tan−1 x = 70 1 . 1 + x2 d 1 (sec−1 x) = √ . dx x x2 − 1 Applications of the Derivative 6.9 The Mean Value Theorem Suppose f (x) is continuous on [a, b] and differentiable on (a, b). Then, there exists a c in (a, b) at which the tangent line is parallel to the secant line joining the points (a, f (a)) and (b, f (b)), i.e., at f (b) − f (a) which f 0 (c) = , b−a OR If f (x) is continuous in [a, b] and differentiable in (a, b), then there exists a point c in (a, b) such that f (b) − f (a) f 0 (c) = , a < c < b. b−a The word mean in The Mean Value Theorem refers to the mean (or average) rate of change of f in the interval [a, b]. If f (a) = f (b) = 0, then the theorem says that there exists a c in (a, b) at which f 0 (c) = 0. The 71 graphs suggest that there must be at least one point on the graph, that corresponds to a number c in (a, b), at which the tangent is horizontal. This special case of the Mean Value Theorem is called Rolle’s Theorem 1 . √ x − 1 on [2, 5], f (x) is continuous when x − 1 ≥ 0, i.e., x ≥ 1. In 1 particular, f (x) is continuous on [2, 5] and f 0 (x) = √ , so differentiable when x > 1. In 2 x−1 particular, f (x) is differentiable on (2, 5). √ √ f (5) − f (2) 1 5−1− 2−1 f (b) − f (a) = = = . b−a 5−2 3 3 Example: Consider f (x) = 1 The Mean Value Theorem asserts that, for some c in (2, 5), f 0 (c) = . Let us find it. 3 f 0 (x) = 1 √ = 2 x−1 √ 2 x−1 = 4(x − 1) = x−1 = x = Notice that 1 3 1 3 3 9 9 4 13 . 4 13 13 is in (2, 5), so we may take c = . 4 4 Example: Show that if f (x) = tan x on the interval 0 ≤ x ≤ k where k < 1 Michel Rolle, a French mathematician (1652-1719) 72 π , then tan k ≥ k. 2 Solution: By the Mean Value Theorem tan k − tan 0 = sec2 c, k−0 for some c ∈ (0, k). But sec2 c ≥ 1 and tan 0 = 0. So tan k ≥ 1 =⇒ tan k ≥ k. k Example: Use The Mean Value Theorem to show that | cos a − cos b| ≤ |a − b|. Solution: The function cos x is continuous and differentiable for all x. By the Mean Value Theorem cos a − cos b a−b cos a − cos b |(cos x)0 | = , a−b (cos x)0 = but (cos x)0 = − sin x and |(cos x)0 | ≤ 1, therefore | cos a − cos b| ≤ 1 =⇒ | cos a − cos b| ≤ |a − b|. |a − b| Example: Prove that b−a b−a < tan−1 b − tan−1 a < for a < b. 2 1+b 1 + a2 Solution: Let f (x) = tan−1 x. Since f 0 (x) = 1 1 , f 0 (c) = . By the Mean Value Theorem 2 1+x 1 + c2 f (b) − f (a) tan−1 b − tan−1 a 1 = = , b−a b−a 1 + c2 a < c < b. Then, from a < c < b, we have a2 < c2 < b2 =⇒ 1 + a2 < 1 + c2 < 1 + b2 1 1 1 1 1 1 > > =⇒ < < 2 2 2 2 2 1+a 1+c 1+b 1+b 1+c 1 + a2 1 tan−1 b − tan−1 a 1 b−a b−a < < =⇒ < tan−1 b − tan−1 a < . 2 2 2 1+b b−a 1+a 1+b 1 + a2 a Example: Use the Mean Value Theorem, to prove that if 0 < a < b, then 1 − < ln b 1 1 Hence show that < ln 1.2 < . 6 5 b b < − 1. a a 1 Solution: Let f (x) = ln x and f 0 (x) = . By the Mean Value Theorem, there exists c ∈ (a, b) x such that 1 ln b − ln a f 0 (c) = = . c b−a 73 Then, from a < c < b we have 1 1 1 < < b c a ln b − ln a 1 b−a b−a 1 < < =⇒ < ln b − ln a < b b−a a b a b b a < − 1. =⇒ 1 − < ln b a a a < c < b =⇒ 12 6 Now, ln(1.2) = ln = ln . Therefore a = 5 and b = 6. Substituting in 10 5 a b b 1 − < ln < − 1, we have b a a 5 6 6 1 1 1 − < ln < − 1 =⇒ < ln 1.2 < . 6 5 5 6 5 6.10 Some Corollaries of The Mean Value Theorem Corollary 6.10.1. If f 0 (x) = 0 at all points of the interval (a, b), then f (x) must be a constant in the interval. Proof. Let x1 < x2 be any two different points in (a, b). By the Mean Value Theorem for x1 < x < x 2 , f (x2 ) − f (x1 ) = f 0 (x) = 0. x2 − x1 Thus f (x1 ) = f (x2 ). Since x1 and x2 are arbitrarily chosen, the function f (x) has the same value at all points in the interval. Thus, f (x) is constant. Corollary 6.10.2. If f 0 (x) > 0 at all points of the interval (a, b), then f (x) is strictly increasing. Proof. Let x1 < x2 be any two different points in (a, b). By the Mean Value Theorem for x1 < x < x 2 , f (x2 ) − f (x1 ) = f 0 (x) > 0. x2 − x1 Thus f (x2 ) > f (x1 ) for x2 > x1 and so f (x) is strictly increasing. 74 6.11 Indeterminate Forms f (x) 0 ∞ f (x) 0 tends to or as x → a. Think of the situation lim → x→a g(x) x→a g(x) 0 ∞ 0 where f (x) and g(x) are differentiable (and therefore continuous so f (a) = lim f (x) = 0 and Happens when lim x→a g(a) = lim g(x) = 0.), then x→a f (x) − f (a) f (x) = lim x→a g(x) − g(x) x→a g(x) f (x) − f (a) x−a = lim (provided the denominator is not zero) x→a g(x) − g(a) x−a f (x) − f (a) lim x→a x−a = g(x) − g(a) lim x→a x−a lim f 0 (x) 0 f (a) x→a = 0 = (provided f 0 (x) and g 0 (x) are also continuous.) g (a) lim g 0 (x) lim x→a Example: x2 − 4 (x2 − 4)0 2x = lim = lim = 2 · 2 = 4. 0 x→2 x − 2 x→2 (x − 2) x→2 1 lim f (x) f 0 (x) f (x) 0 ∞ = lim 0 provided that lim is of the type or , this is g(x) g (x) g(x) 0 ∞ f (x) 0 ∞ called L’Hôpital’s Rule: if either lim = or . g(x) 0 ∞ Theorem 6.11.1. If lim 2 cos 2x 2·1 2 sin 2x = lim = = . x→0 sin 5x 5 cos 5x 5·1 5 e3x 3e3x (b) lim = lim = ∞. x→∞ x x→∞ 1 x x e −1 e (c) lim = lim 2 = ∞. 3 x→0 x→0 x 3x Examples: (a) lim x→0 0 ∞ The form ∞ − ∞. A given limit that is not immediately or can be converted to one of these 0 ∞ forms by combination of algebra and a little cleverness. 1 + 3x 1 − . Example: Evaluate lim x→0 sin x x Solution: We note 1 + 3x 1 → ∞ and → ∞. However, after writing the difference as a single sin x x 75 fraction, we recognize the form 0 . 0 1 + 3x 1 lim − x→0 sin x x 3x2 + x − sin x x→0 x sin x 6x + 1 − cos x = lim x→0 x cos x + sin x 6 + sin x = lim x→0 −x sin x + 2 cos x 6+0 = 3. = 0+2 = lim The form 0 · ∞. By suitable manipulation, L’Hôpital’s Rule can sometimes be applied to the limit form 0 · ∞. 1 Example: Evaluate lim x sin . x→∞ x Solution: Write the given expression as 1 sin x lim 1 x→∞ x 0 . Hence, 0 1 lim x sin = x→∞ x and recognize that we have the form (−x−2 cos x1 ) x→∞ (−x−2 ) 1 = lim cos = 1. x→∞ x lim The form 00 , ∞0 , 1∞ . Suppose y = f (x)g(x) tends towards 00 , ∞0 , 1∞ as x → a or x → ∞. By taking the natural logarithm of y ; ln y = ln f (x)g(x) = g(x) ln f (x) and we see lim ln y = lim g(x) ln f (x) x→a x→a is of the form 0 · ∞. If it is assumed that lim ln y = ln(lim y) = L, then lim = eL or x→a x→a lim f (x)g(x) = eL . x→a 1 Example: Evaluate lim+ x ln x . x→0 76 x→a 1 Solution: The form is 00 . Now, if we set y = x ln x , then ln y = 1 ln x = 1. ln x Notice we do not need L’Hôpital’s Rule in this case since lim ln y = 1. x→0+ 1 Hence, lim+ y = e1 or equivalently lim+ x ln x = e. x→0 x→0 1 Example: Evaluate lim (1 + x) x . x→0 1 Solution: The limit form is of the form 1∞ . If y = (1 + x) x , then ln y = 1 ln(1 + x). x ln(1 + x) 0 has the form and so x→0 x 0 Now, lim 1 ln(1 + x) lim = lim 1+x x→0 x→0 1 x 1 = 1. = lim x→0 1 + x Thus, 1 lim (1 + x) x = e. x→0 2x 3 Example: Evaluate lim 1 − . x→∞ x 2x 3 3 then ln y = 2x ln 1 − Solution: The limit form is 1 . If y = 1 − . Observe that the x x 2 ln(1 − x3 ) 3 0 form lim 2x ln 1 − is ∞ · 0, whereas the form of lim is . Therefore, 1 x→∞ x→∞ x 0 x ∞ 3 x2 lim x→∞ 2 ln(1 − x3 ) 1 x (1 − x3 ) x→∞ − x12 −6 = lim = −6. x→∞ (1 − 3 ) x = lim 2 Finally, we conclude that 2x 3 lim 1 − = e−6 . x→∞ x 77 6.12 Tutorial 4 1. Let f (x) = 3+x , x 6= 3. Evaluate f 0 (2) from the definition. 3−x 2. Show from definition that 1 d √ ( x − 1) = √ . dx 2 x−1 3. Determine whether each of the following functions is differentiable at( x = 0. 3 x − |x| 1 x sin x , x 6= 0 , (i) f (x) = (ii) f (x) = x|x| (iii) f (x) = x 0, x=0 0, x 6= 0 . x=0 4. Show that f (x) = cos x is differentiable at any x ∈ R and that f 0 (x) = − sin x. dy for each of the following functions dx (i) y = sin3 (6x) (ii) y = (ax + b)m (cx + d)n 5. Find (iii) y = cos(x2 − 3x + 1) 6. Use logarithmic differentiation to find derivatives ofseach of the following functions (x + 1)(x + 2) (i) y = xx , x > 0 (ii) y = xln x , x > 0 (iii) y = 3 2 (x + 1)(x2 + 2) √ √ √ √ (iii) y = √x + 1 x + 2 4 x + 3 (iv) y(x) = y1 (x)y2 (x) · · · yn (x) (v) y = x x , x > 0 3 x4 + 6x2 (8x + 3)5 x (vi) y = (vii) y = (x2 + 1)(x+1) , x > −1 (viii) y = xx 2x 2 2 (2x + 7) 3 (x2 + 1)x x ln x (ix) y = xx (x) y = xcos x (xi) y = x(x+1) (xii) y = (ln x)x (xiii) y = . x2 dy . dx (ii) (x2 + y 2 )6 = x3 − y 3 7. Use implicit differentiation to find (i) (y − 1)2 = 4(x + 2) 8. Find (iii) y 4 − y 2 = 10x − 3 dy if xy 2 + 4y 3 + 3x = 0 at (1, −1). dx d2 y . dx2 (i) 4y 3 = 6x2 + 1 9. Find (ii) xy 4 = 5 (iii) x = 2 + t, y = 1 + t2 10. If y = tan x, prove that y 000 = 2(1 + y 2 )(1 + 3y 2 ). 11. Given y 2 sin x + y = tan−1 x. find y 0 . 12. Find the first derivatives of the following functions (i) y = tan−1 (3x2 ). √ 1 + 1 − x2 . (ii) y = x csc−1 x 78 (iv) y = sin xy 13. Evaluate each of the following limits using L’hopital’s rule. sin x sin x ln x 6x2 − 5x + 7 (i) lim (ii) lim x (iii) lim (iv) lim −x x x→0 x x→0 x→∞ 4x2 − 2x e −e x→∞ ex 1 3x 1 + 3x 1 − (xiii) lim (xiv) lim+ xx (xv) lim x(e x − 1) (ix) lim x→∞ x→∞ x→0 x→0 sin x x 3x + 1 1 1 lim − . x→0 x sin x −1 x 6= 0 e x2 , Prove that f 0 (0) = 0. 14. Let f (x) = 0, x=0 (xvi) 15. Explain why Rolle’s theorem is not applicable for the function f (x) = |x| on the interval [−1, 1]. 16. Verify Rolle’s Theorem for f (x) = x2 (1 − x)2 , 0 ≤ x ≤ 1. 17. Prove that if an an−1 a1 + + ··· + + a0 , n+1 n 2 then the equation an xn + an−1 xn−1 + · · · + a1 x + a0 = 0 has at least one real root between 0 and 1. 18. Find the value of c in Rolle’s Theorem when f (x) = (x − a)m (x − b)n where m and n are positive integers. 19. If f 0 (x) ≤ 0 at all points of (a, b), prove that f (x) is monotonic decreasing in (a, b). Under what conditions is f (x) strictly decreasing in (a, b)? 20. Use the mean value theorem to show that sin x ≤ x and tan x ≥ x. Hence show that strictly decreasing on (0, π2 ). a 21. Use the mean value theorem, to prove that, if 0 < a < b, then 1 − < ln b 1 1 Hence show that < ln(1.2) < . 6 5 22. Use the Mean Value Theorem to prove the following inequalities (i) ln(1 + x) < x if x > 0. 79 sin x is x b b < −1 . a a Chapter 7 Integration 7.1 Anti-derivatives In this chapter we shall see that an equally important problem is : Given a f unction f, f ind a f unction whose derivative is the same as f. That is, for a given function f , we wish to find another function F for which F 0 (x) = f (x) for all x on some interval. Definition 7.1.1. A function F is said to be an anti-derivative of a function f if F 0 (x) = f (x) on some interval. Example: An anti-derivative of f (x) = 2x is F (x) = x2 since F 0 (x) = 2x. There is always more than one anti-derivative of a function. For instance, in the foregoing example, F1 (x) = x2 −1 and F2 (x) = x2 +10 are also anti-derivatives of f (x) = 2x since F10 (x) = F20 (x) = f (x). Indeed, if F is an anti-derivative of a function f , then so is G(x) = F (x) + C, for any constant C. This is a consequence of the fact that d (F (x) + C) = F 0 (x) + 0 = F 0 (x) = f (x). dx Thus, F (x) + C stands for a set of functions of which each member has a derivative equal to f (x). G0 (x) = Theorem 7.1.1. If G0 (x) = F 0 (x) for all x in some interval [a, b], then G(x) = F (x) + C for all x in the interval. Examples: (a) The anti-derivative of f (x) = 2x is G(x) = x2 + C. (b) The anti-derivative of f (x) = 2x + 5 is G(x) = x2 + 5x + C since G0 (x) = 2x + 5. 80 7.2 Indefinite Integral For convenience let’s introduce a notation for an anti-derivative of a function. If F 0 (x) = f (x), we shall represent the most general anti-derivative of f by Z f (x)dx = F (x) + C. Z The symbol Z is called an integral sign, and the notation f (x) is called the indefinite integral of f (x) with respect to x. The function f (x) is called the integrand. The process of finding an anti-derivative is called anti-differentiation or integration. The number C is called Z a constant d () denotes differentiation with respect to x, the symbol ()dx denotes of integration. Just as dx integration with respect to x. 7.3 The Indefinite Integral of a Power When differentiating the power xn , we multiply by the exponent n and decrease the exponent by 1. To find an anti-derivative of xn , the reverse of the differentiation rule would be : Increase the exponent by 1 and divide by the new exponent n + 1. If n is a rational number, then for n 6= −1 Z xn+1 + C. xn dx = n+1 Proof. Notice that d dx Z Example: Evaluate (a) 6 xn+1 +C n+1 x dx Z (b) = (n + 1) x(n+1)−1 + 0 = xn . n+1 1 dx. x5 Solution: Z x7 (a) x6 dx = + C. 7 Z x−4 1 1 −5 (b) By writing 5 as x , we have x−5 dx = + C = − 4 + C. x −4 4x Z Example: Evaluate √ x dx. 81 √ Z Solution: We first write Z x dx = 1 x 2 dx and therefore 3 Z 1 2 x dx = x2 3 2 2 3 + C = x 2 + C. 3 The following property of indefinite integrals is an immediate consequence of the fact that the derivative of a sum is the sum of derivatives. Theorem 7.3.1. If F 0 (x) = f (x) and G0 (x) = g(x), then Z Z Z [f (x) ± g(x)]dx = f (x)dx ± g(x)dx = F (x) ± G(x) + C. Z Example: Evaluate 1 (x− 2 + x4 )dx. Solution: We can write Z Z Z 1 1 x2 x5 x5 − 21 4 − 12 4 2 (x + x )dx = x dx + x dx = 1 + + C = 2x + + C. 5 5 2 Theorem 7.3.2. If F 0 (x) = f (x), then Z Z kf (x)dx = k f (x)dx for any constant k. The anti-derivative, or indefinite integral, of any finite sum can be obtained by integrating each term. Z 5 − 31 Example: Evaluate 4x − 2x + 2 dx. x Solution: It follows that Z Z Z Z 5 − 13 − 13 4x − 2x + 2 dx = 4 xdx − 2 x dx + 5 x−2 dx x 2 x2 x3 x−1 = 4· −2· 2 +5· +C 2 −1 3 2 = 2x2 − 3x 3 − 5x−1 + C. 7.4 Some Anti-Differentiation Formulas Z 1. Z cf (x)dx = c f (x)dx. 82 Z Z Z [f (x) ± g(x)] = 2. Z 3. xn dx = f (x)dx ± g(x)dx. 1 xn+1 + C for n 6= −1. n+1 Z 4. adx = ax + C. Z 5. cos xdx = sin x + C. Z sin xdx = − cos x + C. 6. Z 7. Z 8. Z 9. Z 10. Z 11. sec2 xdx = tan x + C. ex dx = ex + C. 1 dx = ln |x| + C. x √ 1 dx = sin−1 x + C. 2 1−x 1 dx = tan−1 x + C. 1 + x2 Z Example: Evaluate √ 5 3 2 − 2 x dx. x Solution: We may write Z Z Z √ 2 1 5 3 2 − 2 x dx = 5 dx − 2 x 3 dx x x 1 5 = 5 ln |x| − 2 5 x 3 + C 3 6 5 = 5 ln |x| − x 3 + C. 5 83 7.5 u−Substitution The Indefinite Integral of a Power of a Function Theorem 7.5.1. If F is an anti-derivative of f , then Z f (g(x))g 0 (x)dx = F (g(x)) + C. Z Example: Evaluate (4x2 x dx. + 3)6 Solution: Let us rewrite the integral as Z (4x2 + 3)−6 xdx and make the identifications u = 4x2 + 3 and du = 8xdx. −6 Z u du Z 1 z 2 }| −6{ z }| { (4x + 3) 8xdx 8 Z 1 u−6 du 8 1 u−5 · +C 8 −5 1 − (4x2 + 3)−5 + C. 40 (4x2 + 3)−6 xdx = = = = Z Example: Evaluate x(x2 + 2)3 dx. Solution: If u = x2 + 2 then du = 2xdx. Thus, 3 Z x(x2 + 2)3 dx = = = = u du Z 1 z 2 }| {3 z }| { (x + 2) 2xdx 2 Z 1 u3 du 2 1 u4 · +C 2 4 1 2 (x + 2)4 + C. 8 84 You Try It: Evaluate p 3 (7 − 2x3 )4 x2 dx. Z Example: Evaluate sin 10x dx. Solution: If u = 10x, we then need du = 10dx. Accordingly, we write Z Z 1 sin 10x dx = sin 10x(10dx) 10 Z 1 sin u du = 10 1 = (− cos u) + C 10 1 = − cos 10x + C. 10 Z You Try It: Evaluate 7.6 sec2 (1 − 4x) dx. Area Under a Graph As the derivative is motivated by the geometric problem of constructing a tangent to a curve, the historical problem leading to the definition of a definite integral is the problem of finding area. Specifically, we are interested in finding the area A of a region bounded between the x−axis, the graph of a non-negative function y = f (x) defined on some interval [a, b] and (i) the vertical lines x = a and x = b. (ii) the x−intercepts of the graph. 85 7.6.1 The Definite Integral The geometric problems that motivated the development of the integral calculus (determination of lengths, areas, and volumes) arose in the ancient civilizations of Northern Africa. Where solutions were found, they related to concrete problems such as the measurement of a quantity of grain. Greek philosophers took a more abstract approach. In fact, Eudoxus (around 400 B.C.) and Archimedes (250 B.C.) formulated ideas of integration as we know it today. Integral calculus developed independently, and without an obvious connection to differential calculus. The calculus became a “whole” in the last part of the seventeenth century when Isaac Barrow, Isaac Newton, and Gottfried Wilhelm Leibniz (with help from others) discovered that the integral of a function could be found by asking what was differentiated to obtain that function. Definition 7.6.1. Let f be a function defined on a closed interval [a, b]. Then the definite integral Z b f (x) dx, is defined to be of f from a to b, denoted by a Z b f (x) dx = lim n→∞ a n X f (xi )∆x. i=1 The numbers a and b are called the lower and upper limits of integration, respectively. If the limit exists, the function f is said to be integrable on the interval. 7.6.2 Properties of the Definite Integral The following two definitions prove to be useful when working with definite integrals. Theorem 7.6.1. If f (a) exists, then Z a f (x)dx = 0. a Theorem 7.6.2. If f is integrable on [a, b], then Z a Z b f (x)dx = − f (x)dx. b a Example: By definition Z 1 (x3 + 3x)dx = 0. 1 Theorem 7.6.3. Let f and g be integrable functions on [a, b]. Then, Z (i) b Z kf (x) dx = k a b f (x) dx where k is any constant. a 86 Z b Z [f (x) ± g(x)] dx = (ii) a Z Z f (x) dx ± a b Z g(x) dx. b f (x) dx, where c is any number in [a, b]. f (x) dx + c a a b a c Z f (x) dx = (iii) b The independent variable x in a definite integral is called a dummy variable of integration. The value of the integral does not depend on the symbol used. In other words, Z b Z b Z b Z b f (x) dx = f (r) dr = f (s) ds = f (t) dt a a a a and so on. Theorem 7.6.4. For any constant k, Z b Z b k dx = k dx = k(b − a). a a Theorem 7.6.5. Let f be integrable on [a, b] and f (x) ≥ 0 for all x in [a, b], then Z b f (x) dx ≥ 0. a 7.7 The Fundamental Theorem of Calculus In this theorem we shall see that the concept of an anti-derivative of a continuous function provides the bridge between the differential calculus and the integral calculus. Theorem 7.7.1. Let f be continuous on [a, b] and let F be any function for which F 0 (x) = f (x). Then Z b f (x) dx = F (b) − F (a). a The difference is usually written b F (x) , a that is, Z b Z b b f (x) dx = f (x) dx = F (x) . a a a | {z } | {z } definite integral indefinite integral Z Example: Evaluate 3 x dx. 1 87 Solution: An anti-derivative of f (x) = x is F (x) = 3 Z 1 7.8 7.8.1 x2 . Consequently, 2 3 x2 9 1 x dx = = − = 4. 2 1 2 2 Techniques of Integration Integration by Parts Useful Z for integrands Z involving products of algebraic and exponential or logarithmic functions, such as x2 ex dx and x ln x dx. This is the inverse operation of differentiating a product. If u and v are functions of x, then d dv du (uv) = u + v . dx dx dx Integrate both sides, if Z Example: Evaluate du dv and are continuous, then dx dx Z Z du dv uv = u dx + v dx dx dx Z Z dv du u dx = uv − v dx dx dx Z Z u dv = uv − v du. xex dx. Z x Solution: Let u = x, du = dx and dv = e dx ⇒ v = Z Z Example: Evaluate Z dv = ex dx = ex . Then, xex dx = xex − ex + C. x2 ln x dx. Solution: x2 is more easily integrated than ln x. So choose dv = x2 dx. Then, Z Z x3 2 2 dv = x dx ⇒ v = dv = x dx = . 3 88 and u = ln x ⇒ du = 1 dx. Therefore, x Z 3 Z x 1 x3 2 ln x − dx x ln x dx = 3 3 x Z x3 1 = ln x − x2 dx 3 3 x3 x3 = ln x − + C. 3 9 Z Example: Evaluate ln x dx. Z Solution: Choose dv = dx ⇒ v = Z dv = dx = x and u = ln x ⇒ du = Z Z ln x dx = x ln x − x· 1 dx, therefore x 1 dx x Z = x ln x − dx = x ln x − x + C. Z You Try It: Evaluate 7.9 x tan−1 x dx. Using Integration by Parts Repeatedly Z Example: Evaluate x2 ex dx. Solution: Notice that the derivative of x2 becomes simpler, whereas the derivative of ex does not. So you should let u = x2 and dv = ex dx. So Z Z x dv = e dx ⇒ v = dv = ex dx = ex u = x2 ⇒ du = 2x dx. Integrating by parts one time we get, Z Z 2 x 2 x x e dx = x e − 2xex dx. Apply integration by parts a second time, where x Z dv = e dx ⇒ v = Z dv = u = 2x ⇒ du = 2 dx. 89 ex dx = ex Then, Z 2 x Z 2 x 2xex dx Z 2 x x x = x e − 2xe − 2e dx x e dx = x e − = x2 ex − 2xex + 2ex + C = ex (x2 − 2x + 2) + C. 7.10 Evaluating a Definite Integral Z e ln x dx. Example: Evaluate 1 Solution: e e Z ln x dx = (x ln x − x) 1 1 = (e ln e − e) − (1 · ln 1 − 1) = (e − e) − (0 − 1) = 1. 7.11 Reduction Formulas These are formulas in which a given integral is expressed in terms of similar integrals of simpler form. Example: Let n be a positive integer. Use integration by parts to derive the reduction formula Z Z n x n x x e dx = x e − n xn−1 ex dx + C. n x n−1 Solution: Let u = x , dv = e dx. Then du = nx Z n x n x Z Z ,v= Z dv = ex dx = ex . So ex (nxn−1 ) dx Z n x = x e − n xn−1 ex dx + C. x e dx = x e − 90 Z To illustrate the use of the reduction formula we calculate Z x Z ex dx = xex − ex + C. Z Z 2 x 2 x n=2 : x e dx = x e − 2 xex dx = x2 ex − 2xex + 2ex + C. xe dx = xe − n=1 : Z x xn ex dx for n = 1, 2. sinn x dx. Example: Evaluate Z Solution: Rewrite as Z n sin x dx = sinn−1 x sin x dx. ThenZ u = sinZn−1 x ⇒ du = (n − 1) sinn−2 x cos x dx and dv = sin x ⇒ v = dv = sin x dx = − cos x. Then , Z n n−1 sin x dx = − sin n−1 = − sin n−1 = − sin n−1 = − sin Z Z n sin x dx + (n − 1) 7.12 n−1 n n−1 sin x dx = − sin Z n n Z x cos x + (n − 1) Z x cos x + (n − 1) Z x cos x + (n − 1) Z x cos x + (n − 1) sinn−2 x cos x cos x dx sinn−2 x cos2 x dx sinn−2 x(1 − sin2 x) dx n−2 sin Z x cos x + (n − 1) x dx − sin x dx n sinn−2 x dx Z x cos x + (n − 1) sinn−2 x dx Z Z sinn−1 x cos x n − 1 n + sinn−2 x dx. sin x dx = − n n sin x dx = − sin Partial Fractions This technique involves the decomposition of a rational function into the sum of two or more simpler rational functions. We will consider rational functions (quotients of polynomials) in which the numerator has a lower degree than the denominator. If this condition is not meet, we first carry out long division process, dividing the denominator into the numerator, until we reduce the problem to an equivalent one involving a fraction in which the numerator has a lower degree than the denominator. 91 Example: x+7 2 1 = − , then x2 − x − 6 x−3 x+2 Z Z 2 1 x+7 dx = − dx x2 − x − 6 x−3 x+2 Z Z 1 1 = 2 dx − dx x−3 x+2 = 2 ln |x − 3| − ln |x + 2| + C. Z You Try It: Evaluate 7.12.1 2x + 1 dx. (x − 1)(x + 3) Integrating with Repeated Factors Z Example: Find 5x2 + 20x + 6 dx. x3 + 2x2 + x Solution: Factorise the denominator as x(x + 1)2 . Then write the partial decomposition as 5x2 + 20x + 6 A B C = + + . 2 x(x + 1) x x + 1 (x + 1)2 Substituting we get, A = 6, B = −1, C = 9, then Z Z Z Z 6 1 9 5x2 + 20x + 6 dx = dx − dx + dx x3 + 2x2 + x x x+1 (x + 1)2 9(x + 1)−1 +C = 6 ln |x| − ln |x + 1| + −1 x6 9 = ln − + C. x+1 x+1 Z You Try It: Evaluate 7.12.2 6x − 1 dx. − 1) x3 (2x Integrating an Improper Rational Function Z Example: Find x5 + x − 1 dx. x4 − x3 Solution: This rational is improper, its numerator has a degree greater than that of its denominator. Carrying out long division, we have x5 + x − 1 x3 + x − 1 = x + 1 + . x4 − x3 x4 − x3 92 Now, applying partial fraction decomposition produces A B x3 + x − 1 C D = + 2+ 3+ . 3 x (x − 1) x x x x−1 We see that A = 0, B = 0, C = 1 and D = 1. So now we can integrate, Z 5 Z x +x−1 x3 + x − 1 x+1+ dx dx = x4 − x3 x4 − x3 Z 1 1 = x+1+ 3 + dx x x−1 x2 1 = + x − 2 + ln |x − 1| + C. 2 2x Z You Try It: Evaluate 7.12.3 x3 − 2x dx. x2 + 3x + 2 Quadratic Factors Z Example: Find x2 dx . − 4x + 5 Solution: Here we cannot find real factors, but we can complete the square, x2 − 4x + 5 = (x − 2)2 + 1, therefore Z Z Example: Find dx = 2 x − 4x + 5 Z dx = tan−1 (x − 2) + C. (x − 2)2 + 1 (x + 3)dx . x2 − 4x + 5 Solution: Completing the square, x2 x+3 x+3 = . Now since x + 3 = x − 2 + 5, we − 4x + 5 (x − 2)2 + 1 have Z Z Example: Find (x − 2 + 5) dx (x − 2)2 + 1 Z Z (x − 2) dx 5 dx = + (x − 2)2 + 1 (x − 2)2 + 1 1 = ln((x − 2)2 + 1) + 5 tan−1 (x − 2) + C. 2 (x + 3)dx = x2 − 4x + 5 Z (x + 1) dx . x(x2 + 1) 93 Solution: Decompose into partial fractions, A B + Cx x+1 = + 2 . 2 x(x + 1) x x +1 Then A = 1, B = 1, C = −1, and our integral is Z Z Z Z (x + 1) dx 1 dx x dx = dx + − x(x2 + 1) x x2 + 1 x2 + 1 1 = ln |x| + tan−1 x − ln(x2 + 1) + C. 2 Z You Try It: Evaluate 7.13 (x2 4x dx. + 1)(x2 + 2x + 3) Integration of Rational Functions of Sine and Cosine Integrals of rational expressions that involve sin x and cos x can be reduced to integrals of quotients of polynomials by means of the substitution x t = tan . 2 It then follows that cos x = Z Example: Evaluate Solution: We see that 1 − t2 2t dx 2 , sin x = and = . 1 + t2 1 + t2 dt 1 + t2 dx . 2 + 2 sin x + cos x Z dx = 2 + 2 sin x + cos x Z t2 2 dt . + 4t + 3 Since t2 + 4t + 3 = (t + 1)(t + 3), we use partial fractions, Z Z dx 1 1 = − dt 2 + 2 sin x + cos x t+1 t+3 = ln |t + 1| − ln |t + 3| + C t+1 = ln +C t+3 1 + tan x2 = ln + C. 3 + tan x2 Z Example: Evaluate dx . 5 + 3 cos x 94 Solution: The integral becomes Z dx = 5 + 3 cos x = = = Z You Try It: Evaluate 7.14 2dt 1+ t2 1 − t2 5+3 1 + t2 2dt Z Z 2dt 1 + t2 2 2 = 5(1 + t ) + 3(1 − t ) 8 + 2t2 1 + t2 Z dt t 1 −1 = tan +C 2 t +4 2 2 1 x 1 −1 tan tan + C. 2 2 2 Z dx . 5 + 4 cos x Integration of Powers of Trigonometric Functions With the aid of trigonometric identities, it is possible to evaluate integrals of the type Z sinm x cosn x dx. We distinguish two cases. Case 1: m or n is an odd positive integer. Let us first assume that m is an odd positive integer. By writing sinm x = sinm−1 x sin x, where m − 1 is even, and using sin2 x = 1 − cos2 x, the integrand can be expressed as a sum of powers of cos x times sin x. Z Example: Evaluate sin3 x dx. 95 Solution: Z Z 3 sin x dx = sin2 x sin x dx Z (1 − cos2 x) sin x dx Z Z = sin x dx + cos2 x(− sin x) dx = = − cos x + Z 1 cos3 x + C. 3 sin5 x cos2 x dx. Example: Evaluate Solution: Z Z 5 2 sin x cos x dx = cos2 x sin4 x sin x dx Z = cos2 x(sin2 x)2 sin x dx Z = cos2 x(1 − cos2 x)2 sin x dx Z = cos2 x(1 − 2 cos2 x + cos4 x) sin x dx Z Z Z 2 4 = − cos x(− sin x) dx + 2 cos x(− sin x) dx − cos6 x(− sin x) dx 2 1 1 = − cos3 x + cos5 x − cos7 x + C. 3 5 7 If n is an odd positive integer, the procedure for evaluation is the same except that we seek an integrand that is the sum of powers of sin x times cos x. Z Example: Evaluate sin4 x cos3 x dx. Solution: Z 4 Z 3 sin x cos x dx = Z = Z = = Z You Try It: Evaluate sin4 x cos2 x cos x dx sin4 x(1 − sin2 x) cos x dx sin4 x(cos x) dx − sin6 x(cos x) dx 1 5 1 sin x − sin7 x + C. 5 7 sin2 x cos3 x dx. 96 Case II: m and n are both even non-negative integers. When both m and n are even non-negative integers, the evaluation of the integral relies heavily on the identities, 1 sin 2x, 2 sin x cos x = Z sin2 x = 1 − cos 2x , 2 cos2 x = 1 + cos 2x . 2 cos4 x dx. Example: Evaluate Solution: Z Z 4 cos x dx = = = = = = Z You Try It: Evaluate (cos2 x)2 dx 2 Z 1 + cos 2x dx 2 Z 1 (1 + 2 cos 2x + cos2 2x) dx 4 Z 1 + cos 4x 1 1 + 2 cos 2x + dx 4 2 Z 1 3 1 + 2 cos 2x + cos 4x dx 4 2 2 3 1 1 x + sin 2x + sin 4x + C. 8 4 32 sin2 x cos2 x dx. Instead of sin8 x cos6 x, suppose you have sin 8x cos 6x. Z 2π sin 8x cos 6x dx. Example: Find 0 Z 2π sin px cos qx dx. Use the identity More generally, find 0 sin px cos qx = Thus sin 8x cos 6x = 1 1 sin(p + q)x + sin(p − q)x. 2 2 1 1 sin 14x + sin 2x. Separated like this, sine are easy to integrate, 2 2 2π Z 2π 1 cos 14x cos 2x sin 8x cos 6x dx = − − = 0. 2 14 4 0 0 With two sines or two cosines, the addition formula, derive these formulas, 1 1 sin px cos qx = − cos(p + q)x + cos(p − q)x. 2 2 1 1 cos px cos qx = cos(p + q)x + cos(p − q)x. 2 2 97 Z You Try It: Evaluate 7.15 sin 2x sin 4x dx. Tutorial 5 1. Evaluate the following. Z Z 3 2 (i) (x + 2) sin(x + 4x − 6)dx (ii) x2 esin x cos x3 dx Z Z √ dx 2 (v) x 2x3 − 4dx (iv) x(ln x)4 2. Integrate (i) (x2 1 + 1)(x + 1) (ii) x+5 (2x − 1)(x + 3) (iii) Z (iii) (x2 tan−1 x dx 1 + x2 2x − 5 . + 4)(x + 6) 3. Evaluate each of the following integrals by usingr an appropriate substitution. Z Z Z √ x5 dx 1+x √ (i) (ii) x7 x4 + 1dx (iii) dx 1−x x3 + 1 4. Evaluate each of the following integrals. Z Z Z 3 (i) sin xdx (ii) sin x cos xdx (iii) sin3 x cos3 xdx Z (iv) sin 6x cos 3xdx 5. UseZa suitable substitution to evaluate the Z following integrals.Z Z sin θ dθ dθ (iii) dθ (iv) if a > b > 0. (i) sec θdθ (ii) 1 + sin θ 2 + sin θ a + b cos θ Z 1 1 6. Show that sec x dx = In| sec x + tan x| = In tan π+ x . 4 2 7. Evaluate each of theZ following integrals.Z Z (i) xe3x dx (ii) ex sin xdx (iii) sin−1 xdx 8. (a) (b) (d) (e) (f) (h) (iv) ax e cos pxdx (v) Z x3 cos xdx. 1 ln x Show that x ln xdx = x − , m 6= −1. m + 1 (m + 1)2 Z Z − sinn−1 x cos x n − 1 n Show that sin xdx = + sinn−2 xdx, n 6= 0. n n Z Z sinm−1 x cosn+1 x m − 1 m n + sinm−2 x cosn xdx and so Show that sin x cos xdx = − m + n m + n Z find sin6 x cos5 xdx. Z secn−2 x tan x n − 2 Let In = secn xdx, n = 2, 3, . . . . Show that In = + In−2 . n−1 n−1 Z Let In = (ln x)n dx, n = 1, 2, . . . . Show that In = x(ln x)n − nIn−1 . Z Let In = (1 + ax2 )n dx. Show that (2n + 1)In = 2nIn−1 + x(1 + ax2 )n . Z m m+1 Z 98 Chapter 8 Functions of Several Variables So far we have dealt only with functions of single (independent) variables. Many familiar quantities, however, are functions of two or more variables. For instance, the work done by a force (W = F D) and the volume of a right circular cylinder (V = πr2 h) are both functions of two variables. The volume of a rectangular solid (V = lwh) is a function of three variables. The notation for a function of two or three variables is as follows z = f (x, y) = x2 + xy | {z } 2 variables and w = f (x, y, z) = x + 2y − 3z. | {z } 3 variables 8.1 Definition of a Function of Two Variables Let D be a set of ordered pairs of real numbers. If to each ordered pair (x, y) in D there corresponds a unique real number f (x, y), then f is called a function of x and y. The set D is the domain of f and the corresponding set of values for f (x, y) is the range of f . For the function given by z = f (x, y), we call x and y the independent variables and z the dependent variable. As with functions of one variable, the most common way to describe a function of several variables is with an equation, and unless otherwise restricted, we can assume that the domain is the set of all points for which the equation is defined. for example, the domain of the function given by f (x, y) = x2 + y 2 is assumed to be the entire xy−plane. Example 8.1.1. Find the domains of the following functions. p x2 + y 2 − 9 x (ii) g(x, y, z) = p . (i) f (x, y) = 2 x 9 − x − y2 − z2 99 Solution: (i) The function f is defined for all points (x, y) such that x 6= 0 and x2 + y 2 ≥ 9. Thus, the domain is the set of all points lying on or outside the circle x2 + y 2 = 9. (ii) The function g is defined for all points (x, y, z) such that x2 + y 2 + z 2 < 9. Consequently, the domain is the set of all points (x, y, z) lying inside a sphere of radius 3 that is centred at the origin. Functions of several variables can be combined in the same ways as functions of single variables. For instance, we can form the sum, difference, product and quotients of two functions of two variables as follows 1. (f ± g)(x, y) = f (x, y) ± g(x, y) Sum or Difference. 2. (f g)(x, y) = f (x, y)g(x, y) Product. 3. f (x, y) f (x, y) = , g g(x, y) 8.2 g(x, y) 6= 0 Quotient. The Graph of a Function of Two Variables As with functions of single variables, we can learn a lot about the behaviour of a function of two variables by sketching its graph. The graph of a function f of two variables is the set of all points (x, y, z) for which z = f (x, y) and (x, y) is in the domain of f . The graph can be interpreted as a surface in space. 8.2.1 Level Curves A second way to visualize a function of two variables is to use a scalar field in which the scalar z = f (x, y) is assigned to the point (x, y). A scalar field can be characterized by level curves (or contour lines) along which the value of f (x, y) is constant. For example, the weather map shows level curves of equal pressure called isobars. In weather maps for which the level curves represent points of equal temperature, the level curves are called isotherms. Another common use of level curves is in representing electrical potential fields, in this type of map, the level curves are called equipotential lines. 8.2.2 Level Surfaces The concept of a level curve can be extended by one dimension to define a level surface. If f is a function of three variables and c is a constant, then the graph of the equation f (x, y, z) = c is a level surface of the function f . 100 8.3 8.3.1 Limits and Continuity Neighborhoods in the Plane In this section, we will study limits and continuity involving functions of two or three variables. We begin our discussion of the limit of a function of two variables by defining a two-dimensional analog to an interval on the real line. Using the formula for the distance between two points (x, y) and (x0 , y0 ) in the plane, we can define the δ-neighborhood about (x0 , y0 ) to be the disc centered at (x0 , y0 ) with radius δ > 0 p {(x, y) : (x − x0 )2 + (y − y0 )2 < δ}. closed. When this formula contains the less than inequality, <, the disc is called open, and when 6 δ sq (x0 , y0 ) - Figure 8.1: A closed disc it contains the less than or equal to inequality, ≤, the disc is called closed. A point (x0 , y0 ) in a plane region R is an interior point of R if there exists a δ-neighborhood about (x0 , y0 ) that lies entirely in R. If every point in R is an interior point, then R is an open region. A point (x0 , y0 ) is a boundary point of R if every open disc centered at (x0 , y0 ) contains points inside R and points outside R. If a region contains all its boundary points, then the region is closed. 101 8.3.2 Limit of a Function of Two Variables Let f be a function of two variables defined on an open disc centered at (x0 , y0 ), except possibly at (x0 , y0 ) and let L be a real number. Then lim f (x, y) = L (x,y)→(x0 ,y0 ) if for each ε > 0 there corresponds a δ > 0 such that |f (x, y) − L| < ε whenever 0 < p (x − x0 )2 + (y − y0 )2 < δ. Definition A function f (x, y) has a limit L as (x, y) approaches (x0 , y0 ) if given any > 0 there exists δ > 0 (depending on and (x0 , y0 )) such that whenever (x − x0 )2 + (y − y0 )2 < δ 2 , then |f (x, y) − L| < . The definition of the limit of a function of two variables is similar to the definition of the limit of a function of a single variable, yet there is a critical difference. For a function of two variables, the statement (x, y) → (x0 , y0 ) means that the point (x, y) is allowed to approach (x0 , y0 ) from any direction. If the value of lim f (x, y) (x,y)→(x0 ,y0 ) is not the same for all possible approaches or paths, to (x0 , y0 ), then the limit does not exist. We usually choose convenient paths. Some of these are 1. along the x or y axis, 2. along straight lines, 3. along well-defined curves such as parabolae, for e.g., y = x3 . Example 8.3.1. Show that lim x = a. (x,y)→(a,b) Solution: Let f (x, y) = x and L = a. We need to show that for each ε > 0, there exists a δ−neighborhood about (a, b) such that |f (x, y) − L| = |x − a| < ε whenever (x, y) 6= (a, b) lies in the neighborhood. We can observe that from p 0 < (x − a)2 + (y − b)2 < δ it follows that |f (x, y) − a| = |x − a| = p p (x − a)2 ≤ (x − a)2 + (y − b)2 < δ. thus, we can choose δ = ε, and the limit is verified. 102 Example 8.3.2. Prove that x2 + 2y = 5. lim (x,y)→(1,2) Solution: Using the definition of limits, we must show that, given ε > 0, we can find a δ > 0 such that |x2 + 2y − 5| < ε when 0 < |x − 1| < δ and 0 < |y − 2| < δ. If 0 < |x − 1| < δ and 0 < |y − 2| < δ, then 1−δ <x<1+δ and 2−δ <y <2+δ excluding x = 1 and y = 2. Thus, 1 − 2δ + δ 2 < x2 < 1 + 2δ + δ 2 and 4 − 2δ < 2y < 4 + 2δ. Adding 5 − 4δ + δ 2 < x2 + 2y < 5 + 4δ + δ 2 or −4δ + δ 2 < x2 + 2y − 5 < 4δ + δ 2 . If δ ≤ 1, it follows that −5δ < x2 + 2y − 5 < 5δ i.e., |x2 + 2y − 5| < 5δ whenever 0 < |x − 1| < δ and 0 < |y − 2| < δ. Choosing 5δ = ε i.e., δ = 5ε or δ = 1 which ever is smaller, it follows that |x2 + 2y − 5| < ε when 0 < |x − 1| < δ and 0 < |y − 2| < δ i.e., lim x2 + 2y = 5. (x,y)→(1,2) Example 8.3.3. Show that the following limit does not exist. 2 2 x − y2 . lim (x,y)→(0,0) x2 + y 2 2 x2 − y 2 Solution: The domain of the function given by f (x, y) = consists of all points in the x2 + y 2 xy-plane except for the point (0, 0). To show that the limit as (x, y) approaches (0, 0) does not exist, consider approaching (0, 0) along two different paths. along the x-axis, every point is of the form (x, 0) and the limit along this approach is lim (x,0)→(0,0) x2 − y 2 x2 + y 2 2 = (1)2 = 1. lim (x,0)→(0,0) However, if (x, y) approaches (0, 0) along the line y = x, we obtain lim (x,x)→(0,0) x2 − y 2 x2 + y 2 2 = lim (x,x)→(0,0) Hence, f does not have a limit as (x, y) → (0, 0). 103 0 2x2 2 = 0. Example 8.3.4. Show that lim (x,y)→(0,0) x2 xy does not exist. + y2 Solution: The fact that the limit taken along the x and y−axis exists and equal zero may lead us to suspect that the lim f (x, y) exists. We have not examined every path to (0, 0). We now try (x,y)→(0,0) any line through the origin given by y = mx, lim f (x, y) = (x,y)→(0,0) m mx2 = . 2 2 2 (x,y)→(0,0) x + m x 1 + m2 lim This limit changes as the gradient m changes. For example (i) on y = 2x, lim f (x, y) = (x,y)→(0,0) 2 and 5 5 (ii) on y = 5x, lim f (x, y) = . There is no single number L that we can call the limit of f (x,y)→(0,0) 26 as (x, y) → (0, 0). So the limit does not exist. 8.3.3 Limit Combinations In general, if lim (x,y)→(x0 ,y0 ) f (x, y) = L1 and lim (x,y)→(x0 ,y0 ) g(x, y) = L2 , then 1. lim[f (x, y) ± g(x, y)] = lim f (x, y) ± lim g(x, y) = L1 ± L2 . 2. lim[f (x, y) · g(x, y)] = (lim f (x, y))(lim g(x, y)) = L1 L2 . 3. lim cf (x, y) = c lim f (x, y) = cL1 , where c is any number. 4. lim lim f (x, y) L1 f (x, y) = = , if L2 6= 0. g(x, y) lim g(x, y) L2 Example 8.3.5. Evaluate lim (2x + 5xy − 3y 2 ). (x,y)→(2,1) Solution: lim (2x + 5xy − 3y 2 ) = (x,y)→(2,1) lim 2x + (x,y)→(2,1) lim (x,y)→(2,1) 5xy + (−3y 2 ) lim (x,y)→(2,1) = lim 2x + (lim 5x)(lim y) − 3 lim y 2 x→2 x→2 y→1 y→1 = 2 lim x + 5(lim x)(lim y) − 3(lim y)(lim y) x→2 x→2 y→1 y→1 y→1 = 2 · 2 + 5 · 2 · 1 − 3 · 1 · 1 = 4 + 10 − 3 = 11. 8.4 Continuity of a Function of Two Variables Notice that the limit of f (x, y) = 5x2 y x2 + y 2 104 as (x, y) → (1, 2) can be evaluated by direct substitution. That is, the limit is f (1, 2) = 2. In such cases the function f is said to be continuous at the point (1, 2). Definition of Continuity of a Function of Two Variables A function f of two variables is continuous at a point (x0 , y0 ) in an open region R if 1. f is defined at (x0 , y0 ), 2. lim f (x, y) exists, and (x,y)→(x0 ,y0 ) 3. lim (x,y)→(x0 ,y0 ) f (x, y) = f (x0 , y0 ). The function f is continuous in the open region R if it is continuous at every point in R. Properties of Continuous Functions of Two Variables If k is a real number and f and g are continuous at (x0 , y0 ), then the following functions are continuous at (x0 , y0 ), 1. Scalar Multiple kf . 2. Sum and difference f ± g. 3. Product f g. 4. Quotient f , if g(x0 , y0 ) 6= 0. g Polynomials and rational functions in two variables are continuous at any point at which they are defined. 8.5 8.5.1 Partial Derivatives Partial Derivatives of a Function of Two Variables In the application of functions of several variables, the question often arises, “How will a function be affected by a change in one of its independent variables?”. You can answer by considering the 105 independent variables one at a time. The process is called partial differentiation, and the result is referred to as the partial derivative of f with respect to the chosen independent variable 1 . Definition of Partial Derivatives of a Function of Two Variables If z = f (x, y), then the first partial derivatives of f with respect to x and y are the functions fx and fy defined by f (x + ∆x, y) − f (x, y) ∆x→0 ∆x f (x, y + ∆y) − f (x, y) fy (x, y) = lim ∆y→0 ∆y fx (x, y) = lim provided the limits exist. This definition indicates that if z = f (x, y), then to find fx we consider y constant and differentiate with respect to x. Similarly, to find fy , we consider x constant and differentiate with respect to y. Example 8.5.1. Find fx and fy for f (x, y) = 3x − x2 y 2 + 2x3 y. Solution: Considering y to be constant and differentiating with respect to x produces fx (x, y) = 3 − 2xy 2 + 6x2 y. Considering x to be constant and differentiating with respect to y produces fy (x, y) = −2x2 y + 2x3 . Notation for Partial Derivatives For z = f (x, y), the partial derivatives fx and fy are denoted by ∂ ∂z f (x, y) = fx (x, y) = zx = ∂x ∂x and ∂ ∂z f (x, y) = fy (x, y) = zy = . ∂y ∂y The first partials evaluated at the point (a, b) are denoted by ∂z ∂x = fx (a, b) (a,b) and ∂z ∂y = fy (a, b). (a,b) 1 The introduction of partial derivatives followed Newton’s and Leibniz’s work in calculus by several years. Between 1760, Leonhard Euler and Jean Le Rond d’Alembert (1717-1783) separately published several papers on dynamics, in which they established much of the theory of partial derivatives 106 2 Example 8.5.2. For f (x, y) = xex y , find fx and fy and evaluate each at the point (1, ln 2). Solution: Because 2 fx (x, y) = xex y (2xy) + ex 2y the partial derivative of f with respect to x at (1, ln 2) is fx (1, ln 2) = eln 2 (2 ln 2) + eln 2 = 4 ln 2 + 2. Because 2 fy (x, y) = xex y (x2 ) = x3 ex 2y the partial derivative of f with respect to y at (1, ln 2) is fy (1, ln 2) = eln 2 = 2. 8.5.2 Partial Derivatives of a function of Three or More Variables The concept of a partial derivative can be extended naturally to functions of three or more variables. For instance, if w = f (x, y, z), then there are three partial derivatives, each of which is formed by holding two of three variables constant. f (x + ∆x, y, z) − f (x, y, z) ∂w = fx (x, y, z) = lim ∆x→0 ∂x ∆x ∂w f (x, y + ∆y, z) − f (x, y, z) = fy (x, y, z) = lim ∆y→0 ∂y ∆y ∂w f (x, y, z + ∆z) − f (x, y, z) = fz (x, y, z) = lim . ∆z→0 ∂z ∆z Example 8.5.3. (i) To find the partial derivative of f (x, y, z) = xy + yz 2 + xz with respect to z, consider x and y to be constant and obtain ∂ xy + yz 2 + xz = 2yz + x. ∂z (ii) to find the partial derivative of f (x, y, z) = z sin(xy 2 + 2z) with respect to z, consider x and y to be constant. Then, using the Product rule, we obtain ∂ ∂ ∂ z sin(xy 2 + 2z) = (z) [sin(xy 2 + 2z)] + sin(xy 2 + 2z) [z] ∂z ∂z ∂z 2 2 = (z)[cos(xy + 2z)](2) + sin(xy + 2z) = 2z cos(xy 2 + 2z) + sin(xy 2 + 2z). x+y+z (iii) To find the partial derivative of f (x, y, z, w) = with respect to w, consider x, y and w z to be constant and obtain x+y+z ∂ x+y+z =− . ∂w w w2 107 8.5.3 Higher-Order Partial Derivatives It is possible to take second, third and higher partial derivatives of a function of several variables, provided such derivatives exist. Higher-order derivatives are denoted by the order in which the differentiation occurs. For instance, the function z = f (x, y) has the following second partial derivatives. 1. Differentiate twice with respect to x: ∂ ∂x ∂f ∂x ∂f ∂y = ∂ 2f = fxx . ∂x2 = ∂ 2f = fyy . ∂y 2 2. Differentiate twice with respect to y: ∂ ∂y 3. Differentiate first with respect to x and then with respect to y: ∂ 2f ∂ ∂f = = fxy . ∂y ∂x ∂y∂x 4. Differentiate first with respect to y and then with respect to x: ∂ ∂f ∂ 2f = = fyx . ∂x ∂y ∂x∂y The third and fourth cases are called mixed partial derivatives. Example 8.5.4. Find the second partial derivatives of f (x, y) = 3xy 2 − 2y + 5x2 y 2 and determine the value of fxy (−1, 2). Solution: Begin by finding the first partial derivatives with respect to x and y. fx (x, y) = 3y 2 + 10xy 2 and fy (x, y) = 6xy − 2 + 10x2 y. Then, differentiate each of these with respect to x and y. fxx (x, y) = 10y 2 , fyy (x, y) = 6x + 10x2 , fxy (x, y) = 6y + 20xy and fyx (x, y) = 6y + 20xy. At (−1, 2), the value of fxy is fxy (−1, 2) = 12 − 40 = −28. Notice that the two mixed partials are equal. Sufficient conditions for this occurrence are given in the next theorem. 108 Theorem 8.5.1 (Equality of Mixed Partial Derivatives). If f is a function of x and y such that fx and fy are continuous on an open disc R, then for every (x, y) in R, fxy (x, y) = fyx (x, y). The order of differentiation of the mixed partial derivatives is irrelevant. Example 8.5.5. Show that fxz = fzx and fxzz = fzxz = fzzx for the function given by f (x, y, z) = yex + x ln z. Solution: First partials: fx (x, y, z) = yex + ln z, fz (x, y, z) = x . z Second partials: (Note the first two are equal) 1 fxz (x, y, z) = , z 1 fzx (x, y, z) = , z fzz (x, y, z) = − x . z2 Third partials: (Note that all three are equal) fxzz (x, y, z) = − 8.6 1 , z2 fzxz (x, y, z) = − 1 , z2 fzzx (x, y, z) = − 1 . z2 Total Differential If z = f (x, y) and ∆x and ∆y are increments of x and y, then the differentials of the independent variables x and y are dx = ∆x and dy = ∆y and the total differential of the dependent variable z is dz = ∂z ∂z dx + dy = fx (x, y)dx + fy (x, y)dy. ∂x ∂y This definition can be extended to a function of three or more variables. for example, if w = f (x, y, z, u), then dx = ∆x, dy = ∆y, dz = ∆z, du = ∆u, and the total differential of w is dw = Example 8.6.1. ∂w ∂w ∂w ∂w dx + dy + dz + du. ∂x ∂y ∂z ∂u (i) The total differential dz for z = 2x sin y − 3x2 y 2 is dz = ∂z ∂z dx + dy = (2 sin y − 6xy 2 )dx + (2x cos y − 6x2 y)dy. ∂x ∂y 109 (ii) The total differential dw for w = x2 + y 2 + z 2 is dw = ∂w ∂w ∂w dx + dy + dz = 2xdx + 2ydy + 2zdz. ∂x ∂y ∂z Theorem 8.6.1. If a function of x and y is differentiable at (x0 , y0 ), then it is continuous at (x0 , y0 ). Note that the existence of fx and fy is not sufficient to guarantee differentiability. Example 8.6.2. Show that fx (0, 0) and fy (0, 0) both exist, but that f is not differentiable at (0, 0) where f is defined as −3xy , (x, y) = 6 (0, 0) f (x, y) = x2 + y 2 0, (x, y) = (0, 0), Solution: You can show that f is not differentiable at (0, 0) by showing that it is not continuous at this point. to see that f is not continuous at (0, 0), look at the values of f (x, y) along two different approaches to f (x, y). Along the line y = x, the limit is lim f (x, y) = (x,x)→(0,0) 3 −3x2 =− 2 (x,x)→(0,0) 2x 2 lim whereas along y = −x we have lim f (x, y) = (x,−x)→(0,0) 3 3x2 = . (x,−x)→(0,0) 2x2 2 lim Thus, the limit of f (x, y) as (x, y) → (0, 0) does not exist, and we can conclude that f is not continuous at (0, 0). Hence f is not differentiable at (0, 0). On the other hand, by the definition of the partial derivatives fx and fy , we have 0−0 f (∆x, 0) − f (0, 0) = lim =0 ∆x→0 ∆x ∆x→0 ∆x fx (0, 0) = lim and 0−0 f (0, ∆y) − f (0, 0) = lim = 0. ∆y→0 ∆y ∆y→0 ∆y fy (0, 0) = lim Thus, the partial derivatives at (0, 0) exist. 8.7 Chain Rules for Functions of Several Variables Theorem 8.7.1. Let w = f (x, y), where f is a differentiable function of x and y. If x = g(t) and y = h(t), where g and h are differentiable functions of t, then w is a differentiable function of t, and dw ∂w dx ∂w dy = + . dt ∂x dt ∂y dt 110 ∂w ∂x w ∂w ∂y y x dy dt dx dt t t Figure 8.2: Chain rule: one independent variable Example 8.7.1. Let w = x2 y − y 2 , where x = sin t and y = et . Find dw at t = 0. dt Solution: By the Chain Rule for one independent variable, we have dw ∂w dx ∂w dy = + dt ∂x dt ∂y dt = 2xy(cos t) + (x2 − 2y)et . When t = 0, x = 0, and y = 1, it follows that dw = 0 − 2 = −2. dt The Chain Rule can be extended to any number of variables. for example, if each xi is a differentiable function of a single variable t, then for w = f (x1 , x2 , . . . , xn ), we have dw ∂w dx1 ∂w dx2 ∂w dxn = + + ··· + . dt ∂x1 dt ∂x2 dt ∂xn dt Another type of composite function is one in which the intermediate variables are themselves functions of more than one variable. For example, if w = f (x, y), where x = g(s, t) and y = h(s, t), then it follows that w is a function of s and t, and we consider the partial derivatives of w with respect to s and t. One way to find these partial derivatives is to write w as a function of s and t explicitly by substituting the equations x = g(s, t) and y = h(s, t) into the equation w = f (x, y), then find the partial derivatives in the usual way. Example 8.7.2. Find ∂w ∂w s and for w = 2xy, where x = s2 + t2 and y = . ∂s ∂t t 111 s into the equation w = 2xy to obtain t 3 s s 2 2 =2 + st . w = 2xy = 2(s + t ) t t Solution: Begin by substituting x = s2 + t2 and y = Then, to find ∂w , hold t constant and differentiate with respect to s. ∂s 2 ∂w 3s 6s2 + 2t2 =2 +t = . ∂s t t Similarly, to find ∂w , hold s constant and differentiate with respect to t to obtain ∂t 3 3 ∂w s 2st2 − 2s3 −s + st2 =2 − 2 +s =2 = . ∂t t t2 t2 The following theorem gives an alternative method for finding the partial derivatives without explicitly writing w as a function of s and t. Theorem 8.7.2 (Chain Rule: Two Independent Variables). Let w = f (x, y), where f is a differ∂x ∂x ∂y , , entiable function of x and y. If x = g(s, t) and y = h(s, t) such that the first partials ∂s ∂t ∂s ∂y ∂w ∂w and all exist, then and exist and are given by ∂t ∂s ∂t ∂w ∂w ∂x ∂w ∂y = + ∂s ∂x ∂s ∂y ∂s Example 8.7.3. Use the Chain rule to find and ∂w ∂w ∂x ∂w ∂y = + . ∂t ∂x ∂t ∂y ∂t ∂w ∂w s and for w = 2xy where x = s2 + t2 and y = . ∂s ∂t t Solution: We can hold t constant and differentiate with respect to s to obtain, ∂w ∂w ∂x ∂w ∂y = + ∂s ∂x ∂s ∂y ∂s 1 = 2y(2s) + 2x t 2 2 s 2s + 2t2 = 4 + t t 2 2 6s + 2t = . t 112 w ∂w ∂y ∂w ∂x # y x "! ∂x ∂s ∂x ∂t ∂y ∂t ∂y ∂s '$ t s t s &% Figure 8.3: Chain Rule: Two independent variables Similarly, holding s fixed gives ∂w ∂x ∂w ∂y ∂w = + ∂t ∂x ∂t ∂y ∂t −s = 2y(2t) + 2x t2 s −s 2 2 (2t) + 2(s + t ) = 2 t t2 2s3 + 2st2 = 4s − t2 2 3 4st − 2s − 2st2 = t2 2 2st − 2s3 = . t2 ∂w ∂w and when s = 1 and t = 2π for the function given by w = xy+yz+xz Example 8.7.4. Find ∂s ∂t where x = s cos t, y = s sin t and z = t. Solution: By extending the theorem, we have ∂w ∂w ∂x ∂w ∂y ∂w ∂z = + + ∂s ∂x ∂s ∂y ∂s ∂z ∂s = (y + z)(cos t)+)(x + z)(sin t) + (y + x)(0) = (y + z)(cos t) + (x + z)(sin t) 113 When s = 1 and t = 2π, we have x = 1, y = 0 and z = 2π. Therefore ∂w = 2π(1) + (1 + 2π)(0) + 0 = 2π. ∂s Furthermore ∂w ∂x ∂w ∂y ∂w ∂z ∂w = + + ∂t ∂x ∂t ∂y ∂t ∂z ∂t = (y + z)(−s sin t) + (x + z)(s cos t) + (y + x)(1) and for s = 1 and t = 2π it follows that ∂w = (0 + 2π)(0) + (1 + 2π)(1) + (0 + 1)(1) = 2 + 2π. ∂t ∂x ∂x ∂y ∂y , , , . ∂u ∂v ∂u ∂v Example 8.7.5. If u = x + y and v = xy, find Solution: Clearly du = dx + dy and dv = ydx + xdy, and hence −ydx = xdy − dv and xdx = xdu − xdy. Adding these two equations, yield (x − y)dx = xdu − dv. Also, xdy = dv − ydx and −ydy = ydx − ydu, hence (x − y)dy = dv − ydu. Thus dx = 1 x du − dv x−y x−y dy = 1 y dv − du. x−y x−y and Consequently, we have ∂x x = , ∂u x−y ∂x −1 = ∂v x−y ∂y −y = ∂u x−y ∂y 1 = . ∂v x−y Example 8.7.6. Parabolic co-ordinates (u, v) are defined implicitly in terms of the Cartesian coordinates (x, y) by the pair of equations x= Obtain expressions for u2 − v 2 , 2 y = uv. ∂u ∂v ∂v , and in terms of u and v and verify that ∂y ∂x ∂y ∂u ∂v ∂u ∂v + = 0. ∂x ∂x ∂y ∂y ∂f ∂f ∂φ ∂φ and in terms of , , u and v, and ∂x ∂y ∂u ∂v " 2 2 2 # 2 ∂f ∂f 1 ∂φ ∂φ + = 2 + . ∂x ∂y u + v2 ∂u ∂v Given that f (x, y) = φ(u, v), obtain expressions for deduce that 114 u2 − v 2 , y = uv, then dx = udu − vdv and dy = vdu + udv. Multiplying the Solution: Since x = 2 first by u and the second by v, we have udx = u2 du − uvdv, vdy = v 2 du + uvdv. On adding them, we obtain (u2 + v 2 )du = udx + vdy. (8.1) Also multiplying the first by v and the second by u, we get vdx = uvdu − v 2 dv, udy = uvdu + u2 dv. Subtracting them, we obtain (u2 + v 2 )dv = udy − vdx. (8.2) From (8.1) and (8.2), we have du = u2 u v dx + 2 dy, 2 +v u + v2 and u v dy − 2 dx. 2 +v u + v2 ∂u ∂v ∂v ∂u dx + dy and dv = dx + dy, we get Comparing these equations with du = ∂x ∂y ∂x ∂y dv = ∂u u = 2 , ∂x u + v2 Now u2 ∂u v = 2 , ∂y u + v2 ∂v u = 2 , ∂y u + v2 ∂v v =− 2 . ∂x u + v2 −uv uv ∂u ∂v ∂u ∂v + = 2 + = 0. ∂x ∂x ∂y ∂y u + v 2 u2 + v 2 Given that f (x, y) = φ(u, v), it follows that u and v are implicit functions of x and y and so ∂φ ∂u ∂φ ∂v u ∂φ v ∂f ∂φ = + = 2 − 2 2 2 ∂x ∂u ∂x ∂v ∂x u + v ∂u u + v ∂v and Now ∂f ∂φ ∂u ∂φ ∂v v ∂φ u ∂φ = + = 2 + 2 . 2 ∂y ∂u ∂y ∂v ∂y u + v ∂u u + v 2 ∂v 2 2 2 ∂f u2 ∂φ 2uv ∂φ ∂φ v2 ∂φ = 2 − 2 + 2 2 2 2 2 2 2 ∂x (u + v ) ∂u (u + v ) ∂u ∂v (u + v ) ∂v 2 2 2 ∂f v2 ∂φ 2uv ∂φ ∂φ u2 ∂φ = 2 + 2 + . ∂y (u + v 2 )2 ∂u (u + v 2 )2 ∂u ∂v (u2 + v 2 )2 ∂v Hence ∂f ∂x 2 + ∂f ∂y 2 1 = 2 u + v2 115 " ∂φ ∂u 2 + ∂φ ∂v 2 # . Example 8.7.7. Let w = f (x, y), where x and y are given in polar coordinates by the equations ∂ 2w ∂w ∂w , and x = r cos θ and y = r sin θ. Calculate in terms of r and θ and the partial ∂r ∂θ ∂r2 derivatives of w with respect to x and y. Solution: Here x and y are intermediate values, while the independent variables are r and θ. First note that ∂y ∂x ∂y ∂x = cos θ, = sin θ, = −r sin θ and = r cos θ. ∂r ∂r ∂θ ∂θ Then ∂w ∂x ∂w ∂y ∂w ∂w ∂w = + = cos θ + sin θ ∂r ∂x ∂r ∂y ∂r ∂x ∂y and ∂w ∂w ∂x ∂w ∂y ∂w ∂w = + = −r sin θ + r cos θ. ∂θ ∂x ∂θ ∂y ∂θ ∂x ∂y Next, ∂ ∂w ∂w ∂wx ∂wy ∂ 2w ∂ ∂w = cos θ + sin θ = cos θ + sin θ, = 2 ∂r ∂r ∂r ∂r ∂x ∂y ∂r ∂r ∂w ∂w and wy = . Therefore where wx = ∂x ∂y ∂ 2w ∂wx ∂x ∂wx ∂y ∂wy ∂x ∂wy ∂y = + cos θ + + sin θ ∂r2 ∂x ∂r ∂y ∂r ∂x ∂r ∂y ∂r 2 2 ∂ 2w ∂ w ∂ 2w ∂ w cos θ + sin θ cos θ + cos θ + sin θ sin θ. = ∂x2 ∂y∂x ∂x∂y ∂y 2 Finally, because wyx = wxy , we get ∂ 2w ∂ 2w ∂ 2w ∂ 2w 2 2 = cos θ + 2 cos θ sin θ + sin θ. ∂r2 ∂x2 ∂x∂y ∂y 2 8.8 8.8.1 Extrema of Functions of Two Variables Absolute Extrema and Relative Extrema Theorem 8.8.1 (Extreme Value Theorem). Let f be a continuous function of two variables x and y defined on a closed bounded region R in the xy-plane. 1. There is at least one point in R where f takes on a minimum value. 2. There is at least one point in R where f takes on a maximum value. A minimum value is also called an absolute minimum and a maximum is also called an absolute maximum. 116 Definition of Relative Extrema Let f be a function defined on a region R containing (x0 , y0 ). 1. The function f has a relative minimum at (x0 , y0 ) if f (x, y) ≥ f (x0 , y0 ) for all (x, y) in an open disc containing (x0 , y0 ). 2. The function f has a relative maximum at (x0 , y0 ) if f (x, y) ≤ f (x0 , y0 ) for all (x, y) in an open disc containing (x0 , y0 ). To locate relative extreme of f , we can investigate the points at which the gradient of f is 0. Such points are called critical points of f . Definition of Critical Point Let f be defined on an open region R containing (x0 , y0 ). The point (x0 , y0 ) is a critical point of f if one of the following is true. 1. fx (x0 , y0 ) = 0 and fy (x0 , y0 ) = 0. 2. fx (x0 , y0 ) and fy (x0 , y0 ) does not exist. Theorem 8.8.2. If f has a relative extremum at (x0 , y0 ) on an open region R, then (x0 , y0 ) is a critical point of f . Example 8.8.1. Determine the relative extrema of f (x, y) = 2x2 + y 2 + 8x − 6y + 20. Solution: Begin by finding the critical points of f . Because fx (x, y) = 4x + 8 and fy (x, y) = 2y − 6 are defined for all x and y, the only critical points are those for which both first partial derivatives are 0. To locate these points, let fx (x, y) and fy (x, y) be 0, and solve the system of equations 4x + 8 = 0 and 2y − 6 = 0 to obtain the critical point (−2, 3). By completing the square, we can conclude that for all (x, y) 6= (−2, 3), f (x, y) = 2(x + 2)2 + (y − 3)2 + 3 > 3. Therefore, a relative minimum of f occurs at (−2, 3). The value of the relative minimum is f (−2, 3) = 3. 117 The above example shows a relative minimum occurring at one type of critical point, the type for which both fx (x, y) and fy (x, y) are 0. The next example concerns a relative maximum that occurs at the other type of critical point, the type for which either fx (x, y) or fy (x, y) is undefined. 1 Example 8.8.2. Determine the relative extrema of f (x, y) = 1 − (x2 + y 2 ) 3 . Solution: Because fx (x, y) = − 2x 3(x2 + y 2 ) 2 3 and fy (x, y) = − 2y 2 3(x2 + y 2 ) 3 it follows that both partial derivatives are defined for all points in the xy-plane except for (0, 0). Moreover, because the partial derivatives cannot both be 0 unless both x and y are 0, we can conclude that (0, 0) is the only critical point. Note that f (0, 0) = 1, for all other (x, y) it is clear that 1 f (x, y) = 1 − (x2 + y 2 ) 3 < 1. Therefore, f has a relative maximum at (0, 0). The Second Partials Test Some critical points yield saddle points, which are neither relative maxima nor relative minima. Theorem 8.8.3. Let f have continuous second partial derivatives on an open region containing a point (a, b) for which fx (a, b) = 0 and fy (a, b) = 0. To test for relative extrema of f , we define the quantity d = fxx (a, b)fyy (a, b) − [fxy (a, b)]2 . 1. If d > 0 and fxx (a, b) > 0, then f has a relative minimum at (a, b). 2. If d > 0 and fxx (a, b) < 0, then f has a relative maximum at (a, b). 3. If d < 0, then (a, b, f (a, b)) is a saddle point. 4. The test is inconclusive if d = 0. A convenient device for remembering the formula for d in the Second Partials Test is given by the 2 × 2 determinant f (a, b) fxy (a, b) d = xx fyx (a, b) fyy (a, b) where fxy (a, b) = fyx (a, b). 118 Example 8.8.3. Find the relative extrema of f (x, y) = −x3 + 4xy − 2y 2 + 1. Solution: Begin by finding the critical points of f . Because fx (x, y) = −3x2 + 4y and fy (x, y) = 4x − 4y are defined for all x and y, the only critical points are those for which both first partial derivatives are 0. Solving the equations −3x2 + 4y = 0 and 4x − 4y = 0, we see that from the second equation that x = y, and by substitution into the first equation, we obtain two solutions, y = x = 0 and y = x = 34 . Because fxx (x, y) = −6x, fyy (x, y) = −4, fxy (x, y) = 4 it follows that, for the critical point (0, 0), d = fxx (0, 0)fyy (0, 0) − [fxy (0, 0)]2 = 0 − 16 < 0 and, by the Second Partials Test, we can conclude that (0, 0) is a saddle point of f . Furthermore, for the critical point ( 34 , 43 ), 2 4 4 4 4 4 4 , fyy , − fxy , = −8(−4) − 16 > 0 d = fxx 3 3 3 3 3 3 and because fxx ( 43 , 43 ) = −8 < 0 we can conclude that f has a relative maximum at ( 34 , 43 ). 2 8.9 Tutorial 6 3 2 1. If f (x, y) = x − 2xy + 3y , find (i) f (−2, 3) (ii) f 1 2 , . x y 2. Use the definition of a limit to show that: (i) lim (3x − 2y) = 14 (ii) lim (xy − 3x + 4) = 0 (x,y)→(4,−1) (iii) (x,y)→(2,1) lim (2x + 5xy − 3y 2 ) = (x,y)→(2,1) 11 . 3. Let Prove that xy 2 , f (x, y) = x2 + y 2 0, lim (x, y) 6= (0, 0) (x, y) = (0, 0), f (x, y) = 0. (x,y)→(0,0) 4. Use rules of limits to evaluate each of the following limits. 3−x+y (i) lim (3x3 − 2xy + 4y 2 ) (ii) lim (x,y)→(2,−1) (x,y)→(1,2) 4 + x − 2y x+y−1 √ √ (iv) lim . (x,y)→(0+ ,1− ) x− 1−y 2 (iii) lim (x,y)−→(4,π) x2 sin y x “Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.”— Albert Einstein 119 5. Show that the following limits do not exist: x2 − 3y 2 xy (i) lim (ii) lim 2 2 2 (x,y)→(0,0) x + 2y (x,y)→(0,0) x + y 2 6. Use the definition of partial derivative to find 7. If f (x, y) = (iii) x2 y (x,y)→(0,0) x4 + y 2 lim (iv) 2x − y 2 . (x,y)→(0,0) 2x2 + y lim ∂z ∂z and for z = 3x2 − xy + 2y 2 + 3. ∂x ∂y x−y ∂f ∂f , find and from the definition. x+y ∂x ∂y 8. Let f (x, y) = xex 2 +2y 2 , find ∂f ∂f ∂ 2f ∂ 2f , and verify that = . ∂x ∂y ∂x∂y ∂y∂x ∂ 2z y at (1, 1). 9. If z = x2 tan−1 , find x ∂x∂y 1 10. Let f (x, y, z) = p . Show that fxx + fyy + fzz = 0. x2 + y 2 + z 2 dw given that (i) w = x2 + y 2 , x = et , y = e−t dt p (iii), w = x2 + y 2 , x = sin t, y = et . 11. Find (ii) w = ln y x , x = cos t, y = sin t ∂r ∂r ∂r , and . ∂x ∂y ∂z (i) r = eu+v+w and u = yz, v = xz, w = xy (ii) r = uvw − u2 − v 2 − w2 and u = y + z, v = x + z, w = x + y. 12. Find 13. Let f (x, y) be a function of x and y where x = eu−v cos(u + v), y = eu−v sin(u + v). Show that ∂f ∂f ∂f ∂f + = 2x − 2y . ∂u ∂v ∂y ∂x 14. Assume that w = f (u, v) where u = x + y and v = x − y. Show that 2 2 ∂w ∂w ∂w ∂w = − . ∂x ∂y ∂u ∂v 15. Suppose that w = f (x, y) and that x = u + v and y = u − v. Show that 1 ∂ 2w ∂ 2w ∂ 2w ∂ 2w − = + . ∂x2 ∂y 2 2 ∂u∂v ∂v∂u 16. Let V = f (x, y), where x and y are given in polar co-ordinates by the equations x = r cos θ 2 2 2 2 ∂V ∂V ∂V 1 ∂V and y = r sin θ. Show that + = + 2 . ∂x ∂y ∂r r ∂θ 17. Given V = f (x, y). Show that if x = r cos θ and y = r sin θ, then ∂ 2V ∂ 2V ∂ 2V 1 ∂V 1 ∂ 2V + = + + . ∂x2 ∂y 2 ∂r2 r ∂r r2 ∂θ2 18. Given that w = f (x, y), x = eu cos v and y = eu sin v. Show that " 2 2 2 # 2 ∂w ∂w ∂w ∂w + = e−2u + . ∂x ∂y ∂u ∂v 120 8.10 Tutorial 7 1. Find the total differential of the following functions. x2 (iii) z = x cos y − y cos x (iv) w = ex cos y + z (i) z = 3x2 y 3 (ii) z = y 1 2 2 x+y 2 2 (v) w = x2 yz 2 sin yz (vi) z = (ex +y − e−x −y ) (vii) z = ex sin y (viii) w = . 2 z − 2y 2. Find and classify the stationary points of (a) f (x, y) = x2 − 2xy + 2y 2 − 2x + 2y + 4 (b) f (x, y) = 4xy − y 4 − x4 3. (a) Let f (x, y) = (x2 − y 2 )e−x 2 −y 2 . Calculate the partial derivatives fx , fy , fxx , fyx and fxy . (b) Hence, find and classify all the critical points of the function f (x, y). 4. Suppose that w = f (u), where u = x2 − y 2 . Show that xwx + ywy = 0. x2 + y 2 5. Suppose that w = f (u) + g(v), where u = x − at and v = x + at. Show that 2 ∂ 2w 2∂ w = a . ∂t2 ∂x2 121 Chapter 9 Multiple Integration 9.1 Iterated Integrals In the previous chapter, we saw that it is meaningful to differentiate functions of several variables with respect to one variable while holding the other variable constant, we can integrate functions of several variables by a similar procedure. For example, if we have the partial derivative fx (x, y) = 2xy, then by considering y constant, we can integrate with respect to x to obtain Z f (x, y) = fx (x, y) dx Integrate with respect to x Z = 2xy dx y is held constant Z = y 2x dx Factor out constant y = y(x2 ) + C(y) = x2 + C(y). Anti-derivative of 2x is x2 C(y) is function of y Note that the constant of integration, C(y) is a function of y. In other words, by integrating with respect to x, we are able to recover f (x, y) only partially. For example, by considering y constant, we can apply the Fundamental Theorem of calculus to evaluate 2y 2y Z 2xy dx = x2 y 1 = (2y)2 y − (1)2 y = 4y 3 − y. 1 Note that the variable ofZ integration cannot appear in either limit of integration. For example, it x makes no sense to write y dx. 0 Z Example 9.1.1. Evaluate x (2x2 y −2 + 2y) dy. 1 122 Solution: Considering x to be constant and integrating with respect to y produces x Z x −2x2 2 2 −2 +y (2x y + 2y) dy = y 1 1 −2x2 −2x2 2 = +x − +1 x 1 = 3x2 − 2x − 1. Notice that in the above example the integral defines a function of x and can itself be integrated. Z 2 Z x 2 −2 (2x y + 2y) dy dx. Example 9.1.2. Evaluate 1 1 Solution: y 1≤x≤2 y=x 1≤y≤x 6 2 1 1 2 - Figure 9.1: The region of integration Using the result from the previous example, we have Z 2 Z x Z 2 2 −2 (2x y + 2y) dy dx = (3x2 − 2x − 1) dx 1 1 1 3 2 = x − x2 − x 1 = 2 − (−1) = 3. 123 x The integral in the above example is an iterated integral. Iterated integrals are usually written simply as Z d Z h2 (y) Z b Z g2 (x) f (x, y) dxdy. f (x, y) dydx and a c g1 (x) h1 (y) The inside limits of integration can be variable with respect to the outer variable of integration. However, the outside limits of integration must be constant with respect to both variables of integration. After performing the inside integration, we obtain a definite integral, and the second integration produces a real number. One order of integration will often produce a simpler integration problem than the other order. The order of integration affects the ease of integration, but not the value of the integral. 9.1.1 Comparing Different Orders of Integration Example 9.1.3. Sketch the region whose area is represented by the integral Z 2Z 4 dxdy. 0 y2 Then find another iterated integral using the order dydx to represent the area, and show that both integrals yield the same value. Solution: From the given limits of integration, we know that y 6 y2 ∆y 4 - x Figure 9.2: Fig 1 y2 ≤ x ≤ 4 Inner limits of integration 124 which means that the region R is bounded on the left by the parabola x = y 2 and on the right by the line x = 4. Furthermore, because 0≤y≤2 Outer limits of integration we know that R bounded below by the x-axis. The value of this integral is Z 2 #4 Z 2Z 4 x dy dxdy = y2 0 0 Z y2 2 (4 − y 2 ) dy 0 2 y3 16 = 4y − = . 3 0 3 = To change the order of integration to dydx, place a vertical rectangle in the region. From this we can see that the constant bounds 0 ≤ x ≤ 4 serve as the outer limits of integration. √ By solving for 2 y in the equation x = y , we can conclude that the inner bounds are 0 ≤ y ≤ x. Therefore, the area of the region can be represented by Z 4 Z √x dydx. 0 0 By evaluating this integral, we can see that it has the same value as the original integral. y 6 y= ∆x Figure 9.3: Fig 2 Z 4 √ Z x Z dydx = 0 0 # √x 4 y 0 dx 0 4 √ = x dx 0 #4 2 3 16 = x2 = . 3 3 Z 0 125 - x √ x 4 Z 2 Z 2 ey dydx as an iterated integral with order of integration reversed Example 9.1.4. Express x 2 0 and evaluate. Solution: From the given limits of integration we see that, for a fixed x, y varies from y = x2 to y = 2 and x varies from x = 0 to x = 4. We can also describe the region as, for y fixed, x varies from Z Z x = 0 to x = 2y and y varies from y = 0 to y = 2. The corresponding iterated integral is 2y 2 2 ey dxdy. Solving, we have 0 0 Z 2 Z 2y 2 Z y2 x=2y xe e dxdy = 0 y2 dy 0 0 x=0 2 Z = 0 = ey 2 2 2yey dy 2 0 4 = e − 1. Z 2 1 Z 3 yex dxdy. Example 9.1.5. Evaluate y 2 0 3 Solution: We cannot integrate first with respect to x, as indicated, because it happens that ex has no elementary anti-derivative. So we try to evaluate the integral by first reversing the order of integration. Z 2Z 1 Z 1 Z 2x 3 x3 ye dxdy = yex dydx 0 y 2 0 Z 0 1 = 0 Z = 0 1 2 1 2 3 y xex dx 2 0 3 2x2 ex dx 1 2 x3 e 3 0 2 (e − 1). = 3 = 126 9.2 9.2.1 Double Integrals and Volume Double Integrals and Volume of a Solid Region Definition of Double Integral If f is defined on a closed, bounded region R in the xy-plane, then the double integral of f over R is given by ZZ n X f (x, y) dA = lim f (xi , yi )∆xi ∆yi |∆|→0 R i=1 provided the limit exists. If the limit exists, then f is integrable over R. A double integral can be used to find the volume of a solid region that lies between the xy-plane and the surface given by z = f (x, y). Volume of a Solid Region If f is integrable over a plane region R and f (x, y) ≥ 0 for all (x, y) in R, then the volume of the solid region that lies above R and below the graph of f is defined as ZZ f (x, y) dA. V = R Example 9.2.1. Find the volume of the solid region R bounded by the surface f (x, y) = e−x 2 and the planes y = 0, y = x and x = 1. Solution: The base of R in the xy-plane is bounded by the lines y = 0, x = 1 and y = x. The two possible orders of integration are Z 1Z x Z 1Z 1 2 −x2 e dydx and e−x dxdy. 0 0 0 y By setting Z up the corresponding iterated integrals, we can see that the order dxdy requires the anti2 derivative e−x dx, which is not an elementary function. On the other hand, the order dydx 127 produces the integral Z 0 Z x −x2 e 1 Z #x 1 e dydx = −x2 y 0 0 Z dx 0 1 xe = −x2 dx 0 #1 1 −x2 = − e 2 0 1 1 = − −1 2 e e−1 = 0.316. = 2e 9.3 Triple Integrals Definition of Triple Integral If f is continuous over a bounded solid region Q, then the triple integral of f over Q is defined as ZZZ n X f (x, y, z) dV = lim f (xi , yi , zi ) ∆Vi |∆|→0 Q i=1 provided the limit exists. The volume of the solid region Q is given by ZZZ Volume of Q = dV. Q Evaluation by Iterated Integrals Let f be continuous on a solid region Q defined by a ≤ x ≤ b, h1 (x) ≤ y ≤ h2 (x), g1 (x, y) ≤ z ≤ g2 (x, y) where h1 , h2 , g1 and g2 are continuous functions. Then ZZZ Z b Z h2 (x) Z f (x, y, z) dV = a Q h1 (x) g2 (x,y) f (x, y, z) dV. g1 (x,y) Example 9.3.1. Evaluate the triple iterated integral Z 2 Z x Z x+y ex (y + 2z) dzdydx. 0 0 0 128 Solution: For the first integration, hold x and y constant and integrate with respect to z. #x+y Z 2 Z x Z x+y Z 2Z x ex (y + 2z) dzdydx = ex (yz + z 2 ) dydx 0 0 0 0 Z 0 2Z = 0 0 x ex (x2 + 3xy + 2y 2 ) dydx. 0 For the second integration, hold x constant and integrate with respect to y. x Z 2 Z 2Z x 3xy 2 2y 3 2 x 2 2 x e (x + 3xy + 2y ) dydx = e x y+ + dx 2 3 0 0 0 0 Z 19 2 3 x = x e dx 6 0 " #2 19 x 3 = e (x − 3x2 + 6x − 6) 6 0 19 e2 +1 . = 6 3 Example 9.3.2. If f (x, y, z) = xy + yz and T consists of those points (x, y, z) in space satisfying the inequalities −1 ≤ x ≤ 1, 2 ≤ y ≤ 3 and 0 ≤ z ≤ 1, then ZZ Z 1 Z 3Z 1 f (x, y, z) dV = (xy + yz) dzdydx −1 T = = = = 9.4 9.4.1 0 1 1 2 dydx xyz + yz 2 −1 2 z=0 Z 1 Z 3 1 xy + y dydx 2 −1 2 3 Z 1 1 2 1 2 xy + y dx 4 −1 2 y=2 Z 1 5 5 = x+ dx 2 4 −1 1 5 5 2 5 x + x = . 4 4 −1 2 Z = 2 1 Z 3 Change of Variables : Jacobians Jacobians The Jacobian is named after the German mathematician Carl Gustav Jacobi (1804-1851). For the single integral Z b f (x) dx a 129 you can change variables by letting x = g(u), so that dx = g 0 (u)du, and obtain b Z Z f (x) dx = a d f (g(u))g 0 (u) du c where a = g(c) and b = g(d). Note that the change of variable introduces an additional factor g 0 (u) into the integrand. This also occurs in the case of double integrals. ZZ ZZ ∂x ∂y ∂y ∂x − dudv f (x, y) dA = f (g(u, v), h(u, v)) ∂u ∂v ∂u ∂v | {z } R S Jacobian where the change of variables x = g(u, v) and y = h(u, v) introduces a factor called the Jacobian of x and y with respect to u and v. Definition of the Jacobian If x = g(u, v) and y = h(u, v), then the Jacobian of x and y with respect to u and v, denoted by ∂(x, y) is ∂(u, v) ∂x ∂x ∂u ∂v ∂(x, y) ∂x ∂y ∂y ∂x = = − . ∂(u, v) ∂u ∂v ∂u ∂v ∂y ∂y ∂u ∂v ∂(u, v) In cases it is more convenient to express u and v in terms of x and y, we can first compute ∂(x, y) ∂(x, y) from the formula explicitly and then find the needed Jacobian ∂(u, v) ∂(x, y) ∂(u, v) · = 1. ∂(u, v) ∂(x, y) Example 9.4.1. Find the Jacobian for the change of variables defined by x = r cos θ and y = r sin θ. Solution: From the definition of a Jacobian, we obtain ∂x ∂r ∂(x, y) = ∂(r, θ) ∂y ∂r ∂x ∂θ ∂y ∂θ = cos θ −r sin θ = r cos2 θ + r sin2 θ = r. sin θ r cos θ 130 The above example points out that the change of variables from rectangular to polar coordinates for a double integral can be written as ZZ ZZ f (x, y) dA = f (r cos θ, r sin θ) rdrdθ, r > 0 R S ZZ f (r cos θ, r sin θ) = ∂(x, y) drdθ ∂(r, θ) S where S is the region in the rθ-plane that corresponds to the region R in the xy-plane. In general, a change of variables is given by a one-to-one transformation T from a region S in the uv-plane to a region R in the xy-plane, to be given by T (u, v) = (x, y) = (g(u, x), h(u, v)) where g and h have continuous first partial derivatives in the region S. Note that the point (u, v) lies in S and the point (x, y) lies in R. In most cases, we are hunting for a transformation for which the region S is simpler than the region R. Change of Variables for Double Integrals Theorem 9.4.1. Let R and S be regions in the xy- and uv-planes that are related by the equations x = g(u, v) and y = h(u, v) such that each point in R is the image of a unique point in S. If f is ∂(x, y) continuous on R, g and h have continuous partial derivatives on S, and is non-zero on S, ∂(u, v) then ZZ ZZ ∂(x, y) dudv. f (x, y) dA = f (g(u, v), h(u, v)) ∂(u, v) R S Example 9.4.2. Let R be the region bounded by the lines x − 2y = 0, x − 2y = −4, x + y = 4, and x + y = 1. Evaluate the double integral ZZ 3xy dA. R Solution: to begin, let u = x + y and v = x − 2y. Solving this system of equations for x and y 1 1 produces x = (2u + v) and y = (u − v). The partial derivatives of x and y are 3 3 ∂x 2 = , ∂u 3 ∂x 1 = , ∂v 3 ∂y 1 = ∂u 3 131 and ∂y 1 =− ∂v 3 which implies that the Jacobian is ∂(x, y) = ∂(u, v) = ∂x ∂u ∂x ∂v ∂y ∂y ∂u ∂v 2 1 3 3 1 1 − 3 3 1 2 1 = − − =− . 9 9 3 Therefore, we obtain ZZ 3xy dA = R = = = = = 1 1 ∂(x, y) 3 (2u + v) (u − v) dvdu 3 3 ∂(u, v) S Z 4Z 0 1 (2u2 − uv − v 2 ) dvdu 9 1 −4 0 Z 4 uv 2 v 3 1 2 du 2u v − − 9 1 2 3 −4 Z 1 4 64 2 8u + 8u − du 9 1 3 4 1 8u3 64 2 + 4u − u 9 3 3 1 164 . 9 ZZ Example 9.4.3. Suppose R is the Z Z plane bounded by the hyperbolas xy = 1, xy = 3 and x2 − y 2 = 1, x2 = y 2 = 4. Find (x2 + y 2 ) dxdy. R Solution: The hyperbolas bounding R are u−curves and v−curves if u = xy and v = x2 − y 2 . We can most easily write x2 + y 2 in terms of u and v by first noting that 4u2 + v 2 = 4x2 y 2 + (x2 − y 2 )2 = (x2 + y 2 )2 , so that x2 + y 2 = √ 4u2 + v 2 . Now ∂(u, v) y x = = −2(x2 + y 2 ). 2x −2y ∂(x, y) Hence we have ∂(x, y) 1 1 =− =− √ . 2 2 ∂(u, v) 2(x + y ) 2 4u2 + v 2 132 Therefore ZZ 2 4 Z 2 3 Z (x + y ) dxdy = 4u2 1 √ dudv = 2 4u2 + v 2 v2 + 1 1 R √ Z 4 3 Z 1 1 1 dudv = 3. 2 Example 9.4.4. Find the area of the region R bounded by the curves xy = 1, xy = 3 and xy 1.4 = 1, xy 1.4 = 2. Solution: Define change of variables transformation by u = xy and v = xy 1.4 . Then ∂(u, v) y x = 1.4 = (0.4)xy 1.4 = (0.4)v. y (1.4)xy 0.4 ∂(x, y) So ∂(x, y) 1 2.5 = = . Consequently, ∂(u, v) ∂(u, v) v ∂(x, y) ZZ Z dxdy = Z 1 R 9.5 2 3 1 2.5 dudv = 5 ln 2. v Z y Tutorial 8 1. Evaluate the integral. Z x Z (i) (2x − y) dy (ii) 0 x2 x y dy x 2. Evaluate the iterated integral. Z 1Z Z 1Z 2 (x+y) dydx (ii) (i) 0 0 0 (iii) ey x √ Z y ln x dx x (iv) 1− dydx ye 2 Z dy (v) 2y−y 2 y dy. Z 3y dxdy 0 cos x Z 0 (iii) 0 y −x 0 Z x2 x3 π 2 Z (iv) 3y 2 −6y sin θ θr drdθ. 0 0 3. Sketch the region R of integration switch the order ofZintegration. Z 4Z Z 4 Zand y 2 1 Z 1 f (x, y) dxdy (ii) f (x, y) dydx. (i) f (x, y) dxdy (iii) √ 0 0 0 −1 y x2 4. Sketch the region R whose area is given by the iterated integral. Then switch the order of integration and show that both orders yield the same area. Z 1Z 2 Z 2Z 1 Z 4Z 2 Z 2 Z 4−y2 (i) dydx (ii) dydx (iii) dydx (iv) dxdy. √ 0 0 0 x 2 0 x −2 0 5. Evaluate order of integration) Z 2 Z 2thepiterated integral. (Note Z 2 Z it2 is necessary to switch Z 1 Z the Z 2Z 4 1 √ −y 2 2 3 (i) x 1 + y dydx (ii) e dydx (iii) sin x dxdy (iv) x sin x dxdy 0 x 0 x 0 y 0 y2 Z 1Z 1 Z πZ π Z 1Z 1 dxdy sin y 2 (v) (vi) dydx (vii) e−x dxdy. 4 y 0 y 1+x 0 x 0 y 133 6. Use polar√coordinates to evaluate each of the following integrals. Z 1 Z 1−x2 Z 2 Z √4−x2 Z 1 Z √1−y2 3 dydx (x2 +y 2 ) 2 dydx (iii) sin(x2 +y 2 ) dxdy. (i) (ii) 2 − y2 4 − x 0 0 0 0 0 0 7. Evaluate Z 1triple integral. Z 3 Z 2the (x+y+z) dxdydz (i) 0 0 Z 4 0 e2 Z Z (iv) 1 1 1 Z Z 1 x y z dxdydz −1 π 2 Z −1 y 2 Z Z −1 1 Z sin y dzdxdy 0 0 4 1 Z Z x 2 2ze−x dydxdz (iii) 1 y (v) 0 Z 2 2 2 (ii) 1 xz ln zdydzdx 1 Z 9 Z y 3 0 0 Z √y2 −9x2 (vi) z dzdxdy. 0 0 0 0 ∂(x, y) for the indicated change of variables. ∂(u, v) 1 1 (i) x = − (u − v), y = (u + v) (ii) x = au + bv, y = cu + dv (iii) x = u − uv, y = uv 2 2 (iv) x = u cos θ − v sin θ, y = u sin θ + v cos θ (v) x = eu sin v, y = eu cos v. ZZ p 9. Evaluate x2 + y 2 dxdy, where D is the region bounded by x2 + y 2 = 4 and x2 + y 2 = 9. 8. Find the Jacobian D ZZ 10. Evaluate (x + y)2 dxdy where D is the parallelogram bounded by the lines D x + y = 0, x + y = 1, 2x − y = 0 and 2x − y = 3. 11. Let D be the region in the firstZ Zquadrant r bounded by the hyperbolas xy = 1, xy = 9 and the y √ lines y = x, y = 4x. Evaluate + xy dxdy. x D ZZ 12. Using a suitable transformation, evaluate 1 x(x2 −y 2 ) 2 dxdy, where R is the region bounded R by x2 − y 2 = 1, x2 − y 2 = 2, 2y = x and 5y = x. Z ∞Z ∞ Z −(x2 +y 2 ) 13. Using polar co-ordinates, evaluate e dxdy. Deduce the value of 0 0 ∞ 2 e−x dx. 0 √ a− a2 −x2 aZ Z 14. By changing the order of integration or otherwise, evaluate 0 0 1Z 1 Z 15. Change the order of integration and evaluate the integral 0 √ x xy log(y + a) dydx. (y − a)2 xy 2 p dydx. x2 + y 2 ZZ p 16. Evaluate x2 + y 2 dydx, where R is the region in which x2 + y 2 ≤ 4, x ≥ 0, y ≥ 0. R Z a Z a 1 17. By means of the substitutions u = x+y and v = y−x, evaluate the integral 0 18. By means of substitutions x = 1 (u+v) 2 and y = 134 1 (u−v), 2 p 0 Z a Z evaluate the integral 0 0 a2 − (y − x)2 a dxdy. dxdy p a2 + (y + x)2 . ZZ 19. Calculate (x + y)3 cos2 (x − y) dxdy, where R is the region bounded by the lines R x + y = π, x + y = 5π, x − y = π and x − y = −π. 20. Calculate the area of the region bounded by xy = 4, xy = 8, xy 3 = 5 and xy 3 = 15. 21. Calculate the area of the parallelogram bounded by the lines x + y = 1, x + y = 2 and 2x − 3y = 2, 2x − 3y = 5. ZZ (x2 + y 2 ) dxdy, where R is the region in the first quadrant bounded by the 22. Evaluate R hyperbolas x2 − y 2 = 6, x2 − y 2 = 1, 2xy = 4 and 2xy = 1. (Hint: Use u = x2 − y 2 , v = 2xy) Z 1Z 1 Z √π Z √π − x2 23. Evaluate (i) sin(x2 ) dxdy. e y dydx (ii) √ 0 x 0 y 135 Chapter 10 Basic Concepts of Set Theory 136 10.1 The Importance of Set Theory One striking feature of humans is their inherent need-and ability-to group objects according to specific criteria. Our prehistoric ancestors grouped tools based on their hunting needs. They eventually evolved into strict hierarchical societies where a person belonged to one class and not another. Many of us today like to sort our clothes at house, or group the songs on our computer into playlists. The idea of sorting out certain objects into similar groupings, or sets, is the most fundamental concept in modern mathematics. The theory of sets has, in fact, been the unifying framework for all mathematics since the German mathematician Georg Cantor formulated it in the 1870’s. No field of mathematics could be described nowadays without referring to some kind of abstract set. A geometer, for example, may study the set of parabolic curves in three dimensions or the set of spheres in a variety of different spaces. An algebraist may work with a set of equations or a set of matrices. A statistician typically works with large sets of raw data. And the list goes on. You may have also read or heard that the most important unresolved problem in mathematics at the moment deals with the set of prime numbers (this problem in number theory is known as Riemann’s Hypothesis; the Clay Institute will award a million dollars to whoever solves it.) As it turns out, even numbers are described by mathematicians in terms of sets! More broadly, the concept of set membership, which lies at the heart of set theory, explains how statements with nouns and predicates are formulated in our language – or any abstract language like mathematics. Because of this, set theory is intimately connected to logic and serves as the foundation for all of mathematics. 10.2 What is a Set? A set is a collection of objects called the elements or members of the set. These objects could be anything conceivable, including numbers, letters, colors, even sets themselves! However, none of the objects of the set can be the set itself. We discard this possibility to avoid running into Russell’s Paradox, a famous problem in mathematical logic unearthed by the great British logician Bertrand Russell in 1901. 10.3 Some Interesting Sets of Numbers Let’s look at different types of numbers that we can have in our sets. 1. Natural Numbers The set of natural numbers is {1, 2, 3, 4, . . . } and is denoted by N. 137 2. Integers The set of integers is {. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . } and is denoted by Z. The Z symbol comes from the German word, Zahlen, which means number. Define the non-negative integers {0, 1, 2, 3, 4, . . . } often denoted by Z+ . All natural numbers are integers. 3. Rational Numbers The set of rational numbers is denoted by Q and consists of all fractional numbers i.e., x ∈ Q if x can be written in the form pq , where p, q ∈ Z with q 6= 0. 4. Real Numbers The real numbers are denoted by R. 5. Complex Numbers The complex numbers are denoted by C. 10.4 Notation 1. A, B, C, . . . for sets. 2. a, b, c, . . . or x, y, z, . . . for members. 3. b ∈ A, if b belongs to A. 4. c ∈ / A, if c does not belong to A. 5. ∅ is used for the empty set. There is exactly one set, the empty set or null set, which has no members at all. 6. A set with only one member is called a singleton or singleton set. for example, {x}. 10.5 Well-Defined Sets A set is said to be well-defined if it is unambiguous which elements belong to the set. In other words, if A is well-defined, then the question “Does x ∈ A?” can always be answered for any object x. For example, if we define C as the set of large numbers, then it is unclear which numbers should be considered “large”. C is therefore not a well-defined set. Similarly, the set of all great Zimbabwean footballers, or the set of all expensive restaurants in Harare, are also not well-defined. 138 10.6 Specification of Sets 10.6.1 Three Ways to Specify a Set 1. Listing all its members (List Notation). 2. By stating a property of its elements (Predicate Notation). 3. By defining a set of rules which generates (defines) its members (Recursive Rules). List Notation This is suitable for finite sets. It lists names of elements of a set, separated by commas and enclose them in braces. For example {2, 3, 5}, {a, b, d, m}, {Zimbabwe, SouthAfrica}. Three Dot Abbreviation For example, {1, 2, . . . , 100}. Predicate Notation For example, {x|x is a natural number and x < 8}. Reading : the set of all x such that x is a natural number and is less than 8. The general form is {x|P (x)} where P is some predicate (condition, property). Recursive Rules For example, the set E of even numbers greater than 3, (a) 4 ∈ E (b) if x ∈ E, then x + 2 ∈ E (c) nothing else belongs to E. The first rule is the basis of the recursion, the second one generates new elements from the elements defined before and the third rule restricts the defined set to the elements generated by (a) and (b). The notion of a set does not allow for multiple instances (repetitions) of the same element in the set, for example, {1, 2, 2, 3} is not a set. The collection, out of which all sets under consideration may be formed, is called the universe of discourse or universal set, denoted by U. 139 10.7 The Empty set (Null Set) We have that the fundamental property of a set is that we can assert of each object whether or not it is a member of the set. Consider a set constructed by asserting of each object that it is not a member of the set. This set has no members and is therefore called the empty set. Definition 10.7.1. The null or empty set is the set that does not contain any elements, denoted ∅ = {} = {x|x 6= x}. Example 10.7.1. (i) {x ∈ R|x2 = −1} (ii) {x ∈ Z|x2 = 2}. Theorem 10.7.1. There is exactly one empty set. 10.8 Identity and Cardinality Two sets are identical if and only if (iff) they have the same elements or both are empty. So A = B iff, for every x, x ∈ A ⇔ x ∈ B. Example 10.8.1. {0, 2, 4} = {x|x is an even positive integer less than 5}. The number of elements in a set A is called the cardinality of A, denoted by |A|. The cardinality of a finite set is a natural number. Infinite sets also have cardinalities but they are not natural numbers. The set A is said to be countable or enumerable if there is a way to list the elements of A. 10.9 Russell’s Paradox (antimony) A paradox (antimony) is an apparently true statement that seems to lead to a logical selfcontradiction. Its important to note that any given property, P (x) does not necessarily determine a set, i.e., we cannot say that given any arbitrary property P , there corresponds a set whose elements satisfy the property P . Consider the following. There was once a barber man, wherever he lived, all of the men in this town either shaved themselves or were shaved by the barber. And this barber man only shaved the men who did not shave themselves. Did the barber shave himself? Let’s say that he did shave himself. But from this he shaved only the men in town, who did not shave themselves, therefore, he did not shave himself. But we see that every men in town either shaved himself or was shaved by the barber. So he did shave himself. We have a contradiction. Russell observed that if S is a set, then either S ∈ S or S ∈ / S, since a given object is either a 140 member of a given set or is not a member of that set. Consider the set of all sets that are not members of themselves, R = {x|x is a set and x ∈ / x}. R is an object, either R ∈ R or R ∈ / R. (i) Assume R ∈ R, then R is a set, and R ∈ / R by definition. (ii) Assume R ∈ / R, then R ∈ R by definition of R, since we are assuming R is a set. But we cannot have both R ∈ R and R ∈ R, so we reach a contradiction. In both cases we have inferred the paradox that R ∈ R iff R ∈ / R. In other words, the assumption that R is a set has led to a contradiction and therefore there is no such thing, then, as the set of all sets. To avoid unnecessary paradoxes, we assume the existence of the universal set, U. All this leads to the following problems 1. There are things that are true in mathematics (based on assumptions). 2. There are things that are false. 3. There are things that are true that can never be proved. 4. There are things that are false that can never be disproved. After this paradox was described, set theory had to be reformulated axiomatically as axiomatic set theory. 10.10 Inclusion Definition 10.10.1. Having fixed our universal set, U, then for all x ∈ U. If A and B are sets (with all members in U), we write A ⊆ B or B ⊇ A iff x ∈ A =⇒ x ∈ B. (⊆ , set inclusion symbol) A set A is a subset of a set B iff every element of A is also an element of B. If A ⊆ B and A 6= B, we call A a proper subset of B and write A ⊂ B. Theorem 10.10.1. If A ⊆ B and B ⊆ C then A ⊆ C. Proof. Let x ∈ A, then since A ⊆ B, we have x ∈ B and given that B ⊆ C, we conclude that x ∈ C, thus A ⊆ C. Example 10.10.1. (i) {a, b} ⊆ {d, a, b, e} (iv) {a, b} 6⊂ {a, b}. (ii) {a, b} ⊆ {a, b} (iii) {a, b} ⊂ {d, a, b, e} Note that the empty set is a subset of every set, ∅ ⊆ A, for every set A and that for any set A, we have A ⊆ A. 141 10.11 Axiom of Extensionality Theorem 10.11.1. For any two sets A and B, A = B ⇐⇒ A ⊆ B and B ⊆ A. 10.11.1 Power Sets The set of all subsets of A is called the power set of A and is denoted by P(A) and |P(A)| = 2|A| where |A| is finite. Example 10.11.1. If A = {a, b}, then P(A) = {∅, {a}, {b}, {a, b}}. From the above example, a ∈ A, {a} ⊆ A, {a} ∈ P(A), ∅ ⊆ A, ∅ ∈ / A, ∅ ⊆ P(A), ∅ ∈ P(A). 10.12 Operations on Sets 10.12.1 Union and Intersection Let A and B be arbitrary sets. The union of A and B, written A ∪ B, is the set whose elements are just the elements of A or B or both. A ∪ B := {x|x ∈ A or x ∈ B}. Example 10.12.1. Let K = {a, b}, L = {c, d}, M = {b, d}, then K ∪ L = {a, b, c, d}, K ∪ M = {a, b, d}, L ∪ M = {b, c, d}, (K ∪ L) ∪ M = K ∪ (L ∪ M ) = {a, b, c, d}, K ∪ K = K, K ∪ ∅ = ∅ ∪ K = K = {a, b}. The intersection of A and B, written A ∩ B, is the set whose elements are just the elements of both A and B. A ∩ B := {x|x ∈ A and x ∈ B}. Example 10.12.2. K ∩ L = ∅, K ∩ M = {b}, L ∩ M = {d}, (K ∩ L) ∩ M = K ∩ (L ∩ M ) = ∅, K ∩ K = K, K ∩ ∅ = ∅ ∩ K = ∅. 10.13 Properties of ∪ and ∩ 1. Every element x in A ∩ B belongs to both A and B, hence x belongs to A and x belongs to B. Thus A ∩ B is a subset of A and of B i.e., A ∩ B ⊆ A and A ∩ B ⊆ B. 142 2. An element x belongs to the union A ∪ B if x belongs to A or x belongs to B, hence every element in A belongs to A ∪ B and every element in B belong to A ∪ B, i.e., A⊆A∪B and B ⊆ A ∪ B. Theorem 10.13.1. For any sets A and B, we have (i) A ∩ B ⊆ A ⊆ A ∪ B and (ii) A ∩ B ⊆ B ⊆ A ∪ B. Theorem 10.13.2. The following are equivalent, A ⊆ B, A ∩ B = A, A ∪ B = B. Proof. Suppose A ⊆ B and let x ∈ A. Then x ∈ B, hence x ∈ A ∩ B and A ⊆ A ∩ B. Then A ∩ B ⊆ A. Therefore A ∩ B = A. Suppose A ∩ B = A and let x ∈ A. Then x ∈ (A ∩ B), hence x ∈ A and x ∈ B. Therefore A ⊆ B. Suppose again that A ⊆ B. Let x ∈ (A ∪ B), then x ∈ A or x ∈ B. If x ∈ A, then x ∈ B because A ⊆ B. In either case x ∈ B. Therefore A ∪ B ⊆ B. But B ⊆ A ∪ B. Therefore A ∪ B = B. Now suppose A ∪ B = B and let x ∈ A. Thus x ∈ (A ∪ B). Hence x ∈ B = A ∪ B, therefore A ⊆ B. Definition 10.13.1. Two sets A and B are called disjoint sets if the intersection of A and B is the null set i.e., A ∩ B = ∅. 10.14 Difference and Complement Definition 10.14.1. A minus B written A\B or A−B, which subtracts from A all elements which are in B (also called relative complement, or the complement of B relative to A) is defined as A − B := {x|x ∈ A and x ∈ / B}. Example 10.14.1. K − L = {a, b}, K − K = ∅, K − M = {a}, K − ∅ = K, L − M = {c}, ∅ − K = ∅. 10.14.1 Symmetric Difference Definition 10.14.2. A 4 B = A ⊕ B := {x|x ∈ A or x ∈ B but not in both} or A 4 B = A ⊕ B := (A ∪ B) \ (A ∩ B) = (A \ B) ∪ (B \ A). The complement of a set A, is the set of elements which do not belong to A, i.e., the difference of the universal set U and A. Denote the complement of A by A0 or Ac . A0 = {x|x ∈ U and x ∈ / A} or A0 = U − A. Example 10.14.2. Let E = {2, 4, 6, . . . }, the set of all even numbers. Then E c = {1, 3, 5, . . . }, the set of odd numbers. 143 10.15 Venn Diagrams A simple and instructive way of illustrating the relationship between sets in the use of the so called Venn-Euler diagrams or simply Venn diagrams. 10.16 Set Theoretic Equalities 1. Idempotent Laws (i) X ∪ X = X (ii) X ∩ X = X. 2. Commutative Laws (i) X ∪ Y = Y ∪ X (ii) X ∩ Y = Y ∩ X. 3. Associative Laws (i) (X ∪ Y ) ∪ Z = X ∪ (Y ∪ Z) (ii) (X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z). 4. Distributive Laws (i) X ∪(Y ∩Z) = (X ∪Y )∩(X ∪Z) 5. Identity Laws (i) X ∪ ∅ = X (ii) X ∪ U = U 6. Complement Laws (i) X ∪ X c = U (iv) X − Y = X ∩ Y c . (iii) X ∩ ∅ = ∅ (ii) (X c )c = X 7. De Morgan’s Laws (i) (X ∪ Y )c = X c ∩ Y c (ii) X ∩(Y ∪Z) = (X ∩Y )∪(X ∩Z). (iv) X ∩ U = X. (iii) X ∩ X c = ∅ (ii) (X ∩ Y )c = X c ∪ Y c . 8. Consistency Principle (i) X ⊆ Y iff X ∪ Y = Y (ii) X ⊆ Y iff X ∩ Y = X. Example 10.16.1. Show that (Ac )c = A. Proof. We need to show that A ⊆ (Ac )c and (Ac )c ⊆ A. Let x ∈ A then x ∈ / Ac . If x ∈ / Ac , then x ∈ (Ac )c . By definition of subsets A ⊆ (Ac )c . We want to show that (Ac )c ⊆ A. Let y ∈ (Ac )c , then y ∈ / Ac . If y ∈ / Ac , then y ∈ A. We have shown that y ∈ (Ac )c =⇒ y ∈ A. Thus (Ac )c ⊆ A. By equality of sets (Ac )c = A. Example 10.16.2. Show that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). Proof. Let D = A ∩ (B ∪ C) and E = (A ∩ B) ∪ (A ∩ C). We have to prove first that D ⊆ E. Let x ∈ D, then x ∈ A and x ∈ (B ∪ C). Since x ∈ (B ∪ C), either x ∈ B or x ∈ C or both. In case x ∈ B we have x ∈ A and x ∈ B, so x ∈ (A ∩ B). On the other hand, if x ∈ / B, then we must have x ∈ C, so x ∈ (A ∩ C). Taking these two cases together, x ∈ (A ∩ B) or x ∈ (A ∩ C), so x ∈ E. Now, we prove that E ⊆ D. Let x ∈ E. Suppose first that x ∈ (A ∩ B), then x ∈ A and x ∈ B, so x ∈ A and x ∈ (B ∪ C), so x ∈ D. On the other hand, if x 6∈ (A ∩ B), then x ∈ (A ∩ C), so again we obtain x ∈ A and x ∈ (B ∪ C), giving x ∈ D. Hence E ⊆ D. Hence both D ⊆ E and E ⊆ D and we conclude that D = E and consequently A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) . 144 10.17 Counting Elements in Sets If A and B are disjoint sets, then |A ∪ B| = |A| + |B|, otherwise |A ∪ B| = |A| + |B| − |A ∩ B|. Example 10.17.1. Let A = {a, b, c, d, e} and B = {d, e, f, g, h, i}, so that A∪B = {a, b, c, d, e, f, g, h, i} and A ∩ B = {d, e}. Since |A| = 5, |B| = 6, |A ∪ B| = 9, |A ∩ B| = 2, we have |A ∪ B| = |A| + |B| − |A ∩ B| = 5 + 6 − 2 = 9. 10.18 The Algebra of Sets We have considered the problem of showing that two sets are the same, however this technique becomes tedious should the expressions involved be at all complicated. We shall develop an algebra of sets, to assist us in simplifying a given expression. The following basic laws are easily established. Law 1 : (Ac )c = A Law 2 : A ∪ B = B ∪ A Law 3 : A ∩ B = B ∩ A Law 4 : A ∪ (B ∩ C) = (A ∪ B) ∪ C Law 5 : A ∩ (B ∩ C) = (A ∩ B) ∩ C Law 6 : A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) Law 7 : A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) Law 8 : (A ∪ B)c = Ac ∩ B c Law 9 : (A ∩ B)c = Ac ∪ B c Law 10 : U c = ∅ Law 11 : ∅c = U Law 12 : A ∪ ∅ = A Law 13 : A ∪ U = U Law 14 : A ∩ U = A Law 15 : A ∩ ∅ = ∅ Law 16 : A ∪ Ac = U Law 17 : A ∩ Ac = ∅. Example 10.18.1. By using the algebra of sets, show that A ∪ (B ∩ Ac ) = A ∪ B. Proof. A ∪ (B ∩ Ac ) = (A ∪ B) ∩ (A ∪ Ac ) by Law 6 = (A ∪ B) ∩ U by Law 16 = A ∪ B by Law 14. 10.19 Set Products 10.19.1 Ordered Pairs Definition 10.19.1. Let n be any natural number and let a1 , a2 , . . . , an be any objects. Then (a1 , a2 , . . . , an ) denotes the ordered n-tuple with first term a1 , second term a2 , . . . and nth term an . 145 Example 10.19.1. (5, 7) denotes the ordered pair whose first term is 5 and second term 7. Note that (5, 7, 2) is called an ordered triple, (5, 7, 2, 4) is called an ordered 4-tuple. The fundamental statement we can make about an ordered n-tuple is that a given object is the kth term of an ordered n-tuple. Definition 10.19.2. Let A and B be any non-empty sets, then A × B := {(a, b)|a ∈ A and b ∈ B}. If A and B are both finite sets, then |A × B| = |A| · |B|. If A = B, we sometimes write A2 for A × A. Example 10.19.2. 1. If A = {1, 2} and B = {2, 3, 4}, then A×B = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4)} and B × A = {(2, 1), (2, 2), (3, 1), (3, 2), (4, 1), (4, 2)}. Notice that A × B 6= B × A, in general. 2. The Cartesian product R × R = R2 is the set of all ordered pairs of real numbers and this represents the 2-dimensional Cartesian plane. 3. (s1 , t1 ) = (s2 , t2 ) if and only if s1 = s2 and t1 = t2 . 10.20 Theorems on Set Products Let A, B, C, D be sets, then 1. A × (B ∪ C) = (A × B) ∪ (A × C). 2. A × (B ∩ C) = (A × B) ∩ (A × C). 3. (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D). 4. (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D). 5. (A − B) × C = (A × C) − (B × C). 6. If A and B are non-empty sets, then A × B = B × A if and only if A = B. 7. If A1 ∈ P(A) and B1 ∈ P(B), then A1 × B1 ∈ P(A × B). Example 10.20.1. Prove that (A ∪ B) × C = (A × C) ∪ (B × C). Proof. Consider any element (u, v) ∈ (A ∪ B) × C. By definition u ∈ (A ∪ B) and v ∈ C. Thus u ∈ A or u ∈ B. If u ∈ A, then (u, v) ∈ (A × C) and if u ∈ B, then (u, v) ∈ (B × C). Thus (u, v) is in A × C or in B × C and therefore (u, v) ∈ (A × C) ∪ (B × C). This proves that (A ∪ B) × C ⊆ (A × C) ∪ (B × C). 146 Now consider any element (u, v) ∈ (A × C) ∪ (B × C). This implies that (u, v) ∈ (A × C) or (u, v) ∈ (B × C). In the first case u ∈ A and v ∈ C and in the second case u ∈ B and v ∈ C. Thus u ∈ (A∪B) and v ∈ C which implies (u, v) ∈ (A∪B)×C. Therefore (A×C)∪(B×C) ⊆ (A∪B)×C. Hence (A ∪ B) × C = (A × C) ∪ (B × C). 10.21 Tutorial 9 1. Let {2, 4, 6}, and {3, 4, 5, 6, 7, 8, 9, 10}, What is the cardinality of the sets A ∩ B and A ∪ B. 2. List the elements of the following sets where P = {1, 2, 3, . . .}. (a) A = {x : x ∈ P, 3 < x < 12} (b) B = {x : x ∈ P, x is even, x < 15} (c) C = {x : x ∈ P, 4 + x = 3} (d) D = {x : x ∈ P, x is a multiple of 5}. 3. Consider the universal set U = {1, 2, 3, . . . , 9} and the sets: A = {1, 2, 3, 4, 5}, B = {4, 5, 6, 7}, C = {5, 6, 7, 8, 9}, D = {1, 3, 5, 7, 9}, E = {2, 4, 6, 8}, F = {1, 5, 9}. Find: (a) A ∪ B and D ∩ F (b) B c , Dc , U c , ∅c (c) A − B, B − A, D − E, F − D (d) A ⊕ B, C ⊕ D, E ⊕ F (e) A ∩ (B ∪ E), (A ∩ D) − B, (B ∩ F ) ∪ (C ∩ E). 4. Let A = {1, 2, 3, 4, 5}, find the power set P(A) of A. 5. If A, B and C are any three sets, prove that (a) A − B = A ∩ B 0 (b) A ⊂ B implies B 0 ⊂ A0 (e) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) and A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) 6. Use the laws of algebra of sets to prove that (a) (A ∩ B) ∪ (A ∩ B 0 ) = A, set. (b) (A ∩ U) ∩ (∅ ∪ A0 ) = ∅, where U is the universal 7. Define A − B = A ∩ B0 and A 4 B = (A − B) ∪ (B − A). Simplify each of the following: (i) A 4 ∅, (ii) A 4 U, (iii) A 4 A, (iv) A 4 A0 . 8. Let A, B, C, D be sets, prove the following (b) A × (B ∩ C) = (A × B) ∩ (A × C), (c) (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D). 147 Introduction to Probability Theory 148 10.22 Probability Probability theory provides a mathematical foundation to concepts such as “probability”, “information”, “belief”, “uncertainty”, “confidence”, “randomness”, “variability”, “chance” and “risk”. Probability theory is important to empirical scientists because it gives them a rational framework to make inferences and test hypotheses based on uncertain empirical data. Probability theory is also useful to engineers building systems that have to operate intelligently in an uncertain world. For example, some of the most successful approaches in machine perception (e.g., automatic speech recognition, computer vision) and artificial intelligence are based on probabilistic models. Moreover probability theory is also proving very valuable as a theoretical framework for scientists trying to understand how the brain works. Many computational neuroscientists think of the brain as a probabilistic computer built with unreliable components, i.e., neurons, and use probability theory as a guiding framework to understand the principles of computation used by the brain. Consider the following examples: • You need to decide whether a coin is loaded (i.e., whether it tends to favor one side over the other when tossed). You toss the coin 6 times and in all cases you get “Tails”. Would you say that the coin is loaded? • You are trying to figure out whether newborn babies can distinguish green from red. To do so you present two colored cards (one green, one red) to 6 newborn babies. You make sure that the 2 cards have equal overall luminance so that they are indistinguishable if recorded by a black and white camera. The 6 babies are randomly divided into two groups. The first group gets the red card on the left visual field, and the second group on the right visual field. You find that all 6 babies look longer to the red card than the green card. Would you say that babies can distinguish red from green? • A pregnancy test has a 99% validity (i.e., 99 of 100 pregnant women test positive) and 95% specificity (i.e., 95 out of 100 non pregnant women test negative). A woman believes she has a 10% chance of being pregnant. She takes the test and tests positive. How should she combine her prior beliefs with the results of the test? • You need to design a system that detects a sinusoidal tone of 1000Hz in the presence of white noise. How should you design the system to solve this task optimally? • How should the photo receptors in the human retina be interconnected to maximize information transmission to the brain? While these tasks appear different from each other, they all share a common problem: The need to combine different sources of uncertain information to make rational decisions. Probability theory provides a very powerful mathematical framework to do so. We now go into the mathematical aspects of probability theory. 149 10.23 Sample Spaces A set S that consists of all possible outcomes of a random experiment is called a sample space, and each outcome is called a sample point. Often there will be more than one sample space that can describe outcomes of an experiment, but there is usually only one that will provide the most information. Example 10.23.1. If we toss a die, then one sample space is given by {1, 2, 3, 4, 5, 6} while another is {even, odd}. It is clear, however, that the latter would not be adequate to determine, for example, whether an outcome is divisible by 3. The sample space is also called the outcome space, reference set, and universal set. It is often useful to portray a sample space graphically. In such cases, it is desirable to use numbers in place of letters whenever possible. If a sample space has a finite number of points, it is called a finite sample space. If it has as many points as there are natural numbers 1, 2, 3, . . . , it is called a countably infinite sample space. If it has as many points as there are in some interval on the x axis, such as 0 ≤ x ≤ 1, it is called a noncountably infinite sample space. A sample space that is finite or countably finite is often called a discrete sample space, while one that is noncountably infinite is called a nondiscrete sample space. Example 10.23.2. The sample space resulting from tossing a die yields a discrete sample space. However, picking any number, not just integers, from 1 to 10, yields a non-discrete sample space. 10.24 Events We have defined outcomes as the elements of a sample space S. In practice, we are interested in assigning probability values not only to outcomes but also to sets of outcomes. For example, we may want to know the probability of getting an even number when rolling a die. In other words, we want the probability of the set {2, 4, 6}. An event is a subset A of the sample space S, i.e., it is set of possible outcomes. If the outcome of an experiment is an element of A, we say that the event A has occurred. An event consisting of a single point of S is called a simple or elementary event. As particular events, we have S itself, which is the sure or certain event since an element of S must occur, and the empty set ∅, which is called the impossible event because an element of ∅ cannot occur. By using set operations on events in S, we can obtain other events in S. For example, if A and B are events, then 1. A ∪ B is the event “either A or B or both.” A ∪ B is called the union of A and B. 2. A ∩ B is the event “both A and B.” A ∩ B is called the intersection of A and B. 3. A0 is the event “not A.” A0 is called the complement of A. 150 4. A − B = A ∩ B 0 is the event “A but not B.” In particular, A0 = S − A. If the sets corresponding to events A and B are disjoint, i.e., A ∩ B = ∅, we often say that the events are mutually exclusive. This means that they cannot both occur. We say that a collection of events A1 , A2 , . . . , An is mutually exclusive if every pair in the collection is mutually exclusive. Example 10.24.1. Consider an experiment of tossing a coin twice, let A be the event “at least one head occurs” and B the event “the second toss results in a tail.” Find the events A ∪ B, A ∩ B, A0 and A − B. Solution: We observe that A = {HT, T H, HH}, B = {HT, T T } and so we have A ∪ B = {HT, T H, HH, T T } = S, A ∩ B = {HT } A0 = {T T } A − B = {T H, HH}. 10.25 The Concept of Probability In any random experiment there is always uncertainty as to whether a particular event will or will not occur. As a measure of the chance, or probability, with which we can expect the event to occur, it is convenient to assign a number between 0 and 1. If we are sure or certain that an event will occur, we say that its probability is 100% or 1. If we are sure that the event will not occur, we say that its probability is zero. If, for example, the probability is 1/4, we would say that there is a 25% chance it will occur and a 75% chance that it will not occur. Equivalently, we can say that the odds against occurrence are 75% to 25%, or 3 to 1. There are two important procedures by means of which we can estimate the probability of an event. 1. CLASSICAL APPROACH: If an event can occur in h different ways out of a total of n possible ways, all of which are equally likely, then the probability of the event is h/n. Example 10.25.1. Suppose we want to know the probability that a head will turn up in a single toss of a coin. Since there are two equally likely ways in which the coin can come upnamely, heads and tails (assuming it does not roll away or stand on its edge)- and of these two ways a head can arise in only one way, we reason that the required probability is 1/2. In arriving at this, we assume that the coin is fair, i.e., not loaded in any way. 2. FREQUENCY APPROACH: If after n repetitions of an experiment, where n is very large, an event is observed to occur in h of these, then the probability of the event is h/n. This is also called the empirical probability of the event. Example 10.25.2. If we toss a coin 1000 times and find that it comes up heads 532 times, we estimate the probability of a head coming up to be 532/1000 = 0.532. 151 Both the classical and frequency approaches have serious drawbacks, the first because the words “equally likely” are vague and the second because the “large number” involved is vague. Because of these difficulties, mathematicians have been led to an axiomatic approach to probability. 10.26 The Axioms of Probability Suppose we have a sample space S. If S is discrete, all subsets correspond to events and conversely; if S is nondiscrete, only special subsets (called measurable) correspond to events. To each event A in the class C of events, we associate a real number P (A). The P is called a probability function, and P (A) the probability of the event, if the following axioms are satisfied. Axiom 1. For every event A in class C, P (A) ≥ 0 Axiom 2. For the sure or certain event S in the class C, P (S) = 1 Axiom 3. For any number of mutually exclusive events A1 , A2 , . . . , in the class C, P (A1 ∪ A2 ∪ . . .) = P (A1 ) + P (A2 ) + . . . In particular, for two mutually exclusive events A1 and A2 , P (A1 ∪ A2 ) = P (A1 ) + P (A2 ). 10.27 Some Important Theorems on Probability From the above axioms we can now prove various theorems on probability that are important in further work. Theorem 10.27.1. If A1 ⊂ A2 , then P (A1 ) ≤ P (A2 ) and P (A2 − A1 ) = P (A2 ) − P (A1 ). Theorem 10.27.2. For every event A, 0 ≤ P (A) ≤ 1, i.e., a probability between 0 and 1. Theorem 10.27.3. For ∅, the empty set, P (∅) = 0, i.e., the impossible event has probability zero. Theorem 10.27.4. If A0 is the complement of A, then P (A0 ) = 1 − P (A). Theorem 10.27.5. If A = A1 ∪ A2 ∪ A3 ∪ . . . ∪ An , where A1 , A2 , . . . , An are mutually exclusive events, then P (A) = P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ). In particular, if A = S, the sample space, then P (A1 ) + P (A2 ) + P (A3 ) + . . . + P (An ) = 1. Theorem 10.27.6. If A and B are any two events, then P (A ∪ B) = P (A) + P (B) − P (A ∩ B). More generally, if A1 , A2 , A3 are any three events, then P (A1 ∪A2 ∪A3 ) = P (A1 )+P (A2 )+P (A3 )−P (A1 ∩A2 )−P (A2 ∩A3 )−P (A3 ∩A1 )+P (A1 ∩A2 ∩A3 ). Generalizations to n events can also be made. 152 Theorem 10.27.7. For any events A and B, P (A) = P (A ∩ B) + P (A ∩ B 0 ). Theorem 10.27.8. If an event A must result in the occurrence of one of the mutually exclusive events A1 , A2 , . . . , An , then P (A) = P (A ∩ A1 ) + P (A ∩ A2 ) + · · · + P (A ∩ An ). 10.28 Assignment of Probabilities If a sample space S consists of a finite number of outcomes a1 , a2 , . . . , an , then by theorem 10.27.5, P (A1 ) + P (A2 ) + . . . + P (An ) = 1 where A1 , A2 , . . . , An are elementary events given by Ai = {ai }. It follows that we can arbitrarily choose any nonnegative numbers for the probabilities of these simple events as long as the previous equation is satisfied. In particular, if we assume equal probabilities for all simple events, then P (Ak ) = 1 , n k = 1, 2, . . . , n And if A is any event made up of h such simple events, we have P (A) = h . n This is equivalent to the classical approach to probability. We could of course use other procedures for assigning probabilities, such as frequency approach. Assigning probabilities provides a mathematical model, the success of which must be tested by experiment in much the same manner that the theories in physics or others sciences must be tested by experiment. Example 10.28.1. A single die is tossed once. Find the probability of a 2 or 5 turning up. Solution: The sample space is S = {1, 2, 3, 4, 5, 6}. If we assign equal probabilities to the sample points, i.e., if we assume that the die is fair, then 1 P (1) = P (2) = · · · = P (6) = . 6 The event that either 2 or 5 turns up is indicated by 2 ∪ 5. Therefore, P (2 ∪ 5) = P (2) + P (5) = 153 1 1 1 + = . 6 6 3 10.29 Conditional Probability Let A and B be two events such that P (A) > 0. Denote P (B|A) the probability of B given that A has occurred. Since A is known to have occurred, it becomes the new sample space replacing the original S. From this we are led to the definition P (B|A) ≡ P (A ∩ B) P (A) (10.1) or P (A ∩ B) ≡ P (A)P (B|A). (10.2) In words, this is saying that the probability that both A and B occur is equal to the probability that A occurs times the probability that B occurs given that A has occurred. We call P (B|A) the conditional probability of B given A, i.e., the probability that B will occur given that A has occurred. It is easy to show that conditional probability satisfies the axioms of probability previously discussed. Example 10.29.1. Find the probability that a single toss of a die will result in a number less than 4 if (a) no other information is given and (b) it is given that the toss resulted in an odd number. Solution: (a) Let B denote the event {less than 4}. Since B is the union of the events 1, 2, or 3 turning up, we see by Theorem 10.27.5 that P (B) = P (1) + P (2) + P (3) = 1 1 1 1 + + = 6 6 6 2 assuming equal probabilities for the sample points. 3 (b) Letting A be the event {odd number}, we see that P (A) = 6 Then P (A ∩ B) 1/3 P (B|A) = = = P (A) 1/2 1 2 1 = . Also, P (A ∩ B) = = . 2 6 3 2 . 3 Hence, the added knowledge that the toss results in an odd number raises the probability from 1/2 to 2/3. 154 10.30 Theorems on Conditional Probability Theorem 10.30.1. For any three events A1 , A2 , A3 , we have P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 |A1 )P (A3 |A1 ∩ A2 ). (10.3) In words, the probability that A1 and A2 and A3 all occur is equal to the probability that A1 occurs times the probability that A2 occurs given that A1 has occurred times the probability that A3 occurs given that both A1 and A2 have occurred. The result is easily generalized to n events. Theorem 10.30.2. If an event A must result in one of the mutually exclusive events A1 , A2 , . . . , An , then P (A) = P (A1 )P (A|A1 ) + P (A2 )P (A|A2 ) + . . . + P (An )P (A|An ). 10.31 (10.4) Independent Events If P (B|A) = P (B), i.e., the probability of B occurring is not affected by the occurrence or nonoccurrence of A, then we say that A and B are independent events. This is equivalent to P (A ∩ B) = P (A)P (B). (10.5) Notice also that if this equation holds, then A and B are independent. We say that three events A1 , A2 , A3 are independent if they are pairwise independent. P (Aj ∩ Ak ) = P (Aj )P (Ak ), j 6= k where j, k = 1, 2, 3 (10.6) and P (A1 ∩ A2 ∩ A3 ) = P (A1 )P (A2 )P (A3 ). (10.7) Both of these properties must hold in order for the events to be independent. Independence of more than three events is easily defined. Note: In order to use this multiplication rule, all of your events must be independent. 10.32 Bayes’ Theorem or Rule Suppose that A1 , A2 , . . . , An are mutually exclusive events whose union is the sample space S, i.e., one of the events must occur. Then if A is any event, we have the following important theorem: 155 Theorem 10.32.1. (Bayes’ Rule): P (Ak |A) = P (Ak )P (A|Ak ) . n X P (Aj )P (A|Aj ) (10.8) j=1 This enables us to find the probabilities of the various events A1 , A2 , . . . , An that can occur. For this reason Bayes’ theorem is often referred to as a theorem on the probability of causes. 10.33 Tutorial 10 1. There are 3 arrangements of the word DAD, namely DAD, ADD, and DDA. How many arrangements are there of the word PROBABILITY? 2. A ball is drawn at random from a box containing 8 red balls, 17 white balls, and 9 blue balls. Determine the probability that it is (i) white, (ii) not blue, (iii) red or blue, (iv) neither white nor red. 3. A card is picked from a deck of 52 playing cards, without replacement, and then another one is picked. What is the probability of picking (i) two red cards, (ii) one of each colour. 4. A die is loaded in such a way that each odd number is twice likely to occur as each even number. Find P (G), where G is the event that a number greater than 3 occurs on a single roll of the die. 5. Events X and Y are such that P (X | Y ) = 0.4, P (Y | X) = 0.25, P (X ∩ Y ) = 0.12 (i) Calculate the value of P (Y ). (ii) Give a reason why X and Y are not independent. (iii) Calculate the value of P (X ∩ Y 0 ). 6. Two dice are rolled. A = {‘sum of two dice equals 3 ’} B = {‘sum of two dice equals 7’} (a) What is P (A | C)? (b) What is P (B | C)? (c) Are A and C independent? What about B and C? 7. From a batch of 100 items of which 20 are defective, exactly two items are chosen, one at a time without replacement. Calculate the probabilities that: (a) the first item chosen is defective (b) both items chosen are defective (c) the second item chosen is defective. 156 8. The punctuality of buses has been investigated by considering a number of bus journey. In the sample, 60% of buses had a destination of Masvingo, 20% Bulawayo and 20% Mutare. The probabilities of a bus arriving late in Masvingo, Bulawayo or Mutare are 30%, 25% and 20% respectively. If a late bus is picked at random from the group under consideration, what is the probability that it terminated in Masvingo. 9. Machines M and N produce 10% and 90% respectively of the production of a component intended for the motor industry. From experience, it is known that the probability that machine M produces a defective component is 0.01 while the probability that machine N produces a defective component is 0.05. If a component is selected at random from a day‘s production and is found to be defective, find the probability that it was made by (a) machine M (b) machine N. 157 Complex Numbers and Polynomials There are no secrets about the world of nature. There are secrets about the thoughts and intentions of men. —Robert Oppenheimer 10.34 Introduction No one person invented complex numbers, but controversies surrounding the use of these numbers existed in the sixteenth century. In their quest to solve polynomial equations by formulas involving 158 radicals, early dabblers in mathematics were forced to admit that there were other kinds of numbers besides positivepintegers. Equations such as x2 + 2x + 2 = 0 and x3 = 6x + 4 that yielded solutions p √ √ √ 1 + −1 and 3 2 + −2 + 3 2 − −2 caused particular consternation within the community √ of fledgling √ mathematical scholars because everyone knew that there are no numbers such as −1 and −2, numbers whose square is negative. Such numbers exist only in one’s imagination, or as one philosopher opined, “the imaginary, (the) bosom child of complex mysticism.” Over time these imaginary numbers did not go away, mainly because mathematicians as a group are tenacious and some are even practical. A famous mathematician held that even though they exist in our imagination, nothing prevents us from employing them in calculations. Mathematicians also hate to throw anything away. After all, a memory still lingered that negative numbers at first were branded fictitious. The concept of number evolved over centuries; gradually the set of numbers grew from just positive integers to include rational numbers, negative numbers, and irrational numbers. But in the eighteenth century the number concept took a gigantic evolutionary step forward when the German mathematician Carl Friedrich Gauss put the so-called imaginary numbers or complex numbers, as they were now beginning to be called on a logical and consistent footing by treating them as an extension of the real number system. 10.35 Complex Numbers The set of all complex numbers is usually denoted by C. Since x2 ≥ 0 for every real number, x, the equation x2 + 1 = 0 has no real solutions. Introduce the imaginary number 1 , i= √ −1 which is assumed to have the property √ i2 = ( −1)2 = −1. Complex numbers are usually written in the form a + bi where a and b are real numbers or can be regarded as the ordered pair (a, b). Ordered Pair Equivalent Notation (3, 4) 3 + 4i (0, 1) 0+i (2, 0) 2 + 0i (4, −2) 4 + (−2)i 1 was first used by the Swiss mathematician Leonhard Euler in 1777. 159 Geometrically, a complex number can be viewed either as a point or vector in the xy−plane. Let us denote z = a + bi. The real number a is called the real part of z and the real number b is called the imaginary part of z. These numbers are denoted Re(z) and Im(z) respectively. Example 10.35.1. Re(4 − 3i) = 4 and Im(4 − 3i) = −3. When complex numbers are represented geometrically in the xy-coordinate system, the x-axis is called the real axis, the y-axis, the imaginary axis, and the plane is called the complex plane. Definition 10.35.1. Two complex numbers a + bi and c + di are defined to be equal, when a + bi = c + di if a = c and b = d. Numbers of the form where a = 0, then a + bi reduces to 0 + bi = bi, these complex numbers which correspond to points on the imaginary axis, are called purely imaginary numbers. For example z = 8i is a purly imaginary number. 10.35.1 Operations Complex numbers can be added, subtracted, multiplied and divided. (a + bi) + (c + di) = (a + c) + (b + d)i. (a + bi) − (c + di) = (a − c) + (b − d)i. k(a + bi) = (ka) + (kb)i, k ∈ R. (multiplication by a real Because (−1)z + z = 0, we denote (−1)z as −z and call it the negative of z. Example 10.35.2. If z1 = 4 − 5i, z2 = −1 + 6i, find z1 + z2 , z1 − z2 , 3z1 and −z2 . Solution: z1 + z2 z1 − z2 3z1 −z2 = = = = (4 − 5i) + (−1 + 6i) = (4 − 1) + (−5 + 6)i = 3 + i. (4 − 5i) − (−1 + 6i) = (4 + 1) + (−5 − 6)i = 5 − 11i. 3(4 − 5i) = 12 − 15i. −1(z2 ) = (−1)(−1 + 6i) = 1 − 6i. Multiplying two complex numbers as (a + bi)(c + di), treating i2 = −1, this yields (a + bi)(c + di) = ac + bdi2 + adi + bci = (ac − bd) + (ad + bc)i. Example 10.35.3. 1. (3 + 2i)(4 + 5i) = (3 · 4 − 2 · 5) + (3 · 5 + 2 · 4)i = 2 + 23i. 2. i2 = (0 + i)(0 + i) = (0 · 0 − 1 · 1) + (0 · 1 + 1 · 0)i = −1. 160 number) 10.35.2 Rules of Complex Arithmetic Given that z1 , z2 , z2 ∈ C, then 1. z1 + z2 = z2 + z1 . 2. z1 z2 = z2 z1 . 3. z1 + (z2 + z3 ) = (z1 + z2 ) + z3 . 4. z1 (z2 z3 ) = (z1 z2 )z3 . 5. z1 (z2 + z3 ) = z1 z2 + z1 z3 . 6. 0 + z = z. 7. z + (−z) = 0. 8. 1 · z = z 10.36 Modulus, Complex Conjugate and Division 10.36.1 Complex Conjugate If z = a + bi, is any complex number, then the conjugate of z denoted by z is defined as z = a − bi. Geometrically, z is the reflection of z about the axis. Example 10.36.1. 1. z = 3 + 2i, then z = 3 − 2i. 2. z = −4 − 2i, then z = −4 + 2i. 3. z = 4, then z = 4. So z = z if and only if z is a real number. 161 10.36.2 Modulus of a Complex Number Definition 10.36.1. The modulus of a complex number z = a + bi, denoted |z|, is defined by √ |z| = a2 + b2 . If b = 0, then z = a is a real number, and √ √ |z| = a2 + 02 = a2 = |a|. So the modulus of a real number is simply its modulus value. Example 10.36.2. Find |z| if z = 3 − 4i. Solution: |z| = p √ 32 + (−4)2 = 25 = 5. Theorem 10.36.1. For any complex number zz = |z|2 or |z| = √ zz. Proof. If z = a + bi, then zz = (a + bi)(a − bi) = a2 − abi + bai − b2 i2 = a2 + b 2 = |z|2 . The modulus of a complex number z has the additional properties |z1 z2 | = |z1 ||z2 | and 10.36.3 Division of Complex Numbers For division Example 10.36.3. Express z1 z 2 z1 = . z2 |z2 |2 3 + 4i in the form a + bi. 1 − 2i Solution: (3 + 4i)(1 + 2i) 3 + 4i = 1 − 2i (1 − 2i)(1 + 2i) 3 + 6i + 4i + 8i2 = 1 + 2i − 2i − 4i2 −5 + 10i = 5 = −1 + 2i. 162 |z1 | z1 = . z2 |z2 | 10.36.4 Properties of the Conjugate Theorem 10.36.2. For any complex numbers z, z1 and z2 , then (a) z1 + z2 = z1 + z2 . (b) z1 − z2 = z1 − z2 . (c) z1 · z2 = z1 · z2 . z1 z1 (d) = . z2 z2 (e) z = z. Proof. (a) Let z1 = a1 + b1 i and z2 = a2 + b2 i, then z1 + z2 = = = = 1 (a1 + a2 ) + (b1 + b2 )i (a1 + a2 ) − (b1 + b2 )i (a1 − b1 i) + (a2 − b2 i) z1 + z2 . √ 1 a2 + b2 = ((Re(z))2 + (Im(z)2 )) 2 , then p p Re(z) ≤ |Re(z)| = (Re(z))2 ≤ (Re(z))2 + (Im(z))2 = |z|. Since |z| = (zz) 2 = Similarly, Im(z) ≤ |Im(z)| ≤ |z|. For any two complex numbers, z1 and z2 , we have that |z1 + z2 | ≤ |z1 | + |z2 |. This is called the triangle inequality. Proof. |z1 + z2 |2 = (z1 + z2 )(z1 + z2 ) = (z1 + z2 )(z1 + z2 ) = z1 z1 + 2Re(z1 z2 ) + z2 z2 . Using the fact that 2Re(z1 z2 ) ≤ 2|z1 z2 | = 2|z1 ||z2 |, we get |z1 + z2 |2 ≤ |z1 |2 + 2|z1 ||z2 | + |z2 |2 = (|z1 | + |z2 |)2 . Taking square roots the result follows, that is |z1 + z2 | ≤ |z1 | + |z2 |. 163 10.37 Polar Representation of Complex Numbers If z = x + iy is a non-zero complex number, r = |z| and θ measures the angle from the positive real axis to the vector z, P = (r, θ) r = directed distance θ = directed angle Figure 10.1: Polar form then x = r cos θ and y = r sin θ, so that z = x + iy can be written as z = r cos θ + ir sin θ = r(cos θ + i sin θ). This is called a polar form of z. The angle θ is called an argument of z and is denoted by θ = arg z. The argument of z is not uniquely determined because we can add or subtract any multiple of 2π from θ to produce another value of the argument. One value of the argument in radians that satisfies −π < θ ≤ π is called the principal argument of z and is denoted by θ = Arg z. √ Example 10.37.1. Express z = 1 + 3i in polar form using the principal argument. √ √ √ Solution: The value of r is r = |z| = (1)2 + ( 3)2 = 4 = 2. Since x = 1 and y = 3, it √ √ follows that 1 = 2 cos θ and 3 = 2 sin θ. So cos θ = 12 and sin θ = 23 . The only value of θ that satisfies these relations and meets the requirement −π, θ ≤ π is θ = π3 . The polar form of z is π π . z = 2 cos + i sin 3 3 q 164 We now show how polar forms can be used to give geometric interpretations of multiplication and division of complex numbers. Let z1 = r1 (cos θ1 + sin θ1 ) and z2 = r2 (cos θ2 + i sin θ2 ). Multiplying, we obtain z1 z2 = r1 r2 [(cos θ1 cos θ2 − sin θ1 sin θ2 ) + i(sin θ1 cos θ2 + cos θ1 sin θ2 )]. Recall: cos(θ1 + θ2 ) = cos θ1 cos θ2 − sin θ1 sin θ2 . sin(θ1 + θ2 ) = sin θ1 cos θ2 + cos θ1 sin θ2 . We obtain z1 z2 = r1 r2 [cos(θ1 + θ2 ) + i sin(θ1 + θ2 )] which is a polar form of the complex number with modulus r1 r2 and argument θ1 + θ2 . Thus, we have shown that |z1 z2 | = |z1 ||z2 | and arg(z1 z2 ) = arg z1 + arg z2 . Also z1 r1 = [cos(θ1 − θ2 ) + i sin(θ1 − θ2 )] , z2 r2 from which, it follows that z1 |z1 | = , z2 |z2 | and arg 10.38 z1 z2 if z2 6= 0 = arg z1 − arg z2 . De Moivre’s Formula If n is a positive integer and z = r(cos θ + i sin θ), then z n = z · z · z · · · z = rn [cos (θ + θ + · · · + θ) +i sin (θ + θ + · · · + θ)] | {z } {z } | n terms n terms or z n = rn (cos nθ + i sin nθ). In the special case, if r = 1, we have for z = (cosθ + i sin θ), so that (10.9) becomes (cos θ + i sin θ)n = cos nθ + i sin nθ which is called the De Moivre’s formula. 165 (10.9) 10.38.1 Application of De Moivre’s Formula It is used to obtain roots of complex numbers Recall from algebra that −2 and 2 are said to be square roots of the number 4 because (−2)2 = 4 and (2)2 = 4. In other words, the two square roots of 4 are distinct solutions of the equation w2 = 4. If n is a positive integer and z is any complex number, then we define the nth root of z to be any complex number that satisfies the equation wn = z (10.10) 1 and denote the nth root of z by z n . Let w = ρ(cos α + i sin α) and z = r(cos θ + i sin θ), then ρn (cos nα + i sin nα) = r(cos θ + i sin θ). Comparing the moduli of the two sides, we see that ρn = r or ρ = √ n r √ where n r denotes the real positive nth root of r. In order to have cos nα = cos θ and sin nα = sin θ, the angles nα and θ must be either equal or differ by a multiple of 2π, that is nα = θ + 2kπ, k = 0, ±1, ±2, . . . θ 2kπ + , k = 0, ±1, ±2, . . . α = n n Thus, the values of w = ρ(cos α + i sin α) that satisfy (10.10) are given by √ θ 2kπ θ 2kπ n w = r cos + + + i sin , k = 0, ±1, ±2, . . . n n n n Although there are infinitely many values of k, it can be shown that k = 0, 1, 2, . . . , n − 1 produces distinct values of w satisfying (10.10), but all other choices of k yield duplicates of these. Example 10.38.1. Find all the cube roots of −8. Solution: Since −8 lies on the negative real axis, we can use π as an argument. Here r = |z| = | − 8| = 8, so a polar form of −8 is −8 = 8(cos π + i sin π). Here n = 3, hence 1 3 (−8) = √ 3 8 cos π 2kπ + 3 3 + i sin 166 π 2kπ + 3 3 , k = 0, 1, 2. Thus, the cube roots of −8 are k = 0, π π = 2 2 cos + i sin 3 3 √ ! √ 3 1 + i = 1 + 3i. 2 2 k = 1, 2(cos π + i sin π) = 2(−1) = −2. √ ! √ 5π 5π 1 3 + i sin = 2 − i = 1 − 3i. k = 2, 2 cos 3 3 2 2 10.39 Applications of Complex Numbers 10.39.1 The Quadratic Formula Example 10.39.1. Solve the quadratic equation z 2 + (1 − i)z − 3i = 0. Solution: From the quadratic formula, we have 1 −(1 − i) + [(1 − i)2 − 4(−3i)] 2 z = 2 i 1 1h = −1 + i + (10i) 2 . 2 √ 1 We compute (10i) 2 with r = 10 and θ = π2 and n = 2 for k = 0 and k = 1. The two square roots of 10i are √ √ √ π π √ 1 1 w0 = 10 cos + i sin = 10 √ + √ i = 5 + 5i 4 4 2 2 √ √ √ √ 1 5π 1 5π + i sin w1 = 10 cos = 10 − √ − √ i = − 5 − 5i. 4 4 2 2 √ √ √ √ Therefore the two values are z1 = 21 [−1 + i + ( 5 + 5i)] and z2 = 12 [−1 + i + (− 5 − 5i)]. These solutions written in the form z = a + bi, are 1 √ 1 √ z1 = ( 5 − 1) + ( 5 + 1)i 2 2 10.39.2 and 1 √ 1 √ z2 = − ( 5 + 1) − ( 5 − 1)i. 2 2 Roots of Polynomials A polynomial in x is a function of the form p(x) = an xn + an−1 xn−1 + · · · + a1 x + a0 . Example 10.39.2. x3 − 2x + 4. 167 A number (real or complex) a is said to be a root of the polynomial p(x) if p(a) = 0. Example 10.39.3. x = 1 is a root of x2 − 2x + 1, since 12 − 2 + 1 = 0. A number a (real or complex) is a root of the polynomial p(x) if and only if (x − a) is a factor of p(x). It may be the case that you pull more than one factor (x − a) out of the polynomial. In such cases a is said to be a multiple root of p(x). A root is called a simple root if it produces one factor. 10.39.3 The Fundamental Theorem of Algebra Theorem 10.39.1 (The Fundamental Theorem of Algebra). Let p(x) be any polynomial of degree n. Then p(x) can be factorized into a product of a constant and n factors of the form (x − a), where a may be real or complex. Suppose the complex number z is a root of the polynomial, then the complex conjugate z is also a root. Example 10.39.4. Let p(z) = z 4 − 4z 3 + 9z 2 − 16z + 20. Given that 2 + i is a root, express p(z) as a product of real quadratic factors. Solution: Given that 2 + i is a root, it follows that 2 − i must also be a root and so the quadratic (z − (2 + i))(z − (2 − i)) = z 2 − 4z + 5 must be a factor. Dividing the given polynomial by this factor gives p(z) = z 4 − 4z 3 + 9z 2 − 16z + 20 = (z 2 − 4z + 5)(z 2 + 4). Example 10.39.5. Solve z 3 + 3z 2 + 2z − 6 = 0 and express the left hand side as a product of irreducible factors. Solution: Since the equation is a polynomial equation of odd degree there is at least one real solution. To find that solution by trial and error the factors of the constant terms are substituted into the polynomial. The factors of 6 are ±1, ±2, ±3, ±6. Substituting z = 1 gives 1+3+2−6=0 z 3 + 3z 2 + 2z + 6 = z 2 + 4z + 6 and the other so z = 1 is a solution and (z − 1) is a factor. So z − 1 √ solutions are z = −2 ± 2i and so z 3 + 3z 2 + 2z − 6 = (z − 1)(z 2 + 4z + 6) as a product of irreducible real factors. Exercise 10.39.1. Express z 5 − 1 as a product of real linear and quadratic factors. 168 10.40 Tutorial 11 1. Given that z1 = 3 − 8i and z2 = −7 + i, find (i) iz1 + 2z2 (ii) z1 + z2 2. Express each of the following complex numbers in polar form and represent each number on an Argand diagram: √ 2 (i) −1 − i (ii) 3 − 3 3i (iii) −2 − √ i 2 3. If z = 1 2+i , find the real and imaginary parts of z + . 1−i z z−i . z+i i. Evaluate w when z = 0, and when z = 1. ii. Let z = β where β ∈ R. Show that for any such z the corresponding w always has unit modulus. (a) Let z ∈ C, and let w = (b) i. Express the complex number z = 24 + 7i in polar form. 1 ii. Find the four values of z 4 in exponential form, and plot them on an Argand diagram. 4. Show that cos 6φ = 32 cos6 φ − 48 cos4 φ + 18 cos2 φ − 1. 5. Consider the polynomial p(z) = z 4 − 3z 3 + rz 2 + sz + t, where r, s, and t are real constants. Given that the two roots of p(z) are 2 and 1 + 2i, determine the values of r, s and t. √ 6. Find all values of z for which z 4 + 2 3i + 2 = 0. 169 Chapter 11 Theory of Matrices Any man who can drive safely while kissing a pretty girl is simply not giving the kiss the attention it deserves. —Albert Einstein 170 11.1 Matrices Definition 11.1.1. A matrix over a field K (elements of K are called numbers or scalars) is a rectangular array of scalars presented in the following form a11 a12 · · · a1n a21 a22 · · · a2n A = .. .. .. . . ··· . am1 am2 · · · amn The rows of such a matrix are the m horizontal list of scalars, that is (a11 , a12 , · · · , a1n ), (a21 , a22 , · · · , a2n ), · · · , (am1 , am2 , · · · , amn ) and the columns of A are the n vertical list of scalars, a11 a12 a1n a21 a22 a2n a31 a32 , , · · · , a3n . .. .. .. . . . am1 am2 amn The element aij , called the ij-entry or ij-element appears in row i and column j. Denote a matrix simply by A = [aij ]. A matrix with m rows and n columns is called an m by n matrix, written m × n. The pair of numbers m and n are called the size of the matrix. Two matrices are equal, written A = B, if they have the same size and if corresponding elements are equal. Example 11.1.1. Find x, y, z, t such that x + y 2z + t 3 7 = . x−y z−t 1 5 Solution: By definition of equality of matrices, the four corresponding entries must be equal. Thus x + y = 3, x − y = 1, 2z + t = 7, z − t = 5. Solving the above system of equations yield x = 2, y = 1, z = 4, t = −1. A matrix whose entries are all zero is called a zero matrix. 171 Example 11.1.2. 0 0 A= , 0 0 P = 0 0 . Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R. Matrices whose entries are all complex numbers are called complex matrices and are said to be matrices over C. 11.2 Matrix Addition and Scalar Multiplication Let A = [aij ] and B = [bij ] be two matrices with the same size, say m × n matrices. The sum of A and B, written A + B, is the matrix obtained by adding corresponding elements from A and B, that is a11 + b11 a12 + b12 · · · a1n + b1n a21 + b21 a22 + b22 · · · a2n + b2n A+B = . .. .. .. . . ··· . am1 + bm1 am2 + bm2 · · · amn + bmn The product of a matrix A by a scalar k, written kA, is element of A by k, that is ka11 ka12 · · · ka21 ka22 · · · kA = .. .. . . ··· kam1 kam2 · · · the matrix obtained by multiplying each ka1n ka2n .. . . kamn Observe that A + B and kA are also m × n matrices. We also define −A = (−1)A and A − B = A + (−B). The matrix −A s called the negative of matrix A and the matrix A − B is called the difference of matrix A and B. Example 11.2.1. Let 1 −2 3 A= 0 4 5 then and and 4 6 8 B= 1 −3 −7 1 + 4 −2 + 6 3+8 5 4 11 A+B = = 0 + 1 4 + (−3) 5 + (−7) 1 1 −2 3(1) 3(−2) 3(3) 3 −6 9 3A = = . 3(0) 3(4) 3(5) 0 12 15 172 11.2.1 Properties Theorem 11.2.1. Consider any matrices A, B and C (with same size) and scalars k and l. Then (i) (A + B) + C = A + (B + C). (ii) A + 0 = 0 + A = A. (iii) A + (−A) = (−A) + A = 0. (iv) A + B = B + A. (v) k(A + B) = kA + kB. (vi) (k + l)A = kA + lA. (vii) (kl)A = k(lA). (viii) 1 · A = A. Proof. (i) Suppose A = [aij ], B = [bij ] and C = [cij ]. Need to show that corresponding ij-entries in each side of each matrix equation are equal. The ij-entry of A + B is aij + bij , hence the ij-entry of (A + B) + C is (aij + bij ) + cij . On the other hand, the ij-entry of B + C is bij + cij and hence the ij-entry of A + (B + C) is aij + (bij + cij ). However for scalars in K, (aij + bij ) + cij = aij + (bij + cij ). Thus (A+B)+C and A+(B +C) have identical ij-entries. Therefore (A+B)+C = A+(B +C). Proof. (v) The ij-entry of A + B is aij + bij , hence k(aij + bij ) is the ij-entry of k(A + B). On the other hand, the ij-entry of kA and kB are kaij and kbij respectively. Thus, kaij + kbij is the ij-entry of kA + kB. However, for scalars in K, k(aij + bij ) = kaij + kbij . Thus, k(A + B) and kA + kB have identical ij-entries. Therefore, k(A + B) = kA + kB. 11.3 Matrix Multiplication The product of matrices A and B, is written as AB. Consider the product AB, of a row matrix A = [aij ] and a column matrix B = [bij ] with the same number of elements is defined to be the 173 scalar obtained by multiplying corresponding entries and adding, that is b1 n b2 X AB = [a1 , a2 , · · · , an ] .. = a1 b1 + a2 b2 + · · · + an bn = ak b k . . k=1 bn The product AB is not defined when A and B have different number of elements. Example 11.3.1. 3 [7, −4, 5] 2 = 7(3) + −4(2) + 5(−1) = 21 − 8 − 5 = 8. −1 We now define matrix multiplication in general. Definition 11.3.1. Suppose A = [aik ] and B = [bkj ] are matrices such that the number of columns of A is equal to the number of rows of B, say, A is an m × p matrix and B is a p × n matrix. Then the product AB is the m × n matrix whose ij-entry is obtained by multiplying the ith row of A by the jth column of B, that is b b1j · · · b1n c11 · · · · · · c1n 11 · · · a11 · · · · · · aip .. .. .. .. .. .. .. .. .. .. .. .. .. . . . . . . . . . . . . . . . . . . . = .. · · · c . . . . . ai1 · · · · · · aip . . . . . · · · ij . . .. .. .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . am1 · · · · · · amp bp1 · · · bpj · · · bpn cm1 · · · · · · cmn where cij = ai1 b1j + ai2 b2j + · · · + aip bpj = p X aik bkj . k=1 The product AB is not defined if A is an m × p matrix and B is an q × n matrix, where p 6= q. Example 11.3.2. Find AB where 1 3 A= 2 −1 and 2 0 −4 B= . 5 −2 6 Solution: Since A is 2 × 2 and B is 2 × 3, the product AB is defined and AB is a 2 × 3 matrix. Hence 2 + 15 0 − 6 −4 + 18 17 −6 14 AB = = . 4 − 5 0 + 2 −8 − 6 −1 2 −14 174 1 2 5 6 Example 11.3.3. Suppose A = and . Then 3 4 0 −2 5+0 6−4 5 2 AB = = 15 + 0 18 − 8 15 10 and 5 + 18 10 + 24 23 34 BA = = . 0−6 0−8 −6 −8 The above example shows that matrix multiplication is not commutative, that is, the products AB and BA of matrices need not be equal. Matrix multiplication satisfies the following properties Theorem 11.3.1. Let A, B and C be matrices, then, whenever the products and sums are defined. (i) (AB)C = A(BC). (ii) A(B + C) = AB + AC. (iii) (B + C)A = BA + CA. (iv) k(AB) = (kA)B = A(kB), where k is a scalar. Proof. (i) (AB)C = A(BC). Let A = [aij ], B = [bjk ], C = [ckl ] and let AB = S = [sik ] and BC = T = [tjl ]. Then sik = m X aij bjk and tjl = j=1 n X bjk ckl . k=1 Multiplying S = AB by C, the il-entry of (AB)C is si1 c1l + si2 c2l + · · · + sim cnl = n X sik ckl = n X m X (aij bjk )ckl . k=1 j=1 k=1 On the other hand, multiplying A by T = BC, the il-entry of A(BC) is ai1 t1l + ai2 t2l + · · · + a1m tnl = m X aij tjl = j=1 m X n X aij (bjk ckl ). j=1 k=1 The above sums are equal, that is, corresponding elements in (AB)C and A(BC) are equal. Thus (AB)C = A(BC). Exercise 11.3.1. Prove that A(B + C) = AB + AC. 175 11.4 Transpose of a Matrix Definition 11.4.1. The transpose of a matrix A, written At , the matrix obtained by writing the columns of A, in order, as rows. Example 11.4.1. t 1 4 1 2 3 = 2 5 4 5 6 3 6 and 1 [1 − 3 − 5]t = −3 . −5 In other words, if A = [aij ] is an m × n matrix, then At = [bij ] is the n × m matrix, where bij = aji . Observe that the transpose of a row vector is a column vector. Similarly, the transpose of a column vector is a row vector. The basic properties of the transpose operation are Theorem 11.4.1. Let A and B be matrices and let k be a scalar. Then, whenever the sum and product are defined, we have (i) (A + B)t = At + B t . (ii) (At )t = A. (iii) (kA)t = kAt . (iv) (AB)t = B t At . Proof. (iv) (AB)t = B t At . Let A = [aik ] and B = [bkj]. Then the ij-entry of AB is ai1 b1j + ai2 b2j + · · · + aim bmj . This is the ji-entry (reverse order) of (AB)t . Now column j of B becomes row j of B t and row i of A becomes column i of At , Thus, the ij-entry of B t At is [b1j , b2j , · · · , bmj ][ai1 ai2 aim ]t = b1j ai1 + b2j ai2 + · · · + bmj aim . Thus, (AB)t = B t At , since the corresponding entries are equal. 11.5 Square Matrices Definition 11.5.1. A square matrix is a matrix with the same number of rows as columns. An n × n square matrix is said to be of order n and is sometimes called an n-square matrix. 176 Example 11.5.1. The following are square matrices of order 3. 1 2 3 2 −5 1 A = −4 −4 −4 and B = 0 3 −2 . 5 6 7 1 2 −4 11.6 Diagonal and Trace Definition 11.6.1. Let A = [aij ] be an n-square matrix. The diagonal or main diagonal of A consists of the elements with the same subscripts, that is a11 , a22 , . . . , ann . Definition 11.6.2. The trace of A, written tr(A), is the sum of the diagonal elements. Namely tr(A) = a11 + a22 + · · · + ann . 11.6.1 Identity Matrix The n-square identity or unit matrix, denoted by I, is the n-square matrix with 1’s on the diagonal and 0’s elsewhere. For any n-square matrix A, AI = IA = A. Example 11.6.1. The following are identity matrices of order 3 and 4. 1 0 0 0 1 0 0 0 1 0 and 0 1 0 0 . 0 0 1 0 0 0 1 0 0 0 1 11.7 Powers of Matrices Let A be an n-square matrix over a field K. Powers of A are defined as follows A2 = AA, A3 = A2 A, · · · , An+1 = An A and A0 = I. 1 2 Example 11.7.1. Suppose A = . Then 3 −4 1 2 1 2 7 −6 2 A = = 3 −4 3 −4 −9 22 177 and 7 −6 1 2 −11 38 A =A A= = . −9 22 3 −4 57 −106 3 2 11.8 Special Types of Square Matrices 11.8.1 Diagonal Matrix A square matrix D = [dij ] is diagonal if its non diagonal entries are all zero. Example 11.8.1. 3 0 0 A = 0 −7 0 0 0 2 11.8.2 and 4 0 B= . 0 −5 Triangular Matrices A square matrix A = [aij ] is upper triangular if all entries below the main diagonal are equal to zero. Example 11.8.2. a11 a12 A= 0 a22 and b11 b12 b13 B = 0 b22 b23 . 0 0 b33 A lower triangular matrix is a square matrix whose entries above the main diagonal are all zero. Suppose A is a square matrix with real entries. 11.8.3 Symmetric Matrices Definition 11.8.1. A matrix A is symmetric if At = A. 2 −3 5 2 −3 5 7 , then At = −3 6 7 . Hence At = A, thus A is Example 11.8.3. Let A = −3 6 5 7 −8 5 7 −8 symmetric. 178 11.8.4 Skew-Symmetric Matrices Definition 11.8.2. A matrix A is skew-symmetric if At = −A. The diagonal elements of such matrix must be zero. Example 11.8.4. 0 3 −4 5 . B = −3 0 4 −5 0 11.8.5 Orthogonal Matrices Definition 11.8.3. A real matrix A is orthogonal if At = A−1 , that is AAt = At A = I. A must necessarily be square and invertible. 11.9 Complex Matrices Let A be a complex matrix. The conjugate of a complex matrix A, written A, is the matrix obtained from A by taking the conjugate of each entry of A. A∗ is used for the conjugate transpose of A, that is A∗ = (A)t = (At ). If A is real then A∗ = At . 2 − 8i −6i 2 + 8i 5 − 3i 4 − 7i Example 11.9.1. Let A = , then A∗ = 5 + 3i 1 + 4i. 6i 1 − 4i 3 + 2i 4 + 7i 3 − 2i 179 11.9.1 Hermitian Matrices Definition 11.9.1. A complex matrix A is said to be Hermitian if A∗ = A. Skew-Hermitian Matrices Definition 11.9.2. A complex matrix A is said to be skew-Hermitian if A∗ = −A. 11.9.2 Unitary Matrices Definition 11.9.3. A complex matrix A is unitary if A∗ A−1 = A−1 A∗ = I, 11.10 i.e., A∗ = A−1 . Inversion of Matrices Here we are dealing with square matrices. Proposition 11.10.1. For every n × n matrix A, AI = IA = A. This raises the following question : Given an n×n matrix A, is it not possible to find another n × n matrix B, such that AB = BA = I? Definition 11.10.1. An n × n matrix A is said to be invertible, if there exists an n × n matrix B, such that AB = BA = I. In this case, we sat that B is the inverse of A and write B = A−1 . Proposition 11.10.2. Suppose that A is an invertible n × n matrix. Then its inverse A−1 is unique. 180 Proof. Suppose that B satisfies the requirements for being the inverse of A. Then AB = I = BA. It follows that A−1 = A−1 I = A−1 (AB) = (A−1 A)B = IB = B. Hence the inverse A−1 is unique. Exercise 11.10.1. Suppose that A and B are invertible n × n matrices. Prove that (AB)−1 = B −1 A−1 . Exercise 11.10.2. Suppose that A is an invertible n × n matrix. Prove that (A−1 )−1 = A. 11.11 Determinants Each n-square matrix A = [aij ] is assigned a special scalar called the determinant of A, denoted by det A or |A| or a11 a12 · · · a1n a21 a22 · · · a2n .. .. .. . . . ··· . am1 am2 · · · amn 11.11.1 Determinants of Order 1 and 2 Determinants of order 1 and 2 are defined as |a11 | = a11 and Example 11.11.1. (a) det(27) = 27 and (b) 11.12 a11 a12 = a11 a22 − a12 a21 . a21 a22 5 3 = 5(6) − 3(4) = 30 − 12 = 18. 4 6 Determinants of Order 3 Consider an arbitrary 3 × 3 matrix A = [aij ]. The determinant of A is defined as follows a11 a12 a13 det A = a21 a22 a23 a31 a32 a33 = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 . 181 A procedure for evaluating the determinants of 3 × 3 is called Sarrus’ Rule. + a11 + a12 + a13 a11 a12 a21 a22 a23 a21 a22 a31 − a32 − a33 − a31 a32 2 1 1 3 2 1 Example 11.12.1. Let A = 0 5 −2 and B = −4 5 −1. Find det A and det B. 1 −3 4 2 −3 4 2 1 1 det A = 0 5 −2 = 1 −3 4 + 2 + 1 + 1 2 1 0 5 −2 0 5 1 − −3 − 4 − 1 −3 = 2(5)(4) + 1(−2)(1) + 1(0)(−3) − (1)(0)(4) − (2)(−2)(−3) − (1)(5)(1) = 40 − 2 + 0 − 0 − 12 − 5 = 21. 3 2 1 det B = −4 5 −1 = 2 −3 4 + 3 + 2 + 1 3 2 −4 5 −1 −4 5 2 − −3 − 4 − 2 −3 182 = 3(5)(4) + 2(−1)(2) + 1(−4)(−3) − (2)(−4)(4) − (3)(−1)(−3) − (1)(5)(2) = 60 − 4 + 12 + 32 − 9 − 10 = 81. Sarrus’ rule applies for evaluating the determinant of 3 × 3 matrices only. 11.13 Evaluation of Determinants of Any Order 11.13.1 Minors and Co-factors Definition 11.13.1. If A = [aij ] is an n × n matrix, then the minor of the element aij denoted by Mij and is defined ad the determinant of the (n − 1) × (n − 1) sub-matrix which is obtained by deleting all the entries in the ith row and the jth column. Example 11.13.1. For the matrix a11 a12 a13 A = a21 a22 a23 . a31 a32 a33 The minor of a11 is a22 a23 = M11 . a32 a33 The minor of a12 is a21 a23 = M12 . a31 a33 The minor of a13 is a21 a22 = M13 . a31 a32 Definition 11.13.2. The co-factor of an element aij denoted by aij is defined as the product of (−1)i+j and the minor of aij , that is Aij = (−1)i+j Mij . Co-factor of an element is merely the signed minor of the element. We emphasize Mij denotes a matrix and Aij denotes a scalar. a11 a12 a13 a a Example 11.13.2. If A = a21 a22 a23 , then the co-factor of a11 = A11 = (−1)1+1 22 23 a32 a33 a31 a32 a33 a a a a a a = + 22 23 , the co-factor of a12 = A12 = (−1)1+2 21 23 = − 21 23 . a32 a33 a31 a33 a31 a33 183 11.14 Laplace Expansion of the Determinant To compute the determinant of an n×n matrix we make use of the concept of co-factors and minors to reduce the matrix to lower ones whose determinants we already know how to calculate. The determinant of a square matrix A = [aij ] is equal to the sum of the products obtained by multiplying the elements of any row (column) by their respective co-factors. |A| = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain = n X aij Aij . j=1 This expansion can be carried out along any row of the matrix in question and the value of the determinant is the same. 3 −1 5 Example 11.14.1. Given that A = 0 4 −3. Find |A|. 2 1 2 Solution: Expanding along the first row, gives det A = 3(−1)1+1 4 −3 0 −3 0 4 + (−1)(−1)1+2 + 5(−1)1+3 1 2 2 2 2 1 0 4 0 −3 4 −3 +5 + 2 1 2 2 1 2 = 3(8 + 3) + (0 + 6) + 5(−8) = 3(11) + 6 − 40 = 33 + 6 − 40 = −1. = 3 Expanding along the second row, gives −1 5 3 5 3 −1 + 4(−1)2+2 + (−3)(−1)2+3 1 2 2 2 2 1 = 0 − 16 + 15 = −1. det A = 0(−1)2+1 Note that expanding by a row or column that contains zeros significantly reduces the number of cumbersome calculations that need to be done. It is sensible to evaluate the determinant by co-factor expansion along a row or column with the greatest number of zeros. 0 0 0 1 3 5 0 −1 Example 11.14.2. Given that A = 0 3 −2 5 . Find det A. 1 0 0 2 184 Solution: 3 5 0 5 0 det A = − 0 3 −2 = − 3 −2 1 0 0 = −(10) − 0 = 10. The determinant of the identity matrix is 1. The determinant of a diagonal matrix D of order n × n is given by the product of the elements on its main diagonal. The determinant of a triangular matrix of order n × n is given by the product of the elements on its main diagonal. 11.15 Properties 1. For general matrices, A and B |AB| = |A||B|. 2. In general, for an n × n matrix A, det A = det At . 3. If A and B are n × n matrices, then |AB| = |BA|. 4. In general, if two rows (columns) of an n × n matrix A are interchanged, then det A = − det A. 5. If the elements of any rows (columns) of an n × n matrix A are multiplied by the same scalar k, then the value of the determinant of the new matrix is k times the determinant of A. 6. If the elements of any row (column) of A are all zeros, then the determinant of A is zero. 7. If an n × n matrix A is multiplied by a scalar k, then the determinant of kA is k n det A, that is det kA = k n det A. 8. If A is an n × n matrix, with any two of its rows (columns) equal, then the determinant of A is zero. 9. If A is an n × n matrix, in which one row (column) is proportional to another, then the determinant of the matrix is zero. 185 11.16 Adjoint Definition 11.16.1. Let A = [aij ] be an n × n matrix and let Aij denote the co-factors of aij . The adjoint of A, denoted by adj A is the transpose of the matrix of co-factors of A, that is adj A = [Aij ]t . 2 3 −4 Example 11.16.1. Let A = 0 −4 2 . The co-factors of the nine elements of A are as follows, 1 −1 5 A11 = + A21 = − A31 = + −4 2 = −18, −1 5 3 −4 = −11, −1 5 0 2 = 2, 1 5 A13 = + 0 −4 =4 1 −1 2 −4 = 14, 1 5 A23 = − 2 3 =5 1 −1 A12 = − A22 = + 3 −4 2 −4 2 3 = −10, A32 = − = −4, A33 = + = −8. 0 2 0 −4 −4 2 A11 A12 A13 −18 2 4 [Aij ] = A21 A22 A23 = −11 14 5 . A31 A32 A33 −10 −4 −8 The transpose of the above matrix of co-factors yields the adjoint of A, that is −18 −11 −10 14 −4 . adj A = 2 4 5 −8 Theorem 11.16.1. Let A be any square matrix.Then A(adj A) = (adj A)A = |A|I, where I is the identity matrix. Thus, if |A| = 6 0, A−1 = 1 adj A. |A| Example 11.16.2. Let A be the matrix above. We have det A = −40 + 6 + 0 − 16 + 4 + 0 = −46. Thus A does have an inverse and from A−1 −18 −11 −10 1 1 14 −4 . adj A = − 2 = |A| 46 4 5 −8 186 11.17 Properties of Inverses 1. If an n × n matrix A is invertible, then det A 6= 0. Definition 11.17.1. A matrix which has an inverse is said to be invertible. A matrix whose determinant is non-zero is said to be non-singular and if a matrix has determinant equal to zero it is called a singular matrix. 2. If an n × n matrix A is invertible, then (A−1 )−1 = A. 3. If an n × n matrix A is invertible, then At is also invertible, and (At )−1 = (A−1 )t . 4. If A is an n × n invertible matrix, then det A−1 = 187 1 . det A Chapter 12 Application of Matrices Gravitation cannot be held responsible for people falling in love. How on earth can you explain in terms of chemistry and physics so important a biological phenomenon as first love? Put your hand on a stove for a minute and it seems like an hour. Sit with that special girl for an hour and it seems like a minute. That’s relativity. —Albert Einstein 12.1 Elementary Row Operations Let ri denote row i of matrix A. There are 3 elementary row operations, namely 1. ri ↔ rj meaning interchanging row i with row j. 2. ri → kri , k 6= 0 meaning multiply ri by a scalar k. 3. ri → kri + rj meaning multiply row i by k and add row j. 188 1 0 2 Example 12.1.1. Consider the matrix A = 4 1 3. 3 2 6 4 1 r1 ↔ r2 gives 3 1 8 r2 → 2r2 gives 3 5 r1 → 2r1 + r3 gives 4 3 12.2 Then 1 3 0 2 2 6 0 2 2 6 2 6 2 10 1 3 . 2 6 Inverses Using Row Operations We can use row operations to find the inverse of A by writing a matrix (A|In ), then use row operations to get (In |A−1 ). 2 3 Example 12.2.1. Consider A = . 2 2 Solution: We write 2 3 1 0 . 2 2 0 1 Then performing row operations we have r2 → r2 − r1 r1 → r1 + 3r2 1 r1 → r1 2 r2 → −r2 −1 Therefore A 2 0 2 0 1 0 1 0 3 1 0 −1 −1 1 0 −2 3 −1 −1 1 0 −1 23 −1 −1 1 0 −1 32 . 1 1 −1 −1 23 = . Checking can be done by verifying that A−1 A = I. 1 −1 −1 23 2 3 1 0 = = I. 1 −1 2 2 0 1 189 12.3 Linear Equations An equation of the kind a1 x 1 + a2 x 2 + · · · + an x n = b is called a linear equation in the n variables x1 , x2 , · · · , xn and a1 , a2 , · · · , an and b are real constants. A solution of a linear equation a1 x1 + a2 x2 + · · · + an xn = b is a sequence of n numbers s1 , s2 , · · · , sn such that the equation is satisfied when we substitute x1 = s1 , x2 = s2 , · · · , xn = sn . The set of all solutions of the equation is called its solution set. A finite set of linear equations in the variables x1 , x2 , · · · , xn is called a system of linear equations or a linear system. A sequence of numbers s1 , s2 , · · · , sn is called a solution of the system if x1 = s1 , x2 = s2 , · · · , xn = sn is a solution of every equation in the system. Not all systems of linear equations have solutions. A system of equations that has no solutions is said to be inconsistent. If there is at least one solution, it is called consistent. Every system of linear equations has either no solutions, exactly one solution or infinitely many solutions. An arbitrary system of m linear equations in n unknowns will be written as a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 .. . . = .. am1 x1 + am2 x2 + · · · + amn xn = bm where x1 , x2 , · · · , xn are the unknowns. We can write a rectangular array of numbers, as a11 a12 · · · a1n b1 a21 a22 · · · a2n b2 .. .. .. .. . . . ··· . . am1 am2 · · · amn bm This is called the augmented matrix for the system. Example 12.3.1. The augmented matrix for the system of equations x1 + x2 + 2x3 = 9 2x1 + 4x2 − 3x3 = 1 3x1 + 6x2 − 5x3 = 0 is 1 1 2 9 2 4 −3 1 . 3 6 −5 0 190 Example 12.3.2. Find the solution set of 2x1 −3x1 x1 − x2 + 2x2 + − − x3 4x3 5x3 = = = 4 1 0 Solution: The augmented matrix for the linear system is 2 −1 1 4 −3 2 −4 1 . 1 0 −5 0 Doing row operations, we have r2 → r2 + 3r1 , 1 −3 r3 ↔ r1 2 1 0 r3 → r3 − 2r1 0 1 r2 ↔ r3 0 0 1 r2 ↔ (−1)r2 0 0 1 r3 → r3 − 2r2 0 0 0 −5 0 2 −4 1 . −1 1 4 0 −5 0 2 −19 1 . −1 11 4 0 −5 0 −1 11 4 . 2 −19 1 0 −5 0 1 −11 −4 . 2 −19 1 0 −5 0 1 −11 −4 . 0 3 9 Corresponding system of linear equations which is derived from the augmented matrix is x1 − 5x3 = 0 x2 − 11x3 = −4 3x3 = 9. Now using the method of back substitution, we find the values of the unknown as follows x3 = 3 x2 = −4 + 33 = 29 x1 = 0 + 15 = 15. The solution set is (x1 , x2 , x3 ) = (15, 29, 3). 191 12.3.1 Applications of Linear Equations Linear equations arise in many applications, for example, quadratic interpolation, temperature distribution, global positioning system (gps), e.t.c. 12.4 Row Echelon Form To be in this form, a matrix must have the following properties 1. If a row does not consist entirely of zeros, then the first non-zero number in the row is a 1 (leading 1). 2. If there are any rows that consists entirely of zeros, then they are grouped together at the bottom of the matrix. 3. In any two successive rows that do not consist entirely of zeros, the leading 1 in the lower row occur further to the right than the leading 1 in the higher row. 4. Each column that contain a leading 1 has zeros elsewhere. A matrix having properties 1, 2 and 3 is said in row-echelon form. A matrix in reduced row-echelon form must have zeros above and below each leading 1. Example 12.4.1. These are 1 0 0 0 in row echelon form −3 1 0 −8 1 −9 6 0 and 0 0 1 7 0 0 0 1 1 1 −3 4 0 0 1 9 . 0 0 0 0 Example 12.4.2. These are in reduced row-echelon form 0 1 0 1 0 −4 0 0 1 2 0 , 0 0 0 , 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 . 0 0 The procedure for reducing a matrix to a reduced row-echelon form is called Gauss-Jordan Elimination and the procedure which produces a row-echelon form is called the Gauss Elimination. The Gauss Elimination method requires fewer elementary row operations than the Gauss-Jordan method. 192 Example 12.4.3. Solve the following system of linear equations −3x2 + 4x3 = −2 x1 + 5x2 + 2x3 = 9 x1 + x2 − 6x3 = −7. Solution: The augmented matrix is 0 −3 4 −2 1 5 2 9 . 1 1 −6 −7 Doing row operations, yields r1 ↔ r2 , 9 1 5 2 r2 ↔ r3 , r2 → r2 − r1 0 −4 −8 −16 . 0 −3 4 −2 1 5 2 9 1 r2 → − r2 , r3 → r3 + 3r2 0 1 2 4 . 4 0 0 10 10 1 5 2 9 1 r3 → r3 0 1 2 4 , 10 0 0 1 1 which is now in echelon form and is equivalent to x1 + 5x2 + 2x3 = 9 x2 + 2x3 = 4 x3 = 1. This means x3 = 1, x2 = 4 − 2x3 = 2 and finally x1 = 9 − 5x2 − 2x3 = −3. Hence (x1 , x2 , x3 ) = (−3, 2, 1). Alternatively we could continue with the elementary row operations as follows 1 0 −8 −11 2 . r1 → r1 − 5r2 , r2 → r2 − 2r3 , 0 1 0 0 0 1 1 1 0 0 −3 r1 → r1 + 8r3 0 1 0 2 , 0 0 1 1 which is now in reduced row-echelon form and is equivalent to x1 = −3 x2 = 2 x3 = 1. Therefore (x1 , x2 , x3 ) = (−3, 2, 1). 193 The types of solutions one can get when solving system of linear equations, we shall look at several augmented matrices that have already been reduced to echelon form. Case 1 1 2 0 3 0 1 1 4 0 0 1 −1 is equivalent to x1 + 2x2 = 3 x2 + x3 = 4 x3 = −1, which implies that x1 = −7, x2 = 5 and x3 = −1. Case 2 1 0 −2 1 0 1 −1 3 0 0 0 0 which is equivalent to x1 − 2x3 = 1 x2 − x3 = 3 which implies that x2 = x3 + 3 x1 = 2x3 + 1 x3 = x3 . In this case all the solutions are expressed in terms of x3 . Any arbitrary value can be assigned to x3 and the resulting values of x1 , x2 and x3 will satisfy all the equations in the system. The solution set therefore is infinite and written as follows (x1 , x2 , x3 ) = (2t + 1, t + 3, t). Case 3 1 −2 4 0 0 1 3 −2 . 0 0 0 −4 194 The equation represented by the last row is 0x1 + 0x2 + 0x3 = −4. Clearly, we can never find suitable values for x1 , x2 and x3 which satisfy this equation. Therefore the solution does not exist. The reduced row-echelon form of a matrix is unique and a row-echelon form is not unique, by changing the sequence of elementary row operations it is possible to arrive at different row-echelon forms. Example 12.4.4. Find the value of α for which the following system of equations is (a) consistent (b) inconsistent. −3x1 + x2 = −2 x1 + 2x2 = 3 2x1 + 3x2 = α. Solution: The augmented matrix is −3 1 −2 1 2 3 . 2 3 α Doing elementary row operations, we have r2 → r2 + 3r1 , 1 2 3 r1 ↔ r2 , −3 1 −2 . 2 3 α 3 1 2 7 . r3 → r3 − 2r1 0 7 0 −1 α − 6 1 2 3 1 1 . r2 → r2 0 1 7 0 −1 α − 6 3 1 2 1 r3 → r3 + r2 0 1 0 0 α−5 which is now in echelon form. The last row is equivalent to 0x1 + 0x2 = α − 5. (a) The system can be consistent only if α − 5 = 0, that is, when α = 5. (b) The system is inconsistent if α − 5 6= 0, that is, when α 6= 5. 12.5 Homogeneous System of Linear Equations Definition 12.5.1. A homogeneous system of linear equations is a system in which all the constant terms are zero. 195 a11 x1 + a12 x2 + · · · + a1n xn = 0 a21 x1 + a22 x2 + · · · + a2n xn = 0 .. .. . . . . · · · .. = .. am1 x1 + am2 x2 + · · · + amn xn = 0. Any homogeneous system of equations will always have a solution no matter what the coefficient matrix is like, and so can never be inconsistent. The obvious solution is x1 = x2 = · · · = xn = 0. This solution is known as the trivial solution. 12.5.1 Solution of Homogeneous Systems Example 12.5.1. Find the solution set of the following homogeneous system of linear equations x1 − 2x2 + x3 = 0 2x1 + x2 − 3x3 = 0 −3x2 + x3 = 0. Solution: The augmented matrix is 1 −2 1 0 2 1 −3 0 . 0 −3 1 0 doing the elementary row operations, we have r2 → r2 − 2r1 1 r2 → r2 5 r3 → r3 + 3r2 1 0 0 1 0 0 1 0 0 −2 1 0 5 −5 0 . −3 1 0 −2 1 0 1 −1 0 . −3 1 0 −2 1 0 1 −1 0 0 1 0 which is equivalent to x1 − 2x2 + x3 = 0 x2 − x − x3 = 0 x3 = 0. Therefore (x1 , x2 , x3 ) = (0, 0, 0) has only one solution, the trivial solution. 196 In general if (i) n = m the system has only the zero solution. (ii) if m < n, the system has a non zero solution. Theorem 12.5.1. A homogeneous system of linear equations with more unknowns than equations has a non-zero solution. 12.6 Tutorial 11 1. Let A = 2 1 3 1 . Solve for B the equation AB = 1 0 0 1 2. Let A, B, C be square matrices of order n. Prove that (i) A + B = B + A, (ii) A + (B + C) = (A + B) + C, 3. Find the determinant of 1 0 1 0 1 2 0 1 0 2 1 1 0 1 2 2 1 0 C= 1 1 1 1 1 −1 3 4. Given that A = , and B = , 0 1 2 0 −4 (AB)t = B t At . x 1 5. Let E(x) = . Show that −1 0 1 1 0 2 1 . (iii) (AB)C = A(BC). find (AB)t and B t At and verify that (a) E(x)E(0)E(y) = −E(x + y). (b) the inverse of E(x) is E(0)E(−x)E(0). 1 a a2 6. Show that 1 b b2 = (b − a)(c − a)(c − b). 1 c c2 1 0 1 1 3 2 7. Find the inverse of A = 2 2 0 and C = 3 1 4 . 0 1 3 0 2 3 8. Solve for x: 2−x 4 −2 4 2 − x −2 −2 −2 4 − x =0 9. Prove that if A is an invertible matrix, then At is also invertible and (At )−1 = (A−1 )t . 197 1 17 17 10. Let A = 0 1 0 . Calculate A2 , A3 , A4 and find an expression for An . 0 0 1 1. Solve the following system using Gaussian elimination method x + y + z 2x − y + z (a) x + z 2x + y + z = = = = 6 3 4 7. 2. Solve the following systems using Gauss Jordan elimination method x + y + z = 5 (a) 2x + 3y + 5z = 8 4x + 5z = 2. 3 1 1 3. Using Gauss Jordan elimination, find the inverse of D = 1 5 1 . 1 1 3 4. Consider the system of equations x + y + 2z = a x + z = b 2x + y + 3z = c. Show that in order for this system to be consistent, a, b, and c must satisfy c = a + b. 5. In the following linear system, determine all the values of a for which the resulting linear system has (a) no solution (b) a unique solution and (c) infinitely many solutions. x + 3y − 2z = 1 x + 7y − 4z = 7 2x + 4y + (a2 − 28)z = a + 4. 6. For what values of c does the following system of linear equations have (i) no solution, (ii) a unique solution, (iii) infinitely many solutions ? x + 2y − 3z = 4 3x − y + 5z = 2 4x + y + (c2 − 14)z = c + 2. + Show that in the case of infinitely many solutions, the solution may be written as ( 78 − α, 10 7 2α, α) for any real α. 2 0 4 2 7. Find the rank of the matrix A = 0 4 8 4 . 0 2 2 1 198