Further Engineering Mathematics (Algebra and Multivariable Calculus) J. N. RIDLEY 3 2 1 -6 -4 -2 2 4 6 -1 -2 -3 MATH2011/2/4 Further Engineering Mathematics (Algebra and Multivariable Calculus) J. N. Ridley c J. N. Ridley All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. First printed 1999 Revised 2016 The cover picture represents a Fourier series approximation to the sawtooth function arctan(tan x) for −2π ≤ x ≤ 2π. Introduction i TABLE OF CONTENTS Calculus Chapter 1 — Differential equations Linear equations and operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 D-operator methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Complex exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Calculus Chapter 2 — Vector functions of a scalar Vector differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Trajectories and orthogonal trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Calculus Chapter 3 — Scalar and vector fields Scalar fields and quadric surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Directional derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 The operator del and vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Potential functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Stationary points and optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Calculus Chapter 4 — Vector integration Path integrals in scalar fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Path integrals in vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Double and repeated integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Change of variables in double integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Green’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Algebra Chapter 1 — Complex numbers Real-imaginary form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 The complex plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Modulus-argument form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Euler’s formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67 Roots and polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 ii MATH2011/2/4 Complex exponentials, logarithms, and powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Algebra Chapter 2 — Convergence of series Indeterminate forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75 Convergence of series I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Convergence of series II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Convergence of series III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Algebra Chapter 3 — Linear algebra Linear spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Bases and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Independence and rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 The characteristic polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Algebra Chapter 4 — Orthonormality Dot products and orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Unitary and hermitian matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Introduction æ iii 0 MATH2011/2/4 æ Calculus Chapter 1 1 Calculus Chapter 1 Differential equations Linear equations and operators Differential equations were introduced in first year, together with techniques for solving first order equations of the types variables separable, homogeneous, linear, and exact. The differential equation y ′ + p(t)y = q(t) is called linear (we shall use t rather than x because in practical examples it is usually time), and simultaneous algebraic equations, which can be written as a single matrix equation AX = B, are also called linear. What do these different situations have in common, to justify using the same word? One clue can be found by considering the form of the solutions. As shown previously, R the general solution of the linear differential equation above is y = µ−1 µQ dt + cµ−1 , R where µ = e p(t) dt is the integrating factor and c is an arbitrary constant. The general R solution can be written as a sum y = y1 + y0 , where y1 = µ−1 µQ dt and y0 = cµ−1 . Now y1 is a particular solution of the given equation, since it is obtained by putting c = 0. Thus we have y1′ + p(t)y1 = q(t) (see Question 1 below). However, what happens if we R ′ substitute y0 in the left hand side of the equation? Note that ln µ = p(t) dt, so µµ = p(t), and y0′ + p(t)y0 = −cµ −2 ′ µ + cµ −1 p(t) = cµ −1 µ′ p(t) − µ = 0. Thus y0 is a solution of the equation with zero on the right hand side, and it is a general solution of this equation because it involves the arbitrary constant c. This means that for any first order linear differential equation we can split the general solution into a sum (Particular solution of given equation) + (General solution of equation with 0 on RHS). (Sometimes the equation with 0 on the right hand side is called “homogeneous” because it is of degree one (linear) in y and its derivative. This must not be confused with homogeneous differential equations in which every term is of the same total degree in x and y.) 2 MATH2011/2/4 Similarly, the solution of the matrix equation AX = B can be split into X = X1 + X0 , where X1 is a particular solution of the given equation, and X0 , which contains all arbitrary constants, is the general solution of the equation with 0 on the right hand side. Now we have seen a similarity in the general form of the solutions, but what is the underlying reason for it? The essence lies in what is called a linear operator, or linear process or linear system. We have met the idea of a function as something, like a button on a calculator, that takes an input number and converts it into an output number: x −→ f −→ f (x). An operator is more general than a function, in that its input and output need not simply be numbers, but can be more general mathematical objects, e.g. vectors or even functions themselves. For example, the differential equation above is associated with an operator that takes an input function y(t) and converts it into the output function y ′ (t) + p(t)y(t). −→ y ′ (t) + p(t)y(t). y(t) −→ (There is no need at the moment to give the operator a name.) Such operators arise frequently in chemical processes, in vibrations, and in electrical analogues of these. Similarly, the matrix equation is associated with the operator that takes an input n × 1 vector X and multiplies it by the constant m × n matrix A to give the m × 1 output vector AX. X −→ −→ AX. With operators, as with functions, there is a unique and well defined output for every appropriate input. Solving an operator equation amounts to solving the inverse problem, associated with the inverse operator, which is obtained by reversing the arrows, or swopping input and output, thus: y ′ + p(t)y = q(t) → −1 −1 → y = ? or AX = B → → X = ?. For example, solving the differential equation y ′ + p(t)y = q(t) is the same as asking, “Given that y ′ + p(t)y = q(t), what is y?”, or “If q(t) is put into the inverse operator, what function y will come out?”. Similarly, solving the algebraic equation AX = B is the same as asking, “Given that AX = B, what is X?”, or “If B is put into the inverse operator, what vector X will come out?”. However, with inverse operators, the output (or solution of the equation) is not usually unique, and sometimes does not even exist, as has been seen before for linear algebraic equations. Calculus Chapter 1 3 We now restrict ourselves to linear operators. An operator is said to be linear if the output is proportional to the input, and if two inputs are combined, then the resulting output is obtained by combining the separate outputs. To express this more mathematically, suppose we have an operator P that takes an input y to the output denoted P y: y −→ P −→ P y. Then the operator P is defined to be linear if (1) P (cy) = c(P y) for any constant c, (2) P (y1 + y2 ) = P y1 + P y2 for any two inputs y1 and y2 . The rules for differentiation and for matrix algebra show immediately that the two operators considered above are linear. The null space of a linear operator P consists of all y0 such that P y0 = 0. Thus saying that y0 is in the null space of P means the same as saying that y0 is a solution of the homogeneous equation. If y is in the null space of P , then, roughly speaking, y0 = P −1 0, so y0 is sometimes called a zero-input solution, since it can arise when zero is input to the inverse operator. Another name is to say that y0 is a natural response of the system, since it can occur even if there are no external influences. Theorem (Superposition Principle). If x0 and y0 are in the null space of a linear operator P , then the sum x0 + y0 and any constant multiple cy0 are also in the null space of P . Proof. Since x0 and y0 are in the null space of P , we have P x0 = 0 = P y0 ; then by linearity P (x0 + y0 ) = P x0 + P y0 = 0 + 0 = 0 and P (cy0 ) = c(P y0 ) = c0 = 0, so x0 + y0 and cy0 are also in the null space. Theorem. If P is a linear operator, then the general solution of the equation P y = b can be written in the form y = y1 + y0 , where y1 is any particular solution of the given equation, and y0 is a general element of the null space. Proof. There are two things to prove: firstly that anything of the form y1 +y0 is a solution, and secondly that any solution is of the form y1 + y0 . Firstly, suppose y = y1 + y0 , where y1 is a particular solution of the given equation, and y0 is in the null space. This means that P y1 = b and P y0 = 0. Then P y = P (y1 + y0 ) = P y1 + P y0 = b + 0 = b, (by linearity) 4 MATH2011/2/4 so y is a solution. Secondly, suppose we have any solution y, as well as our particular solution y1 . This means that P y = b and also P y1 = b. Let y0 = y − y1 ; then P y0 = P (y − y1 ) = P y − P y1 (by linearity) = b − b = 0, so y0 is in the null space, and clearly y = y0 + y1 . For a linear differential equation P y = b, the particular solution y1 satisfying y1 = y1′ = y1′′ = · · · = 0 at time t = 0 is called the zero-state solution with input b, since it arises when the external force b is applied to a system at rest. Any particular solution, whatever its initial state, is also called a forced response of the system. It follows from this theorem that the general solution is the sum of the zero-state solution and the general zero-input solution. Tutorial questions — Linear equations and operators R R 1. Show that if y1 = µ−1 µq(t) dt, where µ = e p(t) dt , then y1 is a solution of the first R order linear differential equation y ′ + p(t)y = q(t). (Hint: write µy1 = µq(t) dt, then differentiate both sides, and use the fact that p(t) = µ′ /µ.) 2. Solve the first order linear differential equations below, and show that the general solution is in the sum of a particular solution of the given equation and a general solution of the homogeneous equation. dy y 1 − 2 = 2 . dt t +1 t +1 3. Use Gaussian elimination (and complex algebra) to solve the matrix equations below, 2 (a) y ′ + 2ty = e−t (b) where Z is an unknown vector in C2 or C3 . Write the general solution in the form Z = Z1 + Z0 , where Z1 is a particular solution of the given equation (with no arbitrary constants) and Z0 is a general solution of the homogeneous equation (including arbitrary constants). 1 i 1+i 1 i 1+i 1 (a) Z= (b) Z= . i −1 −1 + i i 0 1 2−i 4. The following list gives the outputs P (y) obtained when different operators P act on the general function y = y(t). Simplify P (y1 + y2 ) and P y1 + P y2 , and see if they are equal. Similarly, compare P (cy) (where c is constant) and c(P y). Hence determine which operators are linear. (a) P (y) = 2y (b) P (y) = y 2 (c) P (y) = yy ′ (d) P (y) = ty ′ + y. 5. The following list gives the outputs P X obtained when different operators P act on the general 2 × 1 column vector X. Compare P (X1 + X2 ) with P X1 + P X2 and P (cX) with c(P X). Hence determine which operators are linear. Calculus Chapter 1 5 (a) P X = 2X (b) P X = |X| (c) P X = X T (d) P X = X T X. 6. (a) Use the rules for differentiation and integration to show that the differentiation Rb operator (y → → y ′ ) and the definite integral operator (y → → a y dt) are linear operators. (b) Use the rules of matrix algebra to show that multiplication by a fixed matrix A (i.e. X→ → AX) is a linear operator. 7. Find the null spaces of the linear operators in Question 4. (Hint: put the given output equal to 0, and solve for y.) D-operator methods In this section we shall use linearity to solve nth order linear differential equations with constant coefficients. The general form is an dn−1 y d2 y dy dn y + a + · · · + a + a + a0 y = f (t), n−1 2 1 dtn dtn−1 dt2 dt where y = y(t) and the coefficients an , . . . , a0 are real constants. Since the leading coefficient an must be non-zero, we can take it as equal to 1, by dividing through by it if d necessary. We use D to denote the differentiation operator dt , so we can write the equation as Dn y + an−1 Dn−1 y + · · · + a2 D2 y + a1 Dy + a0 y = f (t), or P (D)y = f (t), where the operator P (D) = Dn + an−1 Dn−1 + · · · + a2 D2 + a1 D + a0 . P (D) is called a D-operator and has the form of a polynomial in D. It is easy to show that every such D-operator is linear, since differentiation and multiplication by constants are linear operators. A special property of D-operators, which is true only because the coefficients are all constant, is that if we operate first by P (D) and then by Q(D), the result is the same as multiplying the two polynomials and then operating by their product P (D)Q(D). It is also the same as first operating by Q(D) and then by P (D). This can be written mathematically as P (D) Q(D)y = P (D)Q(D) y = Q(D) P (D)y , or diagrammatically that the three processes below are identical: ր y −→ −→ ց P (D) −→ Q(D) −→ P (D)Q(D) −→ Q(D) −→ P (D) ց −→ −→ P (D)Q(D) y. ր 6 MATH2011/2/4 This is analogous to the result that the mixed second partial derivatives of a function of two variables are equal, i.e. that if we operate on z first by ∂ ∂y ∂2z ∂x∂y ∂2z , ∂y∂x ∂ by ∂x , = and then which means that the output is the same or the other way round. The importance of this property is that a D-operator can be factorized, just like any polynomial, and the effect is the same as if we operate by the factors one at a time. By the Fundamental Theorem of Algebra we can assume that P (D) can be factorized into factors of degree one, though complex coefficients may be required. It follows that the solution of D-operator equations can be found by repeated first order methods. Firstly, if P (D) itself is of degree one, say P (D) = D − α, then we have (D − α)y = f (t), which is a first order linear differential equation with integrating factor e−αt . The solution is αt y=e It is easy to verify that eαt R Z e−αt f (t) dt + Aeαt . e−αt f (t) dt is a particular solution and that Aeαt is in the null space of D − α, as the theory predicts. Secondly, if P (D) is of degree two, say P (D) = (D − α)(D − β), then we substitute u = (D − β)y. This converts P (D)y = f (t) into (D − α)u = f (t), which can be solved for u. We then find y by solving (D − α)y = u, since u is now known. Repeated first order methods are very tedious in practice for equations of order higher than two, so it is worthwhile finding a general solution technique. From the theory of linear operators we can break it into two parts: • find a general function in the null space, and • find a particular solution. The general solution will then be the sum of these two. The particular solution y1 is usually called a particular integral, since it is the output of an inverse differentiation operator. The general function y0 in the null space is called the complementary function, since it completes the solution by being added to the particular integral. Thus y = y1 + y0 , as before, or General Solution = Particular Integral + Complementary Function = Zero-state Solution + Zero-input Solution = Forced Response + Natural Response Firstly, to find the complementary function, we have to solve the homogeneous equation P (D)y0 = 0. If P (D) is of degree one, say P (D) = D − α, and we showed above that the solution of (D − α)y0 = 0 is y0 = Aeαt , where A is an arbitrary constant. Next suppose P (D) is of degree two, say P (D) = (D − α)(D − β). By putting u0 = (D − β)y0 we obtain Calculus Chapter 1 7 (D − α)u0 = 0, so u0 = Ceαt say, where C is arbitrary. We must now solve the first order equation (D − β)y0 = Ceαt , which gives βt y0 = e The integral is equal to C (α−β)t α−β e Z e−βt Ceαt dt + Beβt . if α 6= β, and to Ct if α = β. Thus the solution of (D − α)(D − β)y0 = 0 is y0 = Aeαt + Beβt (At + B)eαt if α 6= β if α = β, where A and B are new arbitrary constants. This result can easily be generalized for higher order operators, and the following can be proved by induction if necessary. Theorem. If P (D) = (D − α)k (D − β)l . . . , where α, β, . . . are all distinct, then a general function y0 in the null space of P (D) has the form y0 = (A1 + A2 t + · · · + Ak tk−1 )eαt + (B1 + B2 t + · · · + Bl tl−1 )eβt + · · · , where A1 , . . . , Ak , B1 , . . . , Bl , . . . are arbitrary constants, equal in number to the degree of P (D). Notice how each k-fold factor (D − α)k of P (D) contributes the exponential eαt multipled by a polynomial with k arbitrary coefficients. The simplest case is when no factors are repeated, i.e. k = l = · · · = 1, and P (D) = (D − α)(D − β) . . . . The complementary function is then simply y0 = Aeαt + Beβt + · · · , with one arbitrary constant for each exponential and no powers of t appearing. Now that we have found the complementary function y0 , we must find a particular integral y1 , that is, a solution of P (D)y1 = f (t). Since D-operators behave like polynomials, it is convenient to write inverse D-operators like reciprocals, so we shall write y1 = 1 f (t), P (D) though it should be remembered that this is not uniquely defined, because there are infin 1 P (D)f (t) is not unique, although itely many solutions. In particular, P (D) P (D) 1 f (t) = f (t) P (D) always. (This is similar to the fact that sin(arcsin x) = x for all x for which it is defined, but arcsin(sin x) 6= x in general.) 8 MATH2011/2/4 In order to find particular integrals, we need to discuss the action of D-operators on products in which one factor is exponential. By the product rule for differentiation, we have D(eβt v1 ) = eβt Dv1 + βeβt v1 = eβt (D + β)v1 . By induction on r it follows that Dr (eβt v1 ) = eβt (D + β)r v1 , and by adding constant multiples of such terms we get the Shift Rule: P (D) eβt v1 = eβt P (D + β)v1 for any polynomial operator P (D). It says that eβt can be brought to the left past any D-operator provided D is replaced by D + β. If we now put v1 (t) = 1 P (D+β) a(t), then P (D) eβt v1 = eβt P (D + β)v1 = eβt a(t), so 1 1 eβt a(t) = eβt a(t), P (D) P (D + β) which shows that the Shift Rule is valid for inverse D-operators also. We assume that f (t) = eβt a(t), the product of an exponential and a polynomial, so all that is left is to determine 1 P (D+β) a(t), an inverse D-operator acting on a polynomial. By taking out the lowest non-zero term in P (D + β), we can write P (D + β) = q0 Dk (1 + {q1 D + · · · }), where q0 6= 0, but q1 and k may be zero. This gives y1 = 1 eβt −k 1 eβt a(t) = eβt a(t) = D (1 + {q1 D + · · · })−1 a(t). P (D) P (D + β) q0 We now expand the reciprocal 1 + {q1 D + · · · } 2 −1 using the binomial series 1 + u −1 = 1 − u + u − · · · to as many terms as are required, and let the resulting polynomial D- operator act on a(t). (Since a(t) is a polynomial in t, operating by sufficiently high powers of D always gives zero, so an infinite series in D will never be required.) Finally, if k 6= 0, we must operate by D−k , i.e. integrate k times, and multiply by eβt /q0 . The procedure for finding a particular integral y1 = 1 eβt a(t) P (D) can be summarized as follows. The simplest case is when a(t) is a constant, say a(t) = a, and P (β) 6= 0. Then k = 0 and q0 = P (β) and only the constant term in the binomial expansion is required, so aeβt 1 βt ae = provided a is constant and P (β) 6= 0. P (D) P (β) Calculus Chapter 1 9 If the polynomial a(t) is not constant, or if P (β) = 0, then we need to perform the following steps. (1) If β 6= 0, use the Shift Rule to write 1 P (D+β) 1 D−k q0 1 = eβt P (D+β) a(t). 1 + {q1 D + · · · } . −1 (3) Expand 1 + {q1 D + · · · } to a polynomial of the same degree as a(t), and let it operate on a(t). (2) Write = 1 βt P (D) e a(t) −1 (4) If k 6= 0, operate by D−k , i.e. integrate k times. (5) Divide by q0 and multiply by eβt . Note that P (D) should be left in the simplest form, which is often not the factorized form. Furthermore, steps (3) and (4) can often be simplified by algebraic manipulation of the operators. Tutorial questions — D-operator methods 8. Find (D2 + 3D + 2)(cy) and (D2 + 3D + 2)(y1 + y2 ), where c is constant. Hence show that D2 + 3D + 2 is a linear operator. *Show that a general D-operator Dn + an−1 Dn−1 + · · · + a0 is linear. d d d d +2) ( dt +1)y and ( dt +1) ( dt +2)y are both equal to (D2 +3D+2)y. 9. Verify that ( dt 10. Show that D(D + t)y 6= (D + t)Dy. (Hint: simplify each side separately, and don’t forget the product rule for differentiation, when it is required.) R 11. Use Question 1 to verify that y1 = eαt e−αt f (t) dt is a particular solution of the equation (D − α)y = f (t), and that y0 = Aeαt is a general function in the null space. (Hint: put p(t) = −α, and replace q(t) by f (t) and by zero.) 12. Use repeated first order methods to solve the D-operator equations: 2 (c) ÿ + 3ẏ + 2y = 4t. (a) (D2 + 3D + 2)y = 0 (b) ddt2y − 4 dy dt + 4y = 0 13. Write down complementary functions for the D-operators: 3 2 d d d (d) (D2 − 1)2 . (a) (D + 2)(D − 1)2 (b) D2 (D − 2)3 (c) dt 3 + dt2 − dt − 1 14. Find complementary functions and particular integrals for the equations from Ques- tion 12, and compare the solutions with those obtained previously. 15. Find the complete general solutions of the D-operator equations below. (Hint: complementary functions were found in Question 13.) (a) (D + 2)(D − 1)2 y = e−t (b) D2 (D − 2)3 y = et d2 2 3 (c) (D3 + D2 − D − 1)y = t (d) ( dt 2 − 1) y = t . * 16. Use integration by parts several times (preferably in one line) to show that Z f ′ (t) f ′′ (t) 1 αt + +··· . e−αt f (t) dt = − f (t) + e α α α2 We know from Question 11 that the left hand side is a particular solution of (D − α)y = f (t). Show that the right hand side can be obtained by expanding 1 D−α as a binomial 10 MATH2011/2/4 series and applying it to f (t). (This justifies the series expansion of an inverse Doperator, provided that the right hand side converges.) Complex exponentials The procedure for finding complementary functions depended on factorizing P (D) into factors of degree one. By the Fundamental Theorem of Algebra this factorization is always possible, but the factors might involve complex numbers. However, this does not cause any difficulties, because by Euler’s formula we know that if α = a + ib, then Aeαt = Aeat+ibt = Aeat (cos bt + i sin bt). where the arbitrary constant A may also be complex. However, since P (D) has real coefficients, it follows that each non-real factor D − α has a corresponding conjugate factor D − α, and the solution has a corresponding term Beαt = Beat (cos bt − i sin bt). Thus the complex portion of the complementary function can be written as a sum of pairs of the form Aeαt + Beαt = eat (A + B) cos bt + i(A − B) sin bt . If the solution is to be real for all t, then A + B and i(A − B) must both be pure real, say P = A + B and Q = i(A − B). This gives A = 21 (P − iQ) and B = 12 (P + iQ) = A. If we write P + iQ = Reiγ (modulus-argument form), then A = 21 Re−iγ , so this part of the solution becomes Aeαt + Aeαt = 2 Re Aeαt = Re Re−iγ eat+ibt = Reat cos(bt − γ). Thus the real-valued term in the complementary function corresponding to the pair of conjugate factors (D − α)(D − α) = (D − a)2 + b2 can be written as eat (P cos bt + Q sin bt) or Reat cos(bt − γ), where P, Q and R, γ are pairs of arbitrary real constants. For repeated factors, we add similar expressions, each multiplied by an appropriate power of t. Particular integrals for cosine or sine functions can be found by writing them as the real or imaginary part of a complex exponential, and using the results from the previous section. Calculus Chapter 1 11 Tutorial questions — Complex exponentials 17. Find real-valued solutions for the following homogeneous D-operator equations. (Look in Algebra Chapter 1 for the factors of (b), or subtract and add 4D2 to complete the square.) d4 y dt4 + 3 (a) (D2 + 4)y = 0 (b) (d) (D2 + 1)2 y = 0 (e) (D + D2 + D + 1)y = 0 (c) (D2 + 4D + 5)y = 0 4y = 0 (f) ÿ + 2ẏ + 2y = 0. 18. Find particular integrals for the following D-operator equations. Hence write down the complete general solutions. (Complementary functions were found in Question 17.) (a) ÿ + 4y = t2 + e−t (c) 2 d y + dt2 3 4 dy + 5y = 8t sin t dt (e) (D + D2 + D + 1)y = te−t (b) (D4 + 4)y = sin t (d) (D2 + 1)2 y = t3 (f) (D2 + 2D + 2)y = 5 cos t. 19. Find real general solutions of the simultaneous differential equations below by the following method: differentiate the first equation to get ẍ in terms of ẋ and ẏ, and substitute for ẏ so that ẍ is expressed in terms of ẋ, x, and y. Then eliminate y from the expressions for ẍ and ẋ, and solve the resulting D-operator equation for x. Finally, find y by using the first equation to express y in terms of x and ẋ. (Dots indicate derivatives with respect to t.) ẋ = y (a) ẏ = −x (b) ẋ ẏ = −x = −x +y −y. Stability A real exponential function eat will tend to zero as t → ∞ if a < 0 and will tend to infinity if a > 0. This remains true (in magnitude) if eat is multiplied by a polynomial in t, since exponentials beat powers, or if eat is multiplied by a sine or cosine, which oscillates. Thus the complementary function of P (D) will tend to zero as t → ∞ provided Re(α) < 0 for every linear factor s − α of P (s). (We replace D by s since we are thinking of it as a complex variable, not an operator.) These values of α are called the zeros of P (s), since s = α is a solution of P (s) = 0. They are also called the poles of 1 P (s) , i.e., the poles of a rational function of s are the values of s for which the denominator is zero, so the function is undefined. If the complementary function (or zero-input solution) does tend to zero, then it is said to be transient (which means temporary), and the system is said to be stable, since the zero-input solutions die away as t → ∞. For a stable system, all solutions of P (D)y = f (t) (with the same initial conditions) will tend to the same limiting solution as t → ∞ and the transients die away. Thus we have the following important result. 12 MATH2011/2/4 Theorem. A D-operator P (D) is stable if and only if all the poles of 1 P (s) have negative real part, i.e., lie to the left of the imaginary axis. Figure 1.1. Underdamped and overdamped stable systems The most important example of a stable system arises from damped simple harmonic motion: the operator is P (D) = mD2 + 2kD + c2 , where m, k, and c are all positive, and the equation P (D)y = f (t) then becomes mÿ = −c2 y − 2k ẏ + f (t). By Newton’s second law, if y denotes displacement, then the equation corresponds to a restoring force −c2 y proportional to the displacement, a resisting force −2k ẏ proportional to the velocity, and an external applied force f (t). Stability means that if there is no external force, then the displacement dies away. The electrical analogue is the RLC-circuit, with equation LD2 + RD + C1 Q(t) = E(t). Here L, R, and C are constants (inductance, resistance, and capacitance respectively), Q(t) is the charge on the capacitor, and E(t) is the applied voltage. Such a system is said to be underdamped if the zeros are non-real, because the transients oscillate (because of the complex exponentials) as they die away. The system is overdamped if the zeros are real; the transients die away without oscillation. These situations are illustrated in Figure 1.1. 1 for s in the left half plane |P (s)| Another important feature of a stable system P (D)y = f (t) is the limiting response to Figure 1.2. Values of a periodic input f (t) = eiωt . By the stability assumption, all the zeros of P (s) lie to the Calculus Chapter 1 13 left of the imaginary axis, so P (iω) 6= 0. Therefore the particular integral 1 1 eiωt = eiωt , P (D) P (iω) using the simplest case for evaluating particular integrals. The limiting output, as the 1 1 transients die away, is therefore P (iω) eiωt . Its amplitude or modulus is |P (iω)| , which is called the dynamic gain of the system, since it is the ratio of the magnitudes of the output and the input. Figure 1.2 plots the surface of values of |P 1(s)| for values of s on and to the left of the imaginary axis. The vertical section on the near side (along the imaginary 1 axis, where s = iω) gives the values of |P (iω)| . The two poles of P 1(s) (where the surface goes up to infinity) are also clearly visible. Tutorial questions — Stability 20. Determine which of the operators in Question 17 are stable. 21. (a) If P (s) = ms2 + 2ks + c2 , where m, k, and c are all positive, find the zeros of P (s). (Hint: complete the square, or use the quadratic formula.) √ √ (b) If 0 < k < c m, define ω = mc2 − k 2 , and show that the zeros are at (−k ±iω)/m. Hence find real expressions for the complementary functions, and show that they oscillate and die away (underdamping). √ √ (c) If k > c m, define ω = k 2 − mc2 , and show that the zeros are at (−k ± ω)/m. Hence find the complementary functions, and show that they die away with at most one turning point (overdamping). √ (d) If k = c m, find the complementary functions. Do they behave more like underdamped or overdamped solutions? 22. If P (D) = D2 +2ζωD+ω 2 , where ω > 0, show that the system is stable and overdamped if ζ > 1, stable and underdamped if 0 < ζ < 1, and unstable if ζ < 0. General tutorial questions Use these for extra practice or revision. 23. Find complete general solutions of the equations below. (a) (D + 2)(D − 1)2 y = 6et (b) D2 (D − 2)3 y = 4te2t 3 2 (d) (D2 − 1)2 y = 4tet . (c) ddt3y + ddt2y − dy dt − y = 8 cosh t 24. Solve the following homogeneous differential equations under the given initial conditions: dy(t) dy(t) d2 y(t) = 0. +5 + 6y(t) = 0, y(0) = 1, (a) 2 dt dt dt t=0 d2 y(t) dy(t) dy(t) (b) +2 + y(t) = 0, y(0) = 1, = 0. 2 dt dt dt t=0 dy(t) d2 y(t) = 1. + y(t) = 0, y(0) = 1, (c) dt2 dt t=0 14 MATH2011/2/4 (d) ẍ + 4ẋ + 13x = 0 such that x = 2 and ẋ = 1 when t = 0. d4 y dy d2 y d3 y (e) 4 − y = 0 such that y = 1 and = 2 = 3 = 0. dt dt dt dt 25. Solve the following differential equations, using the given initial conditions to find the constants. Indicate which part of the solution is the forced response and which is the natural or free response. dy(t) dy(t) d2 y(t) −t + 3 + 2y(t) = e , y(0) = = 0. (a) dt2 dt dt t=0 d2 y(t) dy(t) dy(t) (b) +3 + 2y(t) = e−t , y(0) = = 1. 2 dt dt dt t=0 dy(t) dy(t) d2 y(t) +2 + y(t) = te−t , y(0) = 1, = 0. (c) 2 dt dt dt t=0 (d) ẍ(t) − x(t) = e−t + sin t, x(0) = ẋ(0) = 0. 26. A signal x(t) = te−2t is applied to a system satisfying the differential equation dy(t) d2 y(t) +4 + 8y(t) = x(t). 2 dt dt 1 2 The initial conditions of this system are y(0) = and y ′ (0) = 14 . Find the forced and natural responses. 27. If P (D) = (D + 1)2 and Q(D) = D − 1, find the full general solution of P (D)y = Q(D)te−t . (Hint: First simplify the right hand side.) Answers R 1. µy1 = µq(t) dt, so µy1′ + µ′ y1 = µq(t). Now substitute for µ′ and cancel µ. 2 2 2. (a) y = y1 + y0 , where y1 = te−t and y0 = ce−t . Check that y0′ + 2ty0 = 0. 0 (b) y = y1 + y0 , where y1 = −1 and y0 = cearctan t . Check that y0′ − t2y+1 = 0. −1 − 2i i 1+i −i 3. (a) Z1 + Z0 = +c , (b) Z1 + Z0 = 2 − 2i + c −2 + i . 0 1 0 1 4. (a) P (cy) = 2cy = c(P y), P (y1 + y2 ) = 2(y1 + y2 ) = P y1 + y2 , linear; (b) P (cy) = c2 y 2 6= cy 2 = c(P y), P (y1 + y2 ) = (y1 + y2 )2 6= y12 + y22 = P y1 + P y2 , non-linear; (c) P (cy) = c2 yy ′ 6= cyy ′ = c(P y), P (y1 + y2 ) = (y1 + y2 )(y1′ + y2′ ) 6= y1 y1′ + y2 y2′ = P y1 + P y2 , non-linear; (d) P (cy) = tcy ′ +cy = c(P y), P (y1 +y2 ) = t(y1′ +y2′ )+(y1 +y2 ) = (ty1′ +y1 )+(ty2′ +y2 ) = P y1 + P y2 , linear. 5. (a) P (cX) = 2cX, P (X1 + X2 ) = 2(X1 + X2 ) = P X1 + P X2 , linear; (b) P (cX) = |c||X| = 6 c|X| = c(P X), P (X1 +X2 ) = |X1 +X2 | = 6 |X1 |+|X2 | = P X1 +P X2 , Calculus Chapter 1 15 non-linear; (c) P (cX) = (cX)T = c(X T ) = c(P X), P (X1 + X2 ) = (X1 + X2 )T = P X1 + P X2 , linear; (d) P (cX) = c2 X T X 6= c(X T X) = c(P X), P (X1 + X2 ) = (X1 + X2 )T (X1 + X2 ) 6= X1T X1 + X2T X2 = P X1 + P X2 , non-linear. (d) y = ct . 7. (a) y = 0, d d d 9. ( dt +2) ( dt +1)y = ( dt +2)( dy dt +y) = d2 y dt2 d dy dt ( dt 2 d y dy dy +y)+2( dy dt +y) = ( dt2 + dt )+(2 dt +2y) = + 3 dy + 2y = (D2 + 3D + 2)y. dt 10. D(D + t)y = D(y ′ + ty) = y ′′ + (ty ′ + y) (product rule), while (D + t)Dy = (D + t)y ′ = y ′′ + ty ′ . This shows that if the coefficients are not constants, then the operators cannot simply be multiplied like polynomials. 12. (a) y = Ae−2t + Be−t , (b) y = (At + B)e2t , 13. (a) Ae−2t + (B1 + B2 t)et , (c) y = 2t − 3 + Ae−2t + Be−t . (b) A1 + A2 t + (B1 + B2 t + B3 t2 )e2t , (c) Aet + (B1 + B2 t)e−t , (d) (A1 + A2 t)et + (B1 + B2 t)e−t . 15. (a) 14 e−t + Ae−2t + (B1 + B2 t)et , (b) −et + A1 + A2 t + (B1 + B2 t + B3 t2 )e2t , (c) −t + 1 + Aet + (B1 + B2 t)e−t , (d) t3 + 12t + (A1 + A2 t)et + (B1 + B2 t)e−t . R −αt −αt e−αt ′ e−αt ′′ 16. e f (t) dt = e−α f (t) − (−α) 2 f (t) + (−α)3 f (t) − · · · . 17. (a) y0 = P cos 2t + Q sin 2t or y0 = R cos(2t − γ), (b) y0 = et (P cos t + Q sin t) + e−t (R cos t + S sin t), (d) y0 = P cos t + Q sin t + t(R cos t + S sin t), (c) y0 = e−2t (P cos t + Q sin t), (e) y0 = Ae−t + P cos t + Q sin t, (f) y0 = e−t (P cos t + Q sin t). 18. (a) y = 41 (t2 − 12 ) + 51 e−t + P cos 2t + Q sin 2t, (b) y = e−t (R cos t + S sin t), 1 5 sin t + et (P cos t + Q sin t) + (c) y = cos t(−t + 1) + sin t(t − 21 ) + e−2t (P cos t + Q sin t), (d) y = t3 − 12t + P cos t + Q sin t + t(R cos t + S sin t), ( 41 t2 + 12 t)e−t + Ae−t + P cos t + Q sin t, (e) y = (f) y = cos t + 2 sin t + e−t (P cos t + Q sin t). 19. (a) x = P cos t + Q sin t, y = Q cos t − P sin t, (b) x = e−t (P cos t + Q sin t), y = e−t (Q cos t − P sin t). 20. (c) and (f) are stable. √ 21. (a) −k ± k 2 − mc2 /m. (b) y0 = e−kt/m P cos(ωt/m) + Q sin(ωt/m) , which dies away because of the negative exponent, but oscillates because of the sine and cosine terms. (c) y0 = Ae−(k+ω)t/m +Be−(k−ω)t/m . This dies away because both exponents are negative, since 0 < ω < k. Solve ẏ0 = 0 to find one turning point (at most). (d) y0 = (A1 + A2 t)e−kt/m , no oscillation, like overdamped solutions. p 22. Poles are at s = ω(−ζ ± ζ 2 − 1). These are real and negative if ζ > 1, non-real but in the left half-plane if 0 < ζ < 1, and in the right half-plane if ζ < 0. 1 4 23. (a) t2 et + Ae−2t + (B1 + B2 t)et , (b) ( 24 t − 61 t3 )e2t + A1 + A2 t + (B1 + B2 t + B3 t2 )e2t , (c) tet − t2 e−t + Aet + (B1 + B2 t)e−t , (d) ( 61 t3 − 12 t2 )et + (A1 + A2 t)et + (B1 + B2 t)e−t . 16 24. (a) y(t) = 3e−2t − 2e−3t , (d) x = e−2t (2 cos 3t + 35 sin 3t), MATH2011/2/4 (b) y(t) = e−t + te−t , (c) y(t) = sin t + cos t, 1 (e) y(t) = 2 (cosh t + cos t). 25. (a) y(t) = e−2t + (t − 1)e−t ; forced response is te−t , natural response is e−2t − e−t . (b) y(t) = −e−2t + (t + 2)e−t ; forced response is te−t , natural response is 2e−t − e−2t . (c) y(t) = e−t + (te−t + 16 t3 e−t ; forced response is 16 t3 e−t , natural response is e−t + te−t . (d) x(t) = 12 (et − e−t − te−t − sin t); forced response is − 12 (sin t + te−t ), natural response is sinh t. 26. General solution is y = e−2t (A cos 2t + B sin 2t) + 14 te−2t ; forced response is natural response is 12 e−2t (cos 2t + sin 2t). 27. Q(D)te−t = (1 − 2t)e−t . Solution is y = e−t (A + Bt + 21 t2 − 31 t3 ). t −2t , 4e Calculus Chapter 1 æ 17 16 MATH2011/2/4 æ Calculus Chapter 2 17 Calculus Chapter 2 Vector functions of a scalar Vector differentiation A real vector function of a single real variable, say r = r(t), can be thought of as the parametric representation of a curve, since we have r(t) = x(t), y(t) or r(t) = x(t), y(t), z(t) , depending on whether r is a two-dimensional or three-dimensional vector. It is often helpful to think of the curve as representing the path of a particle that has position or displacement vector r(t) at time t. In the plane, complex numbers form a more powerful alternative to vector notation, and a plane curve can be represented as z = z(t) = x(t) + iy(t). Polar equations provide yet another representation for plane curves, since the polar equation r = r(θ) is the same as the parametric equation r = r(θ) cos θ, r(θ) sin θ = r(θ) cos θ, sin θ . Vector differentiation is straightforward, since each component is differentiated separately. Its interpretation is also self-evident, since if r(t) is thought of as a displacement d dt r(t) is the d2 and dt 2 r(t) is vector at time t, then velocity vector (which is parallel or tangential to the curve at that point), the acceleration vector. The rules for vector differen- tiation are also precisely as expected, so the rules need not be specifically learnt, in spite of the fact that there are three possible kinds of product. The reason is that products of vectors are built up from sums and products of their components, for which the ordinary rules of differentiation apply. (In the plane, differentiation of complex numbers also corresponds to vector differentiation, as is shown below.) Rules for vector differentiation. If u(t) and v(t) are vector functions of t, if φ(t) is a scalar function of t, and if c is a constant, then du d cu = c dt dt du dv d u+v = + dt dt dt du dφ d φu = φ + u dt dt dt 18 MATH2011/2/4 dv du d u·v =u· + ·v dt dt dt Since d dt r(t) dv du d u×v =u× + × v. dt dt dt is a tangent vector to the curve r = r(t), it follows that a unit tangent vector can be found by dividing it by its magnitude. Such a unit tangent vector is denoted by u or sometimes û, so we have u= If d dt r(t) d r(t) dt 1 ṙ. |ṙ| is thought of as the velocity vector of a particle on a curve, then its magnitude is the (scalar) speed of the particle, i.e. the rate of change of distance along the curve i.e., the rate of change of arc length. We denote arc length by s, so we have ds dt = d dt r(t) , and the arc length from the point where t = α to the point where t = β is given by Arc length = Z β α dr dt. dt Sometimes a curve is oriented, i.e., given a direction, so that arc length is taken as positive in one direction and negative in the opposite direction. For an oriented curve, we must write ds d =± r(t) , dt dt since ds dt is obviously negative if arc length s decreases as the parameter t increases. Tutorial questions — Vector differentiation 1. A particle moves along the curve with displacement vector r = (2t2 , t2 − 4t, 3t − 5) at time t. (a) Find the velocity and acceleration vectors of the particle at any time t. (b) Find the components, in the direction of the vector (1, −3, 2), of the velocity and acceleration vectors at time t = 2. (Hint: the component of a vector a in the direction of a vector b is the number |a| cos φ, where φ is the angle between a and b. (c) Find the components, in the direction tangential to the path of the particle, of the velocity and acceleration vectors at time t = 1. 2. If u = (t2 , t, 1) and v = (ln t, et , t), verify the rules for differentiation for d dt (u × v) by evaluating each side separately. d (u dt · v) and 3. For a plane curve with polar equation r = r(θ), note that r = r(θ) cos θ, sin θ . Use the rules for vector differentiation with respect to θ to show that dr dθ (cos θ, sin θ) + r(− sin θ, cos θ) d2 r dr d2 r = ( dθ 2 − r)(cos θ, sin θ) + 2 dθ (− sin θ, cos θ). dθ2 dr = (i) dθ (ii) Calculus Chapter 2 19 Note that the right hand sides are sums of perpendicular vectors. Deduce that q 2 . r 2 + dr dθ dr dθ = 4. If |u| is constant, show that u is perpendicular to u̇. (Hint: |u|2 = u·u; now differentiate both sides.) 5. Suppose a curve z = z(t) in the complex plane is written in the form z = r(t)eiθ(t) . (i) Find dz dt and write it in real-imaginary form. (ii) Write the curve in vector form r = r(t) = (x(t), y(t)). Then find it is the vector form of dr dt and show that dz . dt This shows that differentiation of curves in the complex plane coincides with vector differentiation. (Complex differentiation is often easier to perform). 6. For each of the following curves, find the unit tangent vector at a general point on the curve, and find the arc length between the given points. (The parameter values for the initial and final points must be evaluated to find the endpoints of integration.) (a) r = et (cos t, sin t, 1) between the points r = (1, 0, 1) and r = eπ (−1, 0, 1). (b) r = (t, cosh t, sinh t) between the points r = (ln 2, 54 , 34 ) and r = (ln 3, 53 , 34 ). (c) z = e(1+iπ)t between the points z = 1 and z = e2 . (d) r = 1 + cos θ (polars) between the points where θ = −π and θ = π. (Hint: use the formula for dr dθ given in Question 3.) Sketch the curve in (a). (Hint: show that the curve lies on the surface z = What is this surface?) p x2 + y 2 . 7. In electromagnetic signalling a wave is often represented by a complex curve z = z(t), where |z| is the amplitude and arg(z) is the angular displacement. Thus d dt arg(z) = an- gular velocity = 2π× frequency. A carrier wave zc (t) has constant amplitude r and constant angular velocity ω, so zc = reiωt . Suppose this wave carries a real signal f (t) = cos αt. (a) In AM signalling the amplitude is modulated, and the modulated wave z(t) is obtained by simply multiplying the carrier wave by the signal, i.e. z(t) = f (t)zc (t). Show that z(t) = 21 r(ei(ω+α)t + ei(ω−α)t ). Note how the modulated wave is made up of two waves with angular velocities ω ± α (one on each side of the carrier). These are called sidebands. (b) In FM signalling the frequency alone is modulated, and the modulated wave z(t) satisfies d dt arg(z) = d dt arg(zc ) + f (t). Show that d dt arg(z) = ω + cos αt, and solve for arg(z) (assuming arg(z) = 0 at t = 0). Hence show that the modulated wave is z(t) = r exp i(ωt + α1 sin αt) . It can be shown that there are infinitely many side- bands, with angular velocities ω ± α, ω ± 2α, and so on, but with decreasing amplitudes. 20 MATH2011/2/4 Curvature In elementary curve sketching we obtained a qualitative description of the curvature of a plane curve: if the second derivative is positive, then the curve bends or curves upward (irrespective of whether it is increasing or decreasing), while if the second derivative is negative, then the curve bends downward. We now establish a quantitative description, i.e. we shall measure the sharpness of the bend, not only the direction of bending. We are also not restricted to plane curves. We showed in the previous section that if s denotes dr arc length along a curve r = r(t), then ds dt = dt and a unit tangent vector is given by . Thus u = dr u = dr ÷ dr ÷ ds , which by the Chain Rule gives dt dt dt dt u= dr . ds It is therefore convenient to use arc length s as the parameter for the curve. Low curvature ‹ High curvature Figure 2.1. Changes in curvature Since u is a unit vector, it is perpendicular to its derivative Let du ds = κ (kappa) and n = 1 du κ ds du , ds as shown in Question 4. (assuming κ 6= 0), so du = κ n. ds It follows that n is a unit vector, and n ⊥ u, so n is called the unit normal vector to the curve. It is easy to show that for a straight line κ = 0, and for a circle κ is the reciprocal of the radius. Therefore as κ increases, the radius decreases, and the bend becomes sharper, so we call κ the curvature of the curve, and 1 κ is called the radius of curvature at each point. For a general curve the curvature κ varies from point to point, as illustrated in Figure 2.1. The normal vector n points towards the inside of the bend, and the point r + κ1 n is called the centre of curvature. dy For a curve in the (x, y) plane the situation is simpler. We have u = ( dx ds , ds ), and dx the only unit vectors perpendicular to this are ±(− dy ds , ds ), so n must be one of these. Since u is a unit vector, we can write u = (cos φ, sin φ), say. The angle φ is called the Calculus Chapter 2 21 dx inclination of the curve, since tan φ = dy ds ÷ ds , which is the slope Rule we have du dφ = (− sin φ, cos φ), ds ds dy dx . Then by the Chain where (− sin φ, cos φ) is a unit vector perpendicular to u. By comparing this with the previous expression for κ= dφ ds du ds , we see that and n = ±(− sin φ, cos φ) = ±(− dy , dx ), as mentioned previously. ds ds For an explicit plane curve y = y(x), a slightly different convention is used, to avoid the plus or minus signs. With this convention, we use the formulae κ= dφ ds dx and n = (− sin φ, cos φ) = (− dy ds , ds ). This choice of n always points upward, rather than towards the centre of curvature, and p dy ds = tan φ and dx = 1 + tan2 φ = sec φ, it follows that κ is not always positive. Since dx d2 y d dφ dφ ds dφ = tan φ = sec2 φ = sec2 φ = sec3 φ = κ sec3 φ. 2 dx dx dx ds dx ds Thus κ and y ′′ have the same sign, and κ= y ′′ y ′′ = 3/2 , sec3 φ ′ 2 1 + (y ) so the sign of κ, like the sign of y ′′ , indicates the direction of curving. An important application of plane curvature is in the bending of a horizontal beam. The curvature at each point is proportional to the bending moment, κ = BM/EI, where the constant EI depends on the cross-section of the beam and the material of which it is ds made. If the deflection is slight, then the slope y ′ is small, so dx ≈ 1 and κ ≈ y ′′ . Thus EIy ′′ ≈ BM, and EIy (4) ≈ d2 BM, dx2 which is the load per unit length. Tutorial questions — Curvature 8. Find the curvature at a general point on each of the following curves. (Hint: First find du u, then ddtu . Then κ = ddtu / ddtr since this expression is equal to ddtu / ds dt = ds .) (a) r = a + bt (a, b constant) (b) r = a(cos t, sin t) (a constant) (c) r = (t, t2 ) (e) r = (2t − sin 2t, 1 − cos 2t) (cycloid) (d) r = (t, cosh t) (f) z = (1 − it)eit for t > 0 (see Question 5) (g) r = (cos t, ln(sec t+tan t) − sin t) (h) r = aebθ (use Question 3). 9. The path traced out by the centres of curvature of a curve is called its evolute, and the original curve is called the involute of its evolute. Find the evolutes of the curves 22 MATH2011/2/4 below, i.e. determine r + κ1 n. (a) The cycloid in Question 8(e). Show that the evolute is also a cycloid. (b) The logarithmic spiral in Question 8(h). Show that the evolute is also a logarithmic spiral. (c) The curve in Question 8(f). Show that the evolute is the unit circle, so the curve is an involute of a circle, which is the profile used for gear teeth. 10. Use the formula for the curvature of an explicit plane curve y = y(x) to find the curvatures of the following curves: (a) y = x2 (b) y = sin x (c) y = cosh x (d) y = ln sec x. 11. Find the maximum curvature (in absolute value), and the x value(s) at which it occurs, on the curves (a) y = 13 x3 (b) y = ex . 12. In designing a road for high-speed travel, it is advisable that the curvature be continuous (i.e., not change abruptly). In particular, when a straight road starts to bend, the curvature at the beginning of the bend should be zero. Suppose we have two portions of straight road. One portion has equation y = m(x − a) for x ≥ a, and the other has equation y = −m(x + a) for x ≤ −a. A road with equation y = y(x) for −a ≤ x ≤ a is being designed to join them. What can you say about y, y ′ , and y ′′ at x = ±a? cos πx gives a satisfactory road. What is the maximum curvature Show that y = − 2am π 2a (at x = 0)? Find values of A, B, and C so that the curve y = Ax4 + Bx2 + C is also satisfactory, and find the maximum curvature. Why is this road slightly better than the previous one? Torsion For a curve r = r(t), the plane passing through the point r(t) and containing the vectors u and n is called the osculating plane of the curve at that point. If the curve does not lie in a fixed plane, then in addition to its curvature (which takes place in the osculating plane) it has torsion or twisting out of the osculating plane. The osculating plane is perpendicular to the cross product b = u × n, which is called the binormal to the curve. Note that |b| = |u||n| sin π2 , so b is also a unit vector. If the curve is a plane curve, then = 0. More generally, db measures how fast the osculating plane is b is constant, so db ds ds changing, i.e. how much the curve is twisting. Since u, n and b are mutually perpendicular unit vectors, we can use dot products to express any vector in terms of its components in those three directions.In particular, we can write db = ( db · u)u + ( db · n)n + ( db · b)b. ds ds ds ds Calculus Chapter 2 23 Now b · u is constant, so db ds db ds · u = −b · du ds = b · κn = 0. Similarly, b · b is constant, so · b = 0. Thus the only non-zero component is db ds (tau) the torsion of the curve. Thus we have τ = − db ds · n, and To find the remaining derivative dn ds = − du ds × b − u × dn ds , db ds db ds · n, which we write as −τ , and call τ = −τ n. we note that n = −u × b, so = −κn × b + u × (τ n) = −κu + τ b. The three expressions for the derivatives are called the Serret-Frenet formulae and can be written as follows (using dashes to denote derivatives with respect to s): u′ n′ b′ = = = κn −κu +τ b −τ n or 0 κ u′ n′ = −κ 0 0 −τ b′ u 0 n. τ b 0 Note that the matrix of kappas and taus is the negative of its transpose; such matrices are called skew-symmetric. The simplest formulae for evaluating curvature and torsion are κ = |u′ | and τ = κ−2 (u · (u′ × u′′ )) (see tutorial question below for τ ). The scalar triple product in the formula for τ is just a determinant. Remember that the dashes denote derivatives with respect to arc length s; if another parameter is given, then the chain rule must be used. Tutorial questions — Torsion 13. Find the curvature and torsion of the following curves: (b) r = (t, t2 , 23 t3 ). (a) r = (a cos t, a sin t, bt) Find the osculating plane at a general point on the helix in (a). 14. Use the Serret-Frenet formulae to prove: (i) u × u′ = κb (ii) u′′ = −κ2 u + κ′ n + κτ b (iii) (u × u′ ) · u′′ = κ2 τ . 15. Find the curvature and torsion of curves with the following unit tangent vectors, given that s denotes arc length. (Hint: sech2 s + tanh2 s = 1.) (a) u = √1 (tanh s, sech s, 1) 2 d ds sech s = − sech s tanh s, d ds tanh s = sech2 s, and (b) u = (− sin s tanh s, cos s tanh s, sech s). 24 MATH2011/2/4 Trajectories and orthogonal trajectories Suppose at each point r in a region in the plane or in space we have a vector v = v(r) defined. We say that v forms a vector field in the region. Vector fields arise in practice chiefly from forces or velocities, and they are then called force fields or velocity fields, respectively. If we think of the vector field as attaching an arrow to each point in the region, then we can imagine joining these arrows and thereby filling the region with curves such that v(r) is always a tangent to the curve through the point r. These curves are called the trajectories of the field, or, in special cases, the lines of force or streamlines. Note that the trajectories give us the direction of v at each point, but they say nothing about the magnitude of v. To find a general trajectory r = r(t) for a vector field v = v(r), we note that, since v is tangential to the curve, we have (if we choose the parameter t suitably) dr = v(r). dt We must therefore solve this differential equation to find the trajectories, which can all be found by taking different values of the arbitrary constant(s). If the trajectories are sketched, then the orientation of the field should be indicated by inserting arrows on the trajectories. A curve r = r(t) that is perpendicular to a two dimensional vector field v = v(r) at each point r is called an orthogonal trajectory to the field. Orthogonal trajectories can be found by solving the differential equation dr · v(r) = 0. dt They do not have arrows, since there is no direction involved, but the distance between adjacent orthogonal trajectories can indicate the magnitude of the vector field, just as the distance between adjacent contours on a surface indicates the steepness of the slope. Three dimensional vector fields have orthogonal surfaces, since there are infinitely many orthogonal curves at any point. Tutorial questions — Trajectories and orthogonal trajectories 16. Find equations of the trajectories of the following vector fields: (a) v = (2y, −x) (d) v = (2x, y) (b) v = (3y, x) (e) v = (x2 − 1, xy) (c) v = (y, 2) (f) v = (cosh x cos y, sinh x sin y) (g) v = (x, y ln |y|) (h) v = (y, z, −y) (i) v = (x − y, x + y, 1). 17. (i) Sketch the trajectories of the vector fields in Question 16(a) to (e) roughly, indicating their directions. Calculus Chapter 2 25 (ii) Find the equations of the orthogonal trajectories of the vector fields in Question 16(a) to (e), and sketch them roughly. 18. (a) Sketch the curves y = 12 (x + 1) + values of c between − 12 and c x−1 1 2. inside the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 for (b) Suppose x and y denote the relative concentrations of two chemicals in a reactor, so 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. The speed of the reaction is given by the vector field v = x(x − 1, x − y). Show that the curves in (a) are the trajectories of this vector field, and indicate their directions on your sketch. , 0). If the (c) Find the maximum value of y on the trajectory starting at the point ( 12 13 reaction proceeds until x = 0, what is the final value of y? Answers 1. (a) v = (4t, 2t − 4, 3), a = (4, 2, 0), (b) component of v is √ (c) component of v is 29 and of a is √1229 . 4. Differentiation gives 0 = d dt (u √ q 14 and of a is − 27 , · u) = 2u · u̇. 5. (i) ż = ṙeiθ +ireiθ θ̇ = ṙ(cos θ +i sin θ)+r(i cos θ −sin θ)θ̇ = (ṙ cos θ−r θ̇ sin θ) + i(ṙ sin θ+ r θ̇ cos θ). (ii) r = (r cos θ, r sin θ), so ṙ = (ṙ cos θ−r θ̇ sin θ, ṙ sin θ + r θ̇ cos θ), which corresponds to ż. √ 6. (a) u = √13 (cos t − sin t, sin t + cos t, 1), arc length = 3(eπ − 1). (b) u = √ √1 (sech t, tanh t, 1), arc length = 7 2 . 12 2 √ 2 = 1 + π 2 (e − 1). (d) u = (− sin 23 θ, cos 32 θ), (c) u = √ 1 (1 1+π 2 + iπ)eiπt , arc length arc length = 8. Curve in (a) lies on surface z = r, which is a circular cone. 8. (a) 0 (straight line), (e) 1 4 (b) 1 a (circle of radius a), (c) 2(1 + 4t2 )−3/2 , (d) sech2 t, (f) ż = teit so |ż| = t and u = eit . Then u̇ = ieit , so |u̇| = 1 and cosec t, κ = |u̇|/|ż| = 1t . (g) cot t, (h) √1 e−bθ . a b2 +1 9. (a) r + κ−1 n = (2t + sin 2t, −1 + cos 2t) = r(t + π2 ) − (π, 2) (original curve shifted left by π and down by 2), (b) r + κ−1 n = abebθ (− sin θ, cos θ) = be−bπ/2 r(θ + π2 ), (1 − it)eit + t(ieit ) = eit . 10. (a) 2(1 + 4x2 )−3/2 , 11. (a) |κ| = 2 5 5/4 6 −3/2 (c) z + κ−1 n = (c) sech2 x, (b) − sin x(1 + cos2 x)−3/2 , at x = ±5 −1/4 . (b) |κ| = 2(3 −3/2 (d) cos x. ) at x = − 21 ln 2. 12. y = 0, y ′ = ±m, y ′′ = 0 at x = ±a (i.e., same values as the lines through those points). Curve y = − 2am π cos x has maximum curvature has maximum curvature cornering. 3m 2a , πm 2a . m 4 2 2 4 Curve y = − 8a 3 (x − 6a x + 5a ) which is slightly less than πm 2a . Lower curvature means easier 26 MATH2011/2/4 b a τ = a2 +b 13. (a) κ = a2 +b 2, 2, r · (b sin t, −b cos t, a) = abt. (b) κ = τ = 2(1 + 2t2 )−2 . 15. (a) κ = √12 sech s, τ = − √12 sech s. (b) κ = 1, has constant curvature, but variable torsion. 16. (a) x2 + 2y 2 = c, (e) x2 ± y2 c2 = 1, (b) x2 − 3y 2 = c, Osculating plane τ = 2 sech s. Note that this curve (c) y 2 = 4x + c, cx (d) y 2 = cx, (f) cosh x = c sin y, (g) y = ±e , (h) r = (P sin t − Q cos t + R, P cos t + Q sin t, −P sin t + Q cos t), (i) r = (Ret cos(t − α), Ret sin(t − α), t + β). 2 2 17. (a) y = cx2 , (b) x3 y = c, (c) y 2 = Ae−x , (d) 2x2 + y 2 = c, (e) Ax2 = ex +y . Figure 2.2. Trajectories and orthogonal trajectories 72 8 . Final value 169 . 18. (c) Maximum value 13 Calculus Chapter 2 æ 27 26 MATH2011/2/4 æ Calculus Chapter 3 27 Calculus Chapter 3 Scalar and vector fields Scalar fields and quadric surfaces The previous chapter dealt with vector functions of a scalar variable, which represent parametric equations of a curve. In this chapter we first consider real scalar functions of a real vector variable, which are called scalar fields. A scalar field in two dimensions, say z = φ(x, y), gives an explicit definition of a surface, and the curves defined implicitly by the equations z = constant (or φ(x, y) = constant) are the level curves or horizontal sections or contours of that surface In three dimensions a scalar field, say w = φ(x, y, z), cannot be visualized directly (since it needs four dimensions), but the implicit equations φ(x, y, z) = constant define surfaces, which are called the level surfaces of the field. The simplest surface in three dimensions is a plane, which can be defined explicitly by z = px + qy + r if it is not vertical, but is best defined completely generally by an implicit equation of the form ax + by + cz = d or r · n = d, where r = (x, y, z), as usual, and (a, b, c) = n, a constant vector normal to the plane. The next simplest surfaces are the paraboloids, in which z is an explicit quadratic function of x and y. After rotating or shifting the axes, if necessary, we obtain a canonical equation of the form z = αx2 + βy 2 . The surface is an elliptic paraboloid (cup or cap) if α and β have the same sign, because then its level curves are ellipses. If α and β have opposite signs, then the level curves are hyperbolas, and the surface is a hyperbolic paraboloid or saddle. If one of α and β is zero, then the surface is a parabolic cylinder. Paraboloids and parabolic cylinders are explicit examples of what are called quadric surfaces. 28 MATH2011/2/4 Figure 3.1. Implicit quadric surfaces — ellipsoid, elliptic cylinder, hyperboloid of one sheet, cone, hyperboloid of two sheets, hyperbolic cylinder Other quadric surfaces cannot be defined explicitly, because they involve quadratic terms in all three variables. After possible rotation and/or shift of all three axes (which will be described in Algebra Chapter 4), they can be represented by an implicit equation of the form αx2 + βy 2 + γz 2 = c. Implicit quadric surfaces are illustrated in Figure 3.1; they can be told apart by considering the families of horizontal or vertical sections obtained by keeping one variable constant. Such sections will be ellipses if the coefficients of the other two variables have the same sign, and hyperbolas if the coefficients have opposite signs. Elliptical sections may not exist for some values of the variable being kept constant, because a sum of two positive quantities must be positive. This makes it possible to distinguish different quadric surfaces from their equations. If all sections are ellipses, then the surface is an ellipsoid, of which a sphere is the best known example. If two families of sections are hyperbolas and the other family consists of ellipses, then the surface is in general an elliptical hyperboloid, of which there are two kinds. If there is an elliptical section through the origin (i.e. by putting one variable equal to zero), then the surface is like a cooling tower, and is called a hyperboloid of one sheet. If the section through the origin is a single point, then the surface is an elliptical cone. If there is no elliptical section through the origin, then the surface is in two separate parts, and is called a hyperboloid of two sheets. Finally, if one of the coefficients is zero (i.e. if one variable does not appear in the equation), then the surface is an elliptic or hyperbolic cylinder, depending on the signs of the other two coefficients. Calculus Chapter 3 29 Tutorial questions — Scalar fields and quadric surfaces 1. Describe the vertical and horizontal sections of the following quadric surfaces in three dimensions, and hence identify the surfaces. For elliptical sections, state for which values of the variable they exist. (a) x2 + 2y 2 + 3z 2 = 4 (b) x2 + 2y 2 − 3z 2 = 4 (d) x2 + 2y 2 − 3z 2 = 0 (e) x2 + 2y 2 + 3z = 4 (g) x2 − 2y 2 = 4 (h) x2 + 2y 2 = 4 (c) x2 + 2y 2 − 3z 2 = −4 (f) x2 − 2y 2 + 3z = 4 (i) x2 + 2y 2 + 3z 2 = −4. Directional derivatives A partial derivative represents the rate of change of a scalar field when all except one of the independent variables are kept constant. For an explicitly defined surface z = z(x, y), the partial derivatives ∂z ∂x and ∂z ∂y can also be thought of as the slopes of vertical sections parallel to the x axis and y axis respectively. We now consider the problem of determining the rate of change of a scalar field in an arbitrary direction at a given point, not only parallel to an axis. Suppose φ = φ(r), and suppose u is a unit vector. The directional derivative of φ at the point r in the direction of u is denoted dφ ds , and defined by φ(r + u∆s) − φ(r) dφ = lim . ∆s→0 ds ∆s Since u is a unit vector, it follows that the directional derivative measures the rate of change of φ per unit change in displacement in the direction parallel to u. Thus the independent variable s denotes arc length, and one reason for using a total derivative sign is that (in the given direction) φ can be thought of as depending on the single variable s only. However, the notation ∂φ ∂s is sometimes used, as is ∂φ ∂u , which has the advantage of indicating the direction, but is misleading because the differentiation is not with respect to u. Directional derivatives in the directions of the standard unit vectors i, j, and k are the familiar partial derivatives ∂φ ∂φ ∂x , ∂y , and ∂φ ∂z . Theorem. At any point the directional derivative of φ in the direction of a unit vector u is given by ∂φ ∂φ dφ =( , )·u ds ∂x ∂y or dφ ∂φ ∂φ ∂φ =( , , ) · u, ds ∂x ∂y ∂z depending on whether the scalar field φ is defined in two or three dimensions. Proof. Suppose s denotes arc length along any curve r = r(s) passing through the required point in the direction of u. Then u = dr , ds since both sides are unit vectors in the same 30 MATH2011/2/4 direction. By the chain rule for partial differentiation, using the diagram above, we have ∂φ dx ∂φ dy ∂φ dz dφ = + + ds ∂x ds ∂y ds ∂z ds ∂φ ∂φ ∂φ dx dy dz =( , , )·( , , ) ∂x ∂y ∂z ds ds ds ∂φ ∂φ ∂φ ∂φ ∂φ ∂φ dr , )· =( , , ) · u, =( , ∂x ∂y ∂z ds ∂x ∂y ∂z which is the required result for a scalar field in three dimensions. The proof in two dimensions is obtained by omitting all terms involving z. Figure 3.2. Surface with path and its horizontal projection In two dimensions, the geometrical interpretation is as follows. A scalar field φ(x, y), with two independent variables, can be visualized as an explicit surface z = φ(x, y). Imagine a smooth path x = x(s), y = y(s), z = 0, in the horizontal plane, as shown in Figure 3.2, where s denotes arc length in the plane. Vertically above this horizontal path lies a path on the surface, which is given by x = x(s), y = y(s), since z = φ(x, y). The directional derivative dφ ds z = φ x(s), y(s) , or dz ds then measures the slope or steepness of the path on the surface. For the simplest visualization, imagine a straight line in the is the slope of the vertical section of the surface parallel plane: at each point on the line, dφ ds to that line. We cannot visualize a three-dimensional scalar field w = φ(x, y, z), but it can be thought of as a variable, say temperature, existing at all points within a certain region in space. At dφ any point, and in any direction, dw ds (or ds ) represents the rate of change of temperature if you move from the given point in the given direction. Calculus Chapter 3 31 ∂φ ∂φ ∂φ ∂φ The vector ( ∂φ ∂x , ∂y ) or ( ∂x , ∂y , ∂z ) is called the gradient of φ and is denoted grad φ. Thus ∂φ ∂φ ∂φ ∂φ ∂φ grad φ = ( , ) or ( , , ). ∂x ∂y ∂x ∂y ∂z The formula for the directional derivative can now be written dφ dφ = grad φ · u or = | grad φ| cos α, ds ds where u is the unit vector in the required direction and α is the angle between u and grad φ. It follows that dφ ds is the component of grad φ in the direction of u. Corollary 1. At any given point, the directional derivative dφ ds has a maximum value equal to | grad φ|, which is obtained by taking u in the direction of grad φ. Proof. At a given point, the only variable is the direction, which is determined by the angle α between u and grad φ. The maximum value of cos α is 1, when α = 0. Since dφ ds = | grad φ| cos α, it follows that the maximum value is | grad φ|, which occurs when α = 0, i.e., when u is in the direction of grad φ. In particular, the steepest slope at the point (x, y, z) on an explicit surface z = z(x, y) is q ∂z 2 ∂z ∂z ∂z 2 | grad z| = ( ∂x , ∂y ) = ( ∂x ) + ( ∂y ) , ∂z ∂z , ∂y ). and it occurs in the direction of the vector grad z = ( ∂x Corollary 2. (i) The vector grad φ is normal to an implicitly defined curve or surface φ = constant, ∂z ∂z (ii) ( ∂x , ∂y , −1) is a normal vector to the explicit surface z = z(x, y) at the point (x, y, z). Proof. (i) Suppose u is a unit tangent vector (in any direction) at a point on the curve or surface φ = constant; then at that point dφ ds = 0 in the direction of u, because φ is constant dφ on the curve or surface. Thus 0 = ds = grad φ · u, i.e. u is perpendicular to grad φ. This means that grad φ is perpendicular to every tangent at that point, so grad φ is normal to the curve or surface. (ii) If we define φ(x, y, z) = z(x, y) − z, then the surface z = z(x, y) can be written as ∂z ∂z φ = 0, and by part (i) a normal is grad φ, which is equal to ( ∂x , ∂y , −1). (The letter z is being used in two senses in part (ii): as a function of x and y, and as the third variable. This abuse of notation is convenient, but can be confusing if you don’t think clearly. An alternative proof for part (ii) is given as a tutorial question below.) Knowing the normal to a surface at a point enables us to determine the tangent plane to the surface at that point, i.e., the plane that passes through the point and has the same normal as the surface there. At points where grad φ = 0 the surface may not be smooth, and the tangent plane may not exist. 32 MATH2011/2/4 Tutorial questions — Directional derivatives 2. If z = cos π(2x2 − y 2 ) , find the directional derivatives of z at the point ( 21 , 21 ) in the directions of the vectors (a) (1, 0) (b) (0, 1) (c) (1, 1) (d) (−1, 1) (e) (3, −4). How could the answers to (a) and (b) be found without using directional derivatives? 3. Let z = x4 − 3x3 y + x2 y 2 . (i) Find the directional derivative of z at the point where x = 2 and y = 1 in the direction of the tangent to the curve r = (t2 + 1, t3 ). (ii) In what direction is z increasing at the maximum rate at the point where x = 1 and y = −2, and what is the maximum rate of increase? 4. Consider the point r = (x, y, z) on a general explicit surface z = z(x, y). ∂z ∂z ) and (1, 0, ∂x ) are tangent to the surface at this point. (i) Show that the vectors (0, 1, ∂y (Hint: by holding either x or y constant we obtain vertical sections of the surface, and tangent vectors to these curves are found by partially differentiating r with respect to ∂z the other variable.) Use cross products to prove that ( ∂x , surface. ∂z , ∂y −1) is a normal to the (ii) If the vector (u, v, w) is tangential to the surface at this point, show that w = ∂z ∂z + v ∂y . (Hint: tangent is perpendicular to normal.) u ∂x ∂z (iii)* Show that a steepest tangent vector to the surface is parallel to ( ∂x , (Hint: The steepest tangent vector (u, v, w) occurs when (u, v) k grad z.) ∂z ∂y , | grad z|2 ). 5. Prove that grad(r · n) = n for any constant vector n. (Hint: put n = (a, b, c) and first simplify r · n.) Deduce that n is normal to the surface with implicit equation r · n = d (constant). Interpret this result geometrically. (Hint: what surface does the equation represent, and what is a normal to this surface?) 6. Find the directional derivative of φ(x, y, z) = 4x2 + y 2 − z 2 at a general point r in the direction of a given unit vector u = (u1 , u2 , u3 ). Sketch the surface φ(r) = 4. √ Evaluate grad φ at the points (1, 0, 0), (1, 2, 2), and (0, 5, −1). Mark these points on your sketch, and confirm by inspection in each case that grad φ is perpendicular to the surface. Write down an equation of the tangent plane at the point (1, 2, 2). Sketch the surface φ(r) = 0, and verify that grad φ = 0 at the origin. Is the surface smooth at that point? 7. Find an equation of the tangent plane to the surface in Question 3 at the point where x = 1 and y = −2. (Hint: don’t forget z.) Calculus Chapter 3 33 The operator del and vector fields The gradient of a scalar field can be conveniently expressed by introducing a new symbol, which also serves to define other important operators. We define the operator ∇, called del or nabla, by ∇=( ∂ ∂ ∂ , , ). ∂x ∂y ∂z ∂φ ∂φ ∂φ Then, symbolically, ∇φ = ( ∂x , ∂y , ∂z ), which is just grad φ. From the rules for differentiation it is easy to verify that ∇ is a linear operator, i.e., that ∇(cφ) = c(∇φ) and ∇(φ + ψ) = ∇φ + ∇ψ, for any constant c and any scalar functions φ and ψ. Since φ is a scalar function, but ∇φ is a vector function, it follows that ∇ acts on a scalar field to give a vector field. Vector fields were introduced in the section on trajectories in Calculus Chapter 2; the most important problem, which we shall deal with in the next section, is the inverse problem for ∇: i.e., to determine whether a given vector field can be written in the form ∇φ, and, if so, to find φ. The symbol ∇ can be used to define two more operators, by using the notation of dot and cross product of vectors. Firstly, the divergence of a vector field v = (v1 , v2 , v3 ) is written div v and is defined ∂v2 ∂v3 ∂v1 + + , ∂x ∂y ∂z ∂ ∂ ∂ =( , , ) · (v1 , v2 , v3 ), ∂x ∂y ∂z so div v = ∇ · v, div(v1 , v2 , v3 ) = if the dot is interpreted symbolically as in dot products. (For two-dimensional functions we simply omit the third components in ∇ and v.) The divergence of a vector field is a scalar quantity, and, roughly speaking, measures the rate at which the trajectories are separating or diverging from one another. It can be shown by conservation of mass that the divergence of a velocity field can be non-zero only if the density is changing or if matter is being added to or taken away from the system. We shall explain divergence more precisely in Calculus Chapter 4. Secondly, the curl of a vector field v = (v1 , v2 , v3 ) is written curl v and is defined ∂v3 ∂v2 ∂v1 ∂v3 ∂v2 ∂v1 − , − , − ) ∂y ∂z ∂z ∂x ∂x ∂y i j k ∂ ∂ ∂ = det ∂x ∂y ∂z v1 v2 v3 curl(v1 , v2 , v3 ) = ( so curl v = ∇ × v, 34 MATH2011/2/4 where the cross is interpreted symbolically as in cross products. Beware, however, that the vector function curl v is not necessarily perpendicular to the vector function v, because ∇ × v is only symbolically a cross product. If we have a two dimensional field v = v(x, y), then we can put v3 = 0, and observe that partial derivatives with respect to z are also zero, so the formula simplifies to curl(v1 , v2 ) = (0, 0, ∂v2 ∂x − ∂v1 ∂y ). For a velocity field v it will be shown later that the component of curl v in any direction is twice the angular velocity of the field in the plane perpendicular to that direction, so curl measures the rotation of the field. The curl of a force field will be shown to be related to the work done per unit area when traversing small closed paths. Finally, since grad φ is a vector function, the operators curl and div can be applied to it. The curl of the gradient is always zero, since curl(grad φ) = curl( ∂φ , ∂φ , ∂φ ) = ∂x ∂y ∂z ∂ ∂φ ( ) ∂y ∂z − ∂ ∂φ ∂ ∂φ ( ), ∂z ( ∂x ) ∂z ∂y − ∂ ∂φ ∂ ∂φ ( ), ∂x ( ∂y ) ∂x ∂z − ∂ ∂φ ( ) ∂y ∂x which is equal to the zero vector, since the mixed partial derivatives in each component are equal to each other, for all functions we meet. The divergence of the gradient is not necessarily zero; we have div(grad φ) = div( ∂ ∂φ ∂ ∂φ ∂ ∂φ ∂ 2φ ∂ 2φ ∂ 2φ ∂φ ∂φ ∂φ , , )= ( )+ ( )+ ( )= + + 2. ∂x ∂y ∂z ∂x ∂x ∂y ∂y ∂z ∂z ∂x2 ∂y 2 ∂z The operator div grad is called Laplace’s operator; it acts on a scalar function to give a scalar function, and div(grad φ) = ∇ · (∇φ) = ∇2 φ, say, so symbolically we write ∇2 for ∇ · ∇. Thus we have ∇2 φ = ∇ · (∇φ) = ∂ 2φ ∂ 2φ ∂ 2φ ∂ 2φ ∂ 2φ + + or + 2. ∂x2 ∂y 2 ∂z 2 ∂x2 ∂y Laplace’s operator arises in problems of heat transfer, diffusion, or vibration in two or three dimensions, where it can be shown that ∇2 φ is proportional to (i.e., time-independent) situations we hav ∂φ ∂t ∂φ ∂t or ∂2φ ∂t2 . For steady state = 0, from which it follows that ∇2 φ = 0, which is called Laplace’s equation. A function φ satisfying Laplace’s equation is said to be harmonic, so harmonic functions form the null space of ∇2 . They will reappear frequently. , Calculus Chapter 3 35 From the rules for differentiation it is easy to show that the above operators are all linear; their properties can be summarized in the following table. Operator symbol acts on to give grad div curl Laplace’s ∇ ∇· ∇× ∇2 scalar vector vector scalar vector scalar vector scalar Tutorial questions — The operator del and vector fields 8. Find ∇φ for the scalar functions φ below: (a) 21 ln(x2 + y 2 ) (b) exy + sin(x + y) (c) xyz + ex+y+z (d) x2 + y 2 + z 2 . 9. Find ∇ · v for each of the vector fields v below: (a) (cosh x cos y, − sinh x sin y) (b) ex (cos y, sin y) (c) (yzex , xzey , xyez ) (d) (x − y, y − z, z − x). 10. Find ∇ × v for each of the vector fields v below: (a) (x, y, z) (c) (y, z, −y) (b) (y, z, −x) (d) (xy, yz, zx) (e) (y 2 sin z, 2xy sin z, xy 2 cos z + z) (f) (x − y, x + y, 1). Determine (curl v) · v in each case, to see whether curl v is perpendicular to v. 11. Verify that curl grad φ = 0 for φ = 2xy − xy 2 and φ = x2 yez . * 12. It was stated above that the mixed second partial derivatives of reasonable functions are equal. Here is an example of a function for which they differ at one point. Let a and b be constants, and define z = (ax + by)4 /(x2 + y 2 ) when (x, y) 6= (0, 0) and z = 0 when (x, y) = (0, 0). (i) Show that z = b4 y 2 when x = 0 for all y, including y = 0. Deduce that when (x, y) = (0, 0). (ii) Show by rules for differentiation that ∂z ∂y ∂z ∂y 3 ∂z ∂y =0 = 2(ax + by)3(2bx2 + by 2 − axy)/(x2 + y 2 )2 = 4a bx when y = 0 and x 6= 0. By part (i) this ∂ ∂z 3 formula holds when x = 0 also. Thus ∂x ∂y = 4a b when y = 0 for all x, in particular when (x, y) 6= (0, 0). Deduce that at the point (x, y) = (0, 0). (iii) By symmetry, interchanging x and y, it follows that ∂ ∂y ∂z ∂x = 4ab3 when x = 0 ∂ ∂z ∂ ∂z = 6 for all y, in particular at the point (x, y) = (0, 0). Thus at the origin ∂x ∂y ∂y ∂x unless ab = 0 or a = ±b. 13. Find ∇2 φ for the scalar functions φ in Question 8, and state which functions are har- monic. 14. Show by the product rule for differentiation that ∇(φψ) = φ(∇ψ) + (∇φ)ψ for scalar functions φ and ψ, and deduce that ∇2 (φψ) = φ∇2 ψ + 2∇φ · ∇ψ + ψ∇2 φ. 36 MATH2011/2/4 Potential functions If a vector field v is such that there exists a scalar function φ satisfying grad φ = v, then we say that v is a conservative field or a gradient field, and we call φ a potential function for the vector field v. Thus a conservative field v is one for which the inverse problem for ∇ can be solved. Since ∇ is a linear operator, and the gradient of a constant is 0, it follows that an arbitrary constant can be added to a potential function, as with an indefinite integral. In fact, finding a potential function for a one dimensional vector field is simply integrating it. The simplest examples of conservative fields are force fields (the word “conservative” comes from conservation of energy). For a force field the potential function φ is the potential energy, i.e., the work done by the field in moving a particle to the point in question. The starting point (zero potential) from which the particle is moved is arbitrary, which explains the arbitrary constant in φ. Theorem. A vector field v is conservative if and only if curl v = 0. Proof. If v is conservative, then v = grad φ, so curl v = curl(grad φ), which is the zero vector, as shown in the previous section. We shall prove the converse once we have studied integration in vector fields. Once we know from the above test that a field v is conservative, it is straightforward to find a potential function φ for v: if grad φ = v, then ( ∂φ , ∂φ , ∂φ ) = (v1 , v2 , v3 ), say, so ∂x ∂y ∂z ∂φ ∂φ ∂φ = v1 , = v2 , = v3 . ∂x ∂y ∂z We can write each of these equations in integral form, remembering that each integral has an arbitrary “constant” that may involve the variables that are being temporarily kept constant, and we obtain R R R φ(x, y, z) = v1 dx + α(y, z) = v2 dy + β(z, x) = v3 dz + γ(x, y). By using the three equations successively, in either derivative or integral form, it is possible to obtain a single expression for φ that involves only an arbitrary (genuine) constant. For a two-dimensional conservative field, finding a potential function is similar to solving an exact differential equation , and the test for conservativeness is essentially the same test as for exactness. If v is a conservative vector field with potential function φ, then the curves or surfaces with equation φ(x, y, z) = constant are called equipotentials. The equipotentials are orthogonal trajectories or surfaces to the field, because v = grad φ and we have shown earlier this chapter that grad φ is a normal to any curve or surface with equation φ = constant. Calculus Chapter 3 37 Tutorial questions — Potential functions 15. Test whether each of the following vector fields is conservative (by finding its curl), and then, where possible, find a potential function. (a) (2xy, x2 + y 2 ) (b) (y, −x + (c) (xy − x−1 , x2 + 1) √ xy) (d) (sin 2y, 2x cos 2y). 16. (i) Find the orthogonal trajectories of the field in Question 15(a), and verify that the equipotentials coincide with the orthogonal trajectories. (ii) Find an equation for the orthogonal trajectories of the field in Question 15(b). Write the equation in the form φ(x, y) = constant, and show that ∇φ is parallel to the original field. (Thus you have found a conservative field with the same trajectories, which is the same as finding an integrating factor for the differential equation satisfied by the orthogonal trajectories.) 17. (i) Show that the differential equation M (x, y) dx + N (x, y) dy = 0 is exact if and only if the vector field M (x, y), N (x, y), 0 is conservative. (ii) Assuming that the differential equation in (i) is exact, show that solving it is the same procedure as finding the equipotentials of the vector field. 18. Find a potential function for each field in Question 10 for which ∇ × v = 0. 19. Find the potential function φ for the conservative field (x, y, −z) such that φ is zero at the origin. Identify and sketch roughly, on the same axes, the equipotential surfaces φ = 0 (passing through the origin), φ = 1, and φ = −1. 20. (To illustrate a possible danger in the amalgamation method, because of identities 1 between functions.) Show that the field v = 2 (−y, x) is conservative, and find x + y2 the potential function that is zero on the positive x axis. Stationary points and optimization We have previously discussed stationary points on a curve y = y(x), i.e., points at which the derivative y ′ is zero. If y ′′ 6= 0, then its sign can be used to distinguish whether the point is a maximum or minimum, but if y ′′ = 0, then there is no conclusion without considering nearby points: the point might be a point of inflexion, or a maximum or minimum. We now wish to extend classification of stationary points to surfaces z = z(x, y), which will also allow us to maximize or minimize real functions of two real variables. Firstly, to locate the the stationary points we find the points where the tangent plane is horizontal, i.e., where ∂z ∂x = 0 and ∂z ∂y = 0 simultaneously. Secondly, the nature of a stationary point is determined by considering a second approximation to the surface near the point. This means that the surface is approximated by an explicit quadric surface. If the approximation is an elliptic paraboloid (cap or cup), then the point is a proper maximum or proper minimum, and if the approximation is a hyperbolic paraboloid, 38 MATH2011/2/4 then the point is a saddle point. If, however, the approximation is a parabolic cylinder, then there is in general no conclusion about the surface itself, without considering nearby points. Figure 3.3. Second approximation to a point on a surface We therefore need to obtain a second approximation to z(x + ∆x, y + ∆y) (point C in Figure 3.3) in terms of z(x, y) (point A) and appropriate first and second partial derivatives. We know that, to a second approximation, f (x + ∆x) = f (x) + f ′ (x)∆x + 21 f ′′ (x)∆x2 + · · · . If we apply this result in the vertical section z = z(x, y + ∆y), where y + ∆y is constant, then we obtain z(x + ∆x, y + ∆y) = z(x, y + ∆y) + zx (x, y + ∆y)∆x + 12 zxx (x, y + ∆y)∆x2 + · · · , (1) where the subscripts denote partial derivatives. This approximates the z value at C in terms of z and its derivatives evaluated at B. Similarly, working in the vertical section x = constant, we can approximate each of the functions on the right hand side of (1) (which are evaluated at B) in terms of values at A, using y as the variable. We obtain z(x, y + ∆y) = z(x, y) + zy (x, y)∆y + 21 zyy (x, y)∆y 2 + · · · , zx (x, y + ∆y) = zx (x, y) + zxy (x, y)∆y + · · · , (2) (3) zxx (x, y + ∆y) = zxx (x, y) + · · · . (4) Now substitute (2), (3), and (4) in (1) to get z(x + ∆x, y + ∆y) = {z(x, y) + zy (x, y)∆y + 21 zyy (x, y)∆y 2 + · · · } (from(2)) + {zx (x, y) + zxy (x, y)∆y + · · · }∆x (from(3)) + 21 {zxx (x, y) + · · · }∆x2 (from(4)), i.e., z + ∆z = z + zx ∆x + zy ∆y + 12 {zxx ∆x2 + 2zxy ∆x∆y + zyy ∆y 2 } + · · · . (5) Calculus Chapter 3 39 This result is true at any point where the surface is smooth; if we subtract z from both sides and remember that zx = 0 and zy = 0 at a stationary point, then we obtain ∆z ≈ 21 {zxx ∆x2 + 2zxy ∆x∆y + zyy ∆y 2 }, which shows clearly how the second approximation to ∆z near a stationary point is a quadratic function of ∆x and ∆y. We can now classify the stationary point by determining the sign of ∆z in all possible vertical sections through the point. If ∆z is positive in all directions, then the point is a proper minimum; if ∆z is negative in all directions, then the point is a proper maximum; if ∆z is positive in some directions and negative in others, then the point is a saddle point. By completing the square in the approximation for ∆z, and writing λ = ∆x ∆y , we obtain ∆y 2 2 ∆z ≈ {(zxx λ + zxy )2 + zxx zyy − zxy }. 2zxx 2 If zxx zyy − zxy > 0, then the expression in curly brackets is always positive, so in every direction ∆z has the same sign as zxx , giving a proper maximum or minimum. If zxx zyy − 2 zxy < 0, then the expression in brackets is positive in some directions and negative in others, so ∆z can be both positive and negative, giving a saddle. (The expression for ∆z is valid only if zxx 6= 0, but there is a similar expression if zyy 6= 0, and the situation where both are zero is easy to check.) The conclusions can be summarized in a table: 2 zxx zyy − zxy <0 Saddle. 2 zxx zyy − zxy >0 Proper minimum if zxx > 0 or zyy > 0. Proper maximum if zxx < 0 or zyy < 0. 2 zxx zyy − zxy =0 See below. 2 If zxx zyy − zxy = 0 at an isolated stationary point, then the second approximation to ∆z has the same sign as zxx in all directions except one, in which it is zero. This suggests that the surface is like a ridge or a trough. Unfortunately, third order effects may alter the surface near the stationary point, and the conclusion may not be valid. However, if there 2 is a whole curve of stationary points along which zxy − zxx zyy = 0, then we can indeed conclude that there is an improper maximum or minimum, roughly like a parabolic cylinder, at all points on the curve. The sign of zxx or zyy can be used, as above, to to tell whether it is an improper maximum or minimum. 40 MATH2011/2/4 Tutorial questions — Stationary points and optimization * 21. Show that if you approximate z at the point C in Figure 3.3 by using firstly the vertical section x + ∆x constant and secondly the vertical section y constant, then the end result (equation (5)) is the same. (Hint: interchange x and y in equations (1)–(4). It is essential that zxy and zyx be equal.) 22. Locate and classify the stationary points on the following surfaces, remembering that x, y, and z are real: (a) z = x2 + cos y (d) z = (x3 − x)(y 2 + 1) (b) z = x3 − 3x + cos y (e) z = (x3 − x + 6)(y 2 + 1) (c) z = (x3 − x + 6)(y 3 − y) (f) z = cos x cos y. 23. Find the maximum volume of a cuboid (i.e., box with rectangular faces) that will fit with its base on the (x, y) plane and its four upper corners touching the elliptic paraboloid z = 1 − x2 − 2y 2 . (Hint: the base vertices are at the points (±x, ±y).) Check that the volume is a proper maximum. 24. A open tin box with five rectangular faces (i.e. no lid) is to have volume 32 cubic units. What are its dimensions if the minimum area of sheet metal is used? Check that the area is a proper minimum. 25. Show that the surface z = x2 − 2xy + y 2 has an improper minimum at all points along the line y = x. Identify the surface, and sketch it roughly. 26. Classify, where possible, the stationary points on the following surfaces, remembering that x, y, and z are real: x y + (c) z = x2 y 2 − x2 . y x * 27. Solve the equation z = 0 if z = y(y 2 − x4 ). Hence sketch the horizontal section z = 0 (a) z = x2 y 2 − 2xy (b) z = on the surface z = y(y 2 − x4 ) and determine the sign of z in the regions between the portions of the contour. Show that z changes sign six times as you circle the origin. Deduce that the stationary point at the origin is not one of the standard types. (This is 2 because it is an isolated stationary point at which zxx zyy − zxy = 0. Verify this fact.) 28. Verify by means of a sketch that a torus (e.g. a motor car inner tube or doughnut or tenniquoit ring), when held vertically, has a proper maximum, a proper minimum, and two saddles. Describe the stationary points when it is held horizontally. 29. If P and Q are arbitrary points on the lines r = (1 + 2t, 1 + t, t) and r = (3 + u, 2 − u, 4 + 2u) respectively, express |P Q|2 in terms of t and u. Find the minimum value of |P Q|2 and verify that the minimum occurs when the vector P Q is perpendicular to (the direction vectors of) both lines. * 30. After an experiment with n observations, it is required to draw the best line y = mx + c through the n observed points (x1 , y1 ), . . . , (xn , yn ) in the plane. Note that the vertical difference or error between the line y = mx + c and the point (xi , yi ) is mxi + c − yi . Pn Let S(m, c) = i=1 (mxi + c − yi )2 , the sum of the squared errors. Find expressions Calculus Chapter 3 41 for m and c that will minimize S. (Hint: note that S ≥ 0 and S → ∞ as m → ±∞ or c → ±∞, so proper minimum is the only possibility.) This is called the method of Least Squares, and was invented by Gauss. * 31. It is possible to optimize a function f (r) over all points that satisfy the equation φ(r) = 0 (i.e. subject to some restriction on the points r) by optimizing the function f (r)+λ φ(r), where λ is a dummy variable. This is called the method of Lagrange multipliers. Use Lagrange multipliers to find the point(s) on the curve x2 + 16xy − 11y 2 = 100 that are closest to the origin. (Hint: let f (x, y) = x2 + y 2 (the square of the distance from the origin) and let φ = x2 + 16xy − 11y 2 − 100. Remember that the partial derivative with respect to λ must also be zero.) Notice that it is not easy to solve for y and express the distance in terms of x only, otherwise the problem could be solved by elementary calculus. Answers 1. (a) Ellipsoid (|x| < 2, |y| < √ √ 2, |z| < 2/ 3), (b) hyperboloid of one sheet (elliptical √ sections all values of z), (c) hyperboloid of two sheets (elliptical sections for |z| > 2/ 3), (d) elliptic cone (elliptical sections for |z| = 6 0), sections for z < 4 3 ), (e) elliptic paraboloid (cap) (elliptical (f) hyperbolic paraboloid (saddle), (g) hyperbolic cylinder, (h) elliptic cylinder (elliptical sections for all values of z), (i) surface does not exist (sum of squares cannot be negative). √ ∂z 2. (a) − 2π = ∂x , (b) √π2 = 3. (i) t = 1, u = √ rate 949. √1 (2, 3), 13 and dz ds ∂z ∂y , (c) − π2 , √ = −48/ 13, (d) 3π 2 , √ (e) − 2π. (ii) Direction (30, −7), maximum ∂z ∂z 4. (ii) (u, v, w) · ( ∂x , ∂y , −1) = 0. ∂z ∂z (iii)* (u, v) k grad z, so u = k ∂x and v = k ∂y for some k. By part (ii) it follows that ∂z 2 ∂z 2 + k ∂y = k| grad z|2 . w = k ∂x 5. r · n = d (or ax + by + cz = d) represents a plane, and n (or (a, b, c) is a normal to the plane. dφ ds = 8xu1 + 2yu2 − 2zu3 . Hyperboloid of one sheet. √ grad φ = (8, 0, 0), (8, 4, −4), (0, 2 5, 2). Tangent plane 2x + y − z = 2. 6. Surface φ = 0 is a double elliptic cone, and the origin is the vertex, where it is not smooth. 7. z = 11; point on surface is (1, −2, 11); plane is 30x − 7y − z = 33. 8. (a) 1 x2 +y 2 (x, y), x+y+z (c) (yz + e 9. (a) 0, (b) (yexy + cos(x + y), xexy + cos(x + y), , xz + ex+y+z , xy + ex+y+z ), (b) 2ex cos y, (d) 2(x, y, z). (c) yzex + xzey + xyez , (d) 3. 42 MATH2011/2/4 10. (a) 0,(b) (−1, 1, −1),(c) (−2, 0, −1),(d) (−y, −z, −x), (e) 0,(f) (0, 0, 2). Dot product curl v · v = 0 only when curl v = 0. (b) (x2 + y 2 )exy − 2 sin(x + y), 13. (a) 0, harmonic, 2 15. (a) x y + 1 3 3y + c, (b) none, (c) 3ex+y+z , (c) none, (d) 6. (d) x sin 2y + c. 16. (i) Put x = uy; solution y(x2 + 13 y 2 ) = c. p √ (ii) Put x = uy; solution φ(x, y) = ln y + 2 x/y = c. ∇φ = x−1/2 y −3/2 (y, −x+ xy). 17. ∇ × (M, N, 0) = (0, 0, ∂N − ∂x ∂M ), ∂y ∂N ∂x so the curl is zero if and only if = ∂M , ∂y which is the test for exactness. 18. (a) 12 (x2 + y 2 + z 2 ) + c, (e) xy 2 sin z + 12 z 2 + c. 19. φ = 21 (x2 + y 2 − z 2 ). Equipotential surfaces are hyperboloid of two sheets inside cone inside hyperboloid of one sheet. 20. arctan xy (i.e., the polar angle). NB arctan xy = π 2 − arctan xy . 22. (a) Saddles at (0, 2kπ), proper minima at (0, (2k+1)π). (b) Saddles at (1, 2kπ) and at (−1, (2k+1)π), proper maxima at (−1, 2kπ), proper minima at (1, (2k+1)π). (c) Proper maximum at (− √13 , − √13 ), proper minimum at (− √13 , √13 ), saddles at ( √13 , √13 ), ( √13 , − √13 ), (−2, ±1), (−2, 0). (d) Saddles at (± √13 , 0). (e) Saddle at (− √13 , 0), proper minimum at ( √13 , 0). (f) Saddles at ( π2 + kπ, π 2 + mπ). Proper stationary points at (nπ, lπ): maxima if l + n is even and minima if l + n is odd. 23. V = 4xyz = 4xy(1 − x2 − 2y 2 ). Maximum when x = 12 , y = 1 √ , 2 2 V = 1 √ . 2 2 32 24. xyz = 32 and A = xy + 2yz + 2zx = xy + 2(x + y) xy . Dimensions for minimum area: 4 × 4 × 2. 25. z = (x − y)2 , parabolic cylinder. See sketch below. z>0 z<0 z>0 z<0 z>0 z<0 Figure 3.4. Sketches for Questions 25 and 27 26. (a) Saddle at (0, 0). Improper minimum along xy = 1. (b) Improper minimum along x = y. Improper maximum along x = −y. (c) Improper stationary points along x = 0; maxima if −1 < y < 1 and minima if |y| > 1. 27. Level curves z = 0 have equations y = 0 or y = ±x2 . See sketch above. 28. Improper maximum along upper surface and improper minimum along lower surface. √ 29. |P Q| = 3 when P Q = (−1, 1, 1). Calculus Chapter 3 30. ∂S ∂m 43 ∂S ∂c P P P P = 2 (mxi + c − yi ) = 0. .˙. m xi + c 1 = yi . P P P P = 2 xi (mxi + c − yi ) = 0. .˙. m x2i + c xi = xi y i . By Cramer’s Rule m c 1 P = P 2 ( xi ) − n x2i P Pxi − x2i 31. Minima when λ = − 51 , (x, y) = ±(4, 2). −n P xi P y i P . xi y i 44 MATH2011/2/4 æ Calculus Chapter 4 æ 43 44 MATH2011/2/4 Calculus Chapter 4 Vector integration Path integrals in scalar fields Suppose φ = φ(r) is a scalar field defined in some region in the plane or in space, and C is a finite curve or path in the region. If s denotes arc length along C, then on C we can use s as parameter and write r = r(s). Thus we have φ x ր տ տ ր s y or φ ↑ x y տ ↑ s ր տ z ր so φ is a function of s, exactly as in the section on directional derivatives in Calculus Chapter 3, but now we wish to integrate with respect to s instead of differentiate. We define Z Z s1 φ(r(s)) ds, φ ds = C s0 where s = s0 and s = s1 respectively define the initial and final points r0 and r1 of the path C. Such an integral is called a path integral, or sometimes a line integral, even when the path C is not a straight line. If C is parametrized by a different variable, say t, dr then ds dt = ± dt , as remarked in Calculus Chapter 2, so, using integration by substitution, the formula becomes Z Z t1 C φ ds = ± φ(r(t)) t0 dr dt dt, where the parameter values t = t0 and t = t1 respectively define the initial and final points dr of the path C. The minus sign before the integral must be inserted when ds dt = − dt , i.e., when the parameter t decreases as s increases. R If φ = 1 (constant), then the path integral C ds represents the arc length of C. The R R ratio C x ds/ C ds is the x-component of the centre of mass of the curve, and there are similar expressions for y- and z-components. Alternatively, if φ is the mass per unit length, Calculus Chapter 4 then 45 R φ ds is the total mass of the curve. For a positive two dimensional field φ(x, y), R the path integral C φ ds represents the area of the vertical cliff face exposed when the C surface z = φ(x, y) is cut, as if by a band saw, along the curve C in the (x, y) plane. This is still true if φ(x, y) assumes negative values, provided that areas of regions below the (x, y) plane are taken to be negative. Figure 4.1. Path integral representing area of a vertical face Tutorial questions — path integrals in scalar fields 1. Evaluate R C φ ds in the following cases. √ (a) φ(x, y, z) = yex+z and C is the path r = (ln sin t, t 2, ln sec t) from the point where t= π 6 to the point where t = π 3. (b) φ(x, y) = xy and C is the path from (1, 0) to (0, 1) along (i) a straight line, (ii) a quarter circle anti-clockwise, (iii) a three-quarter circle clockwise. (Choose your own parametrizations, and remember the sign.) Path integrals in vector fields The integrands in the path integrals in the previous section above occur in regions in which a scalar field φ = φ(r) is defined. More important path integrals are those in which the integrand is a component of a vector field, say v = v(r), defined in the region. The unit tangent vector to the curve C is u = to C is v · dr , ds dr , ds so the component of v in the direction tangential and the integral along C of the tangential component of v is Z s1 s0 v· dr ds ds, or Z C v · dr, 46 MATH2011/2/4 by the usual process of symbolically “cancelling differentials” because of the chain rule. The integral is evaluated in terms of a general parameter t by using the expression Z C v · dr = Z t1 t0 v· dr dt dt. Since no absolute values appear in the formula, no adjustment of signs is necessary if the parameter t decreases along the curve C. For a two dimensional vector field v = (f, g), the dot product v · dr = f dx + g dy, since dr = (dx, dy), so we can also write Z C v · dr = Z f dx + g dy. C R F · dr represents the work done by the field in R moving a particle along the path C, since Work = Force dDistance and at each point only the tangential component of the force is relevant. A familiar result from mechanics is that For a force field, say F, the integral C the work done is equal to the change in potential. This result is true for any conservative field F, i.e., any field that has a potential function. Theorem. If the vector field F is conservative, and has potential function φ, then Z C F · dr = φ(r1 ) − φ(r0 ), where r0 and r1 are respectively the initial and final points of the curve C. Proof. Since φ is a potential function for F, we have F = ∇φ. Thus F · dr ds = ∇φ · u = dφ , ds by the formula for directional derivatives, where s denotes arc length along the path C. If the points r0 and r1 are given by parameter values s = s0 and s = s1 respectively, then Z C F · dr = Z s1 s0 dr F· ds = ds Z s1 s0 s1 dφ ds = φ(r(s)) = φ(r1 ) − φ(r0 ), as required. ds s0 This theorem shows that the value of an integral in a conservative field depends only on the initial and final points of the path, not on the path itself. We say that such integrals are independent of the path. This is rather like a definite integral in elementary calculus, with the potential function playing the part of the indefinite integral. Corollary. If F is a conservative vector field defined throughout a plane region containing H a closed curve C and its interior, then c F · dr = 0. (Notice how a little circle is put on the integral sign to emphasize that the integral is around a closed curve.) Calculus Chapter 4 47 Proof. The result follows immediately from the theorem, since for a closed curve the initial and final points coincide, i.e., r0 = r1 . (We shall show after Green’s theorem why F must be defined throughout the interior of C, as well as on C itself.) Ra a (This is similar to the fact that −a x1 dx 6= ln |x| −a , because the integrand is undefined at the point x = 0 between the endpoints.) Tutorial questions — path integrals in vector fields 2. Evaluate R C 2 v · dr in the following cases. √ (a) v = (x − y 2 , 2xy) and C is the hyperbola x2 − y 2 = 1 from (1, 0) to (2, 3). (b) v = (cos x, y sec x) and C is the portion of the parabola y 2 = 4x from (0, 0) to √ ( π4 , π). (c) v = (xz, yz, 2xy) and C is the path r = (t cos t, t sin t, t) from (0, 0, 0) to (−π, 0, π). (d) v = (yz, xz, xy) and C is the same path as in (c). (e) v = (x, y) and C is the ellipse x2 a2 + y2 b2 = 1. (Choose a parametrization going anti- clockwise all the way around C.) (f) v = (y, −x) and C is the same ellipse as in (e). 3. Use the potential functions found in Calculus Chapter 3 Questions 15 and 18 to evaluate R F · dr for the following conservative fields F between the points given. (a) F = (2xy, x2 + y 2 ) from (0, 0) to (1, 3). (b) F = (sin 2y, 2x cos 2y) from (1, 0) to (2, 2π 3 ). (c) F = (x, y, z) from (1, 1, 1) to (2, −3, 6). (d) F = (y 2 sin z, 2xy sin z, xy 2 cos z + z) from (0, 0, 0) to (1, −1, π). Check your answers by integrating along the straight line between the points, using the parametrization r = (1 − t)r0 + tr1 from t = 0 to t = 1. 4. At time t = 0 a particle of unit mass is at rest at the point (0, 1) in the force field F = (2, −y). (i) Solve Newton’s equation F = mr̈ for r(t), using the initial conditions given to evaluate the arbitrary constants. (Hint: (2, −y) = 1(ẍ, ÿ); solve separately for x and y.) (ii) Evaluate the work done between time t = 0 and time t = π by integration along the path r = r(t). (iii) Find a potential function for F and use it to re-calculate the work done. (iv) Find ṙ and hence evaluate the kinetic energy 12 m|ṙ|2 of the particle at time t = π. Verify that the gain in kinetic energy is equal to the work done. 5. Prove the result of Question 4(iv) in general as follows: d 1 ṙ · ṙ = r̈ · ṙ, using rules for vector differentiation. (i) Show that dt R2 (ii) Deduce that F · dr = 21 m|ṙ|2 . (Hint: substitute F = mr̈ and put dr = ṙ dt.) 48 MATH2011/2/4 1 6. Show that the vector field F = x2 +y 2 (−y, x) is conservative where it is defined. Evaluate H F · dr around the unit circle without using the potential function. Why is the answer not zero? Double and repeated integrals An ordinary definite integral Rb a f (x) dx represents (if f (x) is positive) the area of a plane region bounded by the lines x = a, x = b, the x axis, and the curve y = f (x). If we now take a function of two variables f = f (x, y) (again assumed for simplicity to be positive), defined in a bounded region R in the (x, y) plane, then the double integral RR f (x, y) dA represents the volume of a solid region lying above the (x, y) plane, below R the surface z = f (x, y), and within “walls” erected on the boundary of R. In particular, if f (x, y) = 1 (constant), then the solid has constant height 1 and horizontal cross-sections identical with R, so its volume (i.e. cross-sectional area times height) is equal to the area of R. Thus we have ZZ dA = Area of R. R Figure 4.2. Double integral as the limit of a sum A double integral can be regarded as the limit of a sum, as shown in Figure 4.2. Imagine the base region R split up into small subregions numbered by two indices, say i and j. Suppose the (i, j)th subregion has area ∆Aij and contains the point (xi , yj ). Then f (xi , yj )∆Aij represents the volume of a pillar of height f (xi , yj ) erected on this subregion and stretching up to a point on the surface. By adding up the area of all of these pillars, we obtain an approximation to the total volume, i.e. m X n X i=1 j=1 f (xi , yj )∆Aij ≈ ZZ f (x, y) dA. R This becomes an exact expression for the double integral if we let m and n tend to infinity in such a way that the subareas ∆Aij all tend to zero. Calculus Chapter 4 49 The average value of a function f (x, y) over a region R is defined to be the ratio RR RR f (x, y) dA = dA R f (x, y) dA , Area of R R RRR because of the obvious interpretation Total volume . Base area Average height = For practical purposes we need a more efficient procedure for evaluating double integrals. This comes by slicing the solid region into plane vertical sections, which we first take parallel to the y axis. The area of each vertical section is a path integral, say R y1 (x) f (x, y) dy, along a line on which x is constant and ds = dy. The endpoints y = y0 (x) y0 (x) and y = y1 (x) are equations of boundary curves of R, as shown in Figure 4.2. Since VolR ume = Area dLength, and since there is a section of this type for each x from x = a to x = b, we obtain ZZ Z b Z y1 (x) f (x, y) dA = f (x, y) dy dx, R a y0 (x) which is called a repeated integral. Figure 4.3. Repeated integrals Similarly, if we take vertical sections parallel to the x axis, as also illustrated in FigR x (y) ure 4.3, then the area of a typical section is x01(y) f (x, y) dx, where y is constant, and x = x0 (y) and x = x1 (y) are equations of the left and right boundaries of R. By integrating these areas with respect to y from y = c to y = d, as shown, we see that ZZ f (x, y) dA = R Z c d Z x1 (y) f (x, y) dx dy, x0 (y) which is a repeated integral in which the order of integration has been reversed. 50 MATH2011/2/4 y=d y=y1 HxL Hx,yL x=x0 HyL x=x1 HyL Hx,yL y=y0 HxL x=a y=c x=b Figure 4.4. Reversing the order of integration If the region R is not convex, so that some vertical sections cut the boundary more than twice, or if the equations for the boundary curves change, then R must be subdivided, and the repeated integrals over each subregion evaluated separately, and then added together. Tutorial RRquestions — double and repeated integrals 7. Evaluate R xy dA as the limit of a sum, where R is the rectangle with vertices (0, 0), (0, 1), (1, 0), and (1, 1). (Hint: take ∆Aij = 1 mn and (xi , yj ) = ( mi , nj ), where i goes from 1 to m, and j goes from 1 to n. Check your answer by expressing the double integral as a repeated integral. RR 8. Evaluate the double integrals R f (x, y) dx dy as repeated integrals. Check your answers by reversing the order of integration. (a) f (x, y) = xy ; R is bounded by the curves xy = 1, y = 1, and x = 2. √ (b) f (x, y) = x − y 2 ; R is the region between the curves y = x2 and x = y 4 . (c) f (x, y) = cos(x + y); R is the rectangle with vertices (0, 0), (0, π), ( π2 , 0) ( π2 , π). √ √ (d) f (x, y) = xy; R is the region between the curves y = x and y = x. 9. Find the areas of the regions in Question 8. Hence calculate the average values of the given functions over the corresponding regions. 10. By sketching the region, then reversing the order of integration, evaluate: Z 4 Z √4−x p Z 4Z 2 3 3ex dx dy (b) (a) y 2 + x dy dx. √ 0 y 0 0 RR 11. If R is the quadrilateral with vertices (0, 0), (1, 1), (2, 6), (0, 4), write R f (x, y) dy dx as the sum of two repeated integrals, inserting the correct lower and upper endpoints on each integral sign. How many separate repeated integrals would you need if you reversed the order of integration? 12. Find the volumes of the solids enclosed by the surfaces below. (Hint: integrate the difference between the upper and lower z values. The boundaries of the region of integration are found by using those equations that do not involve z (which represent vertical “walls” for the solid), or by equating z values to find where the upper and lower surfaces meet.) (a) x = 0, y = x, z = 0, z = 1 − y 3 . (c) z = x2 + y 2 , z = 2x. (b) z 2 = 4ax, x2 + y 2 = ax. (d) x2 + y 2 = 1, y 2 + z 2 = 1. Calculus Chapter 4 51 Change of variables in double integrals dx R R In integration by substitution, when we replace an integral f (x) dx by f x(u) du du, dx the factor du can be thought of as a linear scaling factor, since in general the u scale on the axis will differ from the x scale, and may also vary from point to point. Similarly, when the variables x and y in a double integral are replaced by new variables u and v, a scaling factor for area must be introduced, since increases of ∆u in u and ∆v in v will not in general increase the area by simply ∆u∆v. The scaling factor for area is called the Jacobian of the transformation, and it is denoted J or ∂(x,y) ∂(u,v) . Theorem. If r = (x, y), where x = x(u, v) and y = y(u, v), then the Jacobian J or is given by the formula ∂r ∂r ∂(x, y) . = × ∂(u, v) ∂u ∂v ∂(x,y) ∂(u,v) Proof. The lines u constant, v constant, u + ∆u constant, and v + ∆v constant transform into curves in the (x, y) plane with the same equations, interpreted implicitly, as shown in Figure 4.5. Thus the rectangle bounded by these lines in the (u, v) plane transforms into a region in the (x, y) plane with four curved boundaries as shown. However, using first ∂x approximations of the type x(u + ∆u, v) ≈ x(u, v) + ∂u ∆u, we obtain ∂x ∂y ∂r r(u + ∆u, v) = x(u + ∆u, v), y(u + ∆u, v) ≈ (x, y) + ∂u , ∂u ∆u = r + ∂u ∆u ∂r , ∂y ∆v = r + ∂v ∆v. r(u, v + ∆v) = x(u, v + ∆v), y(u, v + ∆v) ≈ (x, y) + ∂x ∂v ∂v Figure 4.6. Jacobian of a transformation It follows that the region with curved boundaries in the (x, y) plane can be approximated ∂r ∂r by the parallelogram defined by the vectors ∂u ∆u and ∂v ∆v. The parallelogram has area ∂r ∂u ∆u ∂r × ∂v ∆v , which is equal to |∆u| |∆v| is simply |∆u| |∆v|. Thus we have ∂r ∂u × ∂r ∂v , whereas the area of the rectangle area of parallelogram ∂r ∂r area in (x, y) plane . ≈ = × corresponding area in (u, v) plane area of rectangle ∂u ∂v In the limit, as ∆u and ∆v tend to zero, the result becomes exact, and gives the required formula. 52 MATH2011/2/4 Corollary 1. An alternative formula for the Jacobian J or ∂(x, y) = det ∂(u, v) xu yu xv yv ∂(x,y) ∂(u,v) is , where the subscripts denote partial derivatives. Proof. It is easy to show that ∂r ∂u × ∂r ∂v = (0, 0, xu yv − yu xv ), so ∂r ∂r = xu yv − yu xv = det × ∂u ∂v ∂(x,y) −1 , ∂(u,v) Corollary 2. ∂(u,v) ∂(x,y) = reciprocal of J. xu yu xv yv . i.e. the Jacobian of the inverse transformation is the Proof. The result is, in a sense, obvious, since the Jacobians are ratios of correspond xu xv ing areas in the two planes. More formally, we know that the matrices and yu yv ux uy are inverses of each other, so their determinants are reciprocals of each other, vx vy which by Corollary 1 gives the required result. With the use of the Jacobian, a double or repeated integral can be transformed as follows: ZZ ZZ ∂(x,y) du dv. f (x, y) dx dy = f x(u, v), y(u, v) ∂(u,v) R S The endpoints of the integrals with respect to u and v are obtained by expressing the equations of the boundary of the original region R in terms of u and v. The most important transformation is to polar co-ordinates r and θ, for which we have ∂(x,y) ∂(r,θ) = |xr yθ − yr xθ | = cos θ(r cos θ) − sin θ(−r sin θ) = r(cos2 θ + sin2 θ) = r, so a double integral transforms from cartesian to polar co-ordinates thus: ZZ ZZ f (x, y) dx dy = f (r cos θ, r sin θ) r dr dθ. R S Tutorial questions —√change √of variables in double integrals 13. For the transformation x = u − v, y = ∂(x,y) , (i) find the Jacobian ∂(u,v) (iii) find the Jacobian ∂(u,v) ∂(x,y) , u + v, (ii) solve for u and v in terms of x and y, (iv) verify that ∂r ∂u ∂(x,y) ∂(u,v) ∂(u,v) ∂(x,y) = 1. 14. For a transformation r = r(u, v), explain why the vector is a tangent to the curve with implicit equation v = constant, and why the vector ∇v is a normal to the same Calculus Chapter 4 53 curve. Find the dot product of these two vectors, and prove that the dot product is zero. ∂y ∂x ∂x ∂r 15. If ∂u = ∂y ∂v and ∂v = − ∂u (the Cauchy-Riemann equations), show that the vectors ∂u ∂r and ∂v are perpendicular and of the same magnitude. (It follows that a small square ∂(x,y) ∂(u,v) transforms approximately to a square.) Also show that = ∂r 2 . ∂u 16. If R is the triangle with vertices (0, 0), (3, 1), (1, 2), find the equations of the boundaries of R. Transform these equations by the substitutions x = 3u+v, y = u+2v, and find the ZZ dx dy . Jacobian of the transformation. Use the transformation to evaluate R x + 2y + 5 17. By making the transformation x = u2 − v 2 , y = uv (where u > 0 and v ≥ 0), evaluate ZZ p the integral x2 + 4y 2 dy dx, where R is the region bounded by the positive x axis, R the positive y axis, and the parabola y 2 = 1 − x. * 18. Sketch the region bounded by the rays with polar angles θ and θ +∆θ, and the circles with centre at the origin and radii r and r+∆r. Find the area of the region (difference between two sectors). Hence prove directly that ∂(x,y) = r. ∂(r,θ) 19. Re-do the integrals in Question 12(b)–(c) by changing to polar co-ordinates. Z 1 Z √x−x2 p 1 − x2 − y 2 dy dx and 20. Sketch the region of integration for the integral 0 0 then evaluate the integral by changing to polar co-ordinates. 21. Use the transformation x = 1 2 (v + u v ), y = 1 2 (v − u v) to evaluate ZZ R (x2 − y 2 ) dx dy, where R is the region bounded by the curves x = y, x2 − y 2 = 1, x + y = 1, x + y = 2. 22. Sketch the region R in the first quadrant bounded by the curves xy 2 = 1, xy 2 = 4, y 2 = 4x, y 2 = 9x. Find the area of R by using the transformation x2 = uv , y 4 = uv, where u > 0 and v > 0. 23. The work done during one cycle of a Carnot engine is equal to the area of the region in the (p, v) plane enclosed by the curves pv = x0 , pv = x1 , pv γ = y0 , pv γ = y1 , where x0 , x1 , y0 , y1 and γ are positive constants (and γ 6= 1). Determine this area by using the −1 ∂(p,v) substitutions x = pv, y = pv γ . (Hint: ∂(x,y) = ∂(x,y) .) ∂(p,v) R ∞ −x2 √ dx = 21 π.) If Q(R) denotes the region in the first quadrant * 24. (A proof that 0 e lying inside the circle x2 + y 2 = R2 , show, by changing to polar co-ordinates, that ZZ 2 2 2 π e−x −y dx dy = (1 − e−R ). 4 Q(R) √ If S(R) denotes the square 0 ≤ x ≤ R, 0 ≤ y ≤ R, show that Q(R) ⊂ S(R) ⊂ Q(R 2) (use a sketch), and deduce that 2 π (1 − e−R ) < 4 Z R 0 Z 0 R 2 e−x −y 2 dx dy < 2 π (1 − e−2R ). 4 54 MATH2011/2/4 Show that the integral in the middle is equal to RR 0 2 2 e−x dx , and deduce the final result by letting R → ∞ and then taking square roots. Green’s theorem We proved in Chapter 3 that if a vector field is conservative, then its curl is zero, but we were unable to prove the converse. We now do so by means of Green’s theorem, which connects a path integral around the boundary of a region with a double integral over the region itself. Green’s theorem. If f (x, y) and g(x, y) are functions defined on a simple closed curve C and throughout its interior region R, then I ZZ ∂f ∂g f dx + g dy. ∂x − ∂y dA = C R (The path integral on the right hand side can also be written as taken in the positive, i.e. anticlockwise, direction.) H C (f, g) · dr, and must be Proof. We suppose the region R is convex, from which it follows that it can be expressed either in the form a ≤ x ≤ b, y0 (x) ≤ y ≤ y1 (x) or in the form c ≤ y ≤ d, x0 (y) ≤ x ≤ x1 (y), as shown in Figure 4.4. (Otherwise split R up into convex subregions, and treat them separately.) We write the double integral on the left hand side as the difference of two repeated integrals, choosing the order of integration appropriately for each one. This gives ZZ Z b Z y1 (x) Z d Z x1 (y) ∂g ∂f ∂g ∂f − ∂y dA = dx dy − dy dx ∂x ∂x ∂y R x0 (y) c a y0 (x) y1 (x) x1 (y) Z b Z d f (x, y) dx g(x, y) dy − = c = Z a x0 (y) d y0 (x) Z g(x1 (y), y) − g(x0 (y), y) dy − a I I I = g dy + f dx = (f dx + g dy). c C C b f (x, y1 (x)) − f (x, y0 (x)) dx C Corollary 1. If R is a plane region with boundary C, then I I I 1 Area of R = x dy = − y dx = 2 (−y dx + x dy). C C C RR Proof. Since the area of R is equal to R 1 dA, it follows from Green’s theorem that the H ∂g area is also equal to C f dx + g dy for any functions f and g such that ∂x − ∂f = 1. The ∂y above are just some examples (firstly f = 0 and g = x, secondly f = −y and g = 0, and thirdly f = − 21 y and g = 21 x). Calculus Chapter 4 55 Corollary 2. If F is a vector field in the (x, y) plane defined on a simple closed curve C and throughout its interior region R, then: H H RR (F · u) ds, where u is the unit anticlockwise tangent F · dr = (curl F · k) dA = (i) C C R vector to the curve C, H RR (ii) R div F dA = C (F · n) ds, where n is the unit outward normal vector to the curve C. In words, this result says that: (i) the double integral of curl F · k over the interior of C is equal to the path integral of the anticlockwise tangential component of F around C, (ii) the double integral of div F over the interior of C is equal to the path integral of the outward normal component of F around C. ∂F2 ∂x Proof. (i) Write F = (F1 , F2 , 0); then curl F = (0, 0, 1 − ∂F ∂y ) and curl F·k = ∂F2 ∂x Thus by Green’s theorem with f = F1 and g = F2 we have ZZ R (curl F · k) dA = ZZ ∂F2 ∂x R − ∂F1 ∂y F· dr ds dA = I F1 dx + F2 dy = C I C 1 − ∂F ∂y . F · dr. Next by definition we have I C F · dr = dr ds since the unit tangent vector u = I C ds = I F · u ds, C = ( dx , dy ). ds ds (ii) It is clear (see the section on plane curvature in Calculus Chapter 2) that the vecdx tor ( dy ds , − ds ) is perpendicular to u, has the same magnitude as u, and points outward dx from C. Thus n = ( dy ds , − ds ), or n ds = (dy, −dx), and by Green’s theorem with f = −F2 and g = F1 we have ZZ div F dA = R ZZ R = I C The quantity closed curve C. H C ∂F1 ∂x F· + ∂F2 ∂y dA = dx ( dy ds , − ds ) ds I (−F2 dx + F1 dy) C = I C F · n ds. F · dr is often called the circulation of the vector field F around the 56 MATH2011/2/4 Corollary 3 — interpretation of curl and div. If F is a smooth velocity field in the (x, y) plane, then at any point: (i) curl F · k is equal to the circulation per unit area, or twice the angular velocity of the field about a vertical axis through the point, (ii) div F is the radial outflow rate per unit area at that point. Proof. (i) Let C be a circle of radius ∆r centred at the point and let R denote its interior. Then RR (curl F · k) dA . Average value of curl F · k inside C = R π∆r 2 Now angular velocity is equal to tangential velocity divided by radius, so 2 × average tangential velocity around C ∆r H 2 C F · dr (since the length of C is 2π∆r) = ∆r 2π∆r H F · dr circulation = C = . 2 π∆r area 2 × average angular velocity around C = By Corollary 2(i) the two expressions are equal, and the result follows by letting ∆r → 0, because if curl F is continuous, then, as the circle shrinks down to the point, the averages on the left hand sides tend to the actual values at the point. (ii) Similarly we have Average value of div F inside C = RR div F dA . π∆r 2 R Since F · n is the outward radial component of velocity across C, it follows that is the total outflow across C, and therefore Total outflow across C per unit area = H C H C (F · n) ds (F · n) ds . π∆r 2 By Corollary 2(ii) the two right hand sides are equal, and the result follows by equating the left hand sides and letting ∆r → 0, as before, and using the fact that div F is continuous. Corollary 4. If a vector field F defined in a plane region R satisfies curl F = 0, then F is conservative. (Note that F must be defined throughout the interior of every simple closed curve lying in R.) H Proof. Since curl F = 0, we have curl F · k = 0, so by Corollary 2 F · dr = 0 for every R closed curve in R. It follows that F · dr is independent of the path, because if D1 and R R D2 are two paths with the same initial and final points, then D1 F · dr − D2 F · dr is an Calculus Chapter 4 57 integral around a closed curve (out along D1 and back along D2 ), so it is equal to zero. Rr Now choose some fixed point r0 in R, and define a scalar field φ by φ(r) = r0 F · dr for every point r in R. (The integral can be taken along any path from r0 to r.) If we take a path r = r(s), then, using the Fundamental Theorem of Calculus, we have dφ d = ds ds Z s s0 F· dr dr ds = F · = F · u, ds ds where u is the unit tangent vector to the path. But by the formula for directional derivatives we also have dφ = ∇φ · u. ds From these two expressions for dφ ds we see that F · u = ∇φ · u for all unit vectors u, from which it follows that F = ∇φ. Thus φ is a potential function for F, and so F is conservative. We have now proved that the following statements about a plane vector field F are all equivalent: • F is conservative, • F has a potential function φ such that F = ∇φ, R • F · dr depends only on the endpoints of the path of integration, • ∇ × F = 0. Tutorial questions — Green’s theorem 25. Verify Green’s theorem in the following cases, by evaluating the integrals on both sides and showing that they are equal: (a) f = xy, g = x + y, R is the region bounded by the curves y = x2 and y = x + 2. (b) f = x + y, g = xy, R is the region in the first quadrant bounded by the curves y = sin x, y = 1, and x = 0. 26. Evaluate the path integrals them as double integrals. R C f dx + g dy below, by using Green’s theorem to rewrite (a) f = cos x sin y − xy, g = sin x cos y, C is the circle x2 + y 2 = 1. (b) f = 1 y xe , g = ey ln x + 2x, C is the boundary of the region enclosed by the curves y = x4 + 1 and y = 2. (c) f = 12 y 2 sin x + x ln y, g = x − y cos x + 12 x2 y −1 , C is the circle x2 + (y − 2)2 = 1. 27. Use Green’s theorem to find the areas of the regions enclosed by the curves below. (a) The four cusped hypocycloid r = a(cos3 φ, sin3 φ) (−π ≤ φ ≤ π). (b) The loop in the curve r(t) = t3 − 3t, t210 +1 . (Hint: first find distinct values t0 and t1 such that r(t0 ) = r(t1 ), to find the crossing point.) 58 MATH2011/2/4 28. If F = (xy, x + y), and R is the region enclosed by the curves y = x and y = H RR evaluate R (curl F · k) dy dx and C F · dr, and verify that they are equal. RR H Similarly, evaluate R div F dy dx and C F · n ds, and verify that they are equal. Answers 1. (a) 2. (a) 5π √ 3 6 17 , 3 − √ x, √ √1 2 ln 3, (b)(i) 62 , (ii) 21 , (iii) − 21 . √ √ (b) 12 2 + 2 ln(1 + 2), (c) 13 π 3 − 21 π 2 , r = (a cos t, b sin t)), (f) −2πab. √ (c) 23, 3. (a) 12, (b) − 3, (d) 0 (e) 0 (putting (d) 21 π 2 . 4. (a) r = (t2 , cos t), ṙ = (2t, − sin t). 6. (a) 2π. (F is not defined at the origin.) 7. 14 . 2 (b) 71 , (c) −2, (d) 27 . 8. (a) 2 ln 2 − 34 , (b) 9. Average values are: (a) (2 ln 2 − 43 )/(1 − ln 2), 10. (a) e8 − 1, (b) 8. R 2 R x+4 R 1 R x+4 11. 0 x f (x, y) dy dx + 1 5x−4 f (x, y) dy dx. 12. (a) 13. (i) 3 10 , 1 2 2 (u 3 (b) 32 15 a , 2 −1/2 −v ) , π 2, 15 , 49 (c) − π42 , (d) 94 . Three integrals. (d) 16 3 . 2 2 (c) (b) u = 21 (x + y ), v = 12 (y 2 − x2 ), (iii) 2xy. ∂r 14. The curve is r = r(u, v), where v is constant and u is a parameter, so ∂u is a tangent. ∂v ∂x ∂v ∂y ∂r The curve is given implicitly by v = constant, so ∇v is a normal. ∇v · ∂u = ∂x ∂u + ∂y ∂u = ∂v , ∂u which is zero since v is kept constant when differentiating partially with respect to u. ∂r ∂r ∂r ∂r = ∂v and ∂u · ∂v = 0, so the sides of the parallelogram are of equal length 15. (a) ∂u and at right angles to each other. 16. Boundaries y = 13 x, y = 2x, y = 21 (−x + 5). Integral = 1 − ln 2. 28 . 17. 45 18. Area = 12 (r + ∆r)2 ∆θ − 21 r 2 ∆θ = (r + 21 ∆r)∆r∆θ. Find the Jacobian by dividing through by ∆r∆θ and letting ∆r and ∆θ tend to zero. 20. π6 − 29 . 21. 41 ln 2. 22. J = 14 u−1/4 v −5/4 . Integral = 23. 1 γ−1 (x1 (a) 49 , 25. 26. (a) 0, − x0 ) ln yy10 . 27. (a) 38 πa2 , 1 28. 10 , 14 . (b) 1 − (b) 16 5 , 4 √ ( 3 6 √ 3− √ √ 2)(2 2 − 1). 3π 8 . (c) π. √ √ √ (b) t0 = − 3, t1 = 3, area = 20(2π − 3 3). Calculus Chapter 4 æ 59 58 MATH2011/2/4 æ Algebra Chapter 1 59 Algebra Chapter 1 Complex numbers Real-imaginary form A disadvantage of the real number system is that it is impossible to take the square root of a negative number. In the 18th century, mathematicians overcame the problem by adjoining what they called an imaginary number i such that i2 = −1. (Simplify i3 , i4 , and so on, until you see the pattern. Then repeat with i−n , for n = 1, 2, 3, . . . ) In any useful number system it must be possible to perform the usual algebraic operations, so if you adjoin i to your real number system R, then you must also include all real multiples iy (with y in R) and all sums x + iy (with x and y in R). These make up what is called the complex number system C. A general complex number z can thus be written z = x + iy (with x and y in R). When z is written like this, we say that z is in real-imaginary form, we call x the real part of z and y the imaginary part of z, and we write x = Re(z) and y = Im(z). Note that what we call the imaginary part of z is actually a real number: it is the coefficient of i. The multiple iy is called a pure imaginary number, and x is called a pure real number. For two complex numbers to be equal, their real parts and their imaginary parts must be equal. Thus from a complex equation we can obtain two real equations, by equating real and imaginary parts, but this should be regarded as a last resort, and, wherever possible, one should do all manipulations in complex form. 60 MATH2011/2/4 The algebraic operations can all be performed in C, using the usual rules and the fact that i2 = −1. Thus if z = x + iy and w = u + iv, then z + w = (x + iy) + (u + iv) = (x + u) + i(y + v) z − w = (x + iy) − (u + iv) = (x − u) + i(y − v) zw = (x + iy)(u + iv) = xu + iyu + ixv + i2 yv = (xu − yv) + i(xv + yu). It follows immediately that Re(z ± w) = Re(z) ± Re(w) and Im(z ± w) = Im(z) ± Im(w), i.e., the real or imaginary part of a sum or difference is equal to the sum or difference of the real or imaginary parts, but the real or imaginary part of a product is not the same as the product of the real or imaginary parts. For division we have to multiply top and bottom by the conjugate surd of the denominator, i.e., the complex number obtained by changing the sign of the imaginary part only. Thus we have x + iy (x + iy)(u − iv) 1 z = = = 2 (xu + vy) + i(yu − vx) . 2 w u + iv (u + iv)(u − iv) u +v Note that the last denominator u2 +v 2 is non-zero unless both u and v are zero, so division by any non-zero complex number is possible. Conjugation is so important that we have a special notation for it: if z = x + iy, then we define the complex conjugate z (pronounced z bar) by z = x − iy. It is easy to verify that (z ± w) = z ± w, z × w = z × w, and z ÷ w = z ÷ w, so conjugation behaves well with all four algebraic operations. From the definition it also follows easily that z = z, i.e., conjugating twice gets you back to the number itself. Note, too, that z = z if and only if z is pure real, and that z z = (x + iy)(x − iy) = x2 + y 2 ≥ 0, so z z is real and non-negative for any complex number z, just as x2 is real and non-negative for any real number x. Algebra Chapter 1 61 However, it is important to remember that there are no restrictions on the square of a general complex number. It follows that every complex number has a square root (in fact, two square roots, which are negatives of each other). One method of finding the square roots of a complex number is as follows. Given a complex number c, we must show that the equation z 2 = c can always be solved. If z = x + iy and c = a + ib, then by equating real and imaginary parts we obtain x2 − y 2 = a (1) 2xy = b. (2) We can now use (2) to substitute for y in (1), which gives x4 − ax2 − 41 b2 = 0, which is a quadratic equation in x2 . We solve this and then substitute back in (2) to find y. Alternatively, by squaring and adding (1) and (2), then taking square roots, we have x2 + y 2 = p a 2 + b2 . (3) Now we can add and subtract (1) and (3), and take square roots again, to get x and y, from which we can write down z. Note that there are only two values of z, not four, because the signs of x and y have to be chosen in such a way that equation (2) is satisfied. Tutorial questions — Real-imaginary form 1. Express in real-imaginary form: (p + iq)2 (p − iq)2 p + iq (b) − (a) p − iq (p − iq)2 (p + iq)2 √ (1 − i)3 ( 3 − i) 17(1 + 2i) √ (c) (d) 2 (1 + 3i)(1 + 4i) (1 + i 3) √ − 2(1 − i) 2 √ (e) √ . (f) 1 + cos θ + i sin θ i( 3 + i)(1 + i 3) 2. By expressing each side separately in terms of x, y, u, v (where z = x+iy and w = u+iv), show that: (a) Im(zw) 6= Im(z) Im(w) (b) Re(z/w) 6= Re(z)/ Re(w). 3. Prove the following identities for a general complex number z: Re(z) = z+z z−z and Im(z) = . 2 2i 4. Some people find complex algebra difficult to believe in, but it corresponds to a part of matrix algebra, as follows. x y (i) If z = x +iy and Z = , show that the rows of Z are the real and imaginary −y x parts of z and iz respectively. If w = u + iv and W is the matrix corresponding to w, 62 MATH2011/2/4 using the same rule, write down W . (ii) Use matrix algebra to find Z + W , ZW , and W Z. Note that ZW = W Z. (iii) Use complex algebra to find z + w, zw, and wz. Note that zw = wz. (iv) Show that, using the rule from part (i), the matrices in part (ii) correspond to the complex numbers in part (iii). 5. Using the notation of Question 4, show that the matrices ZW −1 and W −1 Z are equal, and both correspond to the complex quotient z/w. What matrix operations correspond to conjugation? (There are two answers.) 6. Show, by writing z = x + iy and w = u + iv and then simplifying both sides of each equation, that (z ± w) = z ± w, z × w = z × w, z ÷ w = z ÷ w. 7. (a) If z = w−2z, show, without breaking into real and imaginary parts, that 3z = 2w−w. (Hint: take conjugates of the given equation, then eliminate z.) (b) If z = (1 + i)w + (3 − i)w, express w in terms of z and z. (Take conjugates, and then eliminate w.) (c) If z + w + z w = 0, express w in terms of z and z. 8. (a) Find the complex square roots of 2i, 3 − 4i, −5 + 12i, 2 + 2i. (b) Use the quadratic formula and the results of (a) to solve the equations (i) z 2 + (1 − i)z − i = 0 (iii) iz 2 + (2 − 2i)z − 14 − 5i = 0 (ii) iz 2 + 5z − 1 − 7i = 0 (iv) z 2 − 2(1 + i)z − 2 = 0. 9. (a) Find the three complex cube roots of 8, by solving the equation z 3 − 23 = 0. (Hint: find one obvious factor by the Remainder Theorem, then divide and solve the resulting quadratic equation.) (b) Find the four complex fourth roots of −4. (Hint: find the square roots of the square roots of −4.) The complex plane One of many unforeseen benefits of the complex number system is its geometric structure, which is related to its algebraic properties in a way similar to that in which geometric results can be proved by vector algebra. We know that the real numbers lie on a line. In order to represent the complex numbers geometrically, we need two real co-ordinates, one for the real parts and one for the imaginary parts. We therefore identify the complex number z = x + iy with the point (x, y) in the plane (or with its position vector, also written (x, y)), and we can write x + iy ←→ (x, y) or sometimes x + iy = (x, y). Algebra Chapter 1 63 This representation is called the complex plane or Argand diagram. Since x + 0i ↔ (x, 0), it follows that pure real numbers lie on the horizontal axis, which is called the real axis. This shows how the real number line R lies inside the complex number plane C. Similarly, since iy = 0 +iy ↔ (0, y), pure imaginary numbers lie on the vertical axis, which is called the imaginary axis. Addition and subtraction of complex numbers then correspond precisely with addition and subtraction of vectors in the plane, since (x + iy) ± (u + iv) = (x ± u) + i(y ± v) while (x, y) ± (u, v) = (x ± u, y ± v). These can be described geometrically by the parallelogram of vectors. The points 0, z, w, z + w thus form the vertices of a parallelogram whose diagonals are the vectors z + w and z − w. Note that the complex number z − w corresponds to the vector from the point w to the point z. Multiplication by real numbers corresponds to scalar multiplication of vectors, since x(u + iv) = (xu) + i(xv) while x(u, v) = (xu, xv), which always results in a parallel vector. On the other hand, multiplication by a pure imaginary number leads to a perpendicular vector, since (iy)(u + iv) = (−yv) + i(yu) and (u, v) · (−yv, yu) = 0. In general, complex multiplication is a combination of a rotation and a scaling, as we shall show later. Complex conjugation changes the sign of the imaginary part of a number, while leaving the real part unchanged, so it is easy to see that it corresponds to reflecting the number in the real axis. The important result z z = x2 + y 2 shows that z z is the square of the distance of the point z from the origin. An equation z = z(t) (where t is real) represents the parametric equations of a curve in the plane, since by equating real and imaginary parts we can obtain x = x(t) and y = y(t). Similarly, an expression f (z) = constant can be an implicit representation of a curve in the plane. Tutorial questions — The complex plane 10. Sketch all points z in the complex plane satisfying: (a) Re(z) = 2 (d) Im(z) > 1 (b) Im(z) = −1 −1 (e) e < Re(z) < e (c) Re(z) ≥ 0 (f) −π < Im(z) ≤ π. 11. Draw the triangle with vertices 1, 2, and 2 + 2i in the complex plane. Then draw the triangles obtained by 64 MATH2011/2/4 (i) adding 1 − i to each vertex, (ii) multiplying each vertex by 1 − i, (iii) conjugating each vertex. Note the geometrical effect of each operation. 12. Sketch in the complex plane the three cube roots of 8 and the four fourth roots of −4. (See Question 9.) 13. Show that the average of two complex numbers is the midpoint of the line segment joining them. By considering the parallelogram with vertices 0, z, w, and z + w, show that the diagonals of a parallelogram bisect each other. (Hint: find the midpoint of each diagonal.) 14. Show that the dot product of the position vectors of two complex numbers z and w is equal to Re(z w) or Re(z w). (Hint: write z = x + iy and w = u + iv.) Deduce that if z/w is pure imaginary, then the position vectors of z and w are perpendicular to each other. 15. Describe geometrically the effect of the following operations on a general complex number z, and write down the expression obtained after the operation. (a) conjugation (b) multiplication by i (c) multiplication by −1 (d) adding 2 − i (e) conjugation then multiplication by i (f) multiplication by i followed by conjugation. Why are (e) and (f) different? 16. (a) A curve in the complex plane has parametric equation z = t + it2 . Equate real and imaginary parts, eliminate the parameter, and identify the curve. (b) Another curve in has implicit curve Re(z 2 ) = 1. Write the implicit equation in terms of x and y, and identify the curve. 17. Sketch the curve Re((2 + i)z) = −1. Show that a general straight line can be written in the form Re(αz) = c, where α is a complex constant and c is a real constant. (Hint: write α = a + ib.) Modulus-argument form We have shown that real-imaginary form (z = x+iy) corresponds to Cartesian co-ordinates for a point in the plane, since x + iy corresponds to the point (x, y). If we now change to polar co-ordinates (r, θ), where we know that x = r cos θ and y = r sin θ, we can substitute for x and y to give z = r(cos θ + i sin θ). We say z is written in modulus-argument form; r is called the modulus of z, written r = |z|, and θ is called the argument of z, written θ = arg(z). The argument of a complex number is not unique, since adding any whole number of 2πs to the angle does Algebra Chapter 1 65 not change the point represented. The principal value of the argument is the value satisfying −π < θ ≤ π. p From polar co-ordinates we also know that r = x2 + y 2 and tan θ = xy . Thus we have p |x + iy| = x2 + y 2 and tan arg(x + iy) = xy if x 6= 0. Since |z| = r, the modulus of z is the distance between the point z and the origin 0, and similarly by the distance formula |a − b| is the distance between the points a and b. A very important consequence is that zz = |z|2 . This formula again shows that zz is real and non-negative (like the square of a real number), although, as we remarked previously, z 2 can take on any value. When real formulae are generalized to a complex equivalents, x2 must often be replaced by zz (not simply z 2 ) for the formulae to remain true. Modulus-argument form is particularly convenient for multiplication, since if we have z = r(cos θ + i sin θ) and w = s(cos φ + i sin φ), then zw = rs (cos θ cos φ−sin θ sin φ) +i(sin θ cos φ+cos θ sin φ) = rs cos(θ +φ) +i sin(θ +φ) , using compound angle formulae. Thus to multiply two numbers we multiply their moduli and add their arguments. For complex division, since we multiply top and bottom by the conjugate of the denominator, it is easy to see that we divide one modulus by the other and subtract one argument from the other. From the modulus-argument form of the product zw above it follows that |z| z = , |zw| = |z||w| and w |w| z = arg(z) − arg(w), arg(zw) = arg(z) + arg(w) and arg w although the arguments may not be principal values. The laws for arguments are like the laws of logarithms, and we shall see the reason for this in a later section. If several complex numbers are multiplied by a fixed complex number z, then their moduli are all multiplied by |z| and their arguments increased by arg(z). This means geometrically that the position vectors of the points are all multiplied by a scaling factor |z| (which means an enlargement if |z| > 1 and a reduction if |z| < 1), and rotated through an angle arg(z). There is no exact formula for the modulus of a sum; all we can say is that |z + w| ≤ |z| + |w|. This is called the Triangle Inequality, since the length of any one side of a triangle cannot exceed the sum of the lengths of the other two sides. The triangle with vertices 0, z, and z + w has sides of lengths |z + w|, |z|, and |w|. 66 MATH2011/2/4 Tutorial questions —√Modulus-argument form 18. If z = 1 − i and w = −1 + i 3, express w and z in modulus-argument form (principal values). 19. Show that if z is in modulus-argument form, then the corresponding matrix Z (see Question 4) is |z| times a rotation matrix. 20. Prove by induction on n that (cos θ + i sin θ)n = cos nθ + i sin nθ. (This is called De Moivre’s theorem.) 21. Obtain expressions for |z| and arg(z) in terms of |z| and arg(z). z Prove that z −1 = 2 . |z| 22. Sketch all points z in the complex plane satisfying: (a) |z| = 3 (d) 1 < |z| ≤ 2 (b) |z − 1| = 2 (e) |2z + 3| > 4 (c) |z − 2 + i| ≤ 1 (f) |z| < 1 and 0 ≤ arg(z) ≤ π . 2 23. Explain geometrically the difference, if any, between the statements z = w and |z| = |w|. What is the meaning, if any, of the statements z < w and |z| < |w|? 24. Prove that |a + b|2 + |a − b|2 = 2(|a|2 + |b|2 ). (Hint: |z|2 = zz.) Re-state this result as a geometrical theorem about parallelograms. 25. Prove that a − a is pure imaginary for any complex number a. Hence show that if z+w |z| = |w|, then is pure imaginary. (Hint: make the denominator real.) Deduce z−w that the diagonals of a rhombus (i.e. parallelogram with sides of equal length) intersect at right angles. 26. (i) Use the fact that (modulus)2 = (real part)2 +(imaginary part)2 to identify the curves |z − i| = |z − 2 + 3i| and |z + 1| = 3|z − 2|. (ii) Show that if α and β are fixed complex numbers and k is a positive real constant, then the equation |z − α| = k|z − β| represents a line if k = 1 and a circle if k 6= 1. (iii) Identify the polar curves 2π|z| = arg(z), and Re(z) = |z|2 − |z|. 27. (a) If α, β, and z are complex numbers, show, using Figure 1.1, that the angle subtended . at z by the line from α to β is equal in magnitude to arg z−α z−β z ? argHz-aL a argHz- bL b Figure 1.1. Angle subtended at a point (b) Find a cartesian equation for the curve arg z−1 z+1 = c (constant), and hence describe the curve. (Hint: tan(argument) = (imaginary part)/(real part).) *Which theorem in Euclidean geometry does this illustrate? Algebra Chapter 1 67 28. If z = r(cos θ + i sin θ) and w = s(cos φ + i sin φ), prove that φ) . z w = r s cos(θ − φ) + i sin(θ − 29. If z and w are as in Question 18, find w3 /z 2 first by using real-imaginary form and then by using modulus-argument form. 30. Prove the triangle inequality |z + w| ≤ |z| + |w| algebraically as follows: square the left hand side, and then use the results |a|2 = aa, a + a = 2 Re(a), and Re(a) ≤ |a|. 1 1 31. If |z| < 1, prove that ≤ . (Hint: write 1 = z + (1 − z), take moduli, and |1 − z| 1 − |z| use the triangle inequality.) Euler’s formula The equation z(θ) = cos θ +i sin θ (with θ real), which is a parametric equation for the unit circle, has many properties reminiscent of exponential functions. We have shown above that z(θ)z(φ) = z(θ + φ), like the law of exponents. Furthermore, dz(θ) = − sin θ + i cos θ = i(i sin θ + cos θ) = iz(θ). dθ This differential equation has solution z(θ) = Aeiθ (assuming that the usual rules of calculus still hold), and we can show that A = 1 by observing that z(0) = cos 0+i sin 0 = 1. These results led Euler to the remarkable formula eiθ = cos θ + i sin θ. Final confirmation of this result can be obtained by considering the Maclaurin series of each side. Important special cases are eiπ/2 = i, eiπ = −1, e2iπ = 1. With the aid of Euler’s formula a general complex number can be written x + iy = reiθ , which links real-imaginary and modulus-argument forms, i.e., cartesian and polar coordinates. This exponential version of modulus-argument form is preferable, since multiplication and division now obey the usual rules reiθ × seiφ = rsei(θ+φ) and reiθ ÷ seiφ = rs ei(θ−φ) . By taking conjugates of Euler’s formula (or replacing θ by −θ) we obtain e−iθ = cos θ − i sin θ, and by adding and subtracting the expressions for e±iθ we have cos θ = eiθ + e−iθ 2 and sin θ = eiθ − e−iθ . 2i 68 MATH2011/2/4 This explains the similar properties of circular and hyperbolic functions, since in first year we defined cosh θ = eθ + e−θ 2 and sinh θ = eθ − e−θ . 2 The above formulae enable us to prove various identities concerning real functions to be proved much more simply. For example, to express cos nθ and/or sin nθ in terms of powers of cos θ or sin θ we may use the Binomial Theorem to write niθ e iθ n n = (e ) = (cos θ + i sin θ) = n X n r=0 r cosn−r θir sinr θ, and then simplify and equate real or imaginary parts. For the reverse process, we have for example that n cos θ = ( 21 (eiθ −iθ +e n )) = 2 −n n X n (n−r)iθ −riθ e e , r r=0 which can also be simplified. Other identities are obtained by substituting z = reiθ in familiar series expressions (geometric, binomial, or Maclaurin) derived in first year, and then equating real or imaginary parts. These identities, expressing functions in terms of cosines or sines of multiple angles, are called Fourier series, and we shall meet them again in Algebra Chapter 4. Tutorial questions — Euler’s formula 32. Express in the form reiθ : sin α − i cos α, e2π+i , eiπ/6 + eiπ/3 . P n ∞ 33. Put z = iθ in the Maclaurin series expansion ez = n=0 zn! . Split the right hand side into sums over even and odd integers, and note that if n = 2k (even), then in = (−1)k , while if n = 2k + 1 (odd), then in = (−1)k i. Hence verify Euler’s formula. (Hint: use the Maclaurin series for sin θ and cos θ.) 34. Express cos 4θ and sin 5θ in terms of powers of cos θ and sin θ respectively, and express sin5 θ and cos6 θ in terms of multiple angles. 35. Find the sum to n terms of the series below. (They all arise from geometric series.) (a) sin θ + sin 3θ + sin 5θ + · · · 2 (c) sin α + x sin(α + β) + x sin(α + 2β) + · · · 36. Find the sum to infinity of the Fourier series: (a) cos θ + 1 3 cos 2θ + 1 9 (b) cos θ + 1 3 cos 2θ + 1 9 cos 3θ + · · · (d) cosh 1 + cosh 2 + cosh 3 + · · · . cos 3θ + · · · (θ real) (b) sin α + x sin(α + β) + x2 sin(α + 2β) + · · · (α and β real, and |x| < 1) 1 1 (c) 1+cos θ+ 2! cos 2θ+ 3! cos 3θ+· · · . (Hint: this is the real part of a familiar Maclaurin series.) Algebra Chapter 1 69 Roots and polynomials We showed earlier that any non-zero complex number has two square roots. We can now show that it has n nth roots for every natural number n. If a = reiθ , then a = reiθ+2kiπ , so a1/n = reiθ+2kiπ 1/n = r 1/n eiθ/n e2kiπ/n . Here k can be any integer, but increasing k by n increases the argument of a1/n by 2π, and therefore defines the same value of the nth root. The n different roots can thus be found by taking any n successive values of k; the roots all lie on the circle of radius |a|1/n , and are equally spaced like spokes on a wheel, since each one is obtained from the previous one by multiplying by e2iπ/n , i.e., by rotating through From the above it follows that the equation z 1 n n of a revolution. = a has n solutions, i.e., that the polynomial z n − a factorizes into n linear factors, since by the Remainder Theorem every solution z = α of a polynomial equation p(z) = 0 corresponds to a linear factor (z − α) of the polynomial p(z). Gauss proved the Fundamental Theorem of Algebra, which states that if complex coefficients are allowed, then every polynomial of degree n factorizes into n linear factors. Assuming this result, we can now prove the following theorem, which was used in finding the null space of a D-operators. Theorem. If a polynomial p(z) has real coefficients, then the non-real solutions of p(z) = 0 (if any) occur in conjugate pairs. In other words, if p(α) = 0 then p(α) = 0 also. Proof. Suppose p(z) = Pn r=0 ar z r , where the coefficients ar are all real, so ar = ar . Then 0 = 0 = p(α) = an αn + · · · + a1 α + a0 = an (α)n + · · · + a1 α + a0 = an (α)n + · · · + a1 α + a0 = p(α), as required. It follows, if α is not real, that p(z) is divisible by the product (z − α)(z − α) = z 2 − 2z Re(α) + |α|2 , which is a quadratic with real coefficients and negative discriminant. This explains why every polynomial with real coefficients factorizes into real linear or quadratic factors, as we used in first year partial fractions. Partial fractions can sometimes be simplified by using complex numbers, since then all factors are linear. 70 MATH2011/2/4 Tutorial questions — Roots and polynomials 37. Find in modulus-argument form and plot on the complex plane all the n-th roots of a for the following values of n and a: (a) n = 3 and a = 1 (b) n = 4 and a = 1 + i (c) n = 3 and a = −2 + 2i (d) n = 5 and a = i. 38. Find the cube roots of 8 and the fourth roots of −4. Compare with your answers to Question 9. 3z + 2 −w + 2 39. If w = , show that z = . Hence solve the equation (3z+2)3 = −27(2z+1)3 2z + 1 2w − 3 by first expressing it in terms of w. 40. If z is an n-th root of 1 but z 6= 1, show that 1 + z + z 2 + · · · + z n−1 = 0. (Hint: multiply through by z −1.) Solve the equation with n = 5, and hence factorize 1 +z +z 2 +z 3 +z 4 into linear factors with complex coefficients, and then into quadratic factors with real 4π 1 coefficients. By equating coefficients of z, prove that cos 2π 5 + cos 5 = − 2 . 41. By first putting w = z 3 , solve the equation z 6 + 4z 3 + 8 = 0. Indicate all six roots in the complex plane. Hence factorize z 6 + 4z 3 + 8 into quadratic factors with real coefficients. 42. Factorize the polynomials below into linear factors, given that z + 1 + i is a factor of each of them: (a) z 3 + (−3 + i)z 2 + (1 − 4i)z + 5 + 5i (b) z 4 + 4z 3 + 11z 2 + 14z + 10. 43. Find complex partial fractions for the following rational functions: 4 16 (a) 4 (b) 4 . z −1 z +4 Complex exponentials, logarithms, and powers Complex exponentials are defined using Euler’s formula: ez = ex+iy = ex eiy = ex cos y + iex sin y. Complex circular and hyperbolic functions can then be obtained from the familiar expressions in terms of e±iz and e±z . Complex logarithms and powers are similarly defined by means of Euler’s formula, but are in general many-valued, because of the fact that the argument is not unique. Since the argument can be increased by any number of complete revolutions, the most general modulus-argument form is z = rei(θ+2kπ) . This immediately gives ln z = ln r + i(θ + 2kπ) = ln |z| + i(arg(z) + 2kπ), Algebra Chapter 1 71 so the infinitely many values of ln z all have the same real part, and therefore lie equally spaced on a vertical line. The principal value of ln z is the value arising from the principal value of the argument. Finally, the values of the complex power z w are given by z w = ew ln z = ew(ln |z|+i arg(z))+2kπi) = ew(ln |z|+i arg(z)) e(2πiw)k . The power z w has a unique value if e2πiw = 1, i.e., if w is an integer. In general, the many values lie on a logarithmic spiral of the form αeβt with α = ew(ln |z|+i arg(z)) and β = 2πiw. The spiral closes down to a circle if |e2πiw | = 1, and opens up to a straight line if e2πiw is pure real. The principal value of z w is the value arising from the principal value of the argument. Tutorial questions — Complex exponentials, logarithms, and powers 44. Prove from the definitions that: (a) cosh2 z − sinh2 z = 1 (b) cosh(z + w) = cosh z cosh w + sinh z sinh w − 1) (d) cosh2 z = 12 (cosh 2z + 1) (c) sinh z = (e) (cosh z + sinh z) = cosh nz + sinh nz (f) sinh(z + 2kπi) = sinh z 2z e −1 (g) tanh z = 2z (h) sech2 z + tanh2 z = 1. e +1 45. Evaluate: √ √ (a) cosh(2 ln(2 + 3)) (b) tanh( 21 π 2eiπ/4 ) (c) sin( 2π (d) tan(π + 12 i ln 2). 3 + i ln 5) 2 1 2 (cosh 2z n 46. If cos(x + iy) = u + iv, where x, y, u, and v are real, express u and v as functions of x and y. Hence show that | cos(x + iy)|2 = cos2 x + sinh2 y = − sin2 x + cosh2 y. Similarly, if z = x + iy, write sin z in real-imaginary form, and show that sin z 6= Im(eiz ). 47. Evaluate (in general) and plot in the complex plane, indicating which is the principal value: (a) ln i (b) ln(1 + i) √ (c) ln(−3 + i 3) (d) ln 1 1+i 1 ). (f) i(1/3) +i (g) (1 + i)(1/4)−i (h) tan( ln 2i 1 − i √ 48. Show that |1 + z| = 1 + 2r cos θ + r 2 and arg(1 + z) = arctan r sin θ/(1 + r cos θ) if P∞ n z = reiθ . By substituting in the Maclaurin series ln(1+z) = − n=1 (−1)n zn (assuming r < 1) and equating real and imaginary parts, show that (e) ii 1 2 2 ln(1 + 2r cos θ + r ) = − ∞ X n=1 (−r)n cos nθ n ∞ X sin nθ r sin θ =− (−r)n . arctan 1 + r cos θ n n=1 and 72 MATH2011/2/4 By letting r → 1, and using half angle formulae, show that ln(2| cos 1 2 θ|) = ∞ X (−1) n=1 n+1 cos nθ n and arctan(tan 1 2 θ) = ∞ X (−1)n+1 n=1 sin nθ . n Sketch the graph of the function arctan(tan 12 θ) for −2π ≤ θ ≤ 2π. (It is called a sawtooth wave, and the right hand side is its Fourier series representation.) Answers 1. (a) p2 −q 2 p2 +q 2 + i p22pq +q 2 , (b) 8ipq(p2 −q 2 ) (p2 +q 2 )2 , √ √ (c) 21 ( 3 − 1) + 12 ( 3 + 1)i, (d) 1 10 (3 − 29i), sin θ 1 − i), (f) 1 − i 1+cos θ = 1 − i tan 2 θ. u v 4. (i) W = . −v u 5. Conjugation corresponds to transposing or forming the adjoint. 7. (b) 18 (−1 + i)z + 18 (3 − i)z, (c) (z − z 2 )/(zz − 1). p p √ √ (b) −1, i; 1 + 8. (a) ±(1 + i); ±(2 − i); ±(2 + 3i); ±( 1 + 2 + i −1 + 2). p p √ √ 2i, −1 + 3i; 4 − i, −2 + 3i; 1 + i ± ( 1 + 2 + i −1 + 2) . √ √ 3 3 (b) ±2i = ±1 ± i. 9. (a) 2, −1 ± i 3, z − 2 = (z − 2)(z 2 + 2z + 4) 10. See sketches overleaf. √ 12. Spaced at 120◦ apart on a circle of radius 2, and at 90◦ on a circle of radius 2. 13. Average = 12 (x+u)+ 21 i(y+v); midpoint = 12 (x+u), 12 (y+v) . 12 0+(z+w) = 21 (z+w). 14. Re(z w) = Re(z w) = xu + yv = (x, y) · (u, v). z/w = (z w)/(u2 + v 2 ). (e) 1 √ (1 2 2 (c) Rotation through 15. (a) Reflection in real axis; z, (b) Rotation through π2 ; iz, π; −z, (d) Shift 2 units right and 1 unit down; z + 2 − i, (e) Reflection in line y = x; iz, (f) Reflection in line y = −x; iz = −iz. Figure 1.2. Sketches for Question 10. Algebra Chapter 1 73 16. (a) y = x2 , parabola. (b) z 2 = x2 − y 2 + 2ixy, so Re(z 2 ) = x2 − y 2 = 1, hyperbola. 17. Line y = 2x + 1. For y = mx + c let a = −m − i. For x = c (vertical line) let α = 1. √ 2π 18. z = 2 cos(− π4 ) + i sin(− π4 ) , w = 2(cos 2π 3 + i sin 3 ). cos θ sin θ 19. Z = r . − sin θ cos θ 21. |z| = |z|, arg(z) = − arg(z). 22. See sketches. Figure 1.3. Sketches for Question 22. 23. z = w means z and w are the same point, |z| = |w| means z and w are the same distance from the origin. z < w is meaningless, |z| < |w| means z is nearer to the origin than w. 24. The sum of the squares on the diagonals equals twice the sum of the squares on two adjacent sides. 25. a − a = 2i Im a. z+w z−w = |z − w|−2 (z w − z w). If the vertices are at the points 0, z, z + w, w, then the diagonal vectors of the rhombus are z + w and z − w, which are perpendicular because of Question 14. 26. (i) x2 + (y − 1)2 = (x − 2)2 + (y + 3)2 , giving y = 21 (x − 3) (line). 35 (x + 1)2 + y 2 = 9 (x − 2)2 + y 2 , giving x2 + y 2 − 19 4 x + 8 = 0. Complete the square to 9 get circle centre at ( 19 8 , 0) and radius 8 . (ii) If α = a1 + ia2 and β = b1 + ib2 , then (x − a1 )2 + (y − a2 )2 = k 2 (x − b1 )2 + (y − b2 )2 . Coefficient of (x2 + y 2 ) vanishes if k = 1. (iii) |z| = r and arg z = θ, so r = θ/2π (Archimedean spiral) and r = 1 + cos θ (cardioid). . 27. (a) Angle = arg(z−α) − arg(z−β) = arg z−α z−β (b) 2y x2 +y 2 −1 = tan c, giving x2 + y 2 − 2y cot c − 1 = 0, circle with centre at i cot c on the 74 MATH2011/2/4 imaginary axis and passing through points ±1. *Angle subtended by chord of circle is the same at any point on the arc. 29. 4i. 32. (a) ei(α−π/2) , √ 1+ √ 3 eiπ/4 . 2 5 e2π ei , 34. cos 4θ = 8 cos4 θ − 8 cos2 θ + 1, sin5 θ = 35. (a) (c) 1 cos6 16 (sin 5θ − 5 sin 3θ + 10 sin θ), 2 sin θ−sin(2n+1)θ+sin(2n−1)θ sin2 nθ = , 2(1−cos 2θ) sin θ θ= sin 5θ = 16 sin θ − 20 sin3 θ + 5 sin θ, 1 32 (cos 6θ + 6 cos 4θ + 15 cos 2θ + 10). 1−n cos(n+1)θ+3−n cos nθ) (b) 3(3 cos θ−1−3 2(5−3 , cos θ) xn+1 sin α+(n−1)β −xn sin α+nβ −x sin α−β +sin α , x2 −2x cos β+1 36. (a) 3(3 cos θ−1) 2(5−3 cos θ) , ±2iπ/3 (b) −x sin α−β +sin α x2 −2x cos β+1 , (d) en+1 −e−n −e+1 . 2(e−1) (c) Re(ecos θ+i sin θ ) = ecos θ cos(sin θ). √ 37. (a) 1, e or 1, 12 (−1 ± i 3), √ √ 11iπ/12 √ −5iπ/12 2e 2e (c) 2eiπ/4 , , , (d) eiπ/10 , √ √ 38. 2, 2e±2iπ/3 and 2e±iπ/4 , 2e±3iπ/4 . 39. (a) − 59 or − 12 ± 6√i 3 . (b) ±21/8 eiπ/16 , ±21/8 e9iπ/16 , i, e9iπ/10 , e−3iπ/10 , e−7iπ/10 . 40. z = e±2iπ/5 or z = e±4iπ/5 , (z 2 − 2z cos 2π + 1)(z 2 − 2z cos 4π 5 + 1). √ ±5iπ/12 √5 ±11iπ/12 √ ±iπ/4 = 1 ± i or z = 2e or z = 2e 41. z = 2e √ √ 5π 2 2 2 + 2). (z − 2z + 2)(z − 2 2z cos 12 + 2)(z − 2 2z cos 11π 12 42. (a) (z + 1 + i)(z − 2 + i)(z − 2 − i), (b) (z + 1 + i)(z + 1 − i)(z + 1 + 2i)(z + 1 − 2i). 1 i i 1+i 1−i −1+i −1−i 1 − z+1 + z−i − z+i , (b) z+1+i + z+1−i + z−1+i + z−1−i . 43. (a) z−1 √ π +1 1 45. (a) 7, (b) eeπ −1 , (c) 10 (13 3 − 12i), (d) 3i . 46. u = cos x cosh y, v = − sin x sinh y. sin z = sin x cosh y + i cos x sinh y. Im(eiz ) = e−y sin x. 47. (a) i( π2 + 2kπ), 1 π 2 ln 2 + i( 4 + 2kπ), −π(3−i)/6 2π(−1+i/3) k (b) (e) e−π(1+4k)/2 , (f) e π (h) tan( 4 + kπ) = 1. e (c) , 1 2 ln 12 + i( 5π 6 + 2kπ), (d) 2kiπ, k (g) 21/8 e(π/4)+i(π−8 ln 2)/16) ie2π , Algebra Chapter 1 æ 75 74 MATH2011/2/4 æ Algebra Chapter 2 75 Algebra Chapter 2 Convergence of series Indeterminate forms Before dealing with convergence itself, we need some techniques for dealing with indeterminate forms, i.e., limits of the form 00 . In first year we proved L’Hôpital’s Rule. If lim f (x) = 0 and lim g(x) = 0, and if f (x) and g(x) are sufficiently x→a x→a smooth near x = a, then f (x) f ′ (x) lim = lim ′ , x→a g(x) x→a g (x) provided that the limit on the right hand side exists. Proof. The functions f (x) and g(x) can be represented near x = a as Taylor series with constant term 0 = f (a) = g(a), so f ′ (a) + 21 (x − a)f ′′ (a) + · · · 0 + (x − a)f ′ (a) + 12 (x − a)2 f ′′ (a) + · · · f (x) = ′ , = g(x) 0 + (x − a)g ′ (a) + 21 (x − a)2 g ′′ (a) + · · · g (a) + 21 (x − a)g ′′ (a) + · · · by cancelling (x − a). The result follows by letting x → a. If the limit on the right hand side of L’Hôpital’s Rule is again an indeterminate form, then the process of differentiating numerator and denominator can be repeated. L’Hôpital’s Rule can also be used to evaluate limits of the form 1∞ , by taking logarithms beforehand, and exponentials afterward. L’Hôpital’s Rule is also valid for limits of the form f (x) 1 g(x) . Then g(x) 0 . Thus 0 = G(x) F (x) , ∞ ∞. For let F (x) = 1 f (x) and G(x) = to which L’Hôpital’s Rule can be applied, since it is of the form f (x) G(x) G′ (x) g(x)−2 g ′ (x) lim = lim = lim ′ = lim = x→a g(x) x→a F (x) x→a F (x) x→a f (x)−2 f ′ (x) f (x) lim x→a g(x) 2 g ′ (x) , x→a f ′ (x) lim from which the result follows. Indeterminate forms f (x)/g(x) as x → ∞ can also be determined by L’Hôpital’s Rule, by substituting x = h1 and letting h → 0, provided f ( h1 ) and g( h1 ) are smooth at and near h = 0. The proof appears as a tutorial question. 76 MATH2011/2/4 Tutorial questions — Indeterminate forms 1. Evaluate the following limits: 2 sin x − tan x (a) lim x→0 ex − 1 sin x − tan x x→0 ln(1 + x) − x 2 ln(1 + x) + x2 − 2x (d) lim x→0 x3 (f) lim (1 + 2 ln x)1/(x−1) (b) lim (c) lim (e5x − 2x)1/x x→0 (e) lim (sec x − tan x) x→1 x→π/2 sin x 1 − 2 sin 6x cosec x (j) lim x→0 1 + cot x (l) lim ln(ln x) sin πx (g) lim xx (h) lim x→π x→0 cosec x x→π/2 1 + cot x ln(ln x) (k) lim x→e ln x − 1 x→1 arctan x − x z−i (m) lim (n) lim πz . 3 x→0 z→i e x +1 2. By first taking logarithms, evaluate the limits of the following expressions as n → ∞: (i) lim (a) n1/n (b) n1/ √ n (c) n1/ ln n (d) n1/ √ ln n . 3. Prove that L’Hôpital’s Rule can (under suitable conditions) be used for indeterminate forms f (x) g(x) as x → ∞ as follows. Let h = 1 , x and let F (h) = f (x) and G(h) = g(x). Show that F ′ (h) = −x2 f ′ (x). (Hint: put y = f (x) = F (h) and use the fact that dy dx = dy dh dh dx .) Now apply L’Hôpital’s Rule to F (h) G(h) . ekx = ∞ for positive k and m, x→∞ xm however small k may be and however large m may be. (Hint: apply L’Hôpital’s Rule n 4. Prove that exponentials beat powers, i.e., prove that lim times, where n is an integer greater than m.) Deduce that powers beat logarithms by substituting y = ex . Prove that lim (1 + na )bn = eab . (Hint: take logarithms.) n→∞ Convergence of series I Convergence and divergence of series have been seen before: a series P an is said to converge to the number S if the partial sums a1 + a2 + · · · + aN tend to S as N → ∞. Roughly speaking, this means that the more terms you add on, the closer you get to the number S, which is sometimes called the sum to infinity. A series that does not converge is said to diverge. We shall describe some tests to determine whether or not a series converges; they are important because not many sums to infinity can be found exactly. We often use a computer to approximate a sum to infinity by the partial sum for say N = 100 or 1000. This is called truncating the series, but it is meaningless if the series does not converge. Even if the series converges, it is valuable to know how fast it converges, so that we can estimate the truncation error, which is the difference between Algebra Chapter 2 77 the sum to infinity and the partial sum. Note that in testing for convergence it is only the infinite tail of the series that matters: we can ignore any number of terms at the beginning, because a finite sum must have a finite value, and therefore cannot affect the overall convergence or divergence, though it obviously does affect the sum to infinity, if it exists. It has been shown that if a series converges, then its nth term tends to zero as n → ∞. This applies to series with real or complex terms, and it can be used as a basic test for divergence, because it can be stated in the following form. P Divergence Test. If an 6→ 0 as n → ∞, then an diverges. Notice that if an does tend to zero, then there is no conclusion: the series may still converge or diverge. There is another important general result, which also applies to series with real or complex terms. P P P Theorem. If |an | converges, then an also converges, and we say an is absolutely convergent. This result seems obvious, since |a1 + a2 + · · · | ≤ |a1 | + |a2 | + · · · , but the proof is rather subtle, and will be omitted. The next test for series enables us to deal with fast converging or diverging series, usually those whose terms involve exponentials or factorials of n. The simplest are the geometric P n−1 series ar , in which r is the common ratio between the (n + 1)th term and the n th term. A geometric series with common ratio r converges if |r| < 1 and diverges if |r| ≥ 1. Almost the same applies to series which the ratio tends (in modulus) to a constant, even if the ratio is not exactly constant. P an+1 → L as n → ∞. Then the series an converges if L < 1 an and diverges if L > 1. There is no conclusion if L = 1. Ratio Test. Suppose The proof is technical, and will be omitted, but the result is the same as that for geometric series, except that in the border area between convergence and divergence (i.e., L = 1) there is no conclusion. For this reason the Ratio Test is useless unless n appears in an exponent or factorial. Notice that since the Ratio Test deals with the absolute value of the ratio, it is actually a test for absolute convergence, and can also be applied to series with complex terms. The Ratio Test is particularly valuable for power series, i.e., series like Maclaurin or Taylor series involving powers of z or z − a. The Ratio Test, if applied to a power series P cn (z − a)n , can lead to the conclusion that the power series converges if |z − a| < R, say, and diverges if |z − a| > R. This means that the series converges if the point z is inside the circle with centre a and radius R, and diverges if z is outside the circle. There is no 78 MATH2011/2/4 conclusion from the Ratio Test if |z − a| = R, i.e., if z is on the circle, which is called the circle of convergence of the series. Tutorial questions — Convergence of series I 5. Use one of the tests above to determine, if possible, whether the following series converge or diverge: X n (a) 2n X1 (d) n X n! (g) nn 6. Use the Ratio Test (b) X 3n X n2√ (c) n √ X (−3)n n! X 2n 2−2n (f) n 1+ n X 1 29(1 + i) n X 12(1 + i) n (i) . (h) n 17 n 41 P α n to show that the binomial series n z converges for |z| < 1 and α(α−1)...(α−n+1) , and if the series diverges for |z| > 1. (The binomial coefficient α n! n = (e) converges, then its sum is (1 + z)α .) 7. Use the Ratio Test to show that the Maclaurin series for ez converges for all z. Deduce that z n /n! → 0 as n → ∞. (Hint: consider the nth term of the series.) This shows that factorials beat exponentials (which beat powers, which beat logarithms). 8. Use the Ratio Test to find the circles of convergence of the following power series. Then sketch the circles in the complex z plane. X X (2z − 1)n X n!z n n (a) nz (b) (c) n nn X (1)(3)(5) . . . (2n − 1) X 2n X (nz)n (z − 2)n . (e) (z − i)2n (f) (d) n n! n! P∞ 2n n −1/2 n −1/2 Show that 2n = (−4) . Deduce that where n=0 n (z − 2) = (9 − 4z) n n it converges. Indicate the point on the circle of convergence where the right hand side is undefined. 9. If L < 1, where L is the limit found in the Ratio Test, show that if ∞ X an is approx- n=0 imated by the partial sum N X an , then the absolute value of the truncation error can n=0 aN+1 . (Hint: replace the (N + 1)th and later terms in modulus by a 1−L geometric series with common ratio L, since aN+2 ≈ L aN+1 , and so on.) be estimated by Convergence of series II The next tests apply to series for which the Ratio Test fails, i.e., those for which the ratio of successive terms tends in modulus to 1. This includes series in which the nth term Algebra Chapter 2 79 involves only powers or logarithms of n. Unfortunately these tests are restricted to series with positive terms only, which behave more predictably than general series. P Lemma. If an is a series with positive terms, then either it converges, or its partial sums tend to infinity. The important point is that the partial sums cannot oscillate, and the reason is clear: since the terms are all positive, it follows that adding on an extra term must always increase the value of the partial sum, so the partial sums are increasing. (This is analogous to the fact that if the derivative of a function is positive, then the function itself is increasing.) There are now only two possibilities: either the partial sums continue increasing to infinity, or they must level out to a limiting value. The formal proof of this seemingly obvious result is again too theoretical for us. The simplest series of this type are sums of a constant power of n. X 1 diverges if p ≤ 1 and Theorem. The p-series If p is constant, then the series np converges if p > 1. y=x-p y=Hx+1L-p Figure 2.1. Comparison between series and integrals Proof. If p ≤ 0, then the nth term n−p does not tend to zero, so the series diverges. If p > 0, we can draw the graphs y = x−p for x ≥ 1 and y = (x + 1)−p for x ≥ 0, as shown in Figure 2.1. By joining the graphs with horizontal lines, as shown, we obtain the graph of a step function lying between the two graphs. The area under the step function between x = 0 and x = N is the sum of areas of rectangles with base length 1 and heights 1−p , 2−p , . . . , N −p respectively. The areas under the other two graphs can be found by integration. Since the step function lies between (x + 1)−p and x−p , it follows that Z N (x + 1) 0 −p dx < N X n=1 n −p <1+ Z N x−p dx. 1 If p 6= 1 then this becomes N X 1 1 1−p 1−p (N + 1) −1 < n−p < 1 + N −1 . 1−p 1−p n=1 80 MATH2011/2/4 If 0 < p < 1, then both (N + 1)1−p and N 1−p tend to infinity (because the exponent 1 − p is positive), so the partial sums also tend to infinity, and the series diverges. On the other hand, if p > 1, then both (N + 1)1−p and N 1−p tend to zero (because the exponent 1 − p is negative), so the expressions on the left and right tend to 1 p−1 1 and 1 + p−1 respectively. Thus the partial sums cannot tend to infinity, and, by the Lemma, the series must converge to a value lying between 1 p−1 and 1 + 1 p−1 . The case p = 1 must be treated separately, because the integrals involve logarithms. The inequalities become ln(N + 1) < N X n−1 < 1 + ln N, n=1 and since ln N → ∞ as N → ∞ it follows that the series diverges when p = 1. P∞ −p Notice that the infinite series converges for the same values of p as the n=1 n R ∞ −p infinite integral x=1 x dx, as shown in Calculus Chapter 1. This theorem also shows P1 that the divergent series n , which is called the harmonic series, represents, roughly speaking, a borderline between convergence and divergence. It diverges incredibly slowly: if a computer summed ten million terms a second, and kept on calculating for a year, then the partial sum would be about ln(107 × 60 × 60 × 24 × 365), which is less than 34, yet the partial sums eventually go to infinity. The p-series are used as a set of standards by which other series of positive terms can be compared. There are two tests for comparing series having positive terms, each saying essentially that if such a series converges, then so does any such series with smaller terms, and if it diverges, then so does any series with larger terms. Comparison Test. If 0 ≤ an ≤ bn for all n sufficiently large, then: P P (i) if bn converges, then an converges, P P (ii) if an diverges, then bn diverges. (The phrase “all n sufficiently large” draws attention to the fact that only the tail of the series matters when testing for convergence or divergence.) an Limit Comparison Test. If an and bn are positive, and tends to a non-zero numbn P P ber K as n → ∞, then an and bn either both converge or both diverge. (Notice the difference between this and the Ratio Test: this test compares corresponding terms of two different series of positive real numbers, while the Ratio Test compares the ratio of absolute values of successive terms of the same series. In this test there is a conclusion for any limit K not equal to zero or infinity, while the Ratio Test fails only when the limit L = 1.) Algebra Chapter 2 81 Proof. If an ≤ bn for all n ≥ M say, then N X n=M an ≤ N X bn for all N ≥ M , and the n=M Comparison Test follows by the Lemma. For the Limit Comparison Test, note that if an bn → K then for n sufficiently large we must have at least 12 K ≤ bn ≤ 2 K an an bn ≤ 2K. This gives and an ≤ 2Kbn , and the result follows from the Comparison Test. To use these tests in practice, we have to approximate a given series by a simpler series (usually a p-series) whose convergence or divergence is known. We can use inequalities like the fact that powers beat logarithms (i.e., eventually nm > (ln n)k for any positive constants k and m). In addition, based on the fact that we can assume n is large, so 1 n is small, we can use first or second approximations to obtain a simpler series. Once it has been found, we apply one of the comparison tests. The final test for series with positive terms is also a kind of comparison, using the fact that integrals are in general easier to evaluate than sums. It generalizes the proof used for p-series, and emphasizes the link between convergence of series and convergence of integrals, as discussed in Calculus Chapter 1. Integral Test. If f (x) is a positive decreasing function for x ≥ M , then Z ∞ ∞ X f (n) and n=M f (x) dx either both converge or both diverge. M Proof. Replace the graphs in Figure 2.1 by the graph y = f (x) for x ≥ M and the graph y = f (x + 1) for x ≥ M − 1. Then by the same argument we obtain Z N f (x + 1) dx < M −1 N X f (n) < f (M ) + n=M Z N f (x) dx, M and the result follows by the Lemma. Tutorial questions — Convergence of series II 10. Use a second approximation to ln(1 + x) (or a sketch graph) to show that x ≥ ln(1 + x). N N X X 1 > ln(1 + n1 ). Deduce that n n=1 n=1 Rewrite the sum on the right hand side without sigma notation, and show by the laws of logarithms that its value is ln(N + 1). By letting N → ∞, deduce from the above that the harmonic series diverges. 82 MATH2011/2/4 11. Use appropriate comparison tests to determine whether the following series converge or diverge. √ X 3 X 5 n+1 X 1 (b) (c) (a) n+1 n2 + 1 n+2 Xp X n1/2 + 1 X (d) ( n2 + 1 − n) (f) (e) sin n1 n1/3 + 5 X ln n X ln n X n−2 + 1 (h) (i) (g) n−3 + 5 n n2 X 1 X 1 X 1 (j) (k) (l) . 2 n ln n n ln n n(ln n)2 * 12. Which part of the Limit Comparison Test remains true if K = 0, and which part if K = ∞? Convergence of series III There is only one simple test, besides the Ratio Test, for series whose terms are not all positive. It applies to certain alternating series, i.e., series whose terms are alternately positive and negative. Alternating Series Test. If cn is positive and decreasing, and if cn → 0 as n → ∞, P P then the alternating series (−1)n cn and (−1)n+1 cn both converge. P Proof. A partial sum of (−1)n+1 cn can be written as c1 − (c2 − c3 ) − (c4 − c5 ) − · · · or as (c1 − c2 ) + (c3 − c4 ) + (c5 − c6 ) − · · · . Since the values of cn are decreasing, each bracketed term is positive. It follows that the odd partial sums start at the value c1 and decrease, while the even partial sums start at 0 and increase. However, the difference in magnitude between the N th partial sum SN and the (N + 1)th partial sum SN+1 is cN+1 , which tends to zero as N → ∞. Therefore the odd and even partial sums must tend to the same limit, which means that P P P the series (−1)n+1 cn converges. Finally, (−1)n cn = − (−1)n+1 cn , so if one series converges, then the other one converges also. Corollary. In absolute value, the truncation error for an alternating series satisfying the conditions of the test is less than the first term omitted. Proof. Suppose the series converges to the sum S. From the proof of the Alternating Series Test it follows that the odd and even partial sums oscillate from one side of S to the other. Thus |SN − S| < |SN − SN+1 | = cN+1 , as required. When using the test, it is important to check that the terms decrease in magnitude, as well as alternate in sign (and tend to zero). The test actually guarantees the convergence P inθ of e cn for any constant θ such that 0 < θ < 2π, though we cannot prove it here. Algebra Chapter 2 83 What we have proved is convergence when θ = π, since einπ = (−1)n . The general result makes it sometimes possible to establish the convergence or divergence of a power series at points on its circle of convergence. P P P If a series an converges, but |an | does not, then we say an is conditionally convergent. A surprising result is that the terms of a real conditionally convergent series can be rearranged, without omitting or repeating any terms, to sum to any desired value T . You simply take the next positive terms until the partial sum is greater than T , then the next negative terms until the partial sum is less than T , and so on. Tutorial questions — Convergence of series III 13. Prove that the Maclaurin series for ln(1 + x) converges at x = 1 but not at x = −1. What is the sum to infinity at x = 1? 14. Roughly how many terms do you need in order to obtain an estimate of π 4, correct to four decimal places, by truncating the Maclaurin series for arctan x with x = 1? (Hint: the absolute truncation error must be less than 0, 5 × 10−4 .) How many terms do you need if you use the identity π 4 = arctan 21 + arctan 13 and write the right hand side as a single alternating series? (Use a calculator, and trial and error.) * 15. Make up a real alternating series in which the terms tend to zero, but the series still diverges. (Hint: make the series of positive terms converge, and the series of negative terms diverge.) * 16. Determine the behaviour of the power series of Question 8(a) and (b) on their circles of convergence. ∞ X (−1)n+1 * 17. The series converges to ln 2 (see Question 13). Rearrange the terms of n n=1 the series to make a new series whose sum is 1, and write down the first few terms of the new series. Plan a computer program to generate any number of terms of this rearranged series. 18. Test the following assorted series for convergence: X X (a) (2 + i)−n (b) (2n + 1)−2 X (−1)n √ n X 1 (g) n ln n X nn (j) n! X 1 (m) 2+(1/n) n (d) 1 √ n+1− n X n (h) n2 + 1 X n! (k) nn X 1 (n) ln 1 + √ n (e) X √ 1 p n(n + 1) X 4 + 5i n (f) 6 X π (i) sin 2 n X 1 (l) 1−(1/n) n X (o) (1 + n1 )1/3 − 1 (c) X 84 MATH2011/2/4 X cos nπ 1 (r) n ln n ln ln n 2n + 1 X sin nπ X X cos(2n + 1)π n (t) (u) (−1)n 2 (s) 2n + 1 2n + 1 n −1 X X X 1 1 1 (v) *(x) (−1)n 2+(−1)n . (w) q 1+(1/n) n(ln n) n n P n −y 19. Determine for which (real) values of x and y the series x n converges, and shade (p) X arctan(n−2 ) (q) X the region of convergence in the (x, y) plane. (Hint: regard x and y as parameters: first use the Ratio Test to find regions of convergence and divergence, and use other tests on the boundary between the regions. Use solid lines in your sketch for portions of the boundary where the series converges, and dotted lines otherwise.) Answers 1. (a) 1,(b) 0,(c) e3 ,(d) 23 , (e) 0,(f) e2 ,(g) 1,(h) 0, (i) 1,(j) 1,(k) 1,(l) 0, (m) − 13 ,(n) − π1 . 2. (a) 1, (b) 1, (c) e, (d) ∞. 4. After n applications of L’Hôpital’s Rule, the numerator is k n ekx , which still tends to infinity, but the denominator is a constant times xm−n , which tends to zero. 5. (a) C,(b) D,(c) C,(d) no conclusion (so far). (e) D,(f) no conclusion,(g) C,(h) C,(i) D. 8. (a) |z| = 1,(b) |z − 12 | = 21 ,(c) |z| = e,(d) |z| = 1e ,(e) |z − i| = −1/2 equals 1 − 4(z − 2) , undefined at z = 94 . 9. Absolute truncation error ≈ |aN+1 |{1 + L + L2 + · · · }. 10. ln(1 + x) = x − 21 x2 + · · · < x, .˙. 1 n > ln(1 + n1 ), .˙. P 1 n √1 , 2 > (f) |z − 2| = 14 . Sum P ln(1 + n1 ). 11. (a) D, (b) C, (c) D, (d) D, (e) D, (f) D, (g) D, (h) D, (i) C, (j) D, (k) C, (l) C. P P P 12. If K = 0, then bn convergent implies an convergent and an divergent implies P bn divergent only. If K = ∞, interchange an and bn . 13. ln 2. 14. 10 000 terms, 5 terms. 15. * See Question 18(x) 16. * Converges (a) nowhere on the circle, (b) everywhere on the circle except at z = 1. 17. * (1 + 13 ) − 1 2 + 1 5 − 1 4 + ( 71 + 19 ) − 1 6 1 + ( 11 + 1 13 ) − 1 8 + ···. 18. (a) C (RT), (b) C (CT/LCT), (c) D (CT/LCT), (d) C (AST), (e) D (n th term), (f) D (RT),(g) D (IT),(h) D (CT/LCT),(i) C (CT/LCT),(j) D (RT and L’H),(k) C (RT), (l) D (CT), (m) C (CT), (n) D (LCT and L’H), (o) D (LCT and L’H), (p) C (LCT and L’H), (q) D (IT), (r) C (AST), (s) D (CT), (t) C (all terms are zero), (u) C (AST), (v) C if q > 1 (IT), (w) D (LCT and L’H), terms separately). 19. See sketch overleaf. *(x) D (drop sigma notation, and consider odd and even Algebra Chapter 2 85 Figure 2.2. Sketch for Question 19 86 MATH2011/2/4 æ Algebra Chapter 3 æ 85 86 MATH2011/2/4 Algebra Chapter 3 Linear algebra Linear spaces In Calculus Chapter 1 we met the idea of a linear operator, i.e., an operator, say P , such that P (cy) = c(P y) for any constant c, and P (y1 + y2 ) = P y1 + P y2 for any inputs y1 and y2 . By using these facts repeatedly, we can say that P (α1 y1 + · · · + αn yn ) = α1 (P y1 ) + · · · + αn (P yn ) or P n X j=1 n X αj (P yj ) α j yj = j=1 for any constants α1 , . . . , αn and any inputs y1 , . . . , yn . The expression α1 y1 + · · · αn yn Pn or j=1 αj yj is called a linear combination of y1 , . . . , yn . Thus a linear combination of elements is a sum of constant multiples of those elements. (One reason for the word linear is that the expression is of degree one in the variables y1 , . . . , yn .) A set of vectors or functions is called a linear space if it is closed under the formation of linear combinations, i.e., if every linear combination of elements of the set is again in the set. If only real constants are permissible, then the space is called a real linear space; if complex constants are allowed, then the space is called a complex linear space. The following results about linear spaces are important; brief proofs, where necessary, are given in brackets. 1. All vectors with n real entries form a real linear space, denoted Rn . Similarly, all vectors with n complex entries form a complex linear space, denoted Cn . Row vectors are denoted by bold lower case letters, e.g. r or a or v; column vectors are written in upper case, e.g. X or Z. 2. All real-valued functions form a real linear space, and all complex-valued functions form a complex linear space. 3. The inputs to a linear operator form a linear space, which is called the domain of the operator. (This is because in order to test for linearity, every linear combination of inputs must also be an input.) Algebra Chapter 3 87 4. Every linear space contains the zero element. (This is because if y is in the space, then so is y + (−y).) 5. The outputs of a linear operator form a linear space, which is called the range space or image space of the operator. (This is because any linear combination of outputs, say α1 (P y1 ) + · · · + αn (P yn ), can by the properties of linear operators be re-written α1 (P y1 ) + · · · + αn (P yn ) = P (α1 y1 + · · · + αn yn ), so it is also an output.) 6. The null space of a linear operator is a linear space. (This is because if y1 , . . . , yn are in the null space of P , then P (y1 ) = P (y2 ) = · · · = P (yn ) = 0, and by linearity P (α1 y1 + · · · + αn yn ) = α1 (P y1 ) + · · · + αn (P yn ) = α1 0 + · · · + αn 0 = 0, so the linear combination α1 y1 + · · · + αn yn is also in the null space.) This result is called the superposition principle, and was also proved in Calculus Chapter 1. Tutorial questions — Linear spaces and subspaces 1. Prove that if C1 , C2 , . . . , Cn are m × 1 column vectors, then the linear combination Pn j=1 xj Cj is the same as the matrix product CX, where C is the m × n matrix whose columns are C1 , C2 , . . . , Cn , and X is the n×1 column vector with entries x1 , x2 , . . . , xn . (Hint: write out CX and use the definition of matrix product.) 2. Determine whether the following sets of vectors (z, w) in C2 are linear spaces. (In each case, take two general vectors (z1 , w1 ) and (z2 , w2 ) satisfying the given equation; then let (z3 , w3 ) = α1 (z1 , w1 ) + α2 (z2 , w2 ), and see whether or not (z3 , w3 ) also satisfies the same equation.) (a) all (z, w) such that z + iw = 0. (b) all (z, w) such that z + iw = 1 + i. (c) all (z, w) such that z 2 − iw2 = 1 + i. (d) all (z, w) such that z 2 − iw2 = 0. 3. Use Gauss-Jordan elimination to solve (where possible) the systems of equations AZ = B with the following augmented matrices, giving the complete general solution, where it exists. If the solution is not unique, write it in the form Z = Z1 + Z0 , where Z1 is a particular solution and Z0 is a general vector in the null space. 1 2+i : 1+i 1 −1 + i : 1 + i (a) (b) 1−i 3 : i 1−i 2i : 2 1 2+i : 1−i 1 i 1−i : 1+i (c) (d) 1 − i 3−i : −2i . 1 + i −1 + i 2 : 2i i −1 + 2i : 1 + i 4. (Revision.) (i) Find the inverses of the following matrices T by reducing the augmented matrix T : I by Gauss-Jordan elimination to I : T −1 . 88 MATH2011/2/4 (ii) Check your answers by working out T T −1 . 1 adj(T ). (iii) Double-check your answers by finding T −1 using the formula T −1 = det(T ) 1 2 3 1 i 1+i 1 13 1 1+i (a) 1 1 (b) (c) 2 3 4 (d) i 0 1 . 1 − 2i 3 2 2 3 4 6 1 −1 + i −1 + i Bases and dimension A linear space (unless it consists of zero alone) contains infinitely many elements, but usually they can be expressed as linear combinations of relatively few of these elements. For example, any vector (x, y, z) in R3 can be expressed as a linear combination of i, j, and k, since (x, y, z) = xi + yj + zk, and this expression is, in fact, unique. Similarly, any function y(t) in the null space of the operator D2 + 1 can be written uniquely as a linear combination of cos t and sin t, thus: y(t) = P cos t + Q sin t. We say that i, j, k, form a basis of R3 , and that cos t and sin t form a basis of the null space of D2 + 1. More generally, a basis of a linear space is a set of elements in the space such that every element in the space can be written uniquely as a linear combination of the basis elements. Notice that: • the basis elements must themselves be in the space, • for every element in the space, an expression as a linear combination of basis elements must exist, and • for every element in the space, the expression as a linear combination of basis elements must be unique. Theorem. In Rn or Cn n column vectors will form a basis if and only if the matrix of which they are the columns is invertible. Proof. Suppose the vectors are C1 , . . . , Cn , forming the matrix (C1 . . . Cn ) = C. By matrix multiplication (see Question 1) it is easy to show that a linear combination x1 C1 +· · ·+xn Cn is the same as the matrix product CX, where X is the column vector (x1 . . . xn )T . To say that C1 , . . . , Cn form a basis is to say that every n × 1 vector B has a unique expression as a linear combination x1 C1 + · · · xn Cn . By the previous remarks this means precisely that the equation CX = B has a unique solution for every n × 1 vector B. From elementary matrix algebra we know that a solution of the equation CX = B will exist for every B and be unique if and only if C is invertible, in which case X = C −1 B. Algebra Chapter 3 89 The above theorem is true if column is replaced by row throughout. In particular, the standard basis of Rn or Cn consists of the n vectors forming the columns (or rows) of the identity matrix In , which is obviously invertible, since In−1 = In . There are many ways to choose a basis of a linear space, but it can be proved that the number of elements in any basis will be the same. For example, in R3 any three non-coplanar vectors form a basis, as the above theorem shows, since three non-coplanar vectors form a matrix with non-zero determinant. With fewer than three vectors not every element of R3 will be expressible, and with more than three the expressions will not be unique. Similarly, cos(t−α) and sin(t−β) in general form a basis of the null space of D2 +1, as do eit and e−it (if complex coefficients are allowed), but you always need two functions for a basis of this space. This suggests that, although the basis elements themselves can be chosen in many ways, the number of basis elements is always the same. The dimension of a linear space is the number of elements in any basis of the space. Thus R3 has dimension three (so this meaning of dimension corresponds with our usual meaning), and the null space of D2 + 1 has dimension two. A plane through the origin in R3 also has dimension two (any two non-collinear vectors in the plane form a basis), and a line through the origin in R2 or R3 has dimension one. The origin on its own is said to have dimension zero. Theorem. The dimension of the null space of a linear operator is equal to the number of arbitrary constants in the general solution of the inverse problem for that operator. Proof. The general solution of P y = b is of the form y = y1 + y0 , where y1 is a particular solution and y0 is a general element of the null space. If y0 involves n arbitrary constants, then it is a linear combination of n basis elements, so the null space has dimension n. In particular, the solution of P y = b is unique if the null space of P consists of zero only (no arbitrary constants in the solution, and dimension zero for the null space). For an n th order linear differential operator we also see that the null space has dimension n, because the solution requires n integrations and therefore involves n arbitrary constants. Tutorial questions — Bases and dimension 5. Use your working from Question 3 to find a basis of the null space of the coefficient matrices there. Then write down the dimension of each null space. 6. Find a basis of the null space of each of the differential operators below, and verify that the dimension of the null space is equal to the degree of the operator. (Use complex numbers where necessary for simplicity.) (a) D2 − 4D + 3 (b) D2 + 2D + 2 (c) D3 (d) Dn (e) D4 + 4. 7. (a) Prove that if a and b are row vectors in R2 , and if T is the matrix whose rows are a and b, then | det(T )| is equal to the area of the parallelogram whose sides are formed 90 MATH2011/2/4 by the vectors a and b. (Hint: put the third component equal to 0 and use the fact that area = |a × b|. Use the theorem to prove that any two non-collinear vectors form a basis of R2 . (b) Similarly, prove that any three non-coplanar vectors in R3 form a basis of R3 , by using the fact that the determinant of the matrix they form algebraically is plus or minus the volume of the parallelepiped they form geometrically. Independence and rank The uniqueness of expressions for vectors in terms of a basis comes from a property called independence. Elements in a linear space are said to be independent if either of the following is true (in which case it can be shown that the other is also true): • It is impossible to express any one of the elements as a linear combination of the others. • The only way to express zero as a linear combination of the elements is to have all coefficients equal to zero. The first definition makes it clear that in R3 two vectors are independent if they are not collinear, and three vectors are independent if they are not coplanar. Comparing these definitions with the definition of basis shows clearly that basis elements in any linear space must always be independent. Theorem. (i) Column vectors C1 , . . . Cn in Cm are independent if the equation CX = 0 has the unique solution X = 0 (i.e., the null space consists of 0 alone), where C is the m × n matrix (C1 . . . Cn ). (ii) Functions y1 (x), . . . , yn (x) are independent if their Wronskian is non-zero. The Wronskian W (y1 , . . . , yn ) is the determinant of a matrix whose entries are the functions and their derivatives of successive orders. For n = 2 and 3 we have: y1 y2 y3 y1 y2 W (y1 , y2 ) = det = y1 y2′ − y2 y1′ , W (y1 , y2 , y3 ) = det y1′ y2′ y3′ . y1′ y2′ y1′′ y2′′ y3′′ Proof. (i) Column vectors. If X = (x1 , . . . , xn )T , then, as in Question 1, CX = x1 C1 + · · · + xn Cn . If CX = 0 has the unique solution x1 = x2 = · · · = xn = 0, then C1 , . . . , Cn are independent by the second definition of independence above. (Note that the matrix C need not be square, but, if it is, then its columns are independent if and only if it is non-singular, i.e., if det C 6= 0 or C −1 exists. ) (ii) Functions. Suppose a1 y1 + a2 y2 + a3 y3 = 0, where a1 , a2 , a3 are numbers. Differentiate twice to get a1 y1′ + a2 y2′ + a3 y3′ = 0 and a1 y1′′ + a2 y2′′ + a3 y3′′ = 0. These equations give the matrix equation y1 y1′ y1′′ y2 y2′ y2′′ a1 y3 0 ′ a2 = 0 . y3 ′′ a3 y3 0 Algebra Chapter 3 91 Since the Wronskian W (y1 , y2 , y3 ) is the determinant of the coefficient matrix, it follows that if W (y1 , y2 , y3 ) 6= 0, then the coefficient matrix is invertible, so the numbers a1 , a2 , a3 are all zero (as required for independence). The rank of a matrix is the number of non-zero rows in the Gauss or Gauss-Jordan form of the matrix. It can be shown that the rank is equal to the dimension of the range or image space, and is also equal to the maximum number of independent rows or columns in the matrix. The rank also gives information about the existence or uniqueness of solutions of linear equations. Theorem. Suppose the m × n matrix A has rank r. Then: (i) The null space of A, i.e., the space of solutions of the equation AX = 0, has dimension n − r. (ii) If r = m, then the solution of any equation AX = B may not be unique, but will always exist. (iii) If r = n, then the solution of any equation AX = B is unique, assuming it exists. Proof. (i) After Gauss-Jordan elimination of AX = O, there are r non-zero rows or equations; together, these express r of the n unknowns in terms of the remaining n−r unknowns, which are arbitrary. Thus there are n − r arbitrary constants in the solution, so the null space has dimension n − r. (ii) If r = m, then every row in the normal form of A is non-zero, so a solution must exist. (iii) If r = n, then there are no arbitrary constants, so the solution (if it exists) is unique. Corollary. Column vectors C1 , . . . , Cn are independent if and only if the matrix (C1 . . . Cn ) has rank n. Proof. The result follows from part (iii) of this theorem and part (i) of the previous theorem. Tutorial questions — Independence and rank 8. Use your working from Question 3 to determine which coefficient matrices there have all their columns independent. 9. Use Wronskians to test the following sets of functions for independence: (a) ex , e−x (b) cos x, sin x, 1 (c) cos2 x, sin2 x, 1 (d) 1, sec x, tan x. 10. Use Wronskians to show that the basis functions you found in Question 6(a)–(c) are independent. 11. Show that two general functions y1 and y2 are not independent if their quotient is a constant. Use the quotient rule for differentiation to show that this is the same as W (y1 , y2 ) = 0. 92 MATH2011/2/4 12. (a) Find the rank of each of the coefficient matrices in Question 3 by counting the nonzero rows in the normal form. Verify in each case that the dimension of the null space (i.e., the number of arbitrary constants) is equal to the difference between the number of columns and the rank. (b) Given a square matrix A, explain why existence of a solution for AX = B for all possible choices of B always goes together with uniqueness. Eigenvalues and eigenvectors A linear operator whose inputs and outputs come from the same linear space is said to be a linear transformation of that space. For example, multiplication by a (square) n × n matrix A is a linear transformation of Rn or Cn . Similarly, any D operator P (D) is a linear transformation of the space of all infinitely differentiable functions. With a transformation, it is possible for an element to be taken to itself, or a multiple of itself. These elements, and the scalar multiples involved, turn out to be very important. Geometrically, it says that the output is parallel to the input, so the direction is the same, though the magnitude may be different. A number α is said to be an eigenvalue of a linear transformation P if there is a non-zero element y such that P y = αy. If α is an eigenvalue of P , then the eigenspace of P corresponding to α consists of all elements y such that P y = αy. The elements of the eigenspace are called eigenvectors or eigenfunctions. For example, every number α is an eigenvalue of the differentiation operator d , with dt dy = αy. dt corresponding eigenfunctions y = ceαt , where c is an arbitrary constant, since Similarly, 2 is an eigenvalue 0 −4 of the matrix A = 1 4 , with corresponding eigenvectors X = c −2 1 , since by direct calculation it is easy to show that AX = 2X. If α is an eigenvalue of an operator P , then the corresponding eigenspace consists of all y such that P y = αy, or P y − αy = 0, which we may write (by analogy with D operators) in the form (P − α)y = 0. Thus the eigenspace corresponding to the eigenvalue α is the null space of the operator P − α, and is therefore a linear space. The eigenspace must have dimension at least one, since by definition of eigenvalue the eigenspace contains a non-zero element. Theorem. If A is an n × n matrix, then λ is an eigenvalue of A if and only if det(A − λIn ) = 0. Proof. The number λ is an eigenvalue of A if and only if there is a non-zero solution for the equation AX = λX, i.e., AX − λX = 0, i.e., (A − λIn )X = 0. (Notice how we must insert the identity matrix In to make the matrix algebra meaningful.) This equation has a non-unique solution if and only if the coefficient matrix A − λIn is singular, i.e., if and only if det(A − λIn ) = 0. Algebra Chapter 3 93 The equation det(A − λIn ) = 0 is called the characteristic equation of the matrix A, and the left hand side det(A − λIn ) is called the characteristic polynomial of A, which we write as cA (λ), so cA (λ) = det(A − λIn ). The characteristic polynomial is found by subtracting the variable λ from each of the diagonal entries of A, and then taking the determinant. Eigenvalues are sometimes called characteristic roots, since they are the roots of the characteristic equation cA (λ) = 0. Tutorial questions — Eigenvalues and eigenvectors 13. For each matrix A and each vector X below, show that the matrix product AX is a scalar multiple of X. (This shows that X is an eigenvector; the multiplying factor is the corresponding eigenvalue.) 0 1 1 1 (a) A = ;X= and X = . 1 0 1 −1 1 2 2 1 (b) A = ;X= and X = . 3 2 3 −1 1 −1 1 −1 (c) A = ;X= and X = . −2 2 1 2 1 −2 1+i 1−i (d) A = ;X= and X = . 1 3 −1 −1 0 −1 1 (e) A = ;X= . 1 2 −1 14. Find det(A − λI2 ) for the matrices in Question 13 and verify that the solutions of the equation det(A − λI2 ) = 0 are the eigenvalues as found above. 15. Find and solve the characteristic equations of the following matrices, and hence find their (real or complex) eigenvalues and corresponding eigenvectors. (Remember that for a real matrix, conjugate complex eigenvalues have conjugate eigenvectors.) 0 0 4 −1 −1 1 −2 −3 −3 −1 −1 −6 (a) 1 0 4 (b) 1 0 −2 (c) 3 4 3 (d) 1 0 3 . 0 1 −1 0 1 2 −3 −3 −2 0 1 1 Diagonalization A diagonal matrix is a square matrix for which all entries not on the main diagonal are zero. The notation diag (α1 , . . . , αn ) denotes an n × n diagonal matrix with entries α1 , . . . , αn on the main diagonal, and zeros everywhere else. If there is a non-singular matrix T such that T −1 AT is a diagonal matrix, then we say that A can be diagonalized. The main result is that the columns of T are eigenvectors of A, and the diagonal entries are the corresponding eigenvalues. 94 MATH2011/2/4 Theorem. An n × n matrix A can be diagonalized if and only if A has n independent eigenvectors. Proof. ⇒ Suppose T −1 AT = D, where D = diag (α1 , . . . , αn ). Let C1 , . . . , Cn be the columns of T , so they are independent, since T is non-singular. From the equation T −1 AT = D we have AT = T D, so α1 . A(C1 . . . Cn ) = (C1 . . . Cn ) .. 0 ... .. . ... 0 .. . . αn .˙. (AC1 . . . ACn ) = (α1 C1 . . . αn Cn ), which shows that C1 , . . . , Cn are eigenvectors of A. ⇐ Suppose the n independent eigenvectors are C1 , . . . , Cn , with corresponding eigenvalues α1 , . . . , αn . Let T = (C1 . . . Cn ), the matrix whose columns are C1 , . . . , Cn . Then the same calculations show that AT = T D, and T is invertible, since its columns are independent, so T −1 AT = D. An n × n matrix A has n eigenvalues, because they are the zeros of the characteristic polynomial, which is of degree n; if the eigenvalues are all distinct, then it can be shown that the corresponding n eigenvectors are always independent, and by the theorem A can be diagonalized. Thus diagonalization can fail only if A has a repeated eigenvalue. Diagonalization has many applications, since it simplifies matrix calculations. For example, if T −1 AT = D, then A = T DT −1 and A2 = T DT −1 T DT −1 = T D2 T −1 . In general, An = T Dn T −1 , which makes it easy to find An , since if D = diag(α1 , α2 , . . . ), then Dn = diag(α1n , α2n , . . . ). It follows that An → O as n → ∞ if the eigenvalues of A are all in modulus less than 1. Diagonalization can also be used to solve simultaneous linear differential equations to determine trajectories. Suppose d dt X = AX, where X = (x, y)T or (x, y, z)T , and A is a constant matrix such that T −1 AT = diag(α1 , α2 , . . . ). Define Y = T −1 X, so X = T Y , and d dt X d = T dt Y since T is constant. The equation then becomes d dt Y = T −1 AT Y = diag(α1 , α2 , . . . )Y , which can easily be solved, since the variables in Y are now separate. Similar methods can be used to find the paths of particles in time-dependent fields, where d dt X = AX + B(t), or to solve the D-operator equation P (D)y = f (t): if we define X = (y, y ′ , y ′′ , . . . , y (n−1) )T , then we obtain an equation of the form d dt X = AX + B(t). Tutorial questions — Diagonalization 16. In Question 13(a) to (d), let T be the matrix whose columns are the given eigenvectors. Find T −1 , and verify by matrix multiplication that in each case T −1 AT is a diagonal Algebra Chapter 3 95 matrix with the eigenvalues on the diagonal. Find all eigenvalues and corresponding eigenvectors of the matrix in Question 13(e). Why does this show that the matrix cannot be diagonalized? 17. Use your working from Question 15(a) to (c) to verify that AT = T D, where A is the given matrix, T is a matrix of independent eigenvectors, and D is the diagonal matrix with the corresponding eigenvalues on the diagonal. 1 4 1 18. Diagonalize the matrix A = 5 (hint: first take the fraction inside the matrix), 3 2 and hence find lim An . n→∞ 19. (a) Show by matrix multiplication that (I + A + A2 + · · · + An )(I − A) = I − An+1 for any square matrix A. Deduce that if An → O as n → ∞, then I + A + A2 + · · · + An → (I − A)−1 . (Compare the geometric series formula.) (b) In a recycle chemical reactor involving k reagents, the quantities of reagents present can be considered to form a vector in Rk . After n recycles the vector is denoted Xn , and it can be shown that Xn = X0 + AXn−1 , where X0 is the input vector and A is a fixed matrix representing the reaction and the removal of finished product. (i) Show that Xn = (I + A + A2 + · · · + An )X0 . (Hint: use induction.) (ii) By letting n → ∞ in (i), and using part (a), show that if all the eigenvalues of A have modulus less than 1, then Xn approaches a steady state vector X∞ , which is given by the formula X∞ = (I − A)−1 X0 . 20. The voltages v0 , v1 , . . . at the nodes in a ladder network of resistors satisfy the difference equation vn+1 − 52 vn + vn−1 = 0. Find a constant matrix A such that vn+1 vn =A vn vn−1 . By diagonalizing A, find an expression for vn in terms of v0 and v1 . Assuming v0 = 10 and v11 = 0, find the last non-zero voltage v10 . 21. In a gas absorption column, the concentration of solute leaving the n-th plate satisfies the equation yn+1 − (Q + 1)yn + Qyn−1 = 0. Use diagonalization to find an expression for yn in terms of y0 (the input concentration) and y1 . 22. Use diagonalization to find the trajectories in the vector fields below. (Hint: take transposes and re-write them in the form Ẋ = AX.) (a) (x + y, 4x + y) (b) (−x − y + z, x − 2z, y + 2z). (Hint for (b): use Question 15(b) and leave your answer in complex form.) 23. Use diagonalization to find the paths of particles in the time-dependent velocity fields: (a) (x + y − et , 4x + y + et ) (Hint for (b): use Question 15(a).) (b) (4z + e−2t , x + 4z + e−2t , y − z + e−2t ). 96 MATH2011/2/4 The characteristic polynomial The coefficients in the characteristic polynomial of a matrix can be expressed as sums of subdeterminants of the matrix. In particular, the trace of a square matrix A, denoted tr(A), is the sum of the diagonal entries, and is one of the coefficents in cA (λ). Theorem. For an n × n matrix A = (aij ): (i) the constant term in cA (λ) is det(A), (ii) the coefficient of λn in cA (λ) is (−1)n , (iii) the coefficient of λn−1 in cA (λ) is (−1)n−1 tr(A). (Note that for n > 2 these formulae do not give every coefficient.) Proof. (i) The constant term in cA (λ) is cA (0), which is det(A − 0In ), i.e., det(A), as stated. (ii) and (iii). These results are easily verified for n = 1 and n = 2. To prove them by induction, suppose they are true for n = k. Let n = k + 1, and expand det(A − λIk+1 ) by its first row; we obtain cA (λ) = det(A − λIk+1 ) = (a11 − λ) det(B − λIk ) + other terms, where B is the matrix obtained by deleting the first row and column of A. The other terms in the expansion of the determinant involve only lower powers than λk , since at least two appearances of λ are deleted in each of them. Next, det(B − λIk ) = cB (λ), and by the inductive hypothesis cB (λ) = (−1)k λk + (−1)k−1 λk−1 tr(B)+ lower powers. Thus, representing lower powers by dots, we have cA (λ) = (a11 − λ) (−1)k λk + (−1)k−1 λk−1 tr B + · · · = (−1)k+1 λk+1 + (−1)k a11 + tr(B) λk + · · · = (−1)k+1 λk+1 + (−1)k tr(A)λk + · · · , as required. Corollary. The sum of the eigenvalues of a square matrix is equal to the trace, and the product of the eigenvalues is equal to the determinant. Proof. With the notation above, suppose the eigenvalues of A are α1 , . . . , αn . Then by the Factor Theorem and the theorem above cA (λ) = (α1 − λ) . . . (αn − λ), since the polynomials on each side have the same zeros and the same coefficient of λn . The result now follows by equating constant terms and coefficients of λn−1 . Algebra Chapter 3 97 For the final result, we need to consider matrix polynomials p(A) of the form p(A) = pk Ak + · · · + p1 A + p0 In , where the coefficients are constants and again we have to insert the identity matrix after p0 for the expression to be meaningful. It is easy to show that p(A)q(A) = q(A)p(A), even though the factors in general matrix products cannot be interchanged. This means that the algebra of matrix polynomials (like that of D-operators) corresponds to that of ordinary polynomials. Cayley-Hamilton theorem. A square matrix A satisfies its own characteristic equation, i.e., cA (A) = O, where O denotes the zero matrix. Proof. Firstly, cA (λ)In = det(A − λIn )In = (A − λIn ) adj(A − λIn ), since P adj(P ) = det(P )In for any n × n matrix P . Next, if we write cA (λ) = then cA (A) − cA (λ)In = n X (1) Pn j j=0 cj λ , cj (Aj − λj In ) = (A − λIn )Q(λ), say, (2) j=0 since each term Aj − λj In has a factor A − λIn . (The term for j = 0 vanishes.) If we now add (1) and (2), then we obtain cA (A) = (A − λIn ){adj(A − λIn ) + Q(λ)} = (A − λIn )R(λ), say, which is impossible unless cA (A) = O, since the left hand side does not involve λ, whereas the right hand side is of degree at least one in λ. (This can be verified as follows: if R(λ) 6= O, then write R(λ) = Rd λd + · · · + R0 , where Rd 6= O, then multiply out and equate coefficients of λd+1 .) The Cayley-Hamilton theorem can be used in applications in which diagonalization is impossible because of insufficient independent eigenvectors. For example, if Ẋ = AX, where A is a constant matrix, then we can differentiate to get Ẍ = AẊ = A2 X. Repeating this process, we get dn dtn X = An X for all n, and by taking linear combinations of these we d )X = P (A)X for any polynomial P (λ). In particular, if we take P (λ) = cA (λ), obtain P ( dt the characteristic polynomial, then P (A) = cA (A) = O by the Cayley-Hamilton theorem. d Therefore cA ( dt )X = OX = 0. This vector D-operator equation can be solved as in Calculus Chapter 1, but with arbitrary constant vectors, not numbers. The arbitrary constants can be found by equating corresponding vector coefficients in Ẋ and AX. 98 MATH2011/2/4 Tutorial questions — The characteristic polynomial 24. Verify with the matrices in Question 15 above that the coefficient of λ2 in the characteristic polynomial is equal to the trace of the matrix and to the sum of the eigenvalues. Similarly, verify that the constant term in the characteristic polynomial is equal to the determinant of the matrix and to the product of the eigenvalues. It can be shown that (for 3 × 3 matrices only) the coefficient of −λ in the characteristic polynomial is the sum of the cofactors of the entries on the diagonal. Verify this fact with the matrices in Question 15. 25. Find the characteristic polynomial cA (λ) of a general 2 × 2 matrix A = a c b d , and hence verify that cA (λ) = λ2 − λ tr A + det A. Then show that cA (A) is the zero matrix, i.e., verify the Cayley-Hamilton theorem for 2 × 2 matrices. 26. A common howler is to give a false proof of the Cayley-Hamilton theorem by saying cA (A) = det(A − AIn ) = det(O) = 0. Why is this wrong? 27. Verify the Cayley-Hamilton theorem for the matrix in Question 15(a), using the characteristic polynomial in factorized form. 28. (a) If T −1 AT = D and p(A) = pk Ak + · · · + p0 In is a general matrix polynomial, show that T −1 p(A)T = p(D). Also show that if D = diag(α1 , . . . , αn ), then p(D) = diag p(α1 ), . . . , p(αn ) . (b) Deduce that if A is diagonalizable, then cA (A) = T diag cA (α1 ), . . . , cA (αn ) T −1 , where α1 , . . . , αn are the eigenvalues of A. Now show that the right hand side is the zero matrix. (You have just found another proof of the Cayley-Hamilton theorem, but this proof is valid only for diagonalizable matrices.) 29. Use the Cayley-Hamilton theorem to find parametric equations for the streamlines in the velocity fields below. (Hint: take transposes and solve Ẋ = vT = AX.) (a) v = (y, z, −x−3y−3z) (b) v = (x−3y+2z, 2x−5y+3z, 3x−7y+4z). Verify in each case that the matrix cannot be diagonalized. 30. Solve the matrix differential equations in Calculus Chapter 1 Question 19by using diagonalization or the Cayley-Hamilton theorem. 31. Two square matrices A and B are said to be similar if there is a non-singular matrix T such that T −1 AT = B. Thus to say a matrix can be diagonalized is the same as saying that it is similar to a diagonal matrix. (a) Verify that the matrices in Question 15(a) to (d) have the same eigenvalues as the diagonal matrices to which they are similar. (b) Prove that similar matrices have the same characteristic polynomial (and therefore the same eigenvalues). (Hint: if B = T −1 AT , show that B − λIn = T −1 (A − λIn )T and then find cB (λ). Remember that the determinant of a product is equal to the product of the determinants.) Algebra Chapter 3 99 Answers 2. Only (a) is a linear space. 1 − 4i 1+i 1−i 3. (a) Z = Z1 = and Z0 = 0, (b) Z1 = and z0 = z2 , 1 + 2i 0 1 1+i −i −1 + i 1−i (c) Z1 = 0 and Z0 = z2 1 + z3 0 , (d) Z1 = and 0 0 0 1 −2 − i Z0 = z2 . 1 4. Check your own answers! 1−i 5. (a) No basis elements, dimension zero. (b) Basis , dimension one. 1 −i −1 + i −2 − i (c) Basis 1 and 0 , dimension two. (d) Basis , dimension one. 1 0 1 6. (a) Basis et and e3t , dimension two, (b) Basis e(−1+i)t and e(−1−i)t , dimension two, (c) Basis 1, t, t2 , dimension three, (d) Basis 1, t, . . . , tn−1 , dimension n, (e) Basis e(±1±i)t , dimension four. 7. (a) Vectors non-collinear ⇒ area 6= 0 ⇒ det 6= 0 ⇒ matrix invertible. 8. Only (a), because the solution is unique (the null space contains only the zero vector). 9. Independent: (a), (b), (d). 10. (Other answers are possible.) (a) W = −2e4t , (b) W = e−2t (c) W = 2. 11. Not independent means y2 = c1 y1 for some constant c1 . 12. (a) Rank two, One, One, One. (b) If m = n, then r = m (existence) coincides with r = n (uniqueness). 13. Eigenvalues: (a) 1 and −1, (b) 4 and −1, (c) 0 and 3, (d) 2 − i and 2 + i, 14. (a) λ2 − 1, (b) λ2 − 3λ − 4, (e) 1. (c) λ2 − 3λ, (d) λ2 − 4λ + 5, (e) (λ − 1)2 . 2 −2 −4 15. (a) Eigenvalues 2, −2, −1: eigenvectors a 3 , b −1 , c 0 . 1 1 1 1 1 − 2i 1 + 2i (b) Eigenvalues 1, i, −i: eigenvectors a −1 , b −2 + i , c −2 − i . 1 1 1 −1 −1 1 1 (c) Eigenvalues 1, 1, −2: eigenvectors a 0 +b , c −1 . 0 1, 1 −3 3 0 , b −3 . (d) Eigenvalues 1, 1, −2: eigenvectors a 1 1 100 MATH2011/2/4 1 16. Eigenvalue 1: eigenvector a . T is not invertible. −1 3 4 1 2 n . 18. Eigenvalues 1, − 5 . A → 7 3 4 19. (a) (1+A+· · ·+An )(I −A) → I.(b)(i) Xk+1 = X0 +AXk = X0 +A(I +A+· · ·+Ak )X0 . (b)(ii) An → O as n → ∞. 5 −1 2 1 2 20. A = ;T = say (not unique). 1 0 1 2 vn = 13 (21+n − 21−n )v1 − (2n − 22−n )v0 Put v0 = 10 and v11 = 0 and solve for v1 . v10 = 15/(211 − 2−11 ). Q + 1 −Q Q 1 21. A = ; T = say (not unique). 1 0 1 1 yn = (Qn − 1)y1 − (Qn − Q)y0 /(Q − 1). 3t x 1 1 − 2i 1 + 2i aet x 1 −1 ae 22. (a) , (b) y = −1 −2 + i −2 − i beit . = y 2 2 be−t ce−it z 1 1 1 3t x 1 1 ae + et /8 23. (a) = , y 2 −2 be−t − 3et /8 2t 7/12 x 2 −2 −4 ae − 7e−2t /48 (b) T −1 B = 3/4 . X = y = 3 −1 0 be−2t + 3te−2t /4 . −1/3 z 1 1 1 ce−t + e−2t /3 25. cA (A) = A2 − (a + d)A + (ad − bc)I2 , which simplifies to the zero matrix. 26. The definition cA (λ) = det(A − λIn ) is valid only if λ is a scalar, so you cannot replace λ by the matrix A. 27. (A − 2I3 )(A + 2I3 )(A + I3 ) = O. P P P 28. (a) T −1 p(A)T = T −1 pj Aj T = pj (T −1 Aj T ) = pj Dj = p(D). (b) T −1 cA (A)T = cA diag (α1 , . . . , αn ) = diag cA (α1 ), . . . , cA (αn ) . Finally, cA (α1 ) = · · · = cA (αn ) = 0 since the eigenvalues are the roots of the characteristic equation. 0 1 0 d + 1)3 X = O, so X = P e−t + 29. (a) A = 0 0 1 , cA (λ) = −(λ + 1)3 . ( dt −1 −3 −3 Qte−t +Rt2 e−t . Ẋ = (−P +Q)e−t +(−Q+2R)te−t +(−R)t2 e−t . AX = AP e−t +AQte−t + ARt2 e−t . By equating coefficients on the right hand sides it followws that (A + I)R = O, r (A+I)Q = 2R, (A+I)P = Q. Solve first for R, then for Q, then for P , to get R = −r r 4r + q 6r + 2q + p (eigenvector), Q = −2r − q , P = −2r − q − p . q p 2 1 −3 2 a b c t 3 (b) A = 2 −5 3 , cA (λ) = −λ , X = a 2a+b −4a+b+c t . 3 −7 4 1 a 4a+b −6a+2b+c Algebra Chapter 3 101 102 æ MATH2011/2/4 100 æ MATH2011/2/4 Algebra Chapter 4 101 Algebra Chapter 4 Orthonormality Dot products and orthonormal bases We have discussed bases and dimension in general linear spaces of vectors or functions, but we have not extended the geometrical ideas of lengths and angles, which in two or three real dimensions are obtained from the dot product. We now generalize the dot product to n real or complex dimensions, and we define a · b = (a1 , a2 , . . . , an ) · (b1 , b2 , · · · , bn ) = a1 b1 + a2 b2 + · · · + an bn = n X a k bk . k=1 The conjugates in the second factors can be ignored if the vectors have real entries, but are essential for complex vectors because the dot product of a vector with itself must still be the square of its magnitude, as with real vectors. The definition gives a·a= n X ak ak = k=1 n X k=1 |ak |2 , which is real and non-negative, so we may take square roots and define |a| = √ a · a. Note that |a| = 0 only if a = 0, i.e., the only vector with zero magnitude is the zero vector itself. Although it is not possible to visualize complex vectors, we still say that vectors a and b are orthogonal (or normal or perpendicular) if a · b = 0. If a and b are row vectors (i.e., 1 × n matrices), then the dot product is equal to a matrix product: T a · b = ab , where the conjugate in the second factor must not be forgotten. It follows from matrix algebra that 102 MATH2011/2/4 (i) (a + d) · b = a · b + d · b for any vectors a, b, and d. (ii) (ca) · b = c(a · b) for any vectors a and b, and any constant c. (iii) b · a = a · b for any vectors a and b. Properties (i) and (ii) show that dotting with a fixed vector b is a linear operator. For column vectors A and B the dot product is given by a slightly different matrix product, A · B = AT B, but properties (i)–(iii) remain true. An orthonormal basis of a linear space is a basis consisting of mutually perpendicular unit vectors. For example, in R3 the standard basis vectors i, j, and k form an orthonormal basis, as do the unit tangent u, the unit normal n, and the binormal b at any point on a smooth curve. This means that orthonormal basis vectors have dot product equal to 0 with one another (because they are mutually perpendicular) and equal to 1 with themselves (because they are unit vectors). Thus we have that e1 , e2 , . . . , en form an orthonormal basis if and only if ek · em = 0 if k 6= m 1 if k = m. The most important thing about any orthonormal basis is that any vector can easily be expressed in terms of its components in the directions of the basis vectors, where components are obtained via dot products. Theorem. If e1 , e2 , . . . , en form an orthonormal basis of Cn , then a general vector a can be uniquely expressed in the form a= n X k=1 λk ek , where λk = a · ek . Proof. Since e1 , e2 , . . . , en form a basis, it follows that there is a unique expression for a, say a = λ1 e1 + λ2 e2 + · · · + λn en . Therefore, for k = 1, . . . , n we have a · ek = (λ1 e1 + λ2 e2 + · · · + λn en ) · ek = λ1 (e1 · ek ) + λ2 (e2 · ek ) + · · · + λn (en · ek ), which equals λk , since all dot products are zero except for ek · ek , which is equal to 1. Tutorial questions — Dot products and orthonormal bases 1. Use dot products to find the lengths of the vectors below, and show that each pair of vectors is orthogonal. (b) (i, 1 + 2i) and (−1 − 3i, 1 + i) in C2 . Algebra Chapter 4 103 (b) (−i, 2, 1 + i) and (1 + i, i, 2 − i) in C3 . (c) (1 + i, 1 − i, i, 4 + 2i) and (1 + 2i, 1 − i, −3 − 3i, −i) in C4 . 2. Use the properties of dot products to show that a · (cb) = c(a · b). 3. If (1, α, β) is orthogonal to (α, β, 1), show that α + αβ + β = 0. α2 − α if |α| = 6 1. (Hint: take conjugates of the equation above, and Deduce that β = 1 − |α|2 then eliminate β.) If α = 1 + i, find β, verify that the resulting vectors are orthogonal, and find their length. 4. Show that the vectors u = 51 (2, 2, −1+4i), v = 15 (2, −1+4i, 2), and w = 15 (−1+4i, 2, 2) form an orthonormal basis of C3 . 5. If u, v, and w, are as in Question 4, and if a = (1, i, −1), find the components a · u, a · v, and a · w. Verify that a = (a · u)u + (a · v)v + (a · w)w. P2N 6. (i) If ε = eiπ/N , show that ε = ε−1 and that ε2N = 1. Deduce that j=1 ε(n−m)j = 0 for any integers n and m for which εn−m 6= 1. (Hint: the sum is a geometric series with common ratio εn−m .) Evaluate the sum if εn−m = 1 (the geometric series formula is not valid). P2N (n−m)j 1 1 εn , ε2n , . . . , ε2Nn , show that en · em = 2N (ii) If we define en := √2N . j=1 ε Deduce from part (i) that the vectors en , for any 2N successive values of n, form an orthonormal basis of C2N . 7. If e1 , e2 , e3 , . . . form an orthonormal basis, prove that for any vector a |a|2 = |a · e1 |2 + |a · e2 |2 + |a · e3 |2 + · · · . (Hint: write a = λ1 e1 + λ2 e2 + · · · , then substitute for the first a in a · a, and simplify, using properties of dot products. Finally, remember that λ1 = a · e1 , etc. Unitary and hermitian matrices We say that an n × n matrix U is unitary if its rows or columns form an orthonormal basis of Cn . Real unitary matrices are also called orthogonal. Since we proved in Algebra Chapter 3 that an n × n matrix is invertible if and only if its columns (or rows) form a basis of Cn , it follows that a unitary matrix always has an inverse. In fact, the inverse of a unitary matrix is very easy to find. T Theorem. An n × n matrix U is unitary if and only if U −1 = U . Proof. Suppose r1 , r2 , . . . , rn are the rows of U . Then r1 T , r2 T , . . . , rn T are the columns T of U , and by matrix multiplication we have r1 · r1 r1 r ·r r2 T ( r1 T r2 T · · · rn T ) = 2 . 1 UU = . .. .. rn rn · r1 r1 · r2 r2 · r2 .. . ··· ··· .. . rn · r2 ··· r1 · rn r2 · rn , .. . rn · rn 104 MATH2011/2/4 since rk rm T = rk · rm for all k and m. If r1 , r2 , . . . , rn form an orthonormal basis of Cn , then the matrix of dot products has 1s on the diagonal and 0s elsewhere, so it is the T T T identity matrix. Thus we have U U = In , i.e., U = U −1 . Conversely, if U = U −1 , then the matrix of dot products is the identity matrix, and it follows immediately that r1 , r2 , . . . , rn form an orthonormal basis of Cn . The proof for column vectors is similar, and appears as a tutorial question. T The property that U −1 = U , which characterizes unitary matrices, was shown in first year Algebra Chapter 4 to hold for rotation matrices. In fact, since rotations about the origin do not change lengths of vectors or the angles between them, it follows that rotations take orthonormal bases to orthonormal bases, and hence that any rotation of axes in R2 or R3 is defined by a (real) unitary matrix. Reflections of the axes in a line or plane through the origin also leave lengths and angles unchanged (in magnitude) and also correspond to unitary matrices. A square matrix A is said to be hermitian (after the French mathematician Hermite) if A T = A. This means that akj = ajk for all j and k. A real hermitian matrix is called symmetric, since it is equal to its transpose. Hermitian matrices occur in various applications, as we shall show later; the important thing about them is that they can always be diagonalized by a unitary matrix. We start with a preliminary result. Lemma. If A is an n × n hermitian matrix, then (uA) · v = u · (vA) for any 1 × n row vectors u and v. Similarly (AX) · Y = X · (AY ) for any n × 1 column vectors X and Y . Proof. By the expression for the dot product as a matrix product, and since the transpose of a matrix product is the product of the transposes in the reverse order, we have T u · (vA) = u(vA)T = uA vT = uAvT = (uA) · v. The proof for column vectors appears as a tutorial question. Theorem. If A is an n × n hermitian matrix, then (i) the eigenvalues of A are all real, (ii) there is an orthonormal basis of Cn consisting of eigenvectors of A, T (iii) there is a unitary matrix U such that U AU is diagonal. Proof. (i) Suppose X is a non-zero column eigenvector of A, with corresponding eigenvalue α. Then AX = αX, so (AX) · X = (αX) · X = α(X · X) = α|X|2 , using properties of dot products. But by the lemma we have (AX) · X = X · (AX) = X · (αX) = α(X · X) = α|X|2 . Algebra Chapter 4 105 (By Question 2 the constant α must be conjugated when it is taken out of the second factor.) Since |X|2 6= 0, we have α = α, i.e., α is real. (ii) Suppose X and Y are column eigenvectors corresponding to distinct eigenvalues α and β. Then (AX) · Y = (αX) · Y = α(X · Y ) and also (AX) · Y = X · (AY ) = X · (βY ) = β(X · Y ) = β(X · Y ), since A is hermitian and β is real by (i). By equating the right hand sides we see that X ·Y = 0, since α 6= β. This shows that eigenvectors for distinct eigenvalues are orthogonal, and by dividing by their magnitudes we can make them unit vectors. The details for repeated eigenvalues are beyond us at this stage. (iii) Let U be a matrix whose columns form an orthonormal basis of eigenvectors of A, as obtained in (ii). Then U is unitary (because its columns are mutually perpendicular T unit vectors) so U −1 = U . Also U −1 AU is diagonal (because the columns of U are T eigenvectors). By combining these results we see that U AU is diagonal, as required. The following table summarizes diagonalizability properties of square matrices: • Any hermitian matrix is diagonalizable by a unitary matrix of eigenvectors. • A non-hermitian matrix with distinct eigenvalues is diagonalizable, but the invertible matrix of eigenvectors may not be unitary. • A non-hermitian matrix with one or more repeated eigenvalues may or may not be diagonalizable, depending on the total number of independent eigenvectors. Tutorial questions — Unitary and hermitian matrices T 8. Show that the matrix U below is unitary, by calculating U U . 1+i 1 1 U= √ 0 5 −1 + i 0 1+i −1 + i 1 −1 + i 0 1 1+i 1 −1 + i . 1+i 0 9. For N = 1 and N = 2 write down the matrix whose rows are the vectors e−N+1 , . . . , eN given in Question 6. Verify that each matrix is unitary. 10. If C1 , C2 , . . . , Cn are the column vectors of an n × n matrix U , show that U T U is the matrix of dot products of C1 , C2 , . . . , Cn . (Hint: remember that Ck · Cm = (Ck )T Cm because they are column vectors.) Deduce that U −1 = U form an orthonormal basis. T if and only if C1 , C2 , . . . , Cn T 11. (i) If U is a unitary matrix, show that | det U | = 1. (Hint: remember U U = In , and T take determinants of both sides, noting that det(U ) = det U . Why?) It follows that a real unitary matrix has determinant equal to ±1. The sign determines whether it is a 106 MATH2011/2/4 rotation or a reflection matrix. (ii) Show that the 2 × 2 rotation matrix nant +1. cos α sin α − sin α cos α is unitary and has determi- cos α sin α (iii) Show that 2 × 2 matrix is unitary and has determinant −1. (It sin α − cos α corresponds to a reflection of axes in the line with polar angle 12 α.) 12. If A is an n × n hermitian matrix and X and Y are n × 1 column vectors, prove that (AX) · Y = X · (AY ). Also prove that the diagonal entries of A are real. 13. Find unitary matrices that will diagonalize the following hermitian matrices, i.e., find orthonormal bases of eigenvectors. Make sure that they are all unit vectors, and that the eigenvectors for a repeated eigenvalue are orthogonal. Distinct eigenvalues are given. (Save (e) and (f) for revision.) 9 −2 7 −9 (a) (b) −2 6 −9 −17 −1 4 −8 4 −1 + i (c) (d) 4 −7 −4 (±9) −1 − i 5 −8 −4 −1 1 −2i 0 2 185 48 −12 i 2i −2 −3 (e) 48 313 −36 (169, 338) (f) (3, −3, −6). 0 −3 0 3i −12 −36 178 2 −i −3i −2 a b 14. (i) If A = , a general 2 × 2 real symmetric matrix, solve the characteristic equab d q tion cA (x) = 0, and show that the eigenvalues of A are 12 (a+d)+ 14 (a − d)2 + b2 = λ1 , q 1 say, and 2 (a + d) − 14 (a − d)2 + b2 = λ2 , say. (ii) Show that the circle with centre at the point P 21 (a+d), 0 and passing through the point Q(a, b) has equation cA (x) + y 2 = 0. (Hint: let R(x, y) be on the circle; then |RP |2 = |P Q|2 = |P T |2 + |T Q|2 , where T is the point (a, 0)). Let L (on the left) and R (on the right) be the points where the circle cuts the x axis. Show that R is (λ1 , 0) and L is (λ2 , 0). (Hint: put y = 0 in the equation of the circle.) The circle is called Mohr’s circle; it provides a graphical method for finding eigenvalues of 2 × 2 real symmetric matrices. 9 −2 7 −9 (iii) Draw (accurately) Mohr’s circles for the matrices and . −2 6 −9 −17 Hence find their eigenvalues and check your answers by solving the characteristic equations. Then find the eigenvectors and verify that they are parallel to LQ and RQ. (iv)* Show that the eigenvectors of the general symmetric matrix A are parallel to LQ and RQ. (Hint: show that vector LQ = (a −λ2 , b) and that (A − λ1 I)(LQ)T = 0.) Algebra Chapter 4 107 Applications There are many applications of diagonalization of hermitian matrices by unitary matrices. First of all, vector differential equations can be solved, as before. In particular, if we have a real force field F = AX, where A is a constant matrix and X = (x, y, z)T , then, since force fields are conservative (unless the total energy in the system changes), it follows that A is symmetric. Hence there exists a real unitary matrix U such that U T AU is diagonal. The path of a particle of mass m in such a field satisfies the differential equation F = mẌ, 1 AX. If we define Y = U T X (which corresponds to a rotation of axes to the i.e., Ẍ = m 1 (U T AU )Y , directions given by the columns of U ), then X = U Y and we obtain Ÿ = m which is easily solved, since U T AU is diagonal. Secondly, real quadratic forms in two or more variables can be reduced to canonical form by this method. If Q = ax2 + by 2 + cz 2 + 2dxy + 2exz + 2f yz, then it is easy to a d e check that Q = X T AX, where X = (x, y, z)T as before, and A = d b f . If we e f c T T T again define Y = U X, then we obtain Q = Y (U AU )Y , which is in canonical form, because U T AU is diagonal. The new axes, in the directions of the columns of U , are called the principal axes of the quadratic form. The coefficients in the canonical form are the diagonal entries in U T AU , i.e., the eigenvalues of A. In particular, an explicit quadric surface, say 2 2 z = ax + 2bxy + cy = (x y) a b b c x = X T AX = αu2 + βv 2 , y is a saddle if the eigenvalues α and β of A have opposite signs, a cup if they are both positive, and a cap if they are both negative. Since det A = αβ, the product of the eigenvalues, we have a saddle if det A < 0, and a cup or cap if det A > 0. In the latter case, the sign of the trace of A will distinguish cup from cap, since tr A = α + β, the sum of the eigenvalues. Near a stationary point on any surface, we showed in Calculus Chapter 3 how ∆z is approximated by a quadratic form in ∆x and ∆y, thus: ∆z ≈ zxx ∆x2 + 2zxy ∆x∆y + zyy ∆y 2 , zxx zxy 2 in which the coefficient matrix is , which has determinant zxx zyy − zxy and zxy zyy trace zxx + zyy . From the previous paragraph it follows that there is a saddle if zxx zyy − 2 2 zxy < 0, and if zxx zyy −zxy > 0, then there is a cup or cap (proper minimum or maximum), which can be distinguished by inspecting the sign of zxx +zyy . This confirms the conclusions reached in Calculus Chapter 3. 108 MATH2011/2/4 *Thirdly, if a rigid body is under stress but in equilibrium, then the stresses at any point can be expressed in the form of a matrix (strictly speaking, a tensor) σx A = τyx τzx τxy σy τzy τxz τyz , σz where the σs are tensile or compressive stresses parallel to the axes, and the τ s are shear stresses. Since the body is in equilibrium, it follows that τxy = τyx , etc., so the stress tensor A is symmetric. If the axes are rotated to the directions of a real unitary matrix U , then it can be shown that the stress tensor becomes U T AU . In particular, if we take U to be a matrix of eigenvectors, then U T AU is diagonal, so the shear stresses are all zero, and the longituditudinal stresses are the eigenvalues of A, which are called the principal stresses. The corresponding axes (in the directions of the eigenvectors) are called the principal axes of stress. *For plane stresses, where the stress tensor A is 2×2, Mohr’s circle (see Question 14 with a = σx , b = τxy , and d = σy ) is very useful. We have shown that the points Q(σx , τxy ) and S(σy , τxy ) lie on the circle. If the axes are rotated, then the entries in the new stress tensor U T AU correspond to other points on the same circle. This was shown in Question 14 for the principal stresses, which are the eigenvalues of A. (More precisely, if u and v axes are obtained by rotating through an angle θ, then the point Q′ (σu , τuv ) on the circle satisfies b ′ = −θ, i.e., QPb Q′ = −2θ.) QLQ *Fourthly, if A is an m × n matrix (not necessarily square), it is easy to show that the T T matrices AA and A A are both hermitian. Thus there exist unitary matrices U and V , T T T T and real diagonal matrices D and E, such that U AA U = D and V A AV = E. The non-zero entries in D and E are the same, are all positive, and are equal in number to the rank of A. It can also be shown that (after rearrangement of columns, if necessary) T U AV = S say, an m × n diagonal matrix whose non-zero entries are the square roots of those in D and E. The diagonal entries in S are called the singular values of A, and the expression A = U SV T is called the singular value decomposition of A. Tutorial questions — Applications 15. (i) Find curl(rA) if A is a constant 3 × 3 matrix and r = (x, y, z). Deduce that the vector field rA is conservative if and only if A is symmetric. (ii) Use the diagonalization obtained in Question 13(b) to find the general path of a particle of unit mass in the force field F = (7x − 9y, −9x − 17y). (iii) Repeat part (ii) with the force field F = (−x + 4y − 8z, 4x − 7y − 4z, −8x − 4y − z), using the diagonalization obtained in Question 13(d). Algebra Chapter 4 109 16. Discuss how the signs of the eigenvalues of a 3 × 3 real symmetric matrix A determine the nature of the implicit quadric surface X T AX = 1. (See Calculus Chapter 3.) 17. Use the eigenvalues from Question 13(d) to rewrite the left hand sides below in canonical form. Hence identify the implicit quadric surfaces defined by the equations. (a) −x2 − 7y 2 − z 2 + 8xy − 16xz − 8yz = 9 (b) −x2 − 7y 2 − z 2 + 8xy − 16xz − 8yz = −9 (c) −x2 − 7y 2 − z 2 + 8xy − 16xz − 8yz = 0. * 18. Use Question 14(ii) and the fact that the matrices A and U −1 AU have the same characteristic polynomial (see Algebra Chapter 3 Question 31(b)) to show that rotation of axes does not change the Mohr circle. T 19. If A is an m × n matrix, show that A A is hermitian, and that its eigenvalues are non-negative. (Hint: let X be a non-zero column eigenvector corresponding to an eigenvalue σ. Show that |AX|2 = σ|X|2 .) Fourier series It is known that a smooth function can be approximated near 0 by a polynomial of degree n; then by letting n → ∞ we obtain the Maclaurin series, which gives an exact infinite series expression for the function, at least near 0. We now show how orthonormal bases can be used to find another series expression, called the Fourier series, for a function. The difference is that: • to have a Maclaurin series the function must be differentiable infinitely many times, • to have a Fourier series the function must be periodic, but need not be even continuous. Suppose therefore that the function f (t) is periodic of period 2l, i.e., there is a positive number l such that f (t + 2l) = f (t) for all t. This simply means that the graph y = f (t) repeats itself after any interval of length 2l. We allow f (t) to take complex values, but t must always be real. For example, the function eit is periodic of period 2π, since ei(t+2π) = eit e2iπ = eit . The sound wave of a musical note is periodic, but that of a noise is usually not periodic. To obtain the Fourier series of a function of period 2l we use the method of interpolation or sampling, as in digital recordings. We subdivide any interval of length 2l, say the interval from t = 0 to t = 2l, into 2N equal subintervals of length ∆t, where ∆t = Nl . This gives √ 2N values f (k∆t) from k = 1 to k = 2N , which we divide by 2N and write as a vector f=√ 1 f (∆t), f (2∆t), . . . , f (2N ∆t) , 2N which we call the sampling vector of the function f (t). 110 MATH2011/2/4 Theorem. The sampling vectors of the functions eniπt/l , for n = −N +1, . . . , N , form an orthonormal basis of C2N . Proof. Let en denote the sampling vector of eniπt/l . Then, by definition of sampling vector, 1 en = √ (eniπ∆t/l , e2niπ∆t/l , . . . , e2Nniπ∆t/l ). 2n But ∆t l = 1 , N so 1 en = √ (eniπ/N , e2niπ/N , . . . , e2Nniπ/N ) 2n 1 = √ (εn , ε2n , . . . , ε2Nn ), 2n if we put ε = eiπ/N . Thus en is exactly the same as in Question 6, where we showed that we can form an orthonormal basis of C2N by taking en for any 2N successive values of n, in particular, from n = −N +1 to n = N . We can now express our sampling vector f in terms of this orthonormal basis, to get f= N X n=−N+1 λn en , where λn = f · en . The left hand side is the sampling vector of f (t), and the right hand side is the sampling PN vector of the function n=−N+1 λn eniπt/l , which is also periodic of period 2l. Since the sampling vectors of the two functions are equal, it follows that the function values coincide whenever t is a multiple of l/N , i.e., at 2N points in every period. At intermediate values of t the functions may not be exactly equal, and the best we can do in general is to write f (t) ≈ N X n=−N+1 λn einπt/l for all t. (∗) Algebra Chapter 4 111 N=4 N = 12 N = 36 Figure 4.1. Fourier approximations to a square wave In order to improve the accuracy of this approximation, we let ∆t → 0, i.e., N → ∞. This means that the number of points in each period where the two functions are exactly equal tends to infinity, so the right hand side gets closer and closer to f (t). The effect of increasing N is shown in Figure 4.1: the function f (t) is a square wave, shown by the dotted line in each graph. The approximations for different values of N are given by solid lines: note how they oscillate from one side of f (t) to the other, and how they become more accurate as N increases, except near the discontinuities. In the limit, the result is exact, except at the discontinuities themselves. Letting N → ∞ obviously turns the sum on the right hand side into an infinite series from n = −∞ to n = ∞. What is the effect on each coefficient λn , where n is kept constant? lim λn = lim f · en N→∞ N→∞ √1 N→∞ 2N = lim f (∆t), . . . , f (2N ∆t) · 2N 1 X f (k∆t)e−iknπ/N = lim N→∞ 2N √1 2N einπ/N , . . . , e2Ninπ/N (conjugates in the second factor!) k=1 2N X 1 lim f (k∆t)e−iknπ∆t/l ∆t 2l ∆t→0 k=1 Z 2l 1 f (t)e−inπt/l dt, = 2l 0 = (since ∆t = l ) N 112 MATH2011/2/4 since an integral can be expressed as the limit of a sum. The limit of λn is called the nth complex Fourier coefficient of the function f (t), and we shall denote it by cn . If we P∞ let N tend to infinity in equation (∗), then we expect to obtain f (t) = n=−∞ cn einπt/l , which is called the complex Fourier series expansion of f (t). Although the two sides are exactly equal at infinitely many points in each period, this does not necessarily mean at every single point. However, all periodic functions we encounter are equal to their Fourier series expressions at all points of continuity. Because f (t) and e−inπt/l are both periodic of period 2l, the integral expression for cn can be evaluated over any interval of length 2l. In practice, it is usually best to integrate from −l to l, in order to simplify the integration if f (t) is an even or odd function. Thus we have the following result, although we are unable to prove the details about convergence, and we assume the function is sufficiently well behaved. Theorem. If f (t) is of period 2l, then the complex Fourier series of f (t) is Z ∞ X 1 l inπt/l f (t)e−inπt/l dt. cn e , where cn = 2l −l n=−∞ At points where f (t) is continuous the Fourier series converges to f (t), and at finite discontinuities the Fourier series converges to the midpoint of the jump. By pairing off the terms for ±1, ±2, . . . , we can re-write the Fourier series as ∞ X c0 + cn einπt/l + c−n e−inπt/l , n=1 and if f (t) is real, then it is easy to see from the integral formula for the Fourier coefficients that cn = c−n , so c0 is real, and the Fourier series becomes ∞ ∞ X X inπt/l inπt/l = c0 + 2 Re cn einπt/l . c0 + cn e + cn e n=1 n=1 If we now define an = cn + cn = 2 Re(cn ) and bn = i(cn − cn ) = −2 Im(cn ), then c0 = 21 a0 and cn = 12 (an − ibn ), so 2 Re cn einπt/l = Re (an − ibn )(cos nπt + i sin nπt ) = an cos nπt + bn sin nπt . l l l l Thus for a real-valued function f (t) of period 2l the Fourier series can be written as ∞ X 1 nπt nπt a + a cos + b sin , 0 n n 2 l l n=1 where 1 an = l Z l f (t) cos −l nπt l 1 dt and bn = l Z l −l f (t) sin nπt l dt. The results about convergence are, of course, the same as before. In practice it is generally easier to find the complex Fourier series and then combine the terms for ±n, since only one integration is required. Algebra Chapter 4 113 Tutorial questions — Fourier series 20. If f (t) has period 2 and f (t) = 1 − t for −1 < t < 1, find the complex Fourier series of f (t). Use the theorem about convergence of Fourier series to sketch the graph of the function to which the series converges between t = −4 and t = 4. Rewrite the series in real form. 21. Find the complex Fourier series of the function f (x) of period 1 such that f (x) = 0 for − 12 < x < 0 and f (x) = x for 0 < x < 1 . 2 Write the series in real form, and sketch the graph of the function to which the series converges between x = 0 and x = 4. By 1 considering the value to which the series converges when x = 21 , show that 1 + 91 + 25 + 1 49 +··· = π2 8 . 22. Find the complex Fourier series of the function f (x) of period 2π such that f (x) = x2 for −π < x < π. Write it in real form, and considering x = 0 and x = π deduce that ∞ ∞ X X (−1)n−1 1 π2 π2 = and = . n2 12 n2 6 n=1 n=1 23. If f (t) has period 2π and f (t) = eiαt for −π < t < π, where α is not an integer, find the complex Fourier series for f (t). (Hint: e±iπ = −1.) To what value does the series converge when t = 0? (Hint: f (t) is continuous at t = 0.) ∞ X π (−1)n Deduce that = . By what test can you be sure that the series on sin πα n=−∞ α − n the right hand side converges? Show that f (t) jumps from the value eiαπ to the value e−iαπ at t = π. To what value does the Fourier series converge here? By pairing off the terms for ±1, ±2, . . . show that ∞ X n=1 n2 1 − πα cot πα 1 = . 2 −α 2α2 By what test can you be sure that the series on the left hand side converges? Use L’Hôpital’s Rule to find the limit of the right hand side as α → 0, and hence evaluate P∞ 1 n=1 n2 . 24. (i) Assuming f (t) is real-valued, apply the formulae an = 2 Re(cn ) and bn = −2 Im(cn ) to the integral expression for cn , and hence obtain the formulae for an and bn given above. (ii) If in addition f (t) is either an even function or an odd function, show that these formulae simplify further, as follows: Rl f (t) cos nπt l dt and bn = 0, R l dt. if f (t) is an odd function, then an = 0 and bn = 2l 0 f (t) sin nπt l if f (t) is an even function, then an = 2 l 0 114 MATH2011/2/4 25. The following Fourier series of period 2π were found in Algebra Chapter 1 Question 48: ln(2| cos 1 2 θ|) ∞ X (−1)n+1 = cos nθ n n=1 and arctan(tan 1 2 θ) ∞ X (−1)n+1 = sin nθ, n n=1 where the first function is even, and the second function is odd. Using the formulae for an and bn given in Question 24, show without integrating that Z π ln(2| cos 0 1 2 θ|) cos nθ dθ (−1)n+1 π = = 2n Z 0 π arctan(tan 12 θ) sin nθ dθ. Confirm this result by simplifying arctan(tan 21 θ) and evaluating the second integral. (The first integral cannot be evaluated by elementary methods.) * 26. Show that if f (t) is periodic of period 2l, and if f is the sampling vector, as above, then Rl 1 limN→∞ f · f = 2l |f (t)|2 dt. Hence adapt the result of Question 7 to prove that −l Rl P∞ 1 2 2 n=−∞ |cn | . (This is called Parseval’s identity.) 2l −l |f (t)| dt = Answers √ √ 1. (a) 6, 2 3; (b) √ √ 7, 2 2; (c) 5, √ 26. 2. a · (cb) = (cb) · a = c(b · a) = c(b · a) = c(a · b). √ 3. β = 1 − 3i, |(1, α, β)| = 13. 5. a · u = 53 (1 + 2i), a · v = 15 (4 − i), a · w = 15 (−3 − 2i). 6. (i) Sum is 2N if εn−m = 1. (Every term equals 1.) 7. |a|2 = a · a = (λ1 e1 + λ2 e2 + · · · ) · a = λ1 (e1 · a) + λ2 (e2 · a) + · · · = λ1 (a · e1 ) + λ2 (a · e2 ) + · · · = λ1 λ1 + λ2 λ2 + · · · = |λ1 |2 + |λ2 |2 + · · · . −i −1 i 1 1 1 1 1 1 1 1 9. √12 , . 2 i −1 −i 1 1 −1 −1 1 −1 1 11. (i) Transposing does not change the determinant, and conjugating the matrix also conjugates the determinant, since the conjugate of a sum or product is equal to the sum or product of the conjugates. 13. N.B. The unitary matrices given are not the only correct ones. 1 −2 1 −3 1 1 √ √ (a) λ = 5, 10, U = 5 (b) λ = −20, 10, U = 10 , 2 1 3 1 2 1 2 1 − i −1 (c) λ = 3, 6, U = √13 , (d) λ = 9, −9, −9, U = 31 1 2 −2 , 1 1+i −2 2 1 Algebra Chapter 4 115 −3 12 4 1 4 −3 12 , (e) λ = 169, 169, 338, U = 13 12 4 −3 1 i −1 0 0 −1 i 1 (f) λ = 3, 3, −3, 6, U = √13 . i 1 0 1 1 0 1 i 2 2 14. (i) cA (x) = x − (a + d)x + ad − b = x − 12 (a + d)2 − 14 (a − d)2 + b2 . (ii) |RP |2 = x − 12 (a + d)2 + y 2 . |P Q|2 = 41 (a − d)2 + b2 . Points L and R are where y = 0, so cA (x) = 0. (iii) Centre at (7 12 , 0), passing through (9, −2). Centre at (−5, 0), passing through (7, −9). (cA (a) + b2 (a − λ1 )(a − λ2 ) + b2 . The first = (iv) (A − λ1 I)(LQ) = b{(a + d) − (λ2 + λ1 )} b(a − λ2 ) + (d − λ1 )b component is zero by part (ii), since Q(a, b) is on the circle, and the second component is T zero because a + d = tr A = λ1 + λ2 . 15. (i) If A = (aij ), then curl(rA) = (a23 − a32 , a31 − a13 , a12 − a21 ). √ √ x 20 t A cos √20 t + B sin √ (ii) =U . (U as in Question 13.) y C cosh 10 t + D sinh 10 t x A cosh 3t + B sinh 3t (iii) y = U C cos 3t + D sin 3t . z E cos 3t + F sin 3t 16. +, +, + ellipsoid; +, +, − hyperboloid of one sheet; +, −, − hyperboloid of two sheets; planes. +, +, 0 elliptic cylinder; +, −, 0 hyperbolic cylinder; +, 0, 0 pair of Others impossible with 1 on RHS. 17. (a) u2 − v 2 − w2 = 1, hyperboloid of two sheets (b) u2 − v 2 − w2 = −1, (c) u2 − v 2 − w2 = 0, cone. hyperboloid of one sheet, 18. If B = U T AU = U −1 AU , then cB (x) = cA (x) by Algebra Chapter 3 Question 31(b). Equation of Mohr circle of B is cB (x) + y 2 = 0, which is the same equation as Mohr circle of A. T T T 19. (A A) = (AT A)T = A A. T T T T T T (A A)X = σX, so X (A A)X = X (σX), i.e. (AX) (AX) = σX X, i.e. |AX|2 = σ|X|2 . 20. 1 + 1 π P n6=0 (−1)n inπt e in =1+ 2 π P∞ n=1 (−1)n n sin nπt. 2 1 -4 -3 -2 -1 1 2 3 4 116 21. = 1 8 MATH2011/2/4 1 8 + 1−(−1)n 2niπx (−1)n+1 − e niπ n2 π 2 n+1 n P ∞ (−1) 1 sin 2nπx − 1−(−1) n=1 2 nπ n2 π 2 cos 2nπx . + 1 4 P n6=0 0.5 1 2 3 4 1−(−1)n n = Converges to at x = (jumps from to 0), so = + n=1 0 − n2 π 2 (−1) 1 1 1 1 + π 2 1 + 32 + 52 + · · · . 8 n n P∞ P∞ 2 2 22. π3 + 2 n6=0 (−1) einx = π3 + 4 n=1 (−1) cos nx. Converges to 0 at x = 0 and to π 2 n2 n2 1 4 1 2 1 2 1 4 1 8 1 2 P∞ at x = π. (Points of continuity.) P∞ 1 eint . Converges to eiα0 = 1 at t = 0 (point of continuity). 23. sinπαπ n=−∞ (−1)n α−n Check convergence by alternating series test. Converges to 21 (eiαπ + e−iαπ ) = cos απ (midpoint of jump) at t = π. Separate term for n = 0 and combine terms for ±n. Use P 1 π2 limit comparison test with p-series for p = 2. n2 = 6 . Rπ 25. arctan(tan 21 θ) = 21 θ for −π < θ < π so second integral = 12 0 θ sin nθ dθ 1 = 21 − n1 θ cos nθ + n12 sin nθ]π0 = 2n π(−1)n+1 . Algebra Chapter 4 æ 117