i MAS102 MAS102 Calculus II Table of Contents 1 1.1 1.2 1.3 Lecture Notes prepared by R. Tavakol 1.4 1.5 1.6 © Queen Mary, University of London 2001–2003 2 2.1 2.2 2.3 2.4 2.5 2.6 1 CALCULUS II Ordinary Differential Equations Introduction ODEs of First Order and First Degree 1 Separable Differential Equations 2 Homogeneous ODEs (First Order) 3 ODEs Reducible to Homogeneous Differential Equations 4 Linear, First Order ODEs 5 Bernoulli Equations Linear ODEs with Constant Coefficients 1 The D Notation 2 Solution of Homogeneous Part 3 Particular Solutions to Non-homogeneous Equations 1 1 3 3 5 6 10 12 14 14 16 21 Functions of the form f (x) = Ax r (A is constant, r is integer) Functions of the form f (x) = bekx (b, k are constants) Functions of the form f (x) = x k eax (a, k are constants) Functions of the form f (x) = cos nx or sin nx (n constant) Functions of sums of constant multiples of any of the above Differential Equations of the Euler Type Simultaneous Linear Equations with Constant Coefficients Series Solutions of ODEs 1 Picard’s Method 2 Taylor Series Method 3 Frobenius Method (partial) 21 22 24 25 26 27 29 31 32 34 36 Functions of More Than One Variable Functions of One Variable Functions of Two Variables Limits and Continuity 1 Limits and Continuity for Functions of One Variable 2 Limits and Continuity for Functions of Two Variables Partial Differentiation Joint Differentiability Directional Derivative 43 43 44 49 50 52 57 62 65 ii 2.7 2.8 2.9 2.10 2.11 2.12 2.13 Chain Rule Taylor’s Theorem for Functions of Two Variables Quadratic Forms Stationary Points of Functions of Two Variables Lagrange Multipliers Inverse Functions Implicit Functions 3 3.1 3.2 3.3 3.4 3.5 3.6 Multiple Integrals Rectangular Regions Non-rectangular Regions Change of Variables in Area Integrals Changing Variables Twice Volume Integrals Change of Variables in Triple Integrals 69 72 74 75 77 81 83 85 86 90 99 104 107 110 1 Ordinary Differential Equations 1.1 Introduction Differential equations (DE) arise whenever we consider changes in a system or the evolution of a system (since rate of change is equivalent to the derivative). Definition: An ordinary differential equation (ODE) is an equation involving a single unknown function of an independent variable and a finite number of its derivatives. Examples: variables (y, x) dy + yx = x 2 dx (x, t) d4 x dt 4 3 + sin t (1.1) dx +x =0 dt (1.2) (y, x) dy + p(x)y + q(x) = 0 dx (1.3) (y, x) d3 y dy = f (x) 1 + dx dx 3 (1.4) 2 5/2 where p, q and f are given functions of x . Definition: The order of a differential equation is the highest derivative appearing. Definition: The degree of a differential equation is the power of the highest derivative. Definition: A differential equation in y is said to be linear if it is linear in y , y , y , . . . otherwise it is called nonlinear. 1 2 1 Ordinary Differential Equations 1.2 Ordinary Differential Equations of First Order and First Degree Examples: The order, degree and type for examples (Eqs. (1.1–1.4)) are given in Table 1.1 below. Note that in the example given in Eq. (1.4) we have to take the square of the equation before we can find the degree. This gives Note that the general solution, y = x + c, where c is a constant, determines a family of solutions depending on c. This is illustrated in Fig.1.1. A particular solution passing through the point (x0 , y0 ) is given by d3 y dx 3 2 = [ f (x)] 2 dy 1+ dx y = x + y0 − x0 . 2 5 We shall be mostly concerned with linear ODEs. Definition: To solve a differential equation we need to find the unknown function. Note that this unknown function is y in Eqs. (1.1), (1.3) and (1.4), and x in Eq. (1.2). Examples: dy = 0 ⇒ y = c1 dx 2 d y = 0 ⇒ y = c1 x + c2 dx 2 3 d y 1 = 0 ⇒ y = c1 x 2 + c2 x + c3 2 dx 3 where c1 , c2 and c3 are constants. Similarly, dn y/dx n = 0 has a general solution involving n arbitrary constants. More generally, any n th order ODE has n arbitrary constants in its general solution. 1.2 Ordinary Differential Equations of First Order and First Degree Ordinary differential equations of the first order and first degree are the simplest ODEs, but in general it is not always possible to solve them. Here we consider five classes of these equations that we can solve. 1.2.1 Separable Differential Equations These are equations which can be written as: dy = f (x) g(y) . dx We can start the process of “separation” by bringing the g(y) term over to the left-hand side. This gives: 1 dy = f (x) . g(y) dx y Example Order 1.1 1.2 1.3 1.4 1 4 1 3 2 y0 ⇒ y = x +c Table 1.1. The order and degree of Eqs. (1.1–1.4) Degree Type 1 3 1 2 linear nonlinear linear nonlinear y=x 4 Example: Consider the following differential equation and its general solution: dy =1 dx 3 -4 -2 (x0,y0) 2 x0 4 x -2 -4 Fig. 1.1 Lines representing the general solution of dy/dx = 1. 4 1 Ordinary Differential Equations Now we integrate each side with respect to x giving 1 dy dx = g(y) dx and hence the solution is given by dy g(y) f (x) dx = f (x) dx . 1.2 Ordinary Differential Equations of First Order and First Degree 5 dy = (1 − 2x)(1 + y 2 ) i.e. separable! dx dy ⇒ = (1 − 2x) dx 1 + y2 ⇒ tan−1 y = x − x 2 + c ⇒ The initial conditions, x = 0, y = 0, give tan−1 0 = 0 − 0 + c ⇒c=0 This reduces to a pure integration of each side of the equation to obtain the solution. Hence the solution is Example: Solve the differential equation: tan−1 y = x − x 2 or, taking the tan of each side, y+1 dy = . dx x −1 y = tan(x − x 2 ) . We can write 1 1 dy = ⇒ y + 1 dx x −1 dy 1 dx ⇒ = y+1 x −1 ⇒ ln |y + 1| + c1 = ln |x − 1| + c2 ⇒ ln |y + 1| = ln |x − 1| + c y + 1 =c ⇒ ln x − 1 y+1 = ±ec = d ⇒ x −1 ⇒ y + 1 = d(x − 1) (c = c2 − c1 ) where d is an arbitrary constant. Note that the ODE is first order and so we obtain one constant. Remark: In order to fix the arbitrary constants we require further conditions. These are the initial or boundary conditions (IC or BC). Example: Solve dy = 1 + y 2 − 2x − 2x y 2 dx subject to the initial conditions y = 0 at x = 0. We can rewrite the equation as dy = (1 − 2x) + y 2 (1 − 2x) dx 1.2.2 Homogeneous ODEs (First Order) Homogeneous ordinary differential equations are ones that can be put in the form y dy = f . dx x Example: If we take the differential equation dy 2x y = 2 dx x + y2 and divide the numerator and denominator of the right-hand side by x 2 we have y dy 2(y/x) = ≡ f 2 dx x 1 + (y/x) and so the differential equation is homogeneous. Homogeneous equations can be put into a separable form by writing v= y x ⇒ y = vx where v is an unknown function of x . dy dv =x +v. ⇒ dx dx Example: Solve the differential equation, √ 2 xy − x dy dx + y = 0. 6 1 Ordinary Differential Equations To see if it is homogeneous we can rewrite the equation as y dy −y −(y/x) = √ = √ = f dx 2 xy − x x 2 y/x − 1 where we have divided top and bottom by x . Now let y = vx such that dy dv =x +v. dx dx Now substitute for y/x = v and d y/dx = x(dv/dx) + v . −v dy dv =x +v = √ dx dx 2 v−1 −2v 3/2 dv −v − (2v 3/2 − v) = ⇒x = √ √ dx 2 v−1 2 v−1 √ dx 2 v−1 =− ⇒ dv (separable) x 2v 3/2 ⇒ This integrates to give 1 ln |x| = − ln |v| − √ + c v 1.2 Ordinary Differential Equations of First Order and First Degree The right-hand side of the differential equation would be homogeneous if c = h = 0. Therefore we will make a linear transformation to new variables (X, Y ) such that these constant parts vanish. We set x = X + x0 a(X + x0 ) + b(Y + y0 ) + c dY = dX f (X + x0 ) + g(Y + y0 ) + h = a X + bY + (ax0 + by0 + c) . f X + gY + ( f x0 + gy0 + h) In order to make the equation homogeneous we must set the terms in the brackets equal to zero. Hence the values of x0 and y0 will be determined by the solutions of the simultaneous, linear equations ax0 + by0 + c = 0 ⇒ ln |vx| = − √ + c v ln |y| + √ 1 y/x = c. This is the general solution of the differential equation. We cannot determine c without knowing boundary conditions. Note: The solution we have obtained is implicit, i.e. it cannot be expressed as y = f (x). An implicit solution gives a relationship between x and y but not necessarily of the form y = f (x). Remark: Often the solutions of ODEs are implicit. In problems one should try to simplify as much as possible. 1.2.3 ODEs Reducible to Homogeneous Differential Equations Differential equations that can be reduced to ones of the homogeneous type have the form: ax + by + c dy = dx f x + gy + h where a , b, c, f , g and h are constants. y = Y + y0 and and choose (x0 , y0 ) in order to make the right-hand side of the differential equation homogeneous. Now, with the new variables d y dY dY = = dx dx dX so that the differential equation can now be written as 1 But v = y/x and vx = y . Therefore our solution should be written as 7 f x0 + gy0 + h = 0 . The differential equation can now be written a X + bY a + bY/ X dY = = . dX f X + gY f + gY / X This is now homogeneous and can be solved using the usual technique of setting Y = V X. Example: Solve the differential equation y−x +1 dy = . dx y+x +5 We set x = X + x0 y = Y + y0 and so the differential equation becomes Y − X + (y0 − x0 + 1) dY = . dX Y + X + (y0 + x0 + 5) In order to find x0 and y0 we need to solve the equations y0 − x0 = −1 y0 + x0 = −5 . 8 1.2 Ordinary Differential Equations of First Order and First Degree 1 Ordinary Differential Equations 9 The solution is x0 = −2, y0 = −3. The differential equation to be solved is now =0 y+h fx+g Y−X dY = . dX Y+X y Now we let Y = V X and the differential equation is written as V −1 dV dY =V+X = dX dX V +1 dV 1 + V2 =− ⇒X (separable) dX 1+V 1+V dX ⇒ dV = − X 1 + V2 1 −1 2 ⇒ tan V + ln |1 + V | = − ln |X | + C . 2 ln |1 + V 2 | + 2 ln |X | = −2 tan−1 V + 2C ⇒ ln |X 2 (1 + V 2 )| = −2 tan−1 V + 2C −2 tan−1 V 2 ⇒ X (1 + V ) = D e −1 ⇒ X 2 + X 2 V 2 = D e−2 tan x ax+b y+c= 0 2 for some value of k . In this case the lines have the same gradient and the original differential equation is of the form ax + by + c dy = . dx k(ax + by) + h V where D = e2C . Now we substitute for X and V using X = x + 2, Y = y + 3 and Y = V X . This gives the general solution 2 X (x0,y0) Fig. 1.2. Intersection point of two lines is used to transform the coordinates and make the differential equation homogeneous. This can be re-arranged to give 2 Y −2 tan−1 (x + 2) + (y + 3) = D e y+3 x+2 Therefore in this case the previous transformation does not work. To solve this we use z = ax + by . as a new variable. Then Remark: Geometrically we are finding the intersection point between the lines dz z+c dy =a+b =a+b dx dx kz + h Note that again we have an implicit solution. ax +by +c = 0 and f x + gy +h = 0 and then shifting the origin to the intersection point by X = x − x0 , Y = y − y0 . (See Fig. 1.2) which is separable and therefore we can solve for z as before. In practice such equations can be spotted by inspection. The concept of finding the point of intersection in order to make the equation homogeneous is useful because there will be cases where the lines in Fig. 1.2 do not intersect. When the lines are parallel the simultaneous equations take the form Example: Solve the differential equation ax0 + by0 + c = 0 k(ax0 + by0 ) + h = 0 dy x −y−1 = . dx y−x −1 Let z = x − y. 10 1 Ordinary Differential Equations 1.2 Ordinary Differential Equations of First Order and First Degree Hence Integrating gives dz dy =1− . dx dx Therefore the differential equation becomes . (1.9) R(x)y = or y= A first order linear ODE can be written as dy + P(x)y = Q(x) (1.5) dx where P(x) and Q(x) are arbitrary functions of x . It turns out that we can ‘always’ solve Eq. (1.5) by first multiplying through by a function R(x), the integrating factor (IF), such that the left-hand side of Eq. (1.5) becomes the exact derivative d [ R(x)y ] . dx To see how this is done we first multiply Eq. (1.5) by the unknown function R(x): (1.6) (1.7) (1.8) Taking Eq. (1.8) and using the product rule on the left-hand side gives d y d R(x) dy + y = R(x) + R(x)P(x)y . dx dx dx We can cancel the first term on each side of the equation and hence R(x) will be the solution of the differential equation d R(x) = P(x)R(x) . dx P(x) dx d [ R(x)y ] = R(x)Q(x) dx where R(x) is already known from Eq. (1.9). This integrates directly to give 1.2.4 Linear, First Order ODEs ⇒ P(x) dx Then we can write Eq. (1.8) as ln |x − y| + constant = x + y . R(x) P(x) dx ⇒ R(x) = e Substituting back using z = x − y gives dy d (LHS of (1.6)) [ R(x)y ] ≡ R(x) + R(x)P(x)y dx dx = Q(x)R(x) (RHS of (1.6)) = ⇒ ln R(x) = z ⇒ ln |z| + z + constant = 2x dy + R(x)P(x)y = Q(x)R(x) . dx Now, to get the exact derivative we require d R(x) R(x) z−1 dz dy ≡1− = dx dx −(z + 1) dz 2z z−1 =1+ ⇒ = dx z+1 1+z 1+z dz = 2 d x ⇒ R(x) 11 R(x)Q(x) dx 1 R(x) R(x)Q(x) dx and this is the solution of our differential equation. Remark: It is not necessary to include a constant of integration in the integrating factor R(x), as such a constant would cancel in the final solution. Remark: In solving problems, always reduce the linear first order differential equation into the standard form, d y/dx + P(x)y = Q(x) with the coefficient of d y/dx equal to 1. Example: Solve the differential equation dy + 2y = x 4 . dx In order to obtain a coefficient of 1 in the d y/dx term we need to divide through by x . The equation becomes x dy 2 + y = x3 . dx x This is now a linear differential equation with P(x) = 2 x , Q(x) = x 3 . From the theory given above the integrating factor is R(x) = e P(x) dx = e (2/x)dx (1.10) 12 1 Ordinary Differential Equations 1.2 Ordinary Differential Equations of First Order and First Degree = e2 ln x = eln x This equation can be made linear by using the substitution 2 z= = x2 . We can check that this indeed gives a total derivative: d d [ R(x)y ] = (x 2 y) dx dx dy + 2x y = x2 dx dy 2 + y = x2 dx x dy 2 = R(x) + y dx x where n = 1. Note that in the case where n = 1 the differential equation becomes separable. Now we can substitute the expression for y −n d y/dx in Eq. (1.11). This gives as required. dz + P(x)z = Q(x) 1 − n dx 1 = x 2 x 3 dx = x6 ⇒ R(x)Q(x) dx x2y = . dy dz = (1 − n)y −n dx dx 1 dz −n d y = ⇒y dx 1 − n dx where R(x) = x 2 and Q(x) = x 3 . Hence 1 y n−1 Hence, Now we can proceed to obtain the solution to the differential equation. From the theory given above we have R(x)y = 13 If we set (1 − n)P(x) = P(x) and (1 − n)Q(x) = Q(x) then the equation is a linear one of the standard form and can be solved using the integrating factor method (see Sect. 1.2.4). x 5 dx Example: Solve the differential equation +c 6 1 4 ⇒ y = x + cx −2 6 and this the general solution to the differential equation. dy − y = x y5 . dx Here we have P(x) = −1, Q(x) = x and n = 5. We re-write the equation as 1 dy − y −4 = x . y 5 dx 1.2.5 Bernoulli Equations Bernoulli equations are of the form (1.12) Let z = y −4 . Hence dy + P(x)y = Q(x)y n dx where n is a constant and P(x) and Q(x) are given functions of x . dy dz = −4 y −5 . dx dx We substitute the expression for y −5 d y/dx in Eq. (1.12) to give Note: In the special case when n = 0 the equation reduces to a form studied in Sect. 1.2.4. This is a nonlinear equation since it involves powers of y that are greater than one. To solve the equation we divide each side by y n . Hence 1 dy + P(x)y −n+1 = Q(x) . y n dx dz + (1 − n)P(x)z = (1 − n)Q(x) dx (1.11) dz + 4z = −4x. dx (1.13) Note: The coefficient of dz/dx needs to be made 1 before calculating the IF. Eq. (1.13) is now linear with P(x) = 4 and Q(x) = −4x . The integrating factor is R(x) = e P(x) dx =e 4 dx = e4x . 14 1 Ordinary Differential Equations Hence e4 x z = − P(x) = x 2 + a1 x + a0 then we define 4 x e4 x d x = − e4 x d x ⇒ e4x z = −x e4x + 1 4x e +c 4 P(D) = D 2 + a1 D + a0 x d(e4x ) = −x e4x + by replacing x by D . This is standard property of functions. Then (by parts) P(D)y = (D 2 + a1 D + a0 )y means 1 ⇒ z = −x + + ce−4x ≡ y −4 4 −1/4 1 ⇒ y = ce−4x + − x . 4 D 2 y + a1 Dy + a0 y = 1.3 Linear ODEs with Constant Coefficients In this section we consider differential equations of the form dn y dn−1 y + an−1 n−1 + · · · + a0 y = f (x) (1.14) n dx dx where an , an−1 , · · ·, a0 are constant coefficients and f (x) is a function of x . The aim is to solve Eq. (1.14) where y is unknown and f (x) is a given function of x . Equations of this type with non-zero right-hand sides are called nonhomogeneous. (Do not confuse this with ODEs of the homogeneous type.) an 1.3.1 The D Notation When dealing with linear ODEs with constant coefficients it is convenient to introduce the differential operator, D to denote d/dx so that dy . dx d D y = D(Dy) = dx dy dx P(D) = (D − m 1 )(D − m 2 ) then P(D)y = (D − m 1 )(D − m 2 )y = (D − m 2 )(D − m 1 )y . We can check that this is true by applying the operator: (D − m 1 )(D − m 2 )y = (D − m 1 )(Dy − m 2 y) = D 2 y − m 2 Dy − m 1 Dy + m 1 m 2 y = (D − m 2 )(Dy − m 1 y) = (D − m 2 )(D − m 1 )y . So we have the same expression with the order of the brackets reversed. Therefore our general linear ODE with constant coefficients, an dn y dn−1 y + an−1 n−1 + · · · + a0 y = f (x) dx n dx can now be written as P(D)y = (an D n + an−1 D n−1 + · · · + a0 )y = f (x) Then D 2 means that we apply D twice such that 2 d2 y dy + a0 y . + a1 dx dx 2 P(D) is called a linear differential operator. Operators like P(D) which are polynomials in D obey the usual algebraic laws of addition, multiplication and factorisation. This means that, if for example, Note: If we were given boundary conditions then c could be found. For example, if y = 1 at x = 0 then 1 = (c + 14 )−1/4 ⇒ c = 34 . Dy ≡ 15 More generally, if P(x) is a polynomial, say d 4x e z = − 4 x e4 x dx which integrates to give 1.3 Linear ODEs with Constant Coefficients d2 y . = dx 2 Similarly d3 y dn y n D3 y = , · · · , D y = . dx n dx 3 using the D notation. Now we consider the solution of our linear ODE with constant coefficients given in differential operator notation by (an D n + an−1 D n−1 + · · · + a0 )y = f (x) . The problem can be reduced to a two-step procedure: (1.15) 16 1 Ordinary Differential Equations 1.3 Linear ODEs with Constant Coefficients 1) Find the general solution of the corresponding homogeneous equation (the ODE with the right-hand side set to zero). This means solving the equation P(D)y = 0 or (an D n + an−1 D n−1 + · · · + a0 )y = 0 . (1.16) This is referred to as the complementary function (CF) or yh . 2) Find a particular or special solution of the full version of the ODE, Eq. (1.15); this is referred to as the particular integral (PI) or yp . Theorem: The general solution of the non-homogeneous differential equation given in Eq. (1.15) can be written as y(x) = yp (x) + yh (x) 17 Note that if (D − m 1 )y = dy − m1 y = 0 dx then dy = m1 y dx with solution y = c1 em 1 x . Therefore y = c1 em 1 x satisfies Eq. (1.17). However, we could have followed exactly the same procedure for the m 2 and so y = c2 em 2 x also satisfies Eq. (1.17). Suppose m 1 = m 2 . For linear differential equations we can add the solutions such that the general solution of (D − m 1 )(D − m 2 )y = 0 i.e. the sum of the particular integral and the complementary function. is Proof : We have P(D)y = (an D + an−1 D n n−1 y = c1 em 1 x + c2 em 2 x . + · · · + a0 )y = f (x) . But what are m 1 and m 2 ? Now The CF satisfies D 2 + a1 D + a0 ≡ (D − m 1 )(D − m 2 ) P(D)yh = 0 . and so (m 1 , m 2 ) are the roots of the quadratic equation The PI satisfies P(D)yp = f (x) . x 2 + a1 x + a0 = 0 . P(D)(yh + yp ) = f (x) . In general, if P(D)y = 0 then y = cemx is a solution if P(m) = 0, or, in other words, m is a root of P(x) = 0. The equation P(x) = 0 is called the auxiliary equation and when P(x) is a quadratic equation there are three cases to consider: Adding these gives Therefore 1) The roots m 1 and m 2 are real and distinct. In this case the solution is P(D)y = f (x) y = c1 em 1 x + c2 em 2 x . our original equation, where y = yh + yp is the general solution. 2) The roots m 1 and m 2 are real and equal. In this case m 1 = m 2 = m and the solution is y = (c1 x + c2 )emx . 1.3.2 Solution of Homogeneous Part Finding the solution of the homogeneous part of the ODE is equivalent to finding the complementary function (CF). We will approach the problem by considering a general second order ODE of the form This can be verified by showing that (D − m)2 y = 0. 3) The roots m 1 and m 2 are complex conjugate. In this case m 1 = p + iq , m 2 = p − iq and the solution is (D 2 + a1 D + a0 )y = 0 . y = c1 e( p+iq)x + c2 e( p−iq)x . We can factorize the differential operator part to give However, this can be expressed in a number of different ways. We can write 2 D + a1 D + a0 = (D − m 1 )(D − m 2 ) . So (D − m 1 )(D − m 2 )y = 0 or (D − m 2 )(D − m 1 )y = 0 . (1.17) y = e px c1 eiq x + c2 e−iq x = e px (c1 + c2 ) cos q x + i(c1 − c2 ) sin q x = e px E cos q x + F sin q x 18 1 Ordinary Differential Equations 1.3 Linear ODEs with Constant Coefficients where E = c1 + c2 and F = i(c1 − c2 ). So far we have only considered second order linear differential equations. Now we will generalise the theory to n th order equations. Consider the general linear homogeneous equation P(D)y = (an D n + an−1 D n−1 + · · · + a0 )y = 0 . Because the right-hand side is zero, finding the general solution to this equation is equivalent to finding the complementary function of the more general problem. Let us try y = Aemx . Each twice repeated root gives a contribution similar to the first term in Eq. (1.18). The remaining terms in Eq. (1.18) arise from the non-repeated roots. Suppose that the root m 1 is repeated r times. Then P(D) = Q(D)(D − m 1 )r where Q(D) is now a polynomial of degree n − r and we have used the fact that (D − m 1 )r is a factor of P(D). Hence (D − m 1 )r y = 0 is satisfied by y = (c0 + c1 x + · · · + cr −1 x r −1 )em 1 x . This is a solution that satisfies the auxiliary equation P(m) = an m n + an−1 m n−1 + · · · + a0 = 0 . This equation has n roots m 1 , m 2 , · · ·, m n . Suppose that all the roots are distinct. Then the general solution is the superposition yh = c1 em 1 x + c2 em 2 x + · · · + cn em n x Note that the term in brackets is a general polynomial of degree r − 1. Then the general solution is y = (c0 + c1 x + · · · + cr −1 x r −1 )em 1 x + cr em r x + · · · + cn em n x . (1.19) Each r repeated root gives a contribution similar to the first term in Eq. (1.19). The remaining terms in Eq. (1.19) arise from the non-repeated roots. where c1 , c2 , · · ·, cn are n arbitrary constants. Note: If y1 (x), y2 (x), y3 (x), · · · are solutions of the linear equation P(D)y = 0 then so is y = y1 (x) + y2 (x) + y3 (x) + · · · . This can be verified by direct substitution and it follows from the linearity of the operator P(D). Therefore the general solution yh = c1 em 1 x + c2 em 2 x + · · · + cn em n x is valid provided the roots are distinct (even if they are complex). Repeated roots require special attention. Suppose that the root m 1 is twice repeated. Then P(D) = Q(D)(D − m 1 )2 where Q(D) is a polynomial of degree n − 2 and we have used the fact that (D − m 1 )2 is a factor of P(D). Hence Note: The general solution always has n arbitrary constants regardless of repeated roots. Example: Solve the differential equation d2 y d y − 6y = 0 . + dx 2 dx This can be written as P(D)y = 0 where P(D) = D 2 + D − 6. The auxiliary equation is given by P(m) = 0 or m2 + m − 6 = 0 ⇒ (m − 2)(m + 3) = 0 ⇒ m = 2, m = −3 . These are real, distinct roots and so the general solution is y = c1 e2x + c2 e−3x . (D − m 1 )2 y = 0 is satisfied by Example: Solve the differential equation y = (c0 + c1 x)em 1 x and so this also satisfies P(D)y = 0. Then the general solution is y = (c0 + c1 x)em 1 x + c3 em 3 x + · · · + cn em n x . 19 (1.18) dy d2 y + y = 0. +2 dx dx 2 1 Ordinary Differential Equations 1.3 Linear ODEs with Constant Coefficients This can be written as P(D)y = 0 where P(D) = D 2 + 2 D + 1. The auxiliary equation is given by P(m) = 0 or 1.3.3 Particular Solutions to Non-homogeneous Equations 20 m 2 + 2m + 1 = 0 ⇒ (m + 1)2 = 0 The second part of finding the general solution to a non-homogeneous differential equation with constant coefficients is to find the particular integral, yp . We need to do this when the differential equation has the form ⇒ m = −1 twice . Therefore we have a twice repeated root only and so the general solution is −x y = (c1 + c2 x)e . 21 P(D)y = f (x) where f (x) = 0. General methods for finding the particular integral are available for specific forms of the function f (x). We will consider five forms of functions. 1.3.3.1 Functions of the form f (x) = Ax r ( A is constant, r is integer) In this case we need to find yp such that Example: Solve the differential equation d2 y dy + 13 y = 0 . −6 dx dx 2 This can be written as P(D)y = 0 where P(D) = D 2 − 6 D + 13. The auxiliary equation is given by P(m) = 0 or m 2 − 6m + 13 = 0 ⇒ m = 3 ± 2i . Therefore the general solution is y = c1 e(3+2i)x + c2 e(3−2i)x . P(D)yp = Ax r . Try a general polynomial of degree r of the form yp = br x r + br −1 x r −1 + · · · + b0 . The coefficients br , br −1 , · · ·, b0 will have to be determined in order to get the solution. Example: Find the general solution of the differential equation d2 y dy + 9 y = 18x 2 . +6 dx dx 2 Using the fact that e±iθ = cos θ ± i sin θ we can write this as y = e3x E cos 2x + F sin 2x where E = c1 + c2 and F = (c1 − c2 )i are two constants. yp = b2 x 2 + b1 x + b0 . Example: Solve the differential equation d3 y dx 3 In this example A = 18, r = 2 and P(D) = D 2 + 6 D + 9. To find the particular integral we try Then −3 d2 y + 4y = 0 . dx 2 This can be written as P(D)y = 0 where P(D) = D 3 − 3 D 2 + 4. The auxiliary equation is given by P(m) = 0 or m 3 − 3m 2 + 4 = 0 ⇒ (m + 1)(m − 2)2 = 0 ⇒ m = −1, m = 2 twice . Therefore the general solution is y = (c1 + c2 x)e2x + c3 e−x . The first term in this equation comes from the repeated root while the second is from the non-repeated root. P(D)yp = 2b2 + 6(2b2 x + b1 ) + 9(b2 x 2 + b1 x + b0 ) or P(D)yp = 2b2 + 6b1 + 9b0 + (12b2 + 9b1 )x + 9b2 x 2 = 18x 2 . Equating the coefficients of the powers of x on each side of the equation gives x 2 : 9b2 = 18, x : 12b2 + 9b1 = 0, const. : 2b2 + 6b1 + 9b0 = 0 . Hence 8 3 b 2 = 2, b1 = − , and so the particular integral is yp = 2x 2 − 8 4 x+ . 3 3 b0 = 4 3 22 1 Ordinary Differential Equations Now we consider the complementary function. The auxiliary equation is 2 2 P(m) = m + 6m + 9 = (m + 3) = 0 1.3 Linear ODEs with Constant Coefficients 23 Because m 1 = −2 and m 2 = −2we have the case where k = −2 does not satisfy the auxiliary equation. Hence, for the particular integral we try yp = Ae−2x . with roots m = −3 (twice). Hence yh = (c0 + c1 x)e−3x Then and the general solution is P(k)A = 10 = 20 A y = yp + yh = (c0 + c1 x)e−3x + 2x 2 − 8 4 x+ . 3 3 Note: If m = 0 is a l -repeated root of the auxiliary equation P(m) = 0 (i.e. P(m) = m l Q(m)) then for the particular integral try yp = x l (br x r + br −1 x r −1 + · · · + b0 ) . 1.3.3.2 Functions of the form f (x) = bekx (b, k are constants) In this case we need to find yp such that P(D)yp = bekx . Here there are two cases to consider depending on whether or not k satisfies the auxiliary equation P(k) = 0. If k does not satisfy P(k) = 0 then try yp = Ae kx where A has to be determined from the equation P(k)A = b. This gives A = b/P(k). If k satisfies P(k) = 0 and is a repeated root r times then try yp = Ax r ekx where A has to be determined. yp = In this example b = 10, k = −2 and P(D) = D 2 − 5 D + 6. The auxiliary equation is P(m) = m 2 − 5m + 6 = (m − 3)(m − 2) = 0 with roots m 1 = 2 and m 2 = 3. Hence the complementary function is yh = c1 e2x + c2 e3x . 1 −2x e 2 and the general solution is 1 −2x e + c1 e2x + c2 e3x . 2 Example: Find the general solution of the differential equation d3 y dy + 2 y = ex . −3 dx dx 3 In this example b = 1, k = 1 and P(D) = D 3 − 3 D + 2. The auxiliary equation is P(m) = m 3 − 3m + 2 = (m − 1)2 (m + 2) = 0 with roots m 1 = m 2 = 1 and m 3 = −2. Hence the complementary function is yh = (c1 + c2 x)ex + c3 e−2x . Because m = k = 1 is a twice repeated root we try Example: Find the general solution of the differential equation dy d2 y + 6 y = 10e−2x . −5 dx dx 2 1 . 2 Hence y = yp + yh = Note the additional x l factor. ⇒A= yp = Ax 2 ex . We find A by substituting our trial yp in P(D)yp = ex . This gives (D 3 − 3 D + 2)Ax 2 ex = Aex (6 + 6x + x 2 ) − Aex (6x + 3x 2 ) + 2 Ax 2 ex = 6 Aex = ex and hence A = 16 . Therefore the general solution is y = yp + yh = 1 2 x x e + (c1 + c2 x)ex + c3 e−2x . 6 24 1.3 Linear ODEs with Constant Coefficients 1 Ordinary Differential Equations 1.3.3.3 Functions of the form f (x) = x k eax (a , k are constants) In this case we need to find yp such that and hence yp = e P(D)yp = x k eax . Here there are two cases to consider depending on whether or not a satisfies the auxiliary equation P(a) = 0. If a does not satisfy P(a) = 0 then try yp = (bk x + bk−1 x k k−1 − 3 +x 2 y = yp + yh = c0 + (c1 x + c2 )e + e x x + · · · + b0 )e Example: Find the general solution of the differential equation In this example a = 1, k = 2 and P(D) = D 3 − 2 D 2 + D . The auxiliary equation is P(m) = m 3 − 2m 2 + m = m(m − 1)2 = 0 with roots m 1 = 0 and m 2 = m 3 = 1. Hence the complementary function is yh = c0 + (c1 x + c2 )ex . yp = x (b2 x + b1 x + b0 )e . ⇒ (D − 1)2 yp = (12b2 x 3 + 6b1 x 2 + 2b0 x)ex ⇒ D(D − 1)2 yp = ex 12b2 x 2 + (24b2 + 6b1 )x + (6b1 + 2b0 ) . Therefore we require that 1 3 b0 = 1 +x . 1.3.3.4 Functions of the form f (x) = cos nx or sin nx (n constant) In this case we need to find yp such that where Here we are making use of the fact that eiθ = cos θ + i sin θ . In this case the real part of the solution yp corresponds to taking f (x) = cos nx and the imaginary part of the solution corresponds to taking f (x) = sin nx . The equation can then be solved using the procedure developed in Sect. 1.3.3.2 using k = i. Example: Find the general solution of the differential equation where it is understood that we will take the imaginary part when we have obtained the solution. The auxiliary equation is ⇒ (D − 1)yp = (4b2 x 3 + 3b1 x 2 + 2b0 x)ex b1 = − , 3 (D 2 + D − 2)y = eix Hence giving − 2 Here we can write the equation as x 12b2 x + (24b2 + 6b1 )x + (6b1 + 2b0 ) = x 12 x3 d2 y d y − 2 y = sin x . + dx 2 dx Note that the c0 = c0 e0 term corresponds to the m = 0 root. Because m = a = 1 is a twice repeated root we try 2 x4 real part of einx = einx = cos nx imaginary part of einx = einx = sin nx d3 y d2 y d y = x 2 ex . −2 2 + 3 dx dx dx 2 . P(D)yp = b einx where bk , bk−1 , · · ·, b0 have to be determined from the equation P(D)yp = x k eax . 1 , 12 12 x3 Therefore the general solution is yp = x r (bk x k + bk−1 x k−1 + · · · + b0 )eax b2 = x4 ax where bk , bk−1 , · · ·, b0 have to be determined from the equation P(D)yp = x k eax . If a satisfies P(a) = 0 and is a repeated root r times then try 2 x 25 2 P(m) = m 2 + m − 2 = (m − 1)(m + 2) = 0 with roots m 1 = 1 and m 2 = −2. Hence yh = c0 ex + c1 e−2x . Note that m = i is not a root of the auxiliary equation. For the particular integral we try (using the method in Sect. 1.3.3.2) yp = Aeix 26 1.4 Differential Equations of the Euler Type 1 Ordinary Differential Equations giving (−1 + i − 2)A = 1 ⇒ A=− (3 + i) (3 + i) 1 =− =− . 3−i (3 − i)(3 + i) 10 As in Sect. 1.3.3.2 this can also be found by simply writing A = 1/P(i). Hence 3 i yp = − − eix . 10 For the e−x sin x term we recall that this is equivalent to e−(1−i)x . This is a useful way of combining the e−x and sin x terms together. Hence we try yp = be−(1−i)x . Substituting we have 10 3 i yp = − − eix 10 10 3 i − cos x + i sin x = − 10 10 3 1 1 3 cos x + sin x + i − cos x − sin x = − 10 For the 1 term we have f (x) = constant and so we try yp = A. Substituting we have P(D)yp = (D 3 + 1)A = (0 + 1)A = 1 ⇒ A = 1 . P(D)yp = (D 3 + 1)be−(1−i)x = b (−(1 − i))3 + 1 e−(1−i)x = e−(1−i)x . Now we need to take the imaginary part. 10 10 10 Hence ⇒ b −(1 − 3i + 3i2 − i3 ) + 1 = 1 ⇒ b(3i + 3 − i) = 1 ⇒b= 1 3 cos x − sin x + c0 ex + c1 e−2x . 10 10 1 3 − 2i 3 − 2i = = . 3 + 2i (3 + 2i)(3 − 2i) 13 Therefore 3 − 2i −(1−i)x e 13 3 2 sin x − cos x . = e−x 13 13 yp = 1 3 cos x − sin x . =− 10 10 Therefore the general solution is y = yp + yh = − 27 Now we can add the two contributions of yp to yh giving the general solution y = yh + yp √ √ = c1 e−x + ex/2 c2 cos 3x/2 + c3 sin 3x/2 3 2 sin x − cos x + 1 . + e−x 1.3.3.5 Functions of sums of constant multiples of any of the above For functions of this we find each separate yp for each part and the overall yp is the sum of the individual yp ’s. 13 Example: Find the general solution of the differential equation d3 y + y = 1 + e−x sin x . dx 3 Here we can write the equation as (D 3 + 1)y = 1 + e−x sin x . The auxiliary equation is P(m) = m 3 + 1 = 0 √ with roots m 1 = −1, m 2,3 = (1 ± i 3)/2. Hence √ √ yh = c1 e−x + ex/2 c2 cos 3x/2 + c3 sin 3x/2 . For the particular integral we will consider each part separately. 13 1.4 Differential Equations of the Euler Type Euler-type differential equations have the form dn y dn−1 y + an−1 x n−1 n−1 + · · · + a0 y = f (x) dx n dx where the an , an−1 , . . ., a0 are constants. As it stands this equation does not have constant coefficients because of the powers of x that are present. However, it can be transformed so that it does and then standard methods can be used. We carry out a change of variable from x to t using an x n x = et or t = ln x . Then d y d y dt = . dx dt dx 28 1 Ordinary Differential Equations 1.5 Simultaneous Linear Equations with Constant Coefficients From the definition of t we have This is now a linear differential equation with constant coefficients. We have the auxiliary equation dt 1 = dx x P(m) = m 2 − 1 = (m + 1)(m − 1) = 0 and so dy 1 dy = dx x dt or x 29 dy dy = . dx dt with solutions m 1 = 1 and m 2 = −1. This gives us (1.20) Now where a and b are constants. For the particular integral we have f (t) = t and this can be thought of as a polynomial of degree 1. Therefore, according to the method developed in Sect. 1.3.3.1 we need to try a general polynomial of the same degree. Hence we try dy 1 dy = dx x dt d2 y 1 d y 1 d2 y dt + =− 2 ⇒ 2 2 dx x dt dx x dt 2 2 d y 1 d y dy ⇒ = 2 − dt dx 2 x dt 2 d2 y d2 y d y . ⇒ x2 2 = 2 − dt dx dt Equation (1.20) can be written as yh = a et + be−t yp = ct + d . Substitution in the differential equation gives P(Dt )yp = (Dt2 − 1)(ct + d) = −ct − d = t . (1.21) x D x y = Dt y Comparing coefficients gives c = −1 and d = 0. Therefore yp = −t and the general solution is y = yh + yp = a et + be−t − t where Dx ≡ d/dx and Dt ≡ d/dt . Similarly Eq. (1.21) can be written x 2 Dx2 y = Dt (Dt − 1)y . or, in terms of x , y = ax + bx −1 − ln x . We can generalise this (using proof by induction) to give x r Dxr y = Dt (Dt − 1)(Dt − 2) · · · (Dt − (r − 1))y . Having transformed to the variable t the Euler equation becomes linear with constant coefficients and we can solve it using the techniques developed in Sect. 1.3. The initial condition x = 1, y = 0 gives 0 = a + b while the condition x = 1, d y/dx = 3 gives 3 = a − b − 1. Hence a = 2 and b = −2. Therefore the general solution is y = 2x − 2x −1 − ln x . 1.5 Simultaneous Linear Equations with Constant Coefficients Example: Find the general solution of the differential equation d2 y dy − y = ln x +x dx dx 2 with initial conditions y = 0, d y/dx = 3 at x = 1. First we need to transform to the new variable t using the substitution x = et . Recall that for linear simultaneous algebraic equations such as a1 x + b1 y = c1 x2 ⇒ x D x = Dt , x 2 Dx2 = Dt2 − Dt and substitution in the differential equation gives (Dt2 y − Dt y) + Dt y − y = t ⇒ Dt2 y − y = t . a2 x + b2 y = c2 (1.22) (1.23) we can find the solution by first eliminating one unknown, y say, by taking (b2 × Eq. (1.22)) − (b1 × Eq. (1.23)). The approach for solving simultaneous ODEs is similar except that the a ’s and b’s are now differential operators and we can exploit the fact that these are now expressed as polynomials in D . It is best to follow the procedure by means of an example. Example: 30 1 Ordinary Differential Equations 1.6 Series Solutions of ODEs Find the solution of the simultaneous differential equations There is no particular integral because the equation is homogeneous. Therefore dx d y − + 2 y = e−2t dt dt dy dx +2 + 2x − 11 y = 0 dt dt where x(t) and y(t) are unknown functions of t . First we rewrite the equations using the notation D ≡ d/dt and then collect the terms involving x and y respectively. This gives Dx − (D − 2)y = e−2t (1.24) (1.25) (D + 2)x + (2 D − 11)y = 0 . Now eliminate x first by taking ((D + 2) × Eq. (1.24) − D × Eq. (1.25)), just as we do with simultaneous algebraic equations. y = yh = Ae4t + B e−t/3 . (1.29) Now, as in the algebraic equations, we obtain x from the original equations by substituting for y . Using Eq. (1.24) we have Dx − (D − 2)y = e−2t . Therefore substituting for y from Eq. (1.29) we could integrate our expression for Dx to obtain x . However, it is quicker to avoid integration by eliminating the Dx term if possible. We can do this by subtracting Eq. (1.24) from Eq. (1.25) to give 2x + (D − 2)y + (2 D − 11)y = −e−2t which gives x directly by substituting for y from Eq. (1.29) and simplifying. This gives the general solution Note: Operators must be applied from the left. For example, Dx makes sense because it is dx/dt , a function of x . However, x D does not make sense because it is just x(d/dt) which is an operator, not a function of x . So (D + 2) × Eq. (1.24) and D × Eq. (1.25) give (D + 2)Dx − (D + 2)(D − 2)y = (D + 2)e−2t 31 1 1 4t Ae + 7 B e−t/3 2 2 y = Ae4t + B e−t/3 . x = − e−2t + (1.26) (1.27) Note: The number of arbitrary constants in the solution is the same as the order of y in the equation, when x has been eliminated or vice versa. Now, since D(D + 2) = (D + 2)D the first term on the left-hand side cancels when we subtract these equations. So subtracting Eq. (1.27) from Eq. (1.26) gives Remark: Similar methods apply to systems of three or more ODEs (e.g. x(t), y(t), z(t)). D(D + 2)x + D(2 D − 11)y = 0 − [(D + 2)(D − 2)y + D(2 D − 11)y ] = (D + 2)e−2t . But (D + 2)e−2t = d −2t e + 2e−2t = 0 dt and so Eq. (1.28) becomes (D 2 − 4 + 2 D 2 − 11 D)y = 0 ⇒ (3 D 2 − 11 D − 4)y = 0 . P(D) = 3 D 2 − 11 D − 4 . Hence the auxiliary equation is and the roots are m 1 = 4 and m 2 = −1/3. Hence yh = Ae4t + B e−t/3 . Remark: Sometimes inspection can suggest more convenient variables. For example, D 2 x − D 2 y + 3x − 3 y = e4t can be written as D 2 (x − y) + 3(x − y) = e4t which suggests solving for a new variable w = x − y . 1.6 Series Solutions of ODEs This can now be solved by the standard method using P(m) = 3m 2 − 11m − 4 = 3(m − 4)(m + (1.28) 1 )=0 3 So far we have mainly dealt with ODEs with constant coefficients. It turns out that there are no known types of second order linear ODEs, apart from those with constant coefficients or those like Euler’s equations which are reducible to constant coefficients, that are solvable in terms of elementary functions. Definition: Elementary functions consist of (i) algebraic functions, such as y = f (x), satisfying Pn (x)y n + Pn−1 (x)y n−1 + · · · + P1 (x)y + P0 (x) = 0 32 1 Ordinary Differential Equations 1.6 Series Solutions of ODEs where the Pi (x) are polynomials, plus (ii) elementary transcendental (or nonalgebraic) functions such as trigonometric, inverse trigonometric, exponential, logarithmic or combinations of these. For example, y = tan x e1/x −1 x 2) + tan (1 + cos 3x − log x sin x = x 0 33 dx = x Hence the first iteration gives y1 = x . Now use y1 in the second iteration with n = 1. This gives is an elementary function. The idea here is to solve ODEs with non-constant coefficients using power series. We shall consider three methods. x y2 (x) = y0 + (1 + y12 ) dx x0 x =0+ (1 + x 2 ) dx (n = 1) 0 1.6.1 Picard’s Method =x+ This method works with differential equations and boundary conditions of the form dy = f (x, y), y = y0 at x = x0 dx where f (x, y) is a given function with some reasonable properties (such as being a continuous function; see later). The idea is to obtain successive approximations using the formula yn+1 (x) = y0 + x f (x, yn ) dx, n ≥ 0. y2 = x + Here f (x, y) = 1 + y 2 with x0 = 0 and y0 = 0. To solve the equation we start by using Eq. (1.30) with n = 0. This gives (n = 0) x y1 (x) = y0 + (1 + y02 ) dx x0 x =0+ (1 + 02 ) dx 0 . x3 3 . Now use y2 in the third iteration with n = 2. This gives (n = 2) x y3 (x) = y0 + x0 =0+ x0 Example: Use Picard’s method to find a series solution of the following differential equation: dy = 1 + y 2 , y = 0 at x = 0 . dx 3 Hence the second iteration gives (1.30) Therefore, given an approximate solution yn this formula gives the next approximation yn+1 and so on. Note that if yn+1 = yn then we have a solution of d y/dx = f (x, y). Picard proved that this method gives a convergent sequence yn tending to a unique solution y for very general functions f (x, y). For this method to work it is sufficient for f (x, y) to be joint differentiable in a region R of the (x, y) plane in which | f (x, y)| < M , |d f /d y| < A where M and A are positive constants. x3 = x 0 x 0 (1 + y22 ) dx 1 + x + x3 3 2 dx 2x 4 x 6 1+x + + dx . 3 9 2 Note: This step produced two new terms. For Picard’s procedure to work we must take only one new term at each step. Hence x 2x 4 2 y3 = 1+x + dx 3 0 and the third iteration gives 2x 5 + . 3 15 In this particular case the answer could also have been found by separation of variables because dy dy = 1 + y2 ⇒ = dx dx 1 + y2 with solution 2x 5 17x 7 x3 y = tan x = x + + + + ··· 3 15 315 y3 = x + x3 34 1.6 Series Solutions of ODEs 1 Ordinary Differential Equations which shows that the next term, x 6 /9 would have been wrong to take. 35 The second gives Remark: In solving problems you are either told to (i) continue the procedure until the term involving x n first appears; this means stop as soon as x n appears in the solution for y , or (ii) continue until the term involving x n has been shown to be determined correctly; this means take one more term to see if the coefficient of x n term remains the same. If so stop there, otherwise continue. D n (5x Dy) = 5x D n+1 y + 5n D n y where we have used u = 5x and v = Dy . The third gives D n (3 y) = 3 D n y . Substituting in Eq. (1.32) gives 1.6.2 Taylor Series Method (1 − x 2 )D n+2 y − x(2n + 5)D n+1 y − (n + 1)(n + 3)D n = 0 . To use this method we need to find, for a given differential equation, y , y , . . ., etc. by differentiating the differential equation and then constructing a Taylor series. The method is best illustrated with an example. Example: Solve the differential equation d2 y dy − 3y = 0 − 5x (1.31) 2 dx dx The differential equation cannot be solved easily. We assume a solution in the form of a Maclaurin series (about x = 0): (1 − x 2 ) y(x) = y(0) + x y (0) + x2 2! y (0) + · · · + where r x (r ) y (0) + · · · r! D (uv) = u D v + n(Du)(D n n−1 v) + n 2! With n = 1 we have y 3 (0) = 8 y (0) . With n = 3 we have dr y . dx r x=0 (n − 1) This establishes a recurrence relation between the coefficients. We use this with successive values of n to get the coefficients in the Taylor series solution of the differential equation. With n = 0 we have y 2 (0) = 3 y(0) . y 4 (0) = 15 y 2 (0) = 45 y(0) . y 5 (0) = 24 y 3 (0) = 192 y (0) . Now to calculate y , . . ., etc. at x = 0, we differentiate Eq. (1.31) n times using Leibniz’s rule. Recall that writing D ≡ d/dx , Leibniz’s formula for the n th derivative of a product uv is given by n y n+2 (0) = (n + 1)(n + 3)y n (0) . With n = 2 we have y (r ) (0) = Hence, setting x = 0 and using the notation D n y|x=0 = y n (0) gives 2 (D u)(D n−2 The general form of the Taylor series is y(x) = y(0) + x y (0) + Substitution gives v) + · · · + v D u . n Equation (1.31) can be written as (1 − x 2 )D 2 y − 5x Dy − 3 y = 0 which we can differentiate n times using Leibniz’s formula. There are three terms to differentiate. The first gives D n ((1 − x 2 )D 2 y) = (1 − x 2 )D n+2 y − 2xn D n+1 y − where we have used u = 1 − x 2 and v = D 2 y . 2n(n − 1) n D y 2! 2! y 2 (0) + · · · . 3 45 4 y(x) = y(0) 1 + x 2 + x + ··· 2! 4! 8 3 192 5 + y ( 0) x + x + x + ··· . 3! (1.32) x2 5! Note that only y(0) and y (0) appear in our solution. Recognising the pattern in the coefficients allows us to write the general solution as 1 · 3 2 1 · 32 · 5 4 1 · 32 · 52 · 7 6 y(x) = y(0) 1 + x + x ++ x + ··· 2! 4! 6! 2 · 4 3 2 · 42 · 6 5 2 · 42 · 62 · 8 7 x + x + x + ··· . + y (0) x + 3! 5! 7! 1 Ordinary Differential Equations 1.6 Series Solutions of ODEs This is a linear combination of two series with arbitrary constants y(0) and y (0); this is what should be expected for a general solution of a second order ODE. we have P(x) = 7/x 3 , Q(x) = x 2 and x = 0 is an irregular singular point because x P(x) = 7/x 2 → ∞ as x → 0. The following theorems tell us how to find solutions in each case: 36 Note: This method of finding the solution is equivalent to substituting into Eq. (1.31) a trial solution in the form of a series y(x) = ∞ an x n n=0 Theorem: If x = 0 is an ordinary point and P(x) and Q(x) can be expanded in an infinite power series around x = 0 (in which case they are called analytic) then all solutions of the differential equation can be written as power series of the form: with unknown coefficients an . This gives a recurrence relation giving all of the an in terms of the first two coefficients a0 and a1 , Either method works for any (second order) differential equation with well-behaved coefficients. 1.6.3 Frobenius Method (partial) This is a rather general method of solving ODEs with non-constant coefficients. We consider second order cases only. Suppose we have put our ODE into the standard form d2 y dy + Q(x)y = 0 + P(x) dx dx 2 where P(x) and Q(x) are known functions of x and where the initial conditions are y = y0 , d y/dx = y0 at x = x0 . Note that in the standard form the coefficient of the d2 y/dx 2 is unity. The first step is to check what happens as x → x0 . If P(x) and Q(x) have finite limits as x → x0 then x0 is called an ordinary point; otherwise it is called a singular point. If x = x0 is singular but (x − x0 )P(x) and (x − x0 )2 Q(x) remain finite as x → x0 then x = x0 is called a regular singular point. If x = x0 does not satisfy this it is called an irregular singular point. y= we have P(x) = x 3 , Q(x) = 3 and x = 0 is an ordinary point. In the differential equation d2 y 5 d y 3 + 2y =0 + 2 x dx dx x we have P(x) = 5/x , Q(x) = 3/x 2 and x = 0 is a regular singular point because x P(x) = 5 and x 2 Q(x) = 3. In the differential equation 7 dy d2 y + x2y = 0 + dx 2 x 3 dx ∞ an x n . n=0 Theorem: If x = 0 is a regular singular point then there is a solution of the form: y= ∞ an x n+c , a0 = 0 n=0 for some values of the constant c. It is important to note that c can be fractional. Note that this theorem does not state that all solutions are of this form. Example: (Legendre’s Equation) Derive the series solution for the differential equation (1 − x 2 ) d2 y dy + ( + 1)y = 0 − 2x dx dx 2 about the origin x = 0 where is a constant. To begin with we have to put the equation in standard form. This gives 2x d y ( + 1) d2 y + − y=0 dx 2 1 − x 2 dx 1 − x2 Examples: In the differential equation d2 y dy + 3y = 0 + x3 dx dx 2 37 and hence P(x) = − 2x , 1 − x2 Q(x) = ( + 1) . 1 − x2 There are three main steps involved in the solution. Step 1. Note that x = 0 is an ordinary point (but x = ±1 are regular singular points). Therefore, for a solution about x = 0 the first theorem applies and we let y= ∞ n=0 Step 2. an x n . 38 1.6 Series Solutions of ODEs 1 Ordinary Differential Equations We find the various derivatives. Using the D and the notation to denote d/dx we have D(y) = y = D 2 (y) = y = ∞ n=0 ∞ nan x These are substituted into the relevant terms in the differential equation giving −x y = − n(n − 1)an x −2x y = −2 y = nan x n ( + 1)an x n n(n − 1)an x n−2 . n=0 These terms have to be added to assemble the left-hand side of the equation and then set to zero to give the right-hand side. Step 3. Substitute the various terms in the equation and equate the coefficients of all powers of x to zero. Note that apart from the y term, all other terms have x n in the summation. Therefore y = ∞ n(n − 1)an x n−2 n=0 = = ∞ n=2 ∞ n(n − 1)an x n−2 (since n = 0, 1 terms are 0) (n + 1)(n + 2)an+2 x n (changing n → n + 2) ( − n)( + n + 1) an . (n + 1)(n + 2) (1.35) This is our recurrence relation giving higher an ’s in terms of lower an ’s. Now we can substitute different values of n in Eq. (1.35). n=0 ∞ which implies that n n=0 ( + 1)y = n=0 an+2 = − n=0 ∞ ∞ [(n + 1)(n + 2)an+2 − n(n + 1)an + ( + 1)an ] x n = 0 . (n + 1)(n + 2)an+2 + [( + 1) − n(n + 1)] an = 0 n=0 2 ∞ We can now equate the coefficients of all powers of x to zero in order to obtain a relationship between the coefficients. We get n−1 n(n − 1)an x n−2 . ∞ ⇒ 39 (1.33) (1.34) n=0 ⇒ n=1 ⇒ n=2 ⇒ ( + 1) a 1×2 0 ( − 1)( + 2) a3 = − a1 2×3 ( − 2)( + 3) a4 = − a2 3×4 ( − 2)( + 1)( + 3) = a0 a2 = − 4! n=3 ⇒ ( − 3)( + 4) a5 = − a3 4×5 ( − 1)( − 3)( + 2)( + 4) = a1 . 5! Therefore all the an ’s are expressible in terms of a0 and a1 . Substituting these expressions in our series solution gives: ( + 1) 2 ( − 2)( + 1)( + 3) 4 y = a0 1 − x + x + ··· 2! 4! ( − 1)( + 2) 3 ( − 1)( − 3)( + 2)( + 4) 5 + a1 x − x + x + ··· . 3! 5! n=0 We can check that Eqs. (1.33) and (1.34) are equivalent by calculating the first few terms in the series. We get Eq. (1.33) Eq. (1.34) → 2(2 − 1)a2 x 0 + 3(3 − 1)a3 x 1 + · · · → (0 + 1)(0 + 2)a2 x 0 + (1 + 1)(1 + 2)a3 x 1 + · · · and so the series are the same. We can now substitute in the differential equation: ∞ n=0 [(n + 1)(n + 2)an+2 − n(n − 1)an − 2nan + ( + 1)an ] x n = 0 Remark: When is not an integer, each bracketed series is an infinite series and a solution of the original differential equation. Now, since these series are linearly independent, the solution involving two arbitrary constants (i.e. a0 and a1 ) is a general solution. Remark: Each series is convergent with the radius of convergence R = 1. To see this use the recurrence relation (with n replaced by 2n ) for each series: a 2n+2 2n+2 x ( − 2n)( + 2n + 1) 2 = − x a2n x 2n (2n + 1)(2n + 2) → 2 x as n → ∞ . 40 1 Ordinary Differential Equations 1.6 Series Solutions of ODEs The same is also true for the second series (with n replaced by 2n − 1). Remark: Functions defined by these series are called Legendre’s functions and in general they are not elementary functions. However, when is a non-negative integer, one of the series terminates and is therefore a polynomial. (The first series is a polynomial if is even and the second is a polynomial if is odd.) These lead to particular solutions of Legendre’s equations referred to as Legendre’s polynomials which have nice properties and applications. For example, the solution of Laplace’s equation, ∇ 2 φ = 0 leads to Legendre polynomials (see Calculus III). ∞ ∞ an (n + r )2 − p 2 x n+r + an x n+r +2 n=0 n=0 = 0. Now we adjust the equation to make the powers of x the same in both series. We replace n by n + 2 in the first series which we now have to start at n = −2 rather than n = 0. Hence we can write the first series as ∞ n=−2 ∞ dy + (x 2 − p 2 )y = 0 +x dx dx 2 about the origin x = 0 where p is a constant. To begin with we have to put the equation in standard form. This gives d2 y 1 d y (x 2 − p2 ) + + y=0 dx 2 x dx x2 + a0 r 2 − p 2 x r + a1 (r + 1)2 − p 2 x r +1 where we have separated off the first two terms corresponding to n = −2 and n = −1 so that the final series starts at n = 0 again. We are now in a position to group together all the terms in the series solution. This gives ∞ an+2 (n + r + 2)2 − p 2 + an x n+r +2 n=0 and hence P(x) = x , (x 2 − p 2 ) Q(x) = . x2 Therefore x = 0 is a regular singular point and the second theorem listed above applies. Hence we try a power series solution of the form y= ∞ an+2 (n + r + 2)2 − p 2 x n+r +2 n=0 d2 y 1 an+2 (n + r + 2)2 − p 2 x n+r +2 = Example: (Bessel’s Equation) Derive the series solution for the differential equation x2 = 41 + a0 r 2 − p 2 x r + a1 (r + 1)2 − p 2 x r +1 = 0 . In order to satisfy this equation the coefficient of each power of x should be zero. This condition gives a0 r 2 − p 2 = 0 a1 (r + 1)2 − p 2 = 0 an x n+r . (1.36) (1.37) n=0 Note that we must include the r because of the fact that x = 0 is a regular singular point. Substituting this series for y in the terms in the differential equation gives x x2 ∞ dy = an (n + r )x n+r dx n=0 or ∞ d2 y = an (n + r )(n + r − 1)x n+r 2 dx n=0 ∞ n=0 (1.38) (r 2 − p 2 )a0 = 0 . d2 y dy + (x 2 − p 2 )y +x dx dx 2 = an+2 1 =− an (n + r + 2)2 − p 2 an+2 1 . =− an (n + r + 2 − p)(n + r + 2 + p) Therefore, in order to satisfy the equations we must satisfy Eqs. (1.36), (1.37) and (1.38). Eq. (1.36) gives and adding all the terms together gives x2 and the recurrence relation ∞ an (n + r )(n + r − 1) + (n + r ) − p 2 x n+r + an x n+r +2 n=0 Note: We always take a0 = 0 in the Frobenius method which gives the undetermined r . This gives r = ± p. Note that two roots can be considered. The equation r 2 − p2 = 0 is called the indicial equation and it is obtained by setting to zero the coefficient of the lowest power of x and taking a0 = 0. 42 1 Ordinary Differential Equations Taking r = ± p, Eq. (1.37) can only be satisfied if and only if a1 = 0. Therefore, by Eq. (1.38) we must have a3 = a5 = · · · = 0. Therefore all odd powered terms are zero. Now with r = + p we have a solution y= ∞ an x 2 Functions of More than One Variable r +n n=0 which, using Eq. (1.38), gives an+2 1 =− an (n + 2)(n + 2 + 2 p) and hence a2 1 . =− a0 2(2 + 2 p) The resulting series solution is y[1] = a0 x p 1 − x2 2 (2 p + 2 ) + x4 2 × 4(2 p + 2)(2 p + 4) − ··· which is an infinite series provided p is not a negative integer. Now with r = − p we have a similar procedure. Replacing p by − p in the above case we obtain the series solution y[2] = a0 x − p 1 + 2.1 Functions of One Variable x4 x2 + + ··· 2(2 p − 2) 2 × 4(2 p − 2)(2 p − 4) which is a well behaved infinite series provided p is not a positive integer. Therefore a general solution is of the type y = Ay[1] + By[2] Previously we have considered functions of one variable. We have written these in terms of functional notation as, for example, y = f (x), where f is some function of the variable x . Therefore for each value of x there is a corresponding value of y . For example, y = sin x . The derivative of such a function can be written as d y/dx = f (x). This corresponds to the gradient of the tangent to the curve at the point (x, y). For example, d y/dx = cos x . We can also consider the integral of the function between x = a and x = b. This is written as b a f (x) dx = [ F(x)]ab where d F(x) = f (x) . dx where A and B are arbitrary constants. Remark: It is clear that whenever the constant p in the Bessel equation is an integer or whenever the difference between the roots of the indicial equation is zero or a positive integer (i.e. if p is a half integer), then we expect difficulties. To deal with these cases (and similar cases in other equations) we require the full Frobenius method that is not covered in this course. For example, when the roots of the indicial equation are equal, i.e. r1 = r2 , the solutions coincide implying that there is only one solution, y[1] = x r1 ∞ y = fHxL 2 1 -2 an x n . 1 2 -1 n=0 The second solution is then of the form y[2] = ln x y[1] . -1 -2 Fig. 2.1 Plot of a general function y = f (x). 43 x 44 2 Functions of More than One Variable 2.2 Functions of Two Variables A function of x can also be considered graphically. Here y(x) or y = f (x) represents the height above the x -axis of a point on the curve represented by y = f (x). This is illustrated in Fig. 2.1. Figure 2.2 contains plots of several functions. In each case the value of y for a particular value of x defines the height above the x -axis. Note the relationship between the parabolas shown in Fig. 2.2(b), (c) and (d). (a) y = sin x 1 (b) y = x2 20 0.5 -6 -4 15 -2 2 4 x 6 10 5 -1 2.2 Functions of Two Variables It is perfectly possible for some dependent variable to be a function of two variables. Here we use the notation z = f (x, y) for such a function where the ‘input’ is (x, y) and the ‘output’ is z . If we consider the dependent variable z as the third dimension then it can be thought of as the height above the (x, y) plane (defined by z = 0) as illustrated in Fig. 2.3(a). Note that f (x, y) depends on both x and y and therefore z is a function of two variables. Hence the ‘graph’ of z = f (x, y) is now a two-dimensional surface (see Fig. 2.3(b)). Good analogies are the heights of a mountainous terrain above sea level, or the weather maps with temperature or atmospheric pressure plotted as a function of geographic location. Therefore we can imagine a function of two variables as a surface in three dimensional space. Now we consider some of the techniques for plotting surfaces defined by functions of two variables. 45 (c) -4 -3 -2 -1 2 3 4 x y = 2 - x2 (d) y = 5 + x2 1 20 15 -4 -3 -2 -1 -5 10 -10 5 1 2 3 4 x -15 -4 -3 -2 -1 1 2 3 4 x -20 Example: Plot the surface defined by z = f (x, y) = x 2 + y 2 . (e) (a) z (b) 5 z -2 surface defined by z = f (x,y) height above (x,y) plane -15 y x (x,y) -5 1 2 3 (f) 4 5 y = x 2 - cos x + 3 sin x 20 x 15 10 -10 z = f (x,y) in the plane (z = 0) -1 y = x3 - 4x 2 + 4 y x Fig. 2.3. (a) Each value of z = f (x, y) for a particular (x, y) can be thought of as defining the height above the (x, y) plane. (b) -20 5 -4 -2 -5 2 4 x Fig. 2.2. Plots of various functions of the variable x. (a) y = sin x, (b) y = x 2 , (c) y = 5 + x 2 , (d) y = 2 − x 2 , (e) y = x 3 − 4x 2 + 4 and (f) y = x 2 − cos x + 3 sin x. 46 2 Functions of More than One Variable 2.2 Functions of Two Variables 47 In order to plot this surface recall that, for a fixed y , y = a say, we have z = x 2 + a 2 . In this special case we have made it a function of one variable only (z is now only a function of x ); in fact z = x 2 + a 2 defines a parabola in the plane y = a , perpendicular to the y -axis. We can assemble the surface by plotting a set of one dimensional curves z = f (x, a) where we think of a as a constant (i.e. the curves lie on the y = a plane). Using three different values of a we obtain the plots shown in Fig. 2.4. Each value of a gives rise to a different parabola. For example, for a = 0 we obtain the parabola z = x 2 . Therefore the required surface is made up of all the parabolas. Note that w could equally well fix the value of x and plot the parabolas of the form z = a 2 + y 2 which lie in planes perpendicular to the first set of parabolas. The final surface is shown in Fig. 2.5. Likewise we can sketch the surface defined by (i) z = −(x 2 + y 2 ) which is the surface defined by z = x 2 + y 2 turned upside down (see Fig. 2.6) and (ii) z = x 2 + y 2 + 3 which is the surface defined z = x 2 + y 2 lifted up by three units along the z -axis (see Fig. 2.7). Consider the surface defined by z = y 2 − x 2 . Holding x = 0 but letting y vary we obtain a parabola z = y 2 in the y -z plane. Holding y = 0 but letting x vary we obtain a parabola z = −x 2 in the x -z plane. Extending these results we see that we obtain a set of inverted parabolas whose vertices increase in height as |y| increases. The resulting surface is a saddle and the origin is a saddle point. This is illustrated in Fig. 2.8 Unlike the case of functions of one variable, for saddles (arising from functions of two variables), the origin (in this case) is a maximum in x and a minimum in y. Before proceeding it is worthwhile reminding ourselves that in the plane we can choose two sets of coordinates. These are cartesian coordinates (x, y) and z z=x2+a2 Fig. 2.5 The surface defined by z = x 2 + y 2 . polar coordinates (r, θ) where 0 < r < ∞ and 0 ≤ θ ≤ 2π (see Fig. 2.9). The relationships between them are given by x = r cos θ, y = r sin θ z=x2+a2 z=x2 y=-a y=+a x Fig. 2.4 y Three of the parabolas that go to make up the surface defined by z = x 2 + y 2 . Fig. 2.6 The surface defined by z = −(x 2 + y 2 ). 48 2 Functions of More than One Variable 2.3 Limits and Continuity 49 y (x,y) (r,θ) r θ x Fig. 2.9 Cartesian and polar coordinates in the plane. Fig. 2.7 The surface defined by z = x 2 + y 2 + 3. Polar coordinates can be useful when considering surfaces defined by functions of two variables. For example, consider again the surface defined by z = f (x, y) = x 2 + y 2 , the bowl shown previously in Fig. 2.5. In polar coordinates we can write the equation of the surface as z = r 2 . Therefore the value of z depends only on the distance to the origin. Furthermore, the surface must be rotationally symmetric about the z -axis. The surface is shown again in Fig. 2.10. 2.3 Limits and Continuity In this section the aim is to extend ideas from the one variable case to functions of more than one variable. We begin by examining the concepts for functions of one variable. Fig. 2.8 and r= The surface defined by z = y 2 − x 2 . x 2 + y2, θ = tan−1 y x . Fig. 2.10 The surface defined by z = x 2 + y 2 . 50 2 Functions of More than One Variable 2.3 Limits and Continuity 2.3.1 Limits and Continuity for Functions of One Variable (a) For functions of one variable we say that f (x) approaches the limit L as x → a whenever (b) 51 3 2 1 1 lim f (x) = L . x→a -10 ⇒ lim f (x) = f (0) x→0 and also because there is no break at x = 0. -3 Fig. 2.11 3 (b) -10 10 -10 10 -1 -1 Fig. 2.12. Functions of one variable. (a) f (x) = 1 when x > 0 and f (x) = 0 when x ≤ 0, (b) f (x) = 1/x, (c) f (x) = (sin x)/x when x = 0 and f (x) = 1 when x = 0, (d) f (x) = 0 when x = 0 and f (x) = 1 when x = 0. = (c) 1 1 1 3 -3 -2 -1 1 2 3 -3 -2 f (x) = 3 2 2 1 lim f (x) = 0 . x→0− Therefore the function f (x) is not continuous at x = 0. 2 1 (d) 1 lim f (x) = 1 2 -1 -3 Example: Determine whether or not the function defined by 3 -2 3 -2 x→0+ when x > 0 ; when x ≤ 0 . is continuous at x = 0. (a) 2 The function is shown in Fig. 2.12(a). Note that there exists a break at x = 0. As x → 0 from below (the negative side), f (x) → 0. As x → 0 from above (the positive side), f (x) → 1. Example: Determine whether or not the function defined by 1 f (x) = 0 1 x→a Intuitively by continuity we mean that there is a lack of breaks in the graph of the function f (x). Consider the three examples shown in Fig. 2.11. These are the functions f (x) = x , f (x) = x 2 and f (x) = x 3 . As x approaches 0 in each case the functions are continuous at x = 0. This is because -1 -1 (c) lim f (x) = f (a) . -2 -1 Therefore as x approaches a , so f (x) approaches L . The limits are the same as x → a from both directions. For functions of one variable we say that f (x) is continuous at x = a whenever limx→a f (x) exists, f (a) is defined and the limit L equals f (a). Therefore continuity of the function f (x) at x = a -3 10 -1 -1 -1 -2 -2 -2 -3 -3 -3 x is continuous at x = 0. 1 -1 1 2 3 Functions of one variable. (a) f (x) = x, (b) f (x) = x 2 and (c) f (x) = x 3 . The function defines a pair of hyperbolas and is shown in Fig. 2.12(b). The function f (x) is not continuous at x = 0 because limx→0 f (x) does not exist. Example: Determine whether or not the function defined by f (x) = sin x x 1 x = 0 ; x = 0. 52 2.3 Limits and Continuity 2 Functions of More than One Variable is continuous at x = 0. (a) 53 (b) The function is shown in Fig. 2.12(c). The function f (x) is continuous at x = 0 because lim f (x) = 1 . x→0 Note that we do not have to provide a special definition for the function at x = 0. This is because, for small x , sin x ≈ x − x3 3! + ··· and so sin x x =1− x2 6 + ··· → 1 as x → 0 . Example: Determine whether or not the function defined by f (x) = 0 1 whenx = 0; whenx = 0 . is continuous at x = 0. The function is shown in Fig. 2.12(d). The function is clearly not continuous at x = 0 because lim f (x) = 0 x→0 Fig. 2.13. To examine the continuity at a point we have to have the same limit as the point is approached from all directions, including (a) radial directions and (b) tangential directions. Definition: The function f (x, y) is said to be joint continuous at the point (x, y) = (x0 , y0 ) if: 1) f (x, y) is defined at (x0 , y0 ) 2) lim(x,y)→(x0 ,y0 ) f (x, y) exists 3) lim(x,y)→(x0 ,y0 ) f (x, y) = f (x0 , y0 ) . A function that is joint continuous at every point (x, y) is said to be a joint continuous function. In summary, for joint continuity check to see if: lim as approached from both sides but f (0) = 0 2.3.2 Limits and Continuity for Functions of Two Variables We wish to define continuity of the function f (x, y) at (x, y) = (0, 0). Note: In two dimensions we do not (as in one dimension) just approach the origin from the left and right. For continuity we now require there to be the same limit as the origin is approached in all directions (see Fig. 2.13). Note: With functions of one variable we consider smaller and smaller intervals about the origin. With functions of two variables we have to replace intervals by discs and let the radius, r = x 2 + y 2 → 0. We want all values of the function f inside the disc tend to the same limit. Therefore we wish to say that the function f (x, y) is continuous at the point (0, 0) provided the values of f (x, y) in a disc of radius d approaches the value of f (0, 0) as d → 0; we define (x, y) → (0, 0) to mean r = x 2 + y 2 → 0. (x,y)→(x0 ,y0 ) f (x, y) = f (x0 , y0 ) . Example: Show that the function f (x, y) = x 2 − y 2 is (joint) continuous at (0, 0). In polar coordinates, x = r cos θ , y = r sin θ we have f = r 2 (cos2 θ − sin2 θ ) where r becomes the radius of the disc discussed above. Therefore lim f → 0 = f (0, 0) = 0 r →0 independent of θ and therefore the function is joint continuous. Note that this function represents a saddle and therefore it is intuitively obvious that it is continuous. Example: 54 2 Functions of More than One Variable Determine the continuity of the function defined by f (x, y) = 1 1 0 for y = 0 (x -axis) ; for x = 0 ( y -axis) . away from both axes The values of the function in the x - y plane are shown in Fig. 2.14. Note that if we keep y fixed and vary x then f (x, 0) = 1 for all x . Therefore f (x, y) is continuous in x alone. Similarly if we keep x fixed equal to 0 and varying y gives f (0, y) = 1 for all y . Therefore f (x, y) is continuous in y alone. But f (x, y) = 0 anywhere off the two axes. Therefore the function is continuous in each variable separately. But the function f (x, y) is not jointly continuous. This is because in any disc around the origin (see Fig. 2.14) the function always has values zero and one no matter how small the disc is. This function is not joint continuous but separately continuous, i.e. the functions f (x, 0) and f (0, y) are continuous functions of one variable. In order to determine whether or not functions are joint continuous, there are two procedures to try: 2.3 Limits and Continuity 55 2) Try substitutions of the type x = y n or y = x n to see if f (x, y) has different limits along different curves, in which case it is not continuous. Example: Determine the continuity of the function defined by f (x, y) = (x 2 x+y + y 2 )1/4 (x, y) = (0, 0) ; (x, y) = (0, 0) . 0 In polar coordinates we have x 2 + y 2 = r 2 cos2 θ + sin2 θ √ (x 2 + y 2 )1/4 = (r 2 )1/4 = r and hence r (cos θ + sin θ ) √ r √ = r (cos θ + sin θ ) → 0 f = 1) Write the function in polar coordinates and see if f → L as r → 0, for example, independent of θ . If it is then f is joint continuous. as r → 0 . Therefore f is joint continuous at (0, 0). y 000000000001 000000000001 000000000001 000000000001 000000000001 000000000001 000000000001 000000000001 111111111111 000000000001 000000000001 000000000001 000000000001 000000000001 000000000001 000000000001 000000000001 Fig. 2.14 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 11111111111 00000000000 x 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 The function f (x, y) = 1 when x = 0 or y = 0, but 0 elsewhere. Example: Determine the continuity of the function defined by f (x, y) = 2x y x 2 + y2 (x, y) = (0, 0) ; 0 (x, y) = (0, 0) . First method: In polar coordinates we have f = 2r 2 cos θ sin θ = sin 2θ r 2 (cos2 θ + sin2 θ ) away from the origin. Therefore, as r → 0, f → 0 in all directions. Therefore the function is not joint continuous. For example, if we approach the origin along the line θ = π/4, then f |θ=π/4 = sin(π/2) = 1. Therefore f = 1 everywhere along this line, except at the origin where it is equal to zero. Therefore the function is not joint continuous at (0, 0). Second method: 56 2 Functions of More than One Variable 2.4 Partial Differentiation 57 We proceed by letting y = mx and see how the function changes as we approach the origin along this line. We have f = 2mx 2 x 2 + m2 x 2 = Remark: Sums and products of continuous functions are always continuous. 2m 1 + m2 = constant . Therefore f is a constant along y = mx everywhere except at the origin (note that we divided by r in the above equation). Therefore f is not joint continuous at (0, 0). Note: In this example if we keep y = 0 fixed, then f (x, 0) = 0 everywhere along the x -axis. Therefore f is continuous in x alone. Similarly f is continuous in y alone. Therefore f is separately continuous in x alone and y alone, but not jointly continuous in x and y . Remark: Another way of expressing joint continuity is as follows. If we write f = f (x + h, y + k) − f (x, y) then joint continuity at the point (x, y) amounts to lim (h,k)→(0,0) Remark: All functions of the form n/2 m/2 x y f (x, y) = x n + ym Example: Determine the continuity of the function defined by f (x, y) = ( f ) → 0 . 0 x2y x 4 + y2 (x, y) = (0, 0) ; 0 (x, y) = (0, 0) . are not joint continuous as letting y = x n/m (x, y) = (0, 0) . (x, y) = (0, 0) . easily demonstrates. 2.4 Partial Differentiation First method: In polar coordinates we have f = r 3 cos2 θ sin θ r 4 cos4 θ + r 2 sin2 θ . Therefore it is not quite clear what happens as r → 0. So we must try the second method. Second method: We proceed by letting y = x 2 and see how the function changes as we approach the origin along this line. We have f = For functions of one variable: y = f (x). In this case the derivative at a point is the gradient of the tangent to the curve at that point (see Fig. 2.15). For functions of two variables: z = f (x, y). However, there exist an infinite number of tangents that can be drawn at a point (x0 , y0 ) (see Fig. 2.16). So how can we define a derivative in this case? The answer is that we define a partial derivative. The idea is as follows: As we have seen, the function y 1 x2 · x2 x4 = = . 2 2 + (x ) 2x 4 2 x4 Therefore f reduces to a constant along this curve. Therefore f is not jointly continuous. We can draw a disc around the origin to see this. Remark: We can in general prove that joint continuity ⇒ separate continuity x = x0 but that in general separate continuity ⇒ joint continuity . Fig. 2.15 x The tangent to the curve y = f (x) at the point x = x0 . 58 2 Functions of More than One Variable 2.4 Partial Differentiation Similarly the derivative of f (x0 , y) with respect to y is defined as: z ∂f ≡ f y = lim ∂y k→0 surface defined by z = f (x,y) y f (x0 , y + k) − f (x0 , y) k ∂f ∂x ∂f fy ≡ ∂y ∂f fz ≡ ∂z (x0,y0) x There is an infinite number of tangents to the surface at the point (x0 , y0 ). z = f (x, y) can be represented as a surface. If we draw a plane through (x0 , y0 ) parallel to the x -z plane, then the two surfaces intersect in a curve for which y is fixed. We can then draw a tangent to this curve at the point (x0 , y0 ) which is unique (see Fig.2.17). This is the partial derivative with respect to x for fixed y . Similarly one can draw a plane parallel to the z - y plane and define the partial derivative with respect to y for fixed x . So, if we fix y = y0 in f (x, y) and let x vary, then f (x, y0 ) depends only on x . Its derivative with respect to x is called the partial derivative with respect to x and is defined as lim h→0 f (x + h, y0 ) − f (x, y0 ) . h and similarly for functions of three or more variables; we define partial derivatives by holding all but ONE variable fixed. For example, if f = f (x, y, z) then fx ≡ Fig. 2.16 59 y, z constant x, z constant x, y constant Remark: The symbol ∂ is the “curly d” and is NOT the same as d. Also, we cannot treat ∂ f /∂ x as a ratio as in some sense we can do with d f /dx . Remark: Calculations are as easy (or even easier!) as for functions of one variable. Example: f (x, y) = x 2 + y 2 ⇒ f x = 2x, f y = 2y Example: f (x, y, z) = x y 2 z 3 ⇒ fx = y2 z3, f y = 2x yz 3 , f z = 3x y 2 z 2 Provided this limit exists, this partial derivative is denoted by ∂f ∂x Example: or fx . f (x, y) = e(x 2 +y 2 ) ⇒ f x = 2x e(x 2 +y 2 ) , f y = 2 y e(x 2 +y 2 ) Example: z f (x, y) = surface defined by z = f (x,y) Fig. 2.17 ⇒ fx = −1 , (x + y)2 fy = −1 (x + y)2 Note: Partial derivatives are themselves functions of x and y , or x , y and z , etc. Therefore we can take further partial derivatives. So we define: y x 1 x+y (x0,y0) The tangent to the surface at the point (x0 , y0 ) parallel to the x-z plane. ∂ ∂x ∂ ∂y ∂f ∂x ∂f ∂x = ∂2 f ≡ ( f x )x = f x x ∂x2 ∂2 f ≡ ( fx )y = fx y ∂ y∂ x ∂2 f ∂ ∂f ≡ f y x = f yx = ∂x ∂y ∂ x∂ y = 60 2 Functions of More than One Variable 2.4 Partial Differentiation where f x y , f yx , etc. are called mixed partial derivatives. Note that f x y means (partial) differentiate with respect to x first and then with respect to y . Assuming joint continuity of higher order derivatives, similar theorems hold in higher dimensions. For example, if f = f (x, y, z) and we assume joint continuity of the function f , then Example: Given the function 61 f x x yyz = f x yx yz = f zx x yy = · · · etc. 2 f (x, y) = sin(x − 2 y ) find f x , f y , f x x , f yy , f x y and f yx . ∂f = cos(x − 2 y 2 ) keeping y constant ∂x ∂f = −4 y cos(x − 2 y 2 ) fy = keeping x constant ∂y ∂2 f ∂ ∂f fx x = = = − sin(x − 2 y 2 ) ∂x ∂x ∂x2 ∂2 f ∂ ∂f = f yy = = −4 cos(x − 2 y 2 ) + 4 y(−4 y) sin(x − 2 y 2 ) ∂y ∂y ∂ y2 ∂ ∂f ∂2 f = fx y = = 4 y sin(x − 2 y 2 ) ∂ y∂ x ∂y ∂x ∂ ∂f ∂2 f = f yx = = 4 y sin(x − 2 y 2 ) . ∂ x∂ y ∂x ∂y fx = Therefore provided the conditions of joint continuity are met the operational order of differentiation does not matter. Example: Show that (i) f (x, y, z) = x 2 + y 2 − 2z 2 and (ii) f (x, y, z) = e(3x+4 y) cos 5z satisfy Laplace’s equation ∂2 f ∂2 f ∂2 f + + = 0. ∂x2 ∂ y2 ∂z 2 (i) We have ∂f = 2x, ∂x ∂2 f = 2, ∂x2 ∂f = 2 y, ∂y ∂2 f = 2, ∂ y2 ∂f = −4z, ∂z 2 ∂ f = −4, ∂z 2 ∂2 f ∂2 f ∂2 f + + = 2 + 2 − 4 = 0. ⇒ ∂x2 ∂ y2 ∂z 2 Note: In this example f x y = f yx . This is not a coincidence. It turns out that mixed f x y and f yx are equal for most functions in practice. The condition for equality involves the concept of joint continuous functions. Suppose x = x0 + h and y = y0 + k . Then f (x, y) − f (x0 , y0 ) = f and f = f (x0 + h, y0 + k) − f (x0 , y0 ) . Joint continuity means that f → 0 as (h, k) → (0, 0) where (h, k) may approach (0, 0) along any curve. Furthermore, sums and products of joint continuous functions are joint continuous. Theorem: If at a point (x, y) both f x y and f yx exist (for a function f (x, y)) and are joint continuous, then we have f x y = f yx . We can also define higher partial derivatives. For example, fx x x ∂ ≡ ∂x fx x y ∂ ≡ ∂y ∂2 f ∂x2 ∂2 f ∂x2 = ∂3 f ∂x3 = ∂3 f = f yx x = f x yx . ∂ y∂ x 2 (ii) We have ∂f = 3e(3x+4 y) cos 5z, ∂x ∂2 f = 9e(3x+4 y) cos 5z = 9 f ∂x2 ∂f = 4e(3x+4 y) cos 5z, ∂y ∂2 f = 16e(3x+4 y) cos 5z = 16 f, ∂ y2 ∂f = −5e(3x+4 y) sin 5z, ∂z 62 2 Functions of More than One Variable 2.5 Joint Differentiability ∂2 f = −25e(3x+4 y) cos 5z = −25 f, ∂z 2 ∂2 f ∂2 f ∂2 f + 2 + 2 = 9 f + 16 f − 25 f = 0 . ⇒ 2 ∂x ∂y ∂z 63 y y = f(x) 2.5 Joint Differentiability Again we start with the idea of differentiability for functions of one variable. Consider a function y = f (x) and two neighbouring points x = a and x = a + h (see Fig. 2.18). The question is: Does the slope x=a f (a + h) − f (a) h x Fig. 2.19 The gradient of the tangent at the point x = a on the curve y = f (x). tend towards a definite limit A? If so, A is the required tangent slope (the derivative) at x = a and the function is said to be differentiable at x = a . Alternatively we can write f (a + h) − f (a) −A=ρ h where ρ determines how well the slope approximates A (ρ can be thought of as the “error” in the slope). This leads to the alternative definition of differentiability: Definition: Let f = f (x) be a function and let f = f (a + h) − f (a). The function f (x) is differentiable at x = a provided f = Ah + ρh such that ρ → 0 as h → 0. y y = f(x) Therefore differentiability in one dimension implies that the tangent line with slope A d f = dx x=a x=a gives an excellent approximation to the curve near x = a (see Fig. 2.19). Note the form of the functions shown in Fig. 2.20. In each case there are no tangents to the curve at x = x0 . This implies that these functions are not differentiable. When we are dealing with functions of two variables we require a generalisation of this idea and we expect that joint differentiability of a function at a point (x0 , y0 ) implies that the surface z = f (x, y) has a tangent plane at (x0 , y0 ) and near this point the tangent plane gives an excellent approximation to the surface (see Fig. 2.21). y y f (a+h) – f (a) h x=a Fig. 2.18 x = a+h x The approximation to the slope at the point x = a on the curve y = f (x). x = x0 x x = x0 Fig. 2.20 Two functions which are not differentiable. x 64 2.6 Directional Derivative 2 Functions of More than One Variable 65 and where (i) ρ1 and ρ2 depend on h , k , a , b (ρ1 and ρ2 are “error” terms) and (ii) the first and second terms in Eq. (2.1) represent the tangent plane itself: ⇒ A = fx , B = fy . It follows from Eq. (2.1) that lim (h,k)→(0,0) f = 0 . Therefore: joint differentiability Fig. 2.21 Note: A function can have both partial derivatives but fail to be even joint continuous. Therefore the existence of partial derivatives is not enough for joint differentiability (or even joint continuity). The condition for this is given by: A(x − x0 ) + B(y − y0 ) + C(z − z 0 ) = 0 . Dividing by C this can be rearranged to give z − z 0 = A (x − x0 ) + B (y − y0 ) where A = −A/C and B = −B/C . For this plane to represent the tangent plane at x = x0 , its intersection with the plane y = y0 must be the tangent line with gradient given by the partial derivative f x (x h = δx, (x0 ,y0 ) 0 ,y0 + (y − y0 ) f y (x ) 0 ,y0 ) . Definition: The function f (x, y) is jointly differentiable at (a, b) provided f = f (a + h, b + k) − f (a, b) = Ah + Bk + hρ1 + kρ2 such that (h,k)→(0,0) ρ1 → 0 and f = δ f . and δf = lim (h,k)→(0,0) ρ2 → 0 ∂f ∂f δx + δy + (ρ1 δx + ρ2 δy) ∂x ∂y and so (Total change in f ) = (change due to δx ) + (change due to δy ) + (error terms) . 2.6 Directional Derivative Now we can consider the concept of joint differentiability in the case of functions of two variables. lim k = δy Hence Eq. (2.1) becomes . Therefore the equation of the tangent plane to the surface z = f (x, y) at (x0 , y0 , z 0 ) is: z − z 0 = (x − x0 ) f x (x Theorem: If f x and f y both exist and are both joint continuous then the function f is joint differentiable. Another way of writing the joint differentiability condition is to set (for increments) 0 ,y0 ) by definition. Similarly at y = y0 the intersection with x = x0 plane must be the tangent line with gradient given by the partial derivative fy joint continuity . The opposite need not be true. Note that even in functions of one variable continuity does not necessarily imply differentiability (see left-hand plot in Fig. 2.20). The tangent plane to a surface z = f (x, y) at the point (x0 , y0 ). The equation of the plane passing through the point P(x0 , y0 , z 0 ) is: ⇒ (2.1) Recall that partial derivatives are obtained by letting only one of the variables vary. Therefore partial derivatives are derivatives along the x and y directions respectively (see Fig. 2.22). Suppose we wish to find derivatives in any direction making an angle α with the x -axis (see Fig. 2.23). Then there are variations in both directions with δx = t cos α δy = t sin α . 66 2 Functions of More than One Variable 67 Hence δx (x,y) 2.6 Directional Derivative Dα f = lim t→0 δf ∂f ∂f = cos α + sin α t ∂x ∂y where ρ1 and ρ2 → 0 as t → 0. Note: Clearly when α = 0, Dα f = ∂ f /∂ x and when α = π/2, Dα f = ∂ f /∂ y . δy Example: Calculate Dπ/4 f for f = sin(x + y 2 ). Fig. 2.22 Partial derivatives as derivatives along the x and y directions. π ∂f π ∂f + sin Dπ/4 f = cos 4 ∂x 4 ∂y 1 Definition: The directional derivative of a function f along a direction making an angle α with the x -axis is given by: Dα f = lim t→0 δf . t ∂f ∂f δx + δy + (ρ1 δx + ρ2 δy) ∂x ∂y and for δx and δy gives δf = 2 We can also generalise the expression Substituting for δ f given by δf = 1 = √ cos(x + y 2 ) + √ (2 y) cos(x + y 2 ) 2 2 √ 1 = √ + y 2 cos(x + y 2 ) . ∂f ∂f t cos α + t sin α + ρ1 t cos α + ρ2 t sin α . ∂x ∂y d y d y du = dx du d x which is the expression for differentiating a function of a function of one variable. Here, instead of a straight line, at an angle α , consider the change in f (x, y) along a curve given by x = x(t), y = y(t) (i.e. a parametric curve). For a general curve defined by x(t) and y(t) the value of t defines the point along the curve (see Fig. 2.24). y y as t varies t δy α δx Fig. 2.23 x x The components of the length t along the x and y axes. Fig. 2.24. The parametric curve where x = x(t) and y = y(t). The value of t determines the location along the curve. 68 2 Functions of More than One Variable For example, a circle of radius a can be defined parametrically by 2.7 Chain Rule Therefore ∂ f dx ∂ f dy δf = + + “error terms.” δt ∂ x dt ∂ y dt x = x(t) = a cos t y = y(t) = a sin t where 0 ≤ t ≤ 2π . Note that x 2 + y 2 = a 2 , the square of the radius. Theorem: Consider a joint differentiable function f (x, y) along the curve x = x(t), y = y(t), where x and y are differentiable with respect to t . Then f (x(t), y(t)) depends on t only and we have: ∂ f dx ∂ f dy df = + dt ∂ x dt ∂ y dt This represents the derivative of the function f (x, y) along the parametric curve (x(t), y(t)) (see Fig. 2.25). Proof : For a small change in t → t + δt , we have a small change δx in x and δy in y . Using the one variable formulae: δx = dx δt + error, dt δy = dy δt + error dt Then the change in f becomes ∂f ∂f δx + δy + error terms ∂x ∂y ∂ f dx ∂ f dy δt + error + δt + error . = ∂ x dt ∂ y dt δf = 69 In the limit as δt → 0 the “error terms” → 0. Hence lim δt→0 ∂ f dx ∂ f dy δf df = = + . δt dt ∂ x dt ∂ y dt Note: d f /dt means “rate of change of height z = f (x(t), y(t)) as t varies along the curve.” Note: There exist two ways of finding d f /dt ; either direct substitution of x(t) and y(t) in f (x, y) or by the use of the above formula. Example: Find the derivative of f (x, y) = x y 3 along the curve x(t) = cos t , y(t) = sin t . We have ∂f ∂x ∂f ∂y dx dt dy dt = y 3 = sin3 t = 3x y 2 = 3 cos t sin2 t = − sin t = cos t and hence, from the formula, df = (sin3 t)(− sin t) + (3 cos t sin2 t)(cos t) . dt 2.7 Chain Rule Often we wish to express our functions in terms of new variables, say f = f (x, y), x = x(r, θ) = r cos θ , y = y(r, θ) = r sin θ . How do we find ∂ f /∂r , ∂ f /∂θ in terms of ∂ f /∂ x , ∂ f /∂ y or vice versa? Suppose in general that x and y are functions of two variables s and t : x = x(t, s), y = y(t, s) . Keeping s fixed (s = s0 ), these equations define a curve: Fig. 2.25 The derivative of the function f (x, y) along the parametric curve (x(t), y(t)). x = x(t, s0 ), y = y(t, s0 ) . 70 2 Functions of More than One Variable By the previous theorem we can find d f /dt the curve. We have df dt = s=s0 ∂f ∂x dx dt s=s0 + s=s0 ∂f ∂y 2.7 Chain Rule which is the derivative along dy dt . Remark: Sometimes we can invert functions of x = x(t, s) and y = y(t, s) and find t = t (x, y) and s = s(x, y). One such example is polar coordinates r = r (x, y) and θ = θ (x, y). Then ∂f = ∂x ∂f = ∂y . s=s0 But calculating the total derivative of x(t, s) and y(t, s) with respect to t while keeping s at a fixed value is equivalent to taking the partial derivative. Hence ∂ f ∂x ∂ f ∂y ∂f = + . ∂t ∂ x ∂t ∂ y ∂t Similarly, ∂ f ∂x ∂ f ∂y ∂f = + . ∂s ∂ x ∂s ∂ y ∂s In both these expressions the ∂ f /∂ x term represents the change in f due to x changing and the ∂ f /∂ y term represents the change in f due to y changing. in one dimension. Here we have two expressions, one for s and one for t . Example: Express ∂2 f ∂2 f + 2 2 ∂x ∂y where f = f (x, y) in terms of polar coordinates (r, θ). Recall that x = r cos θ , y = r sin θ . First we find ∂ f /∂r and ∂ f /∂θ in terms of ∂ f /∂ x and ∂ f /∂ y using the chain rule: ∂ f ∂x ∂ f ∂y ∂f ∂f ∂f = + = cos θ + sin θ ∂r ∂ x ∂r ∂ y ∂r ∂x ∂y ∂ f ∂x ∂ f ∂y ∂f ∂f ∂f = + = −r sin θ + r cos θ ∂θ ∂ x ∂θ ∂ y ∂θ ∂x ∂y Using the chain rule we have Therefore we have expressed ∂ f /∂r and ∂ f /∂θ in terms of ∂ f /∂ x and ∂ f /∂ y . Similarly we have the chain rule for functions of more than two variables. Consider that example of a function f = f (x, y, z) where x = x(u, v, w), y = y(u, v, w), z = z(u, v, w). The chain rule for a function of three variables gives: ∂ f ∂x ∂ f ∂y ∂ f ∂z ∂f = + + ∂u ∂ x ∂u ∂ y ∂u ∂z ∂u ∂ f ∂x ∂ f ∂y ∂ f ∂z ∂f = + + ∂v ∂ x ∂v ∂ y ∂v ∂z ∂v ∂ f ∂x ∂ f ∂y ∂ f ∂z ∂f = + + ∂w ∂ x ∂w ∂ y ∂w ∂z ∂w ∂ f ∂s ∂t + ∂x ∂s ∂ x ∂ f ∂s ∂t + ∂y ∂s ∂ y Remark: We can also use the chain rule to obtain higher derivatives. For example, ∂ 2 f /∂ x 2 , ∂ 2 f /∂ y 2 , etc. in terms of (r, θ). Example: Given x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ and f = f (x, y), find expressions for ∂ f /∂r and ∂ f /∂θ . ∂ f ∂x ∂ f ∂y ∂f = + = f x cos θ + f y sin θ ∂r ∂ x ∂r ∂ y ∂r ∂ f ∂x ∂ f ∂y ∂f = + = f x (−r sin θ ) + f y (r cos θ ) ∂θ ∂ x ∂θ ∂ y ∂θ ∂f ∂t ∂f ∂t However, often it is easier to find ∂ f /∂ x and ∂ f /∂ y from the above formula for ∂ f /∂t and ∂ f /∂s . Note: This is the generalisation of d f dx df = dt d x dt 71 (2.2) (2.3) Use these to find ∂ f /∂ x and ∂ f /∂ y in terms of ∂ f /∂r and ∂ f /∂θ . Multiply Eq. (2.2) by cos θ and Eq. (2.3) by − sin θ/r and then add. This gives: cos2 θ + sin2 θ ∂f ∂x = cos θ ∂f sin θ ∂ f − ∂r r ∂θ or ∂f sin θ ∂ f ∂f = cos θ − . ∂x ∂r r ∂θ (2.4) ∂f cos θ ∂ f ∂f = sin θ + . ∂y ∂r r ∂θ (2.5) Similarly: These expressions apply to any function f ; we may therefore write them symbolically as operators applied to any function: ∂ ∂ sin θ ∂ = cos θ − ∂x ∂r r ∂θ (2.6) 72 2 Functions of More than One Variable ∂ ∂ cos θ ∂ = sin θ + ∂y ∂r r ∂θ 2.8 Taylor’s Theorem for Functions of Two Variables (2.7) where h and k are fixed constants. Therefore, if we apply the operator to a function f it is taken to mean ∂ ∂f ∂ ∂f +k +k D̃ f ≡ D̃( f ) = h f = h . ∂x ∂y ∂x ∂y Now applying the operator in Eq. (2.6) twice to f gives: ∂f ∂f sin θ ∂ ∂ − = cos θ ∂x ∂r r ∂θ ∂x sin θ ∂ ∂ f ∂ ∂f = cos θ − ∂r ∂ x r ∂θ ∂ x sin θ ∂ f sin θ ∂ sin θ ∂ f ∂ ∂f ∂f = cos θ − − cos θ cos θ − . ∂r ∂r r ∂θ r ∂θ ∂r r ∂θ ∂2 f ∂ ≡ ∂x ∂x2 We can now perform the differentiation using the product rule and using the fact that fr θ = f θr . This gives: 2 sin θ cos θ ∂ 2 f ∂2 f 2 ∂ f = cos θ − 2 r ∂r ∂θ ∂x2 ∂r 2 sin2 θ ∂ 2 f sin2 θ ∂ f + + r ∂r r 2 ∂θ 2 2 sin θ cos θ ∂ f + ∂θ r2 Similarly ∂ ∂f ∂ ∂f +k +k h D̃ 2 f ≡ D̃( D̃ f ) = h ∂x ∂y ∂x ∂y = h2 ∂2 f ∂2 f ∂2 f ∂2 f + kh + k2 2 . + hk 2 ∂ x∂ y ∂ y∂ x ∂x ∂y But the second and third terms in the last expression are equal and so D̃ 2 f = h 2 f x x + 2hk f x y + k 2 f yy . Similarly we can define D̃ 3 f , D̃ 4 f , D̃ n f , etc. Theorem: Assume the function f (x, y) has joint continuous partial derivatives up to order n . Then f (a+h, b+k) = f (a, b)+( D̃ f )(a,b) + 1 2 1 ( D̃ f )(a,b) +· · ·+ ( D̃ (n−1) f )(a,b) +E 2! (n − 1)! where E is the small error term given by Similarly we find sin θ cos θ ∂ 2 f ∂2 f ∂2 f = sin2 θ 2 + 2 2 r ∂r ∂θ ∂y ∂r + − cos2 θ ∂ 2 f ∂θ 2 r2 + cos2 θ ∂ f 2 sin θ cos θ ∂ f r2 ∂θ r ∂r E= F(t) = f (a + ht, b + kt) where t is varying. Note that F(0) = f (a, b) and F(1) = f (a + h, b + k) and d dF = [ f (a + ht, b + kt)] dt dt ∂ f d(a + ht) ∂ f d(b + kt) + = ∂x dt ∂y dt 2.8 Taylor’s Theorem for Functions of Two Variables Here we consider a generalisation of Taylor’s theorem for functions of one variable. This will be useful for finding stationary points of functions of two variables. First we introduce a new notation. We define a differential operator D̃ by ∂ ∂ +k D̃ ≡ h ∂x ∂y ( D̃ n f )(a+θ h,b+θk) Proof : Define a new function of one variable by ∂2 f ∂2 f ∂2 f 1 ∂f 1 ∂2 f + + = + . r ∂r ∂x2 ∂ y2 ∂r 2 r 2 ∂θ 2 1 n! for some θ with 0 < θ < 1. . Adding these together gives 73 =h ∂f ∂f +k . ∂x ∂y Similarly d d2 F = ( D̃ f ) = D̃( D̃ f ) = D̃ 2 f dt dt 2 and dn F = D̃ n f . dt n 74 2 Functions of More than One Variable 2.10 Stationary Points of Functions of Two Variables 75 Now, by Taylor’s theorem for one variable, we have F(1) = F(0) + d F(0) 1 d2 F(0) 1 dn−1 F(0) 1 dn F(θ) + + ··· + + 2 dt 2! dt (n − 1)! dt n−1 n ! dt n where 0 < θ < 1. Substituting into the formula for F(1) and the expressions for the higher derivatives we obtain the theorem. 2.9 Quadratic Forms Consider a quadratic function: 2 f (x, y) = ax + 2bx y + cy 2 where a , b and c are constants. Let 2 = ac − b = 0 . Fig. 2.26. Examples of a (a) minimum, (b) a maximum and (c) a saddle point in the surfaces defined by the function f (x, y). If = 0, we have proved nothing. One can in this case show that either f (x, y) = 0 for all (x, y) (i.e. the surface is the horizontal plane) or the surface resembles f (x, y) = cy 2 . This corresponds to a flat-bottomed valley – neither a Then: (i) If > 0, then a f (x, y) > 0 for all (x, y) = (0, 0). (ii) If < 0, then f (x, y) can take either sign for values of (x, y) near (0, 0). maximum, minimum or a saddle point. Proof : = ac − b2 > 0 implies a = 0. Consider For functions of one variable, f (x), the function is stationary at x = a whenever (d f /dx)|x=a = 0 (or f (a) = 0), i.e. the curve has a horizontal tangent. a f (x, y) = a 2 x 2 + 2abx y + acy 2 . 2.10 Stationary Points of Functions of Two Variables The nature of stationary points is given by the quadratic term in Taylor’s expansion: We can “complete the square” and rewrite this as a f (x, y) = (ax + by)2 + (ac − b2 )y 2 = (ax + by)2 + y 2 . But each of these two terms are positive, since > 0 by assumption. Therefore their sum can only vanish if x = y = 0; otherwise a f (x, y) > 0 which proves (i). = ac − b2 < 0 implies that we can no longer necessarily have a = 0. If a = c = 0, then f (x, y) = 2bx y which takes opposite signs when x = y and x = −y . Now suppose a = 0, say a > 0. Then the above expression for y = 0, x = 0, is f = ax 2 > 0. On the other hand for (x, y) = (0, 0) on the line ax + by = 0, f = (/a)y 2 < 0 (since < 0 by assumption). If c = 0, a similar argument holds. Hence this proves (ii). There are three important consequences of this: 1) If > 0 and a > 0, then f > 0 for all (x, y) = (0, 0) and the surface defined by f is a minimum (see Fig. 2.26(a)). 2) If > 0 and a < 0, then f < 0 for all (x, y) = (0, 0) and the surface defined by f is a maximum (see Fig. 2.26(b)). 3) If < 0, then f takes either sign arbitrarily close to (0, 0) and the surface defined by f is a saddle (see Fig. 2.26(c)). f (a + h) = f (a) + h f (a) + h2 f (a) + E . 2! The term f (a) determines the nature of the stationary point. If f (a) > 0 then the point is a minimum. If f (a) < 0 then the point is a maximum. For functions of two variables, f (x, y) has a maximum at the point (a, b) whenever f (a + h, b + k) − f (a, b) < 0 for sufficiently small (h, k). Note: Such a point is a maximum in x alone (with y fixed) and in y alone (with x fixed). Hence ∂ f /∂ x = 0 = ∂ f /∂ y by the one variable theory. This suggests the following definition: Definition: The point (a, b) is a stationary point of the function f (x, y) whenever ∂f ∂f =0= at (a, b) . ∂x ∂y Now consider Taylor’s theorem for a function of two variables: f (a + h, b + k) = f (a, b) + h f x + k f y 76 2 Functions of More than One Variable + 1 2 h f x x + 2hk f x y + k 2 f yy + Error 2! 2.11 Lagrange Multipliers Example: Find the stationary points and their nature for the function Adopting the notation p = fx , 77 f (x, y) = x 4 + 4x 2 y 2 − 2x 2 + 2 y 2 + 1 . q = fy, r = fx x , t = f yy , s = fx y we can write the Taylor expansion as f (a + h, b + k) = f (a, b) + ph + qk 1 2 + r h + 2shk + tk 2 + Error The stationary points are given by the solutions of p = f x = 4x 3 + 8x y 2 − 4x = 4x(x 2 + 2 y 2 − 1) = 0 2 2 q = f y = 8x y + 4 y = 4 y(1 + 2x ) = 0 . (2.8) (2.9) where the stationary points occur where p = q = 0. Therefore, at a stationary point, > 0 Eq. (2.9) implies that y = 0. Then Eq. (2.8) gives 4x(x 2 −1) = 0 giving the stationary points at x = 0, x = ±1 at y = 0. The three stationary points are (0, 0), (1, 0) and (−1, 0). 1 2 f (a + h, b + k) − f (a, b) = r h + 2hks + tk 2 + Error . 2 r = f x x = 12x 2 + 8 y 2 − 4 2! Now, by definition, this is a maximum, minimum, or saddle according to whether the quadratic term on the right-hand side is negative, positive or either for small (h, k). Using the properties of quadratic functions and analysing (r h 2 + 2shk + tk 2 ) we conclude that: Stationary points are given by p = q = 0 whenever the surface f (x, y) has horizontal tangents at (a, b) and there exist three possibilities as shown in Fig. 2.27. From this we conclude that stationary points are given by p = f x = 0 and q = f y = 0. Their nature depends on r = f x x , s = f x y and t = f yy . Define the quantity = r t − s 2 . Then we have: 1) If > 0 and r > 0, then the point is a minimum. 2) If > 0 and r < 0, then the point is a maximum. 3) If < 0, then the point is a saddle. Since 1+2x 2 For the nature of the points we need to look at: s = f x y = 16x y t = f yy = 8x 2 + 4 = r t − s2 These give us the values shown in Table 2.1 below. 2.11 Lagrange Multipliers Lagrange multipliers can be used to find the maximum and minimum of functions f (x, y) subject to some constraint φ(x, y) = 0, where φ is a given function. Note: Any condition such as x 2 +y 2 = 1 can be written as φ(x, y) = x 2 +y 2 −1 = 0. In principle, we can solve φ(x, y) = 0 for y in terms of x and then find the minimum or maximum of f (x, y(x)) – as a function of one variable. In practice this is often difficult or impossible, so we proceed as follows: Consider the families of curves f (x, y) = c for different values of c. The condition φ(x, y) = 0 also describes a fixed curve in the (x, y) plane. The diagrams in Fig. 2.28 suggest that the function f attains a maximum or minimum Table 2.1. The values of r , s, t and at the three stationary points. Point Fig. 2.27. Examples of a (a) maximum, (b) a minimum and (c) a saddle at the point (a, b) on the surfaces defined by the function f (x, y). (0, 0) (1, 0) (−1, 0) r s −4 0 8 0 8 0 t Nature 4 12 12 −16 96 96 Saddle Minimum Minimum 78 2 Functions of More than One Variable 2.11 Lagrange Multipliers (subject to φ(x, y) = 0) precisely when the curve f = c ( f = c3 in the diagrams) is tangent to the curve φ(x, y) = 0. Therefore formally we can represent φ(x, y) = 0 parametrically as x = x(t), y = y(t) so that φ(x(t), y(t)) = 0 for all t . Then f (x, y) is stationary whenever Example: Find the stationary points (and the values of the function at these points) of f (x, y) = x y subject to the constraint φ(x, y) = x 2 + y 2 − 1 = 0. df = 0. (2.10) dt Also we have φ(x(t), y(t)) = 0 for all t . This implies (trivially by differentiation) dφ =0 dt so that at the stationary points we have: fx φx d y/dt = =− =λ φy dx/dt fy Substitute in Eqs. (2.14)–(2.16) above gives: (2.11) dx dy df = fx + fy =0 dt dt dt dx dy dφ = φx + φy =0 dt dt dt Hence at such points the following ratios all hold: (2.12) (2.13) (2.14) (2.15) (2.16) φ(x, y) = 0 . (2.18) (2.19) Equation (2.17) gives −y/x = 2λ. Equation (2.18) gives −y/x = 1/(2λ). Equating these gives 1 = 2λ ⇒ 4λ2 = 1 . 2λ Hence 1 2 x = ±√ , 1 1 (x, y) = + √ , + √ 2 2 1 1 (x, y) = − √ , − √ 2 2 1 1 (x, y) = + √ , − √ 2 2 1 1 (x, y) = − √ , + √ y f=c1 2 f=c1 f=c2 f=c3 φ(x,y) = 0 y = ±x . Therefore altogether there exist four possibilities: These are three conditions for three unknowns x , y and λ. The constant λ is called the Lagrange multiplier. These equations give the stationary points (x, y). The quantity λ is a constant in each case but different at different stationary points. φ(x,y) = 0 Eq. (2.15) → x + 2λy = 0 Eq. (2.16) → x 2 + y 2 − 1 = 0 . Putting λ = 1/2 in Eq. (2.17) and Eq. (2.18) gives y + x = 0. Putting λ = −1/2 in Eq. (2.17) and Eq. (2.18) gives −y + x = 0. Therefore x 2 = y 2 always; substitute in Eq. (2.19) gives 2x 2 = 1. Therefore say f y + λφ y = 0 f=c2 (2.17) 1 2 f x + λφx = 0 f=c3 Eq. (2.14) → y + 2λx = 0 λ=± . where λ is a constant. Therefore we have three equations: y 79 2 1 2 1 λ=− 2 1 λ= 2 1 λ= 2 λ=− For the stationary values of f the above results give two distinct values: f (x, y) = 1 2 (maximum) f (x, y) = − 1 2 (minimum) We can see what is happening from Fig. 2.29. x x Fig. 2.28. The function f attains a maximum (subject to φ(x, y) = 0) when the curve f = c3 is tangent to the curve φ(x, y) = 0. Note: In such problems do not investigate the nature of such points except by looking at values of f : f = 1/2 is larger so it is a maximum; f = −1/2 is smaller so it is a minimum. 80 2.12 Inverse Functions 2 Functions of More than One Variable 81 Therefore, for non-zero x , y and z we must have 2 y 2 z 2 = −λ, f = -1 x 2 z 2 = −λ, x 2 y 2 = −λ and so f=1 1 x 2 = y2 = z2 . x2 + y2 = 1 0 Substituting in Eq. (2.27) gives 3x 2 = a 2 . Hence f = -1/2 f = 1/2 f = 1/2 f = -1/2 a x = ±√ , y = ±x, 3 z = ±x, λ=− a4 9 and so -1 f=1 a x = ±√ , f = -1 3 -2 -2 -1 0 1 2 Fig. 2.29. Plots of the hyperbolas defined by f (x, y) = x y for f = ±1 and ±1/2, and the circle defined by the condition x 2 + y 2 − 1 = 0. A similar method can be used for functions of three variables to find stationary points of f (x, y, z) subject to the conditions φ(x, y, z) = 0. Here, by analogy with the case of two variables we obtain: f x + λφx = 0 f y + λφ y = 0 f z + λφz = 0 φ(x, y, z) = 0 . (2.20) (2.21) (2.22) (2.23) Therefore we have four equations (Eqs. (2.20)–(2.23)) for the four unknowns x , y , z , λ. Example: Find the maximum and minimum of f (x, y, z) = x 2 y 2 z 2 subject to x 2 + y 2 + z 2 = a 2 (i.e. points on the sphere of radius a ). Here f = x 2 y 2 z 2 and φ(x, y, z) = x 2 + y 2 + z 2 − a 2 = 0. Hence f x + λφx = 2x y 2 z 2 + 2λx = 2x(y 2 z 2 + λ) = 0 f y + λφ y = 2x 2 yz 2 + 2λy = 2 y(x 2 z 2 + λ) = 0 f z + λφz = 2x 2 y 2 z + 2λz = 2z(x 2 y 2 + λ) = 0 (2.24) (2.25) (2.26) (2.27) a z = ±√ 3 3 independently. Therefore there exist eight stationary points apart from the origin at (0, 0, 0). At (0, 0, 0), f = 0 and since f = x 2 y 2 z 2 ≥ 0 for all (x, y, z), the point (x, y, z) = (0, 0, 0) must be a minimum. At all other points, f (x, y, z) = a 6 /27 which is a maximum. Now since these are maximum points we have x 2 + y2 + z2 − a2 = 0 a y = ±√ , x 2 y2 z2 1/3 ≤ a2 3 = x 2 + y2 + z2 3 . Therefore the geometric mean is less than the arithmetic mean for the three positive numbers. 2.12 Inverse Functions A function with domain D is called one-to-one if no two elements of D have the same image; i.e. f (x1 ) = f (x2 ) for all x1 = x2 . For example, the function f (x) = x 3 is one-to-one (see Fig. 2.30a) because each value of the function f (x) corresponds to a unique value of x . However, the function f (x) = x 2 is not one-to-one because each value of f (x) corresponds to two values of x (see Fig. 2.30b). Definition: Let f be a one-to-one function with domain D and range R . Then the inverse function f −1 has domain R and range D : It expresses x as a function of y so that y = f (x) ⇔ x = f −1 (y) . Example:Find the inverse function of f (x) = x 3 . 82 2.13 Implicit Functions 2 Functions of More than One Variable (a) (b) 3 -2- 2 1 1 -1 1 2 -2 Theorem: If f is a one-to-one and continuous function defined on an interval, then its inverse f −1 is also continuous. 3 2 -1 Theorem: Let f be a continuously differentiable function with inverse f −1 in some region R . Let p be a point in R and let f ( p) = 0. Then there exists an open interval containing p such that f is one-to-one and the inverse function f −1 is continuously differentiable. The idea with the condition f ( p) = 0 in the theorem above is to avoid maxima and minima when dealing with the inverse function. 1 2 -1 -1 -2 -2 -3 -3 Fig. 2.30. (a) Plot of the function f (x) = x 3 ; this is a one-to-one function. (b) Plot of the function f (x) = x 2 ; this is not a one-to-one function because, for example, f (x) = 1 could arise from x = 1 or x = −1. 2 2.13 Implicit Functions Often functions define implicit relations between variables as in x 5 + x 2 y 3 + y 6 + 18x y + e y = 0 such that it is not easy (or even possible!) to obtain y = f (x) explicitly. In general the function f (x, y) = 0 defines some curve in the (x, y) plane. Consider the function shown in Fig. 2.32. Near the point A we have an implicit function y = F(x) but we cannot define such a function near the point B , for example. We therefore require conditions on the function f (x, y) such that we can assume the existence of such implicit functions. This leads to the following theorem. y=x f f –1 1 83 y A -2 -1 1 2 -1 -2 Fig. 2.31 Plots of the function y = f (x) = x 3 and its inverse y = x 1/3 . B If we write y = x 3 then x = y 1/3 . If we then change x to y we have y = x 1/3 so that f −1 (x) = x 1/3 . Note that the graph of the inverse function f −1 can be obtained by reflecting the graph of f about the line y = x (see Fig. 2.31). x Fig. 2.32. A curve in the x-y plane where we can define an implicit function in the vicinity of the point A but not in the vicinity of the point B. 84 2 Functions of More than One Variable Theorem: (The Implicit Function Theorem.) Let f (x, y) be defined on an open disc containing the point (a, b) where f (a, b) = 0, f y (a, b) = 0 and f x , f y are continuous on the disc. Then f (x, y) = 0 defines y as a function of x near the point (a, b). This theorem allows d y/dx to be calculated in the following way. Let y = F(x). Hence f (x, y(x)) = 0 . (2.28) 3 Multiple Integrals We can use the chain rule to differentiate both sides of Eq. (2.28). This gives: ∂ f dy ∂ f dx + =0 ∂ x dx ∂ y dx or, because dx/dx = 1, ∂ f dy ∂f + =0 ∂x ∂ y dx Recall that for functions of one variable, y = f (x), we obtain ab f (x) dx by dividing the range [a, b] into small intervals of width δxi and adding up areas of strips above the intervals. The total area is given by and hence fx dy =− . dx fy (2.29) n δ Ai ≈ i=1 Example:Find d y/dx if x 3 + y 3 = 6x y . n f (xi ) δxi i=1 where n is the number of intervals (see Fig. 3.1). We can write this as f (x, y) = x 3 + y 3 − 6x y = 0 and then, using Eq. (2.29), we have dy x 2 − 2y fx 3x 2 − 6 y =− 2 . =− =− 2 dx fy 3 y − 6x y − 2x y The implicit function theorem can be extended to functions of more than two variables. f(x) δxi f(xi) x=a x=xi x=b x Fig. 3.1. The area underneath the curve y = f (x) can be approximated by the sum of the areas of individual strips. 85 86 3 Multiple Integrals 3.1 Rectangular Regions As we take smaller and smaller intervals, i.e. as n → ∞, |δxi | → 0 and in the limit we get the exact area under the curve f (x) between x = a and x = b as b y y R f (x) dx . This limit is well defined for continuous functions. For functions of one variable, f (x), we therefore integrate over an interval [a, b]. For functions of two variables, f (x, y), we start by integrating over a rectangular area R in the x - y plane. The question is: Given z = f (x, y), what is the volume under the surface z = f (x, y) over the rectangle R ? (See Fig. 3.2.) The approach is to divide the region R into identical small rectangles of area δ Ai j = δxi δyi (see Fig. 3.3). Therefore the vertical box in Fig. 3.2 has an approximate volume f (xi , yj ) δ Ai j = i, j It can be shown that in the limit where the number of areas δ Ai j → ∞ and |δ Ai j | → 0, these sums tend to the same limit, provided the function f (x, y) is joint continuous. In the limit we have, by definition, the exact volume, f (x, y) d A . In other words f (xi , yj ) δxi δyj . lim To do this sum we can either fix x and first sum over δyi and then sum the result over δxi , or the other way round. Hence V ≈ f (x, y) δx δy = δx R f (x, y)δy δx = δy x R i, j δxi x δVi j = height × area = f (xi , yj ) δ Ai j . δAij Fig. 3.3. The region R is divided into rectangles of dimensions δxi by δyi and area δ Ai j = δxi δyi . 3.1 Rectangular Regions V ≈ R δyi a Hence the total volume is approximately 87 δy f (x, y) δx δy ≡ f (x, y) d A = required volume, V . R δx,δy For the two different ways of doing the sum we might expect: f (x, y)δx δy . V = lim δx lim δx = lim lim f (x, y) δy δx δy δy z (3.1) f (x, y) δx δy (3.2) δx where the x in f (x, y) is fixed in Eq. (3.1) and the y in f (x, y) is fixed in Eq. (3.2). This gives us a way of evaluating V = R f (x, y) d A. In Eq. (3.1) we can write the limit where x is fixed as surface defined by z = f (x,y) lim f (x, y) δy = f (x, y) d y = g(x), say. δy y x Fig. 3.2. The volume underneath the surface z = f (x, y) can be approximated by the sum of the volumes with the same base area and height given by the value of z. Then the second (outer) limit in Eq. (3.1) is lim δx δy lim f (x, y) δy δx = lim (g(x)) δx = g(x) dx . δx We can treat Eq. (3.2) in a similar way doing the x integration first and then the y integration. 88 3 Multiple Integrals 3.1 Rectangular Regions Therefore we have two ways of evaluating double integrals as repeated ordinary integrals. Let the region R be defined by (a) y a≤x ≤b (b) y R d 89 R d and c ≤ y ≤ d c as shown in Fig. 3.4. Then c V = = = R x=b x=a y=d y=c f (x, y) d A y=d f (x, y) d y dx y=c x=b a (3.3) f (x, y) dx d y (3.4) x=a When we do the y integration first in Eq. (3.3) we keep x fixed. This results in an integrand that is only a function of x which we then integrate from x = a to x = b. We can think of this pictorially as shown in Fig. 3.5a. We first sum over small rectangles (with x fixed) and then sum the resultant over a ≤ x ≤ b to cover the whole region R . When we do the x integration first in Eq. (3.4) we keep y fixed. This results in an integrand that is only a function of y which we then integrate from y = c to y = d . We can think of this pictorially as shown in Fig. 3.5b. We first sum over small rectangles (with y fixed) and then sum the resultant over c ≤ y ≤ d to cover the whole region R . Therefore double integrals become repeated single integrals. b a x b x Fig. 3.5. Pictorial representation of the order of the integrations in Eq. (3.3) and (3.4). (a) First sum over small rectangles with x fixed and then over a ≤ x ≤ b as in Eq. (3.3). (b) First sum over small rectangles with y fixed and then over c ≤ y ≤ d as in Eq. (3.4). Example: Find the volume V under z = f (x, y) = x 2 y with base R , when R is the rectangle 1 ≤ x ≤ 2, −3 ≤ y ≤ 4. First do it by integrating with respect to y first keeping x fixed. We have x=2 2 V = x y dA = R = x=2 = y 7x 3 6 y=−3 = x=1 2 y=−3 dx = x=2 y=4 x y d y dx x=1 y=4 2 x=1 x 2 y2 x=2 x=1 7x 2 dx 2 49 . 6 Now do it by integrating with respect to x first keeping y fixed. We have R d x2y dA = V = R c = y=4 y=−3 = a Fig. 3.4 b x The region of integration, R, corresponding to a ≤ x ≤ b and c ≤ y ≤ d. 7y2 6 1 3 x y 3 y=−3 x=2 y=4 = y=−3 y=4 dy = x=1 x=2 x 2 y dx d y x=1 y=4 y=−3 7y dy 3 49 . 6 Therefore we get the same answer regardless of the order in which we do the integrations. 90 3 Multiple Integrals 3.2 Non-rectangular Regions In this particular example we could have separated the integrand into x and y parts. Hence 0 ≤ y ≤ 2x . In these strips the bottom limit is y = 0 and the top limit is y = 2x . Therefore, in the limit the contribution from the vertical strip is: V = x=2 x=1 y=4 y=−3 x y d y dx = 7 3 x=1 y=4 2 x dx = x=2 2 y=−3 More generally, if the function f (x, y) can be written as f (x, y) = g(x) h(y), i.e. if the function is separable, and if the region R is a rectangle, then we can write: g(x) h(y) d A = R = x=b y=c y=d g(x) dx x=a x=1 x=0 For regions more complicated than rectangles we must think carefully about the limits of the single integrations. Let the region R be a triangle bounded by the x -axis ( y = 0), the vertical line x = 1 and the straight line y = 2x . The region is shown in Fig. 3.6. To proceed to calculate the double integral we go back to approximating sums. First we sum over the vertical strips (as indicated in Fig. 3.7) with fixed x and f (x, y) d A = h(y) d y 3.2 Non-rectangular Regions x=1 x=0 R y=2x y=0 f (x, y) d y dx where the integral from y = 0 to y = 2x is done for fixed x . We can also try the integration the other way round whereby we fix y and integrate over x first. This is equivalent to taking horizontal strips from x = y/2 to x = 1 for fixed y and then integrating from y = 0 to y = 2. This is illustrated in Fig. 3.8. This gives f (x, y) d A = R y=2 y=0 x=1 f (x, y) dx d y . x=y/2 Remark: The two ways will give exactly the same result. In practice it makes sense to choose the simplest method. Example: Find R f (x, y) d A for the triangular region bounded by y = 0, x = 1 and y = 2x for the function f (x, y) = x 2 y 2 . y For the first way we will first do the y integration keeping x fixed. This corresponds to taking vertical strips as shown in Fig. 3.7a. Then we integrate over x from x = 0 to x = 1. We have y = 2x x=1 2 2 x y dA = = x=1 The triangular region bounded by y = 0, x = 1 and y = 2x. x=1 x=0 x = y=2x 2 2 x y d y dx x=0 R Fig. 3.6 f (x, y) d y dx . y=0 y=c and so the integral factors into the product of two integrals. However, note that f (x, y) can rarely be expressed as a product (i.e. separable). More often we have functions such as x 2 cos(x y) or ln(x 2 + y 2 ). y=0 y=2x This covers the whole of the triangular region. From Fig. 3.7 we have g(x) h(y) d y dx x=a x=b f (x, y) d y where we have fixed x . Adding up all the vertical strips by integrating the result over all allowed x , i.e. 0 ≤ x ≤ 1 is equivalent to y=d y=2x y=0 7 49 . = 2 6 y dy 91 8x 6 18 y=0 x 2 y3 y=2x 3 y=0 x=1 = x=0 dx = 4 . 9 x=1 x=0 8x 5 dx 3 92 3 Multiple Integrals (a) y 3.2 Non-rectangular Regions (b) y (a) y a (b) y = +√a2-x2 x y=2 -a y = 2x y=0 y=0 y=0 x=1 y=0 x x a fixed x y a = - √a2-y2 x =+√a2-y2 -a 0 a x Fig. 3.8. The semi-circular region bounded by x 2 + y 2 = a 2 and y ≥ 0, with (a) vertical strips for fixed x and varying y, and (b) horizontal strips for fixed y and varying x. x=1 x = y/2 0 93 x=1 as in Fig. 3.8a. We can then allow x to vary within the limits −a ≤ x ≤ a so that the whole region is covered. Hence x Fig. 3.7. The triangular region bounded by y = 0, x = 1 and y = 2x, with (a) vertical strips for fixed x and varying y, and (b) horizontal strips for fixed y and varying x. x=+a f (x, y) d A = R y=+ a 2 −x 2 y=0 x=−a √ f (x, y) d y dx . Alternatively, if we fix y then the horizontal strips have x -limits given by For the second way we will first do the x integration keeping y fixed. This corresponds to taking horizontal strips as shown in Fig. 3.7b. Then we integrate over y from y = 0 to x = 2. We have x 2 y2 d A = y=2 y=0 R = y=2 y3 9 x=1 x 3 y2 − as in Fig. 3.8b. We can then allow y to vary within the limits 0 ≤ y ≤ a so that the whole region is covered. Hence f (x, y) d A = x 2 y 2 dx d y x=1 R dy = x=y/2 y6 6 × 24 y=2 = y=0 y=2 y=0 y2 3 − y5 24 y=a y=0 x=+ x=− √ a 2 −y 2 √ a 2 −y 2 f (x, y) dx d y . Note that we will obtain the same answer in each case. In practice we need to use the way that provides the simpler integration. dy 4 . 9 Remark: It is more convenient to use a simpler notation. Since f (x, y) d A = R Note: This example is not separable because of the non-rectangular (x , y dependence) of the limits. x 2 + y2 = and Now consider the region R inside the semi-circle defined by y ≥ 0. The region and the two possible strips that could be taken are shown in Fig. 3.8. If we fix x then the vertical strips have y -limits given by f (x, y) dx d y = R f (x, y) d y dx R we can, for example, write the first integral over the semi-circular region above (integrating with respect to y first) as: x=+a a2 √ y= dx x=−a y=0 a 2 −x 2 f (x, y) d y where we have dropped the brackets. This is to be interpreted as “do the righthand side integration (with respect to y ) first and then do the integration with respect to x ”. 0 ≤ y ≤ + a2 − x 2 x=y/2 3 y=0 = − a2 − y2 ≤ x ≤ + a2 − y2 Example: 94 3 Multiple Integrals Evaluate the double integral 3.2 Non-rectangular Regions In the second method we will integrate with respect to x first. Therefore we are integrating along the horizontal strip C D with fixed y and then allow y to move from 0 to 1 to cover the whole region. Here C represents the start of the √ strip at x = y and D represents the end of the strip at x = y . This gives (2x 2 + y) dx d y R where R is the region bounded by the line y = x and the curve y = x 2 . ⇒ x = 1, x=1 y=x dx y=x 2 x=0 x=1 2 (2x + y) d y = x=0 = x=1 2 dx 2x y + x=0 = x4 2 + y=0 (2x 2 + y) dx = y2 y=x 2 = 6 − x5 x=1 2 y 5 y 3/2 3 y=0 − = x=0 x=√ y 2y3 3 x=y − y2 dy y=1 = y=0 1 . 6 Note: In this case the two different ways are equally easy or difficult. In some cases, one way can be much simpler than the other. Therefore the choice of order in the integration can be important. 1 . 6 Example: Find the region R for the double integration √ x=2a x=0 (a) 2 y 5/2 y 4 y 3 − − = 3 6 3 dx y y=1 2x 3 dy + xy 3 5x 4 2x + − dx 2 2 x3 y=0 x=y So far we have shown how, given the region R , we can find the limits for x and y in the integrals. However, sometimes we can have the limits of repeated integrals and be asked to find the corresponding region R (and perhaps change the order of integration). y=x 2 x2 3 y=1 √ x= y x = 0. Therefore the points of intersection are (0, 0) and (1, 1). The resulting region is shown in Fig. 3.9. In the first method we will integrate with respect to y first. Therefore we are integrating along the vertical strip AB with fixed x and then allow x to move from 0 to 1 to cover the whole region. Here A represents the start of the strip at y = x 2 and B represents the end of the strip at y = x . This gives y=1 dy The first step is always to calculate where the curves intersect and draw a diagram of the region of integration. The curves intersect where y = x = x 2 . This gives x − x2 = 0 95 6a 2 −x 2 y= √ y= ax 2x y d y . (b) In the second integral the limits of y are 1 1 √ ax ≤ y ≤ 6a 2 − x 2 . y=x y=x Therefore the lower limit defines a parabola, y 2 = ax √, while the upper limit defines a circle, x 2 + y 2 = 6a 2 (the circle’s radius is 6a ). These two curves intersect at B C 6a 2 − x 2 = ax D A y = x2 y = x2 1 x 1 x Fig. 3.9. The region bounded by the line y = x and the curve y = x 2 , with (a) vertical strips for fixed x and varying y, and (b) horizontal strips for fixed y and varying x. ⇒ x 2 + ax − 6a 2 = 0 ⇒ x = 2a . We are now in a position to draw the two curves. These are shown in Fig. 3.10. Therefore the integral can be written as x=2a dx x=0 √ 6a 2 −x 2 √ y= ax 2x y d y = x=2a dx x y x=0 2 √ y= 6a 2 −x 2 √ y= ax 96 3 Multiple Integrals 3.2 Non-rectangular Regions 97 y y = √ax Note that as it stands the integral is very difficult because of the cos x 5 dx part. √ The limits of the second integral are from x = y to x = 1 implying that one limit is the curve y = x 2 . The limits of the first integral are y = 0 (the x -axis) to y = 1. The resulting region and the associate horizontal strip are shown in Fig. 3.11a. Now change the order of the integration so that we do the y integration first taking vertical strips for fixed x as shown in Fig. 3.11b. This gives y=1 dy y=0 x=1 √ y cos x 5 dx = x=0 = x=0 x = 2a Fig. 3.10. The region bounded by the curve y = with vertical strips for fixed x and varying y. = x=2a x=0 √ √ x and the curve y = 6a 2 − x 2 , x=0 = 3a 2 x 2 − x=1 x4 2 x=0 sin x 5 10 y2 2 y=x 2 cos x 5 y=0 cos x 5 dx x=1 = x=0 sin 1 . 10 The last integration can be seen by substituting u = x 5 (du = 5x 4 dx ). 2 2 6a x − ax − x = dx x y cos x 5 d y y=0 x=1 = y=x 2 dx x= y y = √6a2-x2 x=1 ax 3 3 − x4 4 3 Remark: Sometimes it is sensible to divide the region R in order to do the integration. For example, if the region R can be subdivided into regions R1 and dx x=2a = x=0 16 4 a . 3 y In this case it is actually possible to do the integration using horizontal strips keeping y fixed for the first integration. However, the integration would have to be done in two parts. In the first part the limit for x are 0 ≤ x≤ y 2 /a with y √ going from 0 √ to 2a .√In the second part the limits are 0 ≤ x ≤ 6a 2 − y 2 with y going from 2a to 6a . y (a) 1 (b) 1 y = x2 y = x2 x=1 x=1 Remark: There are always two ways to do a double integral; choose the simpler because the other may be impossible! Example: By changing the order of the integration, evaluate the integral y=1 dy y=0 x=1 √ x= y y cos x 5 dx . y=0 1 x y=0 1 x Fig. 3.11. The region bounded by the curve y = x 2 and the lines x = 1 and y = 0, with (a) horizontal strips for fixed y and varying x, and (b) vertical strips for fixed x and varying y. 98 3 Multiple Integrals 3.3 Change of Variables in Area Integrals R2 as shown in Fig. 3.12, where R = R1 ∪ R2 , then we have f (x, y) d A = f (x, y) d A + f (x, y) d A . R R1 R2 3.3 Change of Variables in Area Integrals For functions of one variable it is often useful to integrate by a change of variable, e.g. x = x(u). The rule is to replace x by x(u) and dx by (dx/du)du and then alter the x -limits to the u -limits. This is integration by substitution. This gives x=b I = Remark: Double integrals can be used to find the area of a particular region. This is because dA ≡ R 1 · d A = Area of R . R Therefore to find the area of R we just find Remark: In evaluating expressions such as R R y=2−2x u=u 1 x=a x=b I = are meaningless because they will depend on x . The outer or second set of limits should always be constants. Remark: If we interpret δ A as an area element, then f d A is the volume between the surface z = f (x, y) and the surface element δ A. Another interpretation is to imagine R made of sheet metal of variable density f (x, y) grammes/unit area. Then f (x, y) δ A = density × area = mass of δ A. Summing gives f (x, y) d A = total mass of R . R x=b x=a x y dx x=0 dx du du f (x) dx = − u=u 2 f (x(u)) u=u 1 dx du . du But dx/du < 0 in this case, so we can combine both cases in one formula: x=1 dy y=0 f (x(u)) where the limits u 1 and u 2 correspond to the limits a and b such that a = x(u 1 ) and b = x(u 2 ). This procedure is fine if x(u) increases with u . If x(u) is a decreasing function of u the u -limits are then reversed and therefore we have a change of sign: f (x, y) d A with f (x, y) = 1. f (x, y) d A the answer must be a number. Therefore, u=u 2 f (x) dx = x=a 99 f (x) dx = u=u 2 u=u 1 dx f (x(u)) du . du (3.5) Remark: On the right-hand side of Eq. (3.5) the function f (x) is expressed as f (x(u)). The right-hand side of Eq. (3.5) includes a magnification factor Remark: dx/du , multiplying the du ; this comes from transforming from dx to du . For functions of two variables one would similarly expect that the change in variables x = x(u, v), y = y(u, v) (for example, for polar coordinates u = r and v = θ ) would result in a change in the area by a magnification factor M such that dx d y = M du dv . As an example consider a linear change of coordinates: x = x(u, v) = au + bv, or R1 R2 Fig. 3.12. A region R that can be subdivided into regions R1 and R2 to make integration easier. x a = y c y = y(u, v) = cu + dv b d u v (3.6) where a , b, c and d are constants. Now write M for the transformation matrix composed of a , b, c and d and recall that a unit square in (u, v) variables has sides 1 u = = e1 , 0 v u 0 = = e2 1 v 100 3 Multiple Integrals v (a) e2 (b, d) Therefore, P e1 (u=1,v=0) u b 1 a = 0 d c b 0 b = 1 d d a c a Me2 = e2 = c where (a, c) and (b, d) represent the coordinates of the new corners in the (x, y) plane (see Fig. 3.13b). Therefore, under the transformation M we find unit square in (u, v) based on e1 , e2 → parallelogram in (x, y) based on e1 , e2 . Note from the matrix and the diagram that the point (1, 1) in (u, v) transforms to the point (a + b, c + d) in (x, y). Now consider the area of the parallelogram P (see Fig. 3.14). We have Area P = [Total area of rectangle] y c b R d P T1 T2 c Fig. 3.14 T1 R b = det M δx δy = a c b d δu δv or, δx = a δu + b δv, δy = c δu + d δv, . Therefore, for a linear change of variables: (Rectangular area du dv in (u, v) plane) → (“Parallelogram” area, i.e. (det M)δu δv in (x, y) plane) . Now let us consider a nonlinear change of coordinates. We take the transformation to have the following form: x = x(u, v), y = y(u, v) where, neglecting small errors, the increments in x and y are given by or, in matrix form, δx δy = ∂ x/∂u ∂ y/∂u ∂x δv ∂v ∂y δv ∂v ∂ x/∂v ∂ y/∂v δu δv d Definition: The Jacobian matrix is defined to be b a ∂x δu + ∂u ∂y δu + δy = ∂u T2 b b d Since the unit square (area) gets multiplied by a factor of det M, a small rectangle of sides δu and δv , with area δu δv also gets multiplied by the same factor det M. Hence, replacing u and v in Eq. (3.6) by δu and δv gives the corresponding changes δx and δy as δx = a c a c = ad − bc = det x as shown in Fig. 3.13a. Now, to see what happens to this unit square under the transformation M, just apply M. This gives 1 1 Area P = (a + b)(c + d) − 2 · ac − 2 · bd − 2bc 2 2 (a, c) Fig. 3.13. (a) The unit square in the (u, v) coordinate system. (b) The transformed unit square in the (x, y) coordinate system. Me1 = e1 = − [Area of 2 rectangles R ] . (a+b, c+d) e2 e1 101 − [Area of 2 pairs of equal triangles T1 and T2 ] (b) y (u=1,v=1) (u=0,v=1) 3.3 Change of Variables in Area Integrals c x Individual areas in the transformed unit square. M x, y u, v ≡ ∂ x/∂u ∂ y/∂u and the Jacobian determinant is defined to be ∂ x/∂v ∂ y/∂v ∂(x, y) x, y ≡ det M . ∂(u, v) u, v . 102 3 Multiple Integrals 3.3 Change of Variables in Area Integrals 103 This should help when trying to remember the formula ∂(x, y) ∂ x/∂u = ∂ y/∂u ∂(u, v) So, for a nonlinear change of variables: (Rectangular area du dv in (u, v) plane ) ∂ x/∂v . ∂ y/∂v → (Parallelogram area, i.e. (det M)δu δv in (x, y) plane) . This is illustrated in Fig. 3.15. Therefore, the required formula for double integrals under a change of variables is: f (x, y) dx d y = R R where ∂(x, y) du dv f (x(u, v), y(u, v)) ∂(u, v) A= then a ∂(x, y) ∂(u, v) = det M det A = and can be thought of as the magnification factor. Remark: The factor Note: In matrix notation, vertical lines on either side of a matrix denote the determinant. However, vertical lines on either side of an expression also denote the absolute value. For example, if we let c a c b d b = ad − bc d det A = |ad − bc| . ∂(x, y) ∂(u, v) is det M, the absolute value of the determinant of the matrix M. Note that we take the modulus as in the one variable case. Remark: Note that (x 2 + y 2 ) dx d y I = ∂(x, y) du dv dx d y → ∂(u, v) is the analogue of Example: Evaluate the integral R where R is a circle x 2 + y 2 = a 2 , by changing to polar coordinates. dx dx → du . du In polar coordinates we have x = r cos θ, y = r sin θ . Therefore, taking u = r and v = θ , we can write the Jacobian matrix as (a) v (b) y v+δv R v u u+δu u v+δv v R' u+δu u x Fig. 3.15. The transformation of the rectangular area R in the (u, v) plane to the area R in the (x, y) plane. M= ∂ x/∂r ∂ y/∂r and the Jacobian determinant is ∂ x/∂θ ∂ y/∂θ ∂(x, y) cos θ = det M = sin θ ∂(r, θ) = cos θ sin θ −r sin θ r cos θ −r sin θ = r cos2 θ + sin2 θ = r r cos θ which is always positive and so we do not need to take the absolute value. The original area R and the transformed area R are shown in Fig. 3.16. Note that the circle in the (x, y) plane transforms into a rectangle in the (r, θ) plane. Note 104 3 Multiple Integrals 3.4 Changing Variables Twice y (a) (∂ x/∂u)(∂u/∂s) + (∂ x/∂v)(∂v/∂s) (∂ y/∂u)(∂u/∂s) + (∂ y/∂v)(∂v/∂s) ∂ x/∂s ∂ x/∂t = ∂ y/∂s ∂ y/∂t x, y =M . s, t = (b) θ 2π R' R a x Fig. 3.16. The transformation of the circular area R in the (x, y) plane to the rectangular area R in the (r, θ) plane. that here R is the region given by x 2 + y 2 ≤ a 2 and R is the region given by 0 ≤ r ≤ a , 0 ≤ θ ≤ 2π . Therefore r 2 (r ) dr dθ (x + y ) dx d y = 2 I = r =a r =0 θ=2π θ=0 r 3 dr dθ = r =a r =0 r 3 dr θ=2π θ=0 dθ = πa 4 M x, y u, v M u, v s, t =M x, y s, t M x, y u, v M u, v s, t = ∂ x/∂u ∂ y/∂u ∂ x/∂v ∂ y/∂v ∂u/∂s ∂v/∂s ∂(u, v) ∂(x, y) nor ∂(u, v) ∂(x, y) (3.7) ∂u/∂t ∂v/∂t x, y x, y =I (3.8) can ever be zero −1 . (3.9) The result in Eq. (3.9) is often useful in solving problems. I = This implies that instead of undergoing a two-stage transformation we can obtain the necessary Jacobian matrix directly from the relationship between (x, y) and (s, t). To prove this consider the left-hand side of Eq. (3.7). We have ∂(x, y) ∂(u, v) Example: Evaluate the integral v = v(s, t) . . =M Therefore the product of the magnification factors of a transformation and its inverse is unity. So, for reversible changes, ∂(x, y) = ∂(u, v) u = u(s, t), u, v x, y and Eq. (3.8) gives Then for the Jacobian matrices we obtain M we obtain neither 2 Suppose that we change variables twice, first from (x, y) to (u, v) and then from (u, v) to (s, t), such that and det AB = det A det B 3.4 Changing Variables Twice y = y(u, v) x, y u, v ∂(x, y) ∂(u, v) = 1. ∂(u, v) ∂(x, y) where we note that the integral is separable. x = x(u, v), where I is the unit or identity matrix. Now taking determinants and recalling that R where the r 2 on the right-hand integral comes from the transformed x 2 + y 2 and the r dr dθ is from the transformed dx d y with r coming from the Jacobian determinant det M. Hence I = M 2 R (∂ x/∂u)(∂u/∂t) + (∂ x/∂v)(∂v/∂t) (∂ y/∂u)(∂u/∂t) + (∂ y/∂v)(∂v/∂t) Then, by definition the equations x = x(u, v) and y = y(u, v) can be solved for (u, v) in terms of (x, y) to give u = u(x, y) and v = v(x, y). Therefore, setting x = s and y = t in Eq. (3.7) we have ar 0 105 1 · dx d y R (i.e. the area of the region R ) where R is enclosed by y 2 = x , y 2 = 2x , x y = 1 and x y = 2. The region R bounded by the curves is shown in Fig. 3.17a. To solve the integral consider the change of variables defined by u = y 2 /x, v = xy . 106 3.5 Volume Integrals 3 Multiple Integrals (a) y 2 1 Volume integrals are integrations where the region of integration is a volume. The basic concepts are similar to those we introduced for two-dimensional (area) integrals, but now we have R' y2=x R 3.5 Volume Integrals (b) v v=2 y2=2x lim v=1 xy=2 δx,δy,δz→0 1 2 0 0 x u=1 u u=2 Fig. 3.17. The transformation of (a) the area R in the (x, y) plane bounded by the curves y 2 = x, y 2 = 2x, x y = 2 and x y = 1 to (b) the square R in the (u, v) plane bounded by u = 1, u = 2, v = 1 and v = 2. Then we can write the four bounding curves as y 2 = x ⇔ u = 1, y 2 = 2x ⇔ u = 2, x y = 1 ⇔ v = 1, xy = 2 ⇔ v = 2 . So the region becomes a square (the region R in Fig. 3.17b). Now, for the Jacobian determinant it is easier to use Eq. (3.9). So, to calculate ∂(x, y)/∂(u, v) we first calculate ∂(u, v)/∂(x, y) and then take the inverse. Using u = y 2 /x and v = x y we have ∂(u, v) ∂u/∂ x = ∂(x, y) ∂v/∂ x f (x, y, z) δx δy δz where δV = δx δy δz are now small volumes (see Fig. 3.18). The limit as the size of the volume element δV → 0 is written as xy=1 0 0 107 ∂u/∂ y −y 2 /x 2 = y ∂v/∂ y 2 y/x y2 = −3 x = −3u . x f (x, y, z) dx d y dz = f (x, y, z) dV V V where V is the three–dimensional region being integrated over. The integrals are, as in the two-dimensional case, evaluated by repeated integration where we integrate over one variable at a time. For example, we could start by integrating over z first (see Fig. 3.19). The procedure is as follows. 1) Fix (x, y) and integrate over the allowed values of z in the region V . The z -integral limits are the small, filled circles at the bottom and the top of the dashed line with, say, z = z 1 (x, y) at the bottom and z = z 2 (x, y) at the top as shown in Fig. 3.19. Therefore we are summing vertically over the boxes as shown in Fig. 3.20. 2) This result depends on the choice of (x, y) and is defined in the region R of the (x, y) plane which is the projection of V on to this plane as shown in Fig. 3.21. This now defines the region over which we must do the x and y integrations. Therefore, using Eq. (3.9), ∂(x, y) = ∂(u, v) Hence ∂(u, v) ∂(x, y) 1 · dx d y = R =− z 1 . 3u ∂(x, y) du dv ∂(u, v) R v=2 1 u=2 1 du dv = du dv 3 u=1 v=1 u I = −1 δV 1 · 1 − = 3u R u=2 v=2 v 1 du = 3 u=1 u v=1 u=2 1 1 1 ln 2 2 = du = [ln u ]u= u=1 = 3 u=1 u 3 3 V y x Fig. 3.18 The volume of integration, V , and the volume element δV = δx δy δz. 108 3 Multiple Integrals 3.5 Volume Integrals 109 Now we can take the double integral of the result of the z -integration over the region R in the (x, y) plane (see Fig. 3.22). Therefore z z2 V dx V z1 y x=b f (x, y, z) dV = x=a Example: Evaluate the integral y=y2 (x) y=y1 (x) dy z=z 2 (x,y) z=z 1 (x,y) f (x, y, z) dz . f (x, y, z) dV T x (x,y) over the tetrahedron T bounded by the planes x = 0, y = 0, z = 0 and x +y+z = 1. Fig. 3.19. The lower (z = z 1 ) and upper (z = z 2 ) limits on z for the first integration over the volume of integration, V . z V Note that the plane x + y + z = 1 passes through x = 1 (putting y = z = 0) and similarly through y = 1 and z = 1. The relevant planes are shown in Fig. 3.23. Now evidently for fixed (x, y) the z -limits are the heavy dots corresponding to z = 0 at the bottom and z = 1 − x − y at the top. This gives our z -limits. The projection R of T on to the (x, y) plane is the triangle on which the tetrahedron rests, i.e. the triangle given by x = 0, y = 0 and x + y = 1 (obtained by setting z = 0). So I = y x R x=1 x=0 z=1−x−y dy y=0 z=0 f (x, y, z) dz . For example, if f (x, y, z) = 1 then (x,y) y=1−x dx I = 1 · dV = dV = volume of T . T Fig. 3.21. The projection of the volume V on to the (x, y) plane defines the region R over which we must do the x and y integrations. T y y=y2(x) R z2 z1 Fig. 3.20 The stack of volume elements from z = z 1 to z = z 2 . y=y1(x) a b x Fig. 3.22. The region R in the (x, y) plane over which we must do the x and y integrations. 110 3 Multiple Integrals 3.6 Change of Variables in Volume Integrals z We define the Jacobian matrix for change of variables from (x, y, z) to (u, v, w) to be z (a) (b) 1 x, y, z M u, v, w 1 1 1 y 1 ∂ x/∂u = ∂ y/∂u ∂z/∂u Fig. 3.23. The volume formed by the intersection of the planes x = 0, y = 0, z = 0 and x + y + z = 1. (a) The lower and upper bounds on z (at each end of the dashed line) for fixed x and y. (b) Vertical strips from the (x, y) plane to the tetrahedron plane defined by z = 1 − x − y as well as strips in the (x, y) plane (z = 0). ∂(x, y, z) ≡ det M ∂(u, v, w) ∂(x, y, z) du dv dw . ∂(u, v, w) dx d y dz = As before, for reversible transformations ∂(x, y, z)/∂(u, v, w) = 0 and we have ∂(x, y, z) = ∂(u, v, w) Therefore, in this case I = = x=0 y=0 = x=1 y=0 y=1−x dx x=0 = y=0 x=1 x=0 x=1 (1 − 1 − x − y dy dx y − x y − 2 x=0 1−x−y dy [z ]z= z=0 x)2 1 dz z=0 y=1−x dx x=0 z=1−x−y dy x=1 dx = y2 2 y=1−x y=0 1 6 ∂(u, v, w) ∂(x, y, z) −1 . f (x, y, z) dx d y dz = V ∂(x, y, z) du dv dw f (x(u, v, w), y(u, v, w), z(u, v, w)) ∂(u, v, w) V where V is the transformed volume in (u, v, w) coordinates. Example: Find an expression for dx d y dz for the coordinate transformation from cartesian coordinates to spherical polar coordinates given by y = y(r, θ, φ) = r sin θ sin φ f dV = total mass of V . V 3.6 Change of Variables in Volume Integrals Changing variables in volume integrals is similar to the procedure used for double integrals. Suppose y = y(u, v, w), x = x(r, θ, φ) = r sin θ cos φ and this is the volume of the tetrahedron. We can “picture” V f d V as a “four-dimensional” volume, but it is easier to think of f (x, y, z) as a mass density (i.e. mass per unit volume), so that x = x(u, v, w), The integral under the change of variables becomes y=1−x dx = x=1 ∂ x/∂w ∂ y/∂w . ∂z/∂w such that the transformation for volume is x ∂ x/∂v ∂ y/∂v ∂z/∂v We can also define the Jacobian determinant y 1 x 111 z = z(u, v, w) . z = z(r, θ, φ) = r cos θ . The relationship between the coordinate systems is shown in Fig. 3.24. Note that O P = r sin θ . Let u = r , v = θ and w = φ . Then ∂(x, y, z) ∂(x, y, z) = ∂(u, v, w) ∂(r, θ, φ) sin θ cos φ = sin θ sin φ cos θ = r 2 sin θ r cos θ cos φ r cos θ sin φ −r sin θ −r sin θ sin φ r sin θ cos φ 0 112 3.6 Change of Variables in Volume Integrals 3 Multiple Integrals The Jacobian determinant is z 1−v ∂(x, y, z) = v(1 − w) ∂(u, v, w) vw P θ y 1 · d x d y dz = T P' = x Fig. 3.24. The spherical polar coordinate system (r, θ, φ) for a point P in relation to the cartesian coordinate system (x, y, z). Therefore dx d y dz = r 2 sin θ dr dθ dφ . Example: Consider the tetrahedron T confined within the planes x = 0, y = 0, z = 0 and x + y + z = 1. Consider the change of variables x = u(1 − v), −uv = u 2 v . uv The volume is given by r O φ 0 −u u(1 − w) uw y = uv(1 − w), z = uvw . Calculate u , v and w in terms of x , y and z . Deduce that in the new variables T is given by 0 < u < 1, 0 < v < 1, and 0 < w < 1. Hence calculate the volume of T . From the definitions of u , v and w we have u =x+y+z y+z v= x+y+z z w= y+z (3.10) (3.11) (3.12) Taking Eq. (3.10) with (x, y, z) = (0, 0, 0) gives u = 0 while x + y + z = 1 gives u = 1. Taking Eq. (3.11) with y = z = 0 and x = 0 gives v = 0 while x = 0 and (y, z) = (0, 0) gives v = 1. Taking Eq. (3.12) with z = 0 and y = 0 gives w = 0 while y = 0 and z = 0 gives w = 1. Therefore the new limits are 0 < u < 1, 0 < v < 1 and 0 < w < 1. = as before. u=1 u=0 1 0 1 6 v=1 v=0 u 2 du 0 w=1 w=0 1 v dv 1 · u 2 v du dv dw 1 0 dw 113