MAS102 Calculus II

advertisement
i
MAS102
MAS102
Calculus II
Table of Contents
1
1.1
1.2
1.3
Lecture Notes prepared by
R. Tavakol
1.4
1.5
1.6
© Queen Mary, University of London 2001–2003
2
2.1
2.2
2.3
2.4
2.5
2.6
1
CALCULUS II
Ordinary Differential Equations
Introduction
ODEs of First Order and First Degree
1 Separable Differential Equations
2 Homogeneous ODEs (First Order)
3 ODEs Reducible to Homogeneous Differential Equations
4 Linear, First Order ODEs
5 Bernoulli Equations
Linear ODEs with Constant Coefficients
1 The D Notation
2 Solution of Homogeneous Part
3 Particular Solutions to Non-homogeneous Equations
1
1
3
3
5
6
10
12
14
14
16
21
Functions of the form f (x) = Ax r (A is constant, r is integer)
Functions of the form f (x) = bekx (b, k are constants)
Functions of the form f (x) = x k eax (a, k are constants)
Functions of the form f (x) = cos nx or sin nx (n constant)
Functions of sums of constant multiples of any of the above
Differential Equations of the Euler Type
Simultaneous Linear Equations with Constant Coefficients
Series Solutions of ODEs
1 Picard’s Method
2 Taylor Series Method
3 Frobenius Method (partial)
21
22
24
25
26
27
29
31
32
34
36
Functions of More Than One Variable
Functions of One Variable
Functions of Two Variables
Limits and Continuity
1 Limits and Continuity for Functions of One Variable
2 Limits and Continuity for Functions of Two Variables
Partial Differentiation
Joint Differentiability
Directional Derivative
43
43
44
49
50
52
57
62
65
ii
2.7
2.8
2.9
2.10
2.11
2.12
2.13
Chain Rule
Taylor’s Theorem for Functions of Two Variables
Quadratic Forms
Stationary Points of Functions of Two Variables
Lagrange Multipliers
Inverse Functions
Implicit Functions
3
3.1
3.2
3.3
3.4
3.5
3.6
Multiple Integrals
Rectangular Regions
Non-rectangular Regions
Change of Variables in Area Integrals
Changing Variables Twice
Volume Integrals
Change of Variables in Triple Integrals
69
72
74
75
77
81
83
85
86
90
99
104
107
110
1
Ordinary Differential Equations
1.1 Introduction
Differential equations (DE) arise whenever we consider changes in a system or
the evolution of a system (since rate of change is equivalent to the derivative).
Definition: An ordinary differential equation (ODE) is an equation involving
a single unknown function of an independent variable and a finite number of its
derivatives.
Examples:
variables
(y, x)
dy
+ yx = x 2
dx
(x, t)
d4 x
dt 4
3
+ sin t
(1.1)
dx
+x =0
dt
(1.2)
(y, x)
dy
+ p(x)y + q(x) = 0
dx
(1.3)
(y, x)
d3 y
dy
= f (x) 1 +
dx
dx 3
(1.4)
2 5/2
where p, q and f are given functions of x .
Definition: The order of a differential equation is the highest derivative appearing.
Definition: The degree of a differential equation is the power of the highest
derivative.
Definition: A differential equation in y is said to be linear if it is linear in y ,
y , y , . . . otherwise it is called nonlinear.
1
2
1 Ordinary Differential Equations
1.2 Ordinary Differential Equations of First Order and First Degree
Examples:
The order, degree and type for examples (Eqs. (1.1–1.4)) are given in Table 1.1
below.
Note that in the example given in Eq. (1.4) we have to take the square of the
equation before we can find the degree. This gives
Note that the general solution, y = x + c, where c is a constant, determines a
family of solutions depending on c. This is illustrated in Fig.1.1. A particular
solution passing through the point (x0 , y0 ) is given by
d3 y
dx 3
2
= [ f (x)]
2
dy
1+
dx
y = x + y0 − x0 .
2 5
We shall be mostly concerned with linear ODEs.
Definition: To solve a differential equation we need to find the unknown
function.
Note that this unknown function is y in Eqs. (1.1), (1.3) and (1.4), and x in
Eq. (1.2).
Examples:
dy
= 0 ⇒ y = c1
dx
2
d y
= 0 ⇒ y = c1 x + c2
dx 2
3
d y
1
= 0 ⇒ y = c1 x 2 + c2 x + c3
2
dx 3
where c1 , c2 and c3 are constants.
Similarly, dn y/dx n = 0 has a general solution involving n arbitrary constants.
More generally, any n th order ODE has n arbitrary constants in its general
solution.
1.2 Ordinary Differential Equations of First Order and First Degree
Ordinary differential equations of the first order and first degree are the simplest
ODEs, but in general it is not always possible to solve them. Here we consider
five classes of these equations that we can solve.
1.2.1 Separable Differential Equations
These are equations which can be written as:
dy
= f (x) g(y) .
dx
We can start the process of “separation” by bringing the g(y) term over to the
left-hand side. This gives:
1 dy
= f (x) .
g(y) dx
y
Example Order
1.1
1.2
1.3
1.4
1
4
1
3
2
y0
⇒ y = x +c
Table 1.1. The order and degree of
Eqs. (1.1–1.4)
Degree
Type
1
3
1
2
linear
nonlinear
linear
nonlinear
y=x
4
Example:
Consider the following differential equation and its general solution:
dy
=1
dx
3
-4
-2
(x0,y0)
2
x0
4
x
-2
-4
Fig. 1.1 Lines representing the general solution of dy/dx = 1.
4
1 Ordinary Differential Equations
Now we integrate each side with respect to x giving
1 dy
dx =
g(y) dx
and hence the solution is given by
dy
g(y)
f (x) dx
=
f (x) dx .
1.2 Ordinary Differential Equations of First Order and First Degree
5
dy
= (1 − 2x)(1 + y 2 )
i.e. separable!
dx
dy
⇒
= (1 − 2x) dx
1 + y2
⇒ tan−1 y = x − x 2 + c
⇒
The initial conditions, x = 0, y = 0, give
tan−1 0 = 0 − 0 + c
⇒c=0
This reduces to a pure integration of each side of the equation to obtain the
solution.
Hence the solution is
Example:
Solve the differential equation:
tan−1 y = x − x 2
or, taking the tan of each side,
y+1
dy
=
.
dx
x −1
y = tan(x − x 2 ) .
We can write
1
1
dy
=
⇒
y + 1 dx
x −1
dy
1
dx
⇒
=
y+1
x −1
⇒ ln |y + 1| + c1 = ln |x − 1| + c2
⇒ ln |y + 1| = ln |x − 1| + c
y + 1
=c
⇒ ln x − 1
y+1
= ±ec = d
⇒
x −1
⇒ y + 1 = d(x − 1)
(c = c2 − c1 )
where d is an arbitrary constant. Note that the ODE is first order and so we
obtain one constant.
Remark: In order to fix the arbitrary constants we require further conditions.
These are the initial or boundary conditions (IC or BC).
Example:
Solve
dy
= 1 + y 2 − 2x − 2x y 2
dx
subject to the initial conditions y = 0 at x = 0.
We can rewrite the equation as
dy
= (1 − 2x) + y 2 (1 − 2x)
dx
1.2.2 Homogeneous ODEs (First Order)
Homogeneous ordinary differential equations are ones that can be put in the form
y
dy
= f
.
dx
x
Example:
If we take the differential equation
dy
2x y
= 2
dx
x + y2
and divide the numerator and denominator of the right-hand side by x 2 we have
y
dy
2(y/x)
=
≡ f
2
dx
x
1 + (y/x)
and so the differential equation is homogeneous.
Homogeneous equations can be put into a separable form by writing
v=
y
x
⇒
y = vx
where v is an unknown function of x .
dy
dv
=x
+v.
⇒
dx
dx
Example:
Solve the differential equation,
√
2 xy − x
dy
dx
+ y = 0.
6
1 Ordinary Differential Equations
To see if it is homogeneous we can rewrite the equation as
y
dy
−y
−(y/x)
= √
= √
= f
dx
2 xy − x
x
2 y/x − 1
where we have divided top and bottom by x . Now let y = vx such that
dy
dv
=x
+v.
dx
dx
Now substitute for y/x = v and d y/dx = x(dv/dx) + v .
−v
dy
dv
=x
+v = √
dx
dx
2 v−1
−2v 3/2
dv −v − (2v 3/2 − v)
=
⇒x
= √
√
dx
2 v−1
2 v−1
√
dx
2 v−1
=−
⇒
dv
(separable)
x
2v 3/2
⇒
This integrates to give
1
ln |x| = − ln |v| − √ + c
v
1.2 Ordinary Differential Equations of First Order and First Degree
The right-hand side of the differential equation would be homogeneous if
c = h = 0. Therefore we will make a linear transformation to new variables
(X, Y ) such that these constant parts vanish. We set
x = X + x0
a(X + x0 ) + b(Y + y0 ) + c
dY
=
dX
f (X + x0 ) + g(Y + y0 ) + h
=
a X + bY + (ax0 + by0 + c)
.
f X + gY + ( f x0 + gy0 + h)
In order to make the equation homogeneous we must set the terms in the brackets
equal to zero. Hence the values of x0 and y0 will be determined by the solutions
of the simultaneous, linear equations
ax0 + by0 + c = 0
⇒ ln |vx| = − √ + c
v
ln |y| + √
1
y/x
= c.
This is the general solution of the differential equation. We cannot determine c
without knowing boundary conditions.
Note: The solution we have obtained is implicit, i.e. it cannot be expressed
as y = f (x). An implicit solution gives a relationship between x and y but not
necessarily of the form y = f (x).
Remark: Often the solutions of ODEs are implicit. In problems one should
try to simplify as much as possible.
1.2.3 ODEs Reducible to Homogeneous Differential Equations
Differential equations that can be reduced to ones of the homogeneous type have
the form:
ax + by + c
dy
=
dx
f x + gy + h
where a , b, c, f , g and h are constants.
y = Y + y0
and
and choose (x0 , y0 ) in order to make the right-hand side of the differential equation
homogeneous.
Now, with the new variables
d y dY
dY
=
=
dx
dx
dX
so that the differential equation can now be written as
1
But v = y/x and vx = y . Therefore our solution should be written as
7
f x0 + gy0 + h = 0 .
The differential equation can now be written
a X + bY
a + bY/ X
dY
=
=
.
dX
f X + gY
f + gY / X
This is now homogeneous and can be solved using the usual technique of setting
Y = V X.
Example:
Solve the differential equation
y−x +1
dy
=
.
dx
y+x +5
We set
x = X + x0
y = Y + y0
and so the differential equation becomes
Y − X + (y0 − x0 + 1)
dY
=
.
dX
Y + X + (y0 + x0 + 5)
In order to find x0 and y0 we need to solve the equations
y0 − x0 = −1
y0 + x0 = −5 .
8
1.2 Ordinary Differential Equations of First Order and First Degree
1 Ordinary Differential Equations
9
The solution is x0 = −2, y0 = −3.
The differential equation to be solved is now
=0
y+h
fx+g
Y−X
dY
=
.
dX
Y+X
y
Now we let Y = V X and the differential equation is written as
V −1
dV
dY
=V+X
=
dX
dX
V +1
dV
1 + V2
=−
⇒X
(separable)
dX
1+V
1+V
dX
⇒
dV = −
X
1 + V2
1
−1
2
⇒ tan V + ln |1 + V | = − ln |X | + C .
2
ln |1 + V 2 | + 2 ln |X | = −2 tan−1 V + 2C
⇒ ln |X 2 (1 + V 2 )| = −2 tan−1 V + 2C
−2 tan−1 V
2
⇒ X (1 + V ) = D e
−1
⇒ X 2 + X 2 V 2 = D e−2 tan
x
ax+b
y+c=
0
2
for some value of k . In this case the lines have the same gradient and the original
differential equation is of the form
ax + by + c
dy
=
.
dx
k(ax + by) + h
V
where D = e2C .
Now we substitute for X and V using X = x + 2, Y = y + 3 and Y = V X . This
gives the general solution
2
X
(x0,y0)
Fig. 1.2. Intersection point of two lines is used to transform the coordinates and make
the differential equation homogeneous.
This can be re-arranged to give
2
Y
−2 tan−1
(x + 2) + (y + 3) = D e
y+3 x+2
Therefore in this case the previous transformation does not work. To solve this
we use
z = ax + by
.
as a new variable. Then
Remark: Geometrically we are finding the intersection point between the lines
dz
z+c
dy
=a+b
=a+b
dx
dx
kz + h
Note that again we have an implicit solution.
ax +by +c = 0 and f x + gy +h = 0 and then shifting the origin to the intersection
point by X = x − x0 , Y = y − y0 . (See Fig. 1.2)
which is separable and therefore we can solve for z as before. In practice such
equations can be spotted by inspection.
The concept of finding the point of intersection in order to make the equation
homogeneous is useful because there will be cases where the lines in Fig. 1.2
do not intersect. When the lines are parallel the simultaneous equations take the
form
Example:
Solve the differential equation
ax0 + by0 + c = 0
k(ax0 + by0 ) + h = 0
dy
x −y−1
=
.
dx
y−x −1
Let
z = x − y.
10
1 Ordinary Differential Equations
1.2 Ordinary Differential Equations of First Order and First Degree
Hence
Integrating gives
dz
dy
=1−
.
dx
dx
Therefore the differential equation becomes
.
(1.9)
R(x)y =
or
y=
A first order linear ODE can be written as
dy
+ P(x)y = Q(x)
(1.5)
dx
where P(x) and Q(x) are arbitrary functions of x .
It turns out that we can ‘always’ solve Eq. (1.5) by first multiplying through
by a function R(x), the integrating factor (IF), such that the left-hand side of
Eq. (1.5) becomes the exact derivative
d
[ R(x)y ] .
dx
To see how this is done we first multiply Eq. (1.5) by the unknown function R(x):
(1.6)
(1.7)
(1.8)
Taking Eq. (1.8) and using the product rule on the left-hand side gives
d y d R(x)
dy
+
y = R(x)
+ R(x)P(x)y .
dx
dx
dx
We can cancel the first term on each side of the equation and hence R(x) will be
the solution of the differential equation
d R(x)
= P(x)R(x) .
dx
P(x) dx
d
[ R(x)y ] = R(x)Q(x)
dx
where R(x) is already known from Eq. (1.9). This integrates directly to give
1.2.4 Linear, First Order ODEs
⇒
P(x) dx
Then we can write Eq. (1.8) as
ln |x − y| + constant = x + y .
R(x)
P(x) dx
⇒ R(x) = e
Substituting back using z = x − y gives
dy
d
(LHS of (1.6))
[ R(x)y ] ≡ R(x) + R(x)P(x)y
dx
dx
= Q(x)R(x)
(RHS of (1.6))
=
⇒ ln R(x) =
z
⇒ ln |z| + z + constant = 2x
dy
+ R(x)P(x)y = Q(x)R(x) .
dx
Now, to get the exact derivative we require
d R(x)
R(x)
z−1
dz
dy
≡1−
=
dx
dx
−(z + 1)
dz
2z
z−1
=1+
⇒
=
dx
z+1
1+z
1+z
dz = 2 d x
⇒
R(x)
11
R(x)Q(x) dx
1
R(x)
R(x)Q(x) dx
and this is the solution of our differential equation.
Remark: It is not necessary to include a constant of integration in the integrating
factor R(x), as such a constant would cancel in the final solution.
Remark: In solving problems, always reduce the linear first order differential
equation into the standard form, d y/dx + P(x)y = Q(x) with the coefficient of
d y/dx equal to 1.
Example:
Solve the differential equation
dy
+ 2y = x 4 .
dx
In order to obtain a coefficient of 1 in the d y/dx term we need to divide through
by x . The equation becomes
x
dy 2
+ y = x3 .
dx x
This is now a linear differential equation with
P(x) =
2
x
,
Q(x) = x 3 .
From the theory given above the integrating factor is
R(x) = e P(x) dx
= e (2/x)dx
(1.10)
12
1 Ordinary Differential Equations
1.2 Ordinary Differential Equations of First Order and First Degree
= e2 ln x
= eln x
This equation can be made linear by using the substitution
2
z=
= x2 .
We can check that this indeed gives a total derivative:
d
d
[ R(x)y ] = (x 2 y)
dx
dx
dy
+ 2x y
= x2
dx
dy 2
+ y
= x2
dx x
dy 2
= R(x)
+ y
dx x
where n = 1. Note that in the case where n = 1 the differential equation becomes
separable.
Now we can substitute the expression for y −n d y/dx in Eq. (1.11). This gives
as required.
dz
+ P(x)z = Q(x)
1 − n dx
1
=
x 2 x 3 dx =
x6
⇒
R(x)Q(x) dx
x2y =
.
dy
dz
= (1 − n)y −n
dx
dx
1 dz
−n d y
=
⇒y
dx
1 − n dx
where R(x) = x 2 and Q(x) = x 3 . Hence
1
y n−1
Hence,
Now we can proceed to obtain the solution to the differential equation. From
the theory given above we have
R(x)y =
13
If we set (1 − n)P(x) = P(x) and (1 − n)Q(x) = Q(x) then the equation is a linear
one of the standard form and can be solved using the integrating factor method
(see Sect. 1.2.4).
x 5 dx
Example:
Solve the differential equation
+c
6
1 4
⇒ y = x + cx −2
6
and this the general solution to the differential equation.
dy
− y = x y5 .
dx
Here we have P(x) = −1, Q(x) = x and n = 5. We re-write the equation as
1 dy
− y −4 = x .
y 5 dx
1.2.5 Bernoulli Equations
Bernoulli equations are of the form
(1.12)
Let z = y −4 . Hence
dy
+ P(x)y = Q(x)y n
dx
where n is a constant and P(x) and Q(x) are given functions of x .
dy
dz
= −4 y −5
.
dx
dx
We substitute the expression for y −5 d y/dx in Eq. (1.12) to give
Note: In the special case when n = 0 the equation reduces to a form studied in
Sect. 1.2.4.
This is a nonlinear equation since it involves powers of y that are greater than
one.
To solve the equation we divide each side by y n . Hence
1 dy
+ P(x)y −n+1 = Q(x) .
y n dx
dz
+ (1 − n)P(x)z = (1 − n)Q(x)
dx
(1.11)
dz
+ 4z = −4x.
dx
(1.13)
Note: The coefficient of dz/dx needs to be made 1 before calculating the IF.
Eq. (1.13) is now linear with P(x) = 4 and Q(x) = −4x . The integrating
factor is
R(x) = e
P(x) dx
=e
4 dx
= e4x .
14
1 Ordinary Differential Equations
Hence
e4 x z = −
P(x) = x 2 + a1 x + a0
then we define
4 x e4 x d x = −
e4 x d x
⇒ e4x z = −x e4x +
1 4x
e +c
4
P(D) = D 2 + a1 D + a0
x d(e4x )
= −x e4x +
by replacing x by D . This is standard property of functions. Then
(by parts)
P(D)y = (D 2 + a1 D + a0 )y
means
1
⇒ z = −x + + ce−4x ≡ y −4
4
−1/4
1
⇒ y = ce−4x + − x
.
4
D 2 y + a1 Dy + a0 y =
1.3 Linear ODEs with Constant Coefficients
In this section we consider differential equations of the form
dn y
dn−1 y
+ an−1 n−1 + · · · + a0 y = f (x)
(1.14)
n
dx
dx
where an , an−1 , · · ·, a0 are constant coefficients and f (x) is a function of x .
The aim is to solve Eq. (1.14) where y is unknown and f (x) is a given function
of x . Equations of this type with non-zero right-hand sides are called nonhomogeneous. (Do not confuse this with ODEs of the homogeneous type.)
an
1.3.1 The D Notation
When dealing with linear ODEs with constant coefficients it is convenient to
introduce the differential operator, D to denote d/dx so that
dy
.
dx
d
D y = D(Dy) =
dx
dy
dx
P(D) = (D − m 1 )(D − m 2 )
then
P(D)y = (D − m 1 )(D − m 2 )y = (D − m 2 )(D − m 1 )y .
We can check that this is true by applying the operator:
(D − m 1 )(D − m 2 )y = (D − m 1 )(Dy − m 2 y)
= D 2 y − m 2 Dy − m 1 Dy + m 1 m 2 y
= (D − m 2 )(Dy − m 1 y)
= (D − m 2 )(D − m 1 )y .
So we have the same expression with the order of the brackets reversed.
Therefore our general linear ODE with constant coefficients,
an
dn y
dn−1 y
+ an−1 n−1 + · · · + a0 y = f (x)
dx n
dx
can now be written as
P(D)y = (an D n + an−1 D n−1 + · · · + a0 )y = f (x)
Then D 2 means that we apply D twice such that
2
d2 y
dy
+ a0 y .
+ a1
dx
dx 2
P(D) is called a linear differential operator. Operators like P(D) which are
polynomials in D obey the usual algebraic laws of addition, multiplication and
factorisation. This means that, if for example,
Note: If we were given boundary conditions then c could be found. For example, if y = 1 at x = 0 then 1 = (c + 14 )−1/4 ⇒ c = 34 .
Dy ≡
15
More generally, if P(x) is a polynomial, say
d 4x e z = − 4 x e4 x
dx
which integrates to give
1.3 Linear ODEs with Constant Coefficients
d2 y
.
=
dx 2
Similarly
d3 y
dn y
n
D3 y =
,
·
·
·
,
D
y
=
.
dx n
dx 3
using the D notation.
Now we consider the solution of our linear ODE with constant coefficients
given in differential operator notation by
(an D n + an−1 D n−1 + · · · + a0 )y = f (x) .
The problem can be reduced to a two-step procedure:
(1.15)
16
1 Ordinary Differential Equations
1.3 Linear ODEs with Constant Coefficients
1) Find the general solution of the corresponding homogeneous equation (the
ODE with the right-hand side set to zero). This means solving the equation
P(D)y = 0 or
(an D n + an−1 D n−1 + · · · + a0 )y = 0 .
(1.16)
This is referred to as the complementary function (CF) or yh .
2) Find a particular or special solution of the full version of the ODE, Eq. (1.15);
this is referred to as the particular integral (PI) or yp .
Theorem: The general solution of the non-homogeneous differential equation
given in Eq. (1.15) can be written as
y(x) = yp (x) + yh (x)
17
Note that if
(D − m 1 )y =
dy
− m1 y = 0
dx
then
dy
= m1 y
dx
with solution
y = c1 em 1 x .
Therefore y = c1 em 1 x satisfies Eq. (1.17). However, we could have followed
exactly the same procedure for the m 2 and so y = c2 em 2 x also satisfies Eq. (1.17).
Suppose m 1 = m 2 . For linear differential equations we can add the solutions
such that the general solution of
(D − m 1 )(D − m 2 )y = 0
i.e. the sum of the particular integral and the complementary function.
is
Proof : We have
P(D)y = (an D + an−1 D
n
n−1
y = c1 em 1 x + c2 em 2 x .
+ · · · + a0 )y = f (x) .
But what are m 1 and m 2 ? Now
The CF satisfies
D 2 + a1 D + a0 ≡ (D − m 1 )(D − m 2 )
P(D)yh = 0 .
and so (m 1 , m 2 ) are the roots of the quadratic equation
The PI satisfies
P(D)yp = f (x) .
x 2 + a1 x + a0 = 0 .
P(D)(yh + yp ) = f (x) .
In general, if P(D)y = 0 then y = cemx is a solution if P(m) = 0, or, in other
words, m is a root of P(x) = 0. The equation P(x) = 0 is called the auxiliary
equation and when P(x) is a quadratic equation there are three cases to consider:
Adding these gives
Therefore
1) The roots m 1 and m 2 are real and distinct. In this case the solution is
P(D)y = f (x)
y = c1 em 1 x + c2 em 2 x .
our original equation, where y = yh + yp is the general solution.
2) The roots m 1 and m 2 are real and equal. In this case m 1 = m 2 = m and the
solution is
y = (c1 x + c2 )emx .
1.3.2 Solution of Homogeneous Part
Finding the solution of the homogeneous part of the ODE is equivalent to finding
the complementary function (CF). We will approach the problem by considering
a general second order ODE of the form
This can be verified by showing that (D − m)2 y = 0.
3) The roots m 1 and m 2 are complex conjugate. In this case m 1 = p + iq ,
m 2 = p − iq and the solution is
(D 2 + a1 D + a0 )y = 0 .
y = c1 e( p+iq)x + c2 e( p−iq)x .
We can factorize the differential operator part to give
However, this can be expressed in a number of different ways. We can write
2
D + a1 D + a0 = (D − m 1 )(D − m 2 ) .
So
(D − m 1 )(D − m 2 )y = 0
or
(D − m 2 )(D − m 1 )y = 0 .
(1.17)
y = e px c1 eiq x + c2 e−iq x
= e px (c1 + c2 ) cos q x + i(c1 − c2 ) sin q x
= e px E cos q x + F sin q x
18
1 Ordinary Differential Equations
1.3 Linear ODEs with Constant Coefficients
where E = c1 + c2 and F = i(c1 − c2 ).
So far we have only considered second order linear differential equations.
Now we will generalise the theory to n th order equations. Consider the general
linear homogeneous equation
P(D)y = (an D n + an−1 D n−1 + · · · + a0 )y = 0 .
Because the right-hand side is zero, finding the general solution to this equation
is equivalent to finding the complementary function of the more general problem.
Let us try
y = Aemx .
Each twice repeated root gives a contribution similar to the first term in Eq. (1.18).
The remaining terms in Eq. (1.18) arise from the non-repeated roots.
Suppose that the root m 1 is repeated r times. Then
P(D) = Q(D)(D − m 1 )r
where Q(D) is now a polynomial of degree n − r and we have used the fact that
(D − m 1 )r is a factor of P(D). Hence
(D − m 1 )r y = 0
is satisfied by
y = (c0 + c1 x + · · · + cr −1 x r −1 )em 1 x .
This is a solution that satisfies the auxiliary equation
P(m) = an m n + an−1 m n−1 + · · · + a0 = 0 .
This equation has n roots m 1 , m 2 , · · ·, m n . Suppose that all the roots are distinct.
Then the general solution is the superposition
yh = c1 em 1 x + c2 em 2 x + · · · + cn em n x
Note that the term in brackets is a general polynomial of degree r − 1. Then the
general solution is
y = (c0 + c1 x + · · · + cr −1 x r −1 )em 1 x + cr em r x + · · · + cn em n x .
(1.19)
Each r repeated root gives a contribution similar to the first term in Eq. (1.19).
The remaining terms in Eq. (1.19) arise from the non-repeated roots.
where c1 , c2 , · · ·, cn are n arbitrary constants.
Note: If y1 (x), y2 (x), y3 (x), · · · are solutions of the linear equation P(D)y = 0
then so is
y = y1 (x) + y2 (x) + y3 (x) + · · · .
This can be verified by direct substitution and it follows from the linearity of the
operator P(D).
Therefore the general solution
yh = c1 em 1 x + c2 em 2 x + · · · + cn em n x
is valid provided the roots are distinct (even if they are complex).
Repeated roots require special attention. Suppose that the root m 1 is twice
repeated. Then
P(D) = Q(D)(D − m 1 )2
where Q(D) is a polynomial of degree n − 2 and we have used the fact that
(D − m 1 )2 is a factor of P(D). Hence
Note: The general solution always has n arbitrary constants regardless of repeated roots.
Example:
Solve the differential equation
d2 y d y
− 6y = 0 .
+
dx 2 dx
This can be written as P(D)y = 0 where P(D) = D 2 + D − 6. The auxiliary
equation is given by P(m) = 0 or
m2 + m − 6 = 0
⇒ (m − 2)(m + 3) = 0
⇒ m = 2,
m = −3 .
These are real, distinct roots and so the general solution is
y = c1 e2x + c2 e−3x .
(D − m 1 )2 y = 0
is satisfied by
Example:
Solve the differential equation
y = (c0 + c1 x)em 1 x
and so this also satisfies P(D)y = 0. Then the general solution is
y = (c0 + c1 x)em 1 x + c3 em 3 x + · · · + cn em n x .
19
(1.18)
dy
d2 y
+ y = 0.
+2
dx
dx 2
1 Ordinary Differential Equations
1.3 Linear ODEs with Constant Coefficients
This can be written as P(D)y = 0 where P(D) = D 2 + 2 D + 1. The auxiliary
equation is given by P(m) = 0 or
1.3.3 Particular Solutions to Non-homogeneous Equations
20
m 2 + 2m + 1 = 0
⇒ (m + 1)2 = 0
The second part of finding the general solution to a non-homogeneous differential
equation with constant coefficients is to find the particular integral, yp . We need
to do this when the differential equation has the form
⇒ m = −1 twice .
Therefore we have a twice repeated root only and so the general solution is
−x
y = (c1 + c2 x)e
.
21
P(D)y = f (x)
where f (x) = 0. General methods for finding the particular integral are available
for specific forms of the function f (x). We will consider five forms of functions.
1.3.3.1 Functions of the form f (x) = Ax r ( A is constant, r is integer)
In this case we need to find yp such that
Example:
Solve the differential equation
d2 y
dy
+ 13 y = 0 .
−6
dx
dx 2
This can be written as P(D)y = 0 where P(D) = D 2 − 6 D + 13. The auxiliary
equation is given by P(m) = 0 or
m 2 − 6m + 13 = 0
⇒ m = 3 ± 2i .
Therefore the general solution is
y = c1 e(3+2i)x + c2 e(3−2i)x .
P(D)yp = Ax r .
Try a general polynomial of degree r of the form
yp = br x r + br −1 x r −1 + · · · + b0 .
The coefficients br , br −1 , · · ·, b0 will have to be determined in order to get the
solution.
Example:
Find the general solution of the differential equation
d2 y
dy
+ 9 y = 18x 2 .
+6
dx
dx 2
Using the fact that e±iθ = cos θ ± i sin θ we can write this as
y = e3x E cos 2x + F sin 2x
where E = c1 + c2 and F = (c1 − c2 )i are two constants.
yp = b2 x 2 + b1 x + b0 .
Example:
Solve the differential equation
d3 y
dx 3
In this example A = 18, r = 2 and P(D) = D 2 + 6 D + 9. To find the particular
integral we try
Then
−3
d2 y
+ 4y = 0 .
dx 2
This can be written as P(D)y = 0 where P(D) = D 3 − 3 D 2 + 4. The auxiliary
equation is given by P(m) = 0 or
m 3 − 3m 2 + 4 = 0
⇒ (m + 1)(m − 2)2 = 0
⇒ m = −1,
m = 2 twice .
Therefore the general solution is
y = (c1 + c2 x)e2x + c3 e−x .
The first term in this equation comes from the repeated root while the second is
from the non-repeated root.
P(D)yp = 2b2 + 6(2b2 x + b1 ) + 9(b2 x 2 + b1 x + b0 )
or
P(D)yp = 2b2 + 6b1 + 9b0 + (12b2 + 9b1 )x + 9b2 x 2 = 18x 2 .
Equating the coefficients of the powers of x on each side of the equation gives
x 2 : 9b2 = 18,
x : 12b2 + 9b1 = 0,
const. : 2b2 + 6b1 + 9b0 = 0 .
Hence
8
3
b 2 = 2,
b1 = − ,
and so the particular integral is
yp = 2x 2 −
8
4
x+ .
3
3
b0 =
4
3
22
1 Ordinary Differential Equations
Now we consider the complementary function. The auxiliary equation is
2
2
P(m) = m + 6m + 9 = (m + 3) = 0
1.3 Linear ODEs with Constant Coefficients
23
Because m 1 = −2 and m 2 = −2we have the case where k = −2 does not satisfy
the auxiliary equation. Hence, for the particular integral we try
yp = Ae−2x .
with roots m = −3 (twice). Hence
yh = (c0 + c1 x)e−3x
Then
and the general solution is
P(k)A = 10 = 20 A
y = yp + yh = (c0 + c1 x)e−3x + 2x 2 −
8
4
x+ .
3
3
Note: If m = 0 is a l -repeated root of the auxiliary equation P(m) = 0
(i.e. P(m) = m l Q(m)) then for the particular integral try
yp = x l (br x r + br −1 x r −1 + · · · + b0 ) .
1.3.3.2 Functions of the form f (x) = bekx (b, k are constants)
In this case we need to find yp such that
P(D)yp = bekx .
Here there are two cases to consider depending on whether or not k satisfies the
auxiliary equation P(k) = 0.
If k does not satisfy P(k) = 0 then try
yp = Ae
kx
where A has to be determined from the equation P(k)A = b. This gives A =
b/P(k).
If k satisfies P(k) = 0 and is a repeated root r times then try
yp = Ax r ekx
where A has to be determined.
yp =
In this example b = 10, k = −2 and P(D) = D 2 − 5 D + 6. The auxiliary equation
is
P(m) = m 2 − 5m + 6 = (m − 3)(m − 2) = 0
with roots m 1 = 2 and m 2 = 3. Hence the complementary function is
yh = c1 e2x + c2 e3x .
1 −2x
e
2
and the general solution is
1 −2x
e + c1 e2x + c2 e3x .
2
Example:
Find the general solution of the differential equation
d3 y
dy
+ 2 y = ex .
−3
dx
dx 3
In this example b = 1, k = 1 and P(D) = D 3 − 3 D + 2. The auxiliary equation
is
P(m) = m 3 − 3m + 2 = (m − 1)2 (m + 2) = 0
with roots m 1 = m 2 = 1 and m 3 = −2. Hence the complementary function is
yh = (c1 + c2 x)ex + c3 e−2x .
Because m = k = 1 is a twice repeated root we try
Example:
Find the general solution of the differential equation
dy
d2 y
+ 6 y = 10e−2x .
−5
dx
dx 2
1
.
2
Hence
y = yp + yh =
Note the additional x l factor.
⇒A=
yp = Ax 2 ex .
We find A by substituting our trial yp in P(D)yp = ex . This gives
(D 3 − 3 D + 2)Ax 2 ex = Aex (6 + 6x + x 2 ) − Aex (6x + 3x 2 ) + 2 Ax 2 ex
= 6 Aex = ex
and hence A = 16 . Therefore the general solution is
y = yp + yh =
1 2 x
x e + (c1 + c2 x)ex + c3 e−2x .
6
24
1.3 Linear ODEs with Constant Coefficients
1 Ordinary Differential Equations
1.3.3.3 Functions of the form f (x) = x k eax (a , k are constants)
In this case we need to find yp such that
and hence
yp = e
P(D)yp = x k eax .
Here there are two cases to consider depending on whether or not a satisfies the
auxiliary equation P(a) = 0.
If a does not satisfy P(a) = 0 then try
yp = (bk x + bk−1 x
k
k−1
−
3
+x
2
y = yp + yh = c0 + (c1 x + c2 )e + e
x
x
+ · · · + b0 )e
Example:
Find the general solution of the differential equation
In this example a = 1, k = 2 and P(D) = D 3 − 2 D 2 + D . The auxiliary
equation is
P(m) = m 3 − 2m 2 + m = m(m − 1)2 = 0
with roots m 1 = 0 and m 2 = m 3 = 1. Hence the complementary function is
yh = c0 + (c1 x + c2 )ex .
yp = x (b2 x + b1 x + b0 )e .
⇒ (D − 1)2 yp = (12b2 x 3 + 6b1 x 2 + 2b0 x)ex
⇒ D(D − 1)2 yp = ex 12b2 x 2 + (24b2 + 6b1 )x + (6b1 + 2b0 ) .
Therefore we require that
1
3
b0 = 1
+x
.
1.3.3.4 Functions of the form f (x) = cos nx or sin nx (n constant)
In this case we need to find yp such that
where
Here we are making use of the fact that eiθ = cos θ + i sin θ . In this case the real
part of the solution yp corresponds to taking f (x) = cos nx and the imaginary
part of the solution corresponds to taking f (x) = sin nx . The equation can then
be solved using the procedure developed in Sect. 1.3.3.2 using k = i.
Example:
Find the general solution of the differential equation
where it is understood that we will take the imaginary part when we have obtained
the solution. The auxiliary equation is
⇒ (D − 1)yp = (4b2 x 3 + 3b1 x 2 + 2b0 x)ex
b1 = − ,
3
(D 2 + D − 2)y = eix
Hence
giving
−
2
Here we can write the equation as
x
12b2 x + (24b2 + 6b1 )x + (6b1 + 2b0 ) = x
12
x3
d2 y d y
− 2 y = sin x .
+
dx 2 dx
Note that the c0 = c0 e0 term corresponds to the m = 0 root.
Because m = a = 1 is a twice repeated root we try
2
x4
real part of einx = einx = cos nx
imaginary part of einx = einx = sin nx
d3 y
d2 y d y
= x 2 ex .
−2 2 +
3
dx
dx
dx
2
.
P(D)yp = b einx
where bk , bk−1 , · · ·, b0 have to be determined from the equation P(D)yp = x k eax .
1
,
12
12
x3
Therefore the general solution is
yp = x r (bk x k + bk−1 x k−1 + · · · + b0 )eax
b2 =
x4
ax
where bk , bk−1 , · · ·, b0 have to be determined from the equation P(D)yp = x k eax .
If a satisfies P(a) = 0 and is a repeated root r times then try
2
x
25
2
P(m) = m 2 + m − 2 = (m − 1)(m + 2) = 0
with roots m 1 = 1 and m 2 = −2. Hence
yh = c0 ex + c1 e−2x .
Note that m = i is not a root of the auxiliary equation. For the particular integral
we try (using the method in Sect. 1.3.3.2)
yp = Aeix
26
1.4 Differential Equations of the Euler Type
1 Ordinary Differential Equations
giving
(−1 + i − 2)A = 1
⇒ A=−
(3 + i)
(3 + i)
1
=−
=−
.
3−i
(3 − i)(3 + i)
10
As in Sect. 1.3.3.2 this can also be found by simply writing A = 1/P(i). Hence
3
i
yp = −
−
eix .
10
For the e−x sin x term we recall that this is equivalent to e−(1−i)x . This is
a useful way of combining the e−x and sin x terms together. Hence we try
yp = be−(1−i)x . Substituting we have
10
3
i
yp = −
−
eix
10 10
3
i −
cos x + i sin x
= −
10 10
3
1
1
3
cos x +
sin x + i − cos x −
sin x
= −
10
For the 1 term we have f (x) = constant and so we try yp = A. Substituting
we have
P(D)yp = (D 3 + 1)A = (0 + 1)A = 1 ⇒ A = 1 .
P(D)yp = (D 3 + 1)be−(1−i)x = b (−(1 − i))3 + 1 e−(1−i)x = e−(1−i)x .
Now we need to take the imaginary part.
10
10
10
Hence
⇒ b −(1 − 3i + 3i2 − i3 ) + 1 = 1
⇒ b(3i + 3 − i) = 1
⇒b=
1
3
cos x −
sin x + c0 ex + c1 e−2x .
10
10
1
3 − 2i
3 − 2i
=
=
.
3 + 2i
(3 + 2i)(3 − 2i)
13
Therefore
3 − 2i −(1−i)x
e
13
3
2
sin x −
cos x .
= e−x
13
13
yp = 1
3
cos x −
sin x .
=−
10
10
Therefore the general solution is
y = yp + yh = −
27
Now we can add the two contributions of yp to yh giving the general solution
y = yh + yp
√
√
= c1 e−x + ex/2 c2 cos 3x/2 + c3 sin 3x/2
3
2
sin x −
cos x + 1 .
+ e−x
1.3.3.5 Functions of sums of constant multiples of any of the above
For functions of this we find each separate yp for each part and the overall yp is
the sum of the individual yp ’s.
13
Example:
Find the general solution of the differential equation
d3 y
+ y = 1 + e−x sin x .
dx 3
Here we can write the equation as
(D 3 + 1)y = 1 + e−x sin x .
The auxiliary equation is
P(m) = m 3 + 1 = 0
√
with roots m 1 = −1, m 2,3 = (1 ± i 3)/2. Hence
√
√
yh = c1 e−x + ex/2 c2 cos 3x/2 + c3 sin 3x/2 .
For the particular integral we will consider each part separately.
13
1.4 Differential Equations of the Euler Type
Euler-type differential equations have the form
dn y
dn−1 y
+ an−1 x n−1 n−1 + · · · + a0 y = f (x)
dx n
dx
where the an , an−1 , . . ., a0 are constants.
As it stands this equation does not have constant coefficients because of the
powers of x that are present. However, it can be transformed so that it does and
then standard methods can be used.
We carry out a change of variable from x to t using
an x n
x = et
or t = ln x .
Then
d y d y dt
=
.
dx
dt dx
28
1 Ordinary Differential Equations
1.5 Simultaneous Linear Equations with Constant Coefficients
From the definition of t we have
This is now a linear differential equation with constant coefficients. We have the
auxiliary equation
dt
1
=
dx
x
P(m) = m 2 − 1 = (m + 1)(m − 1) = 0
and so
dy 1 dy
=
dx
x dt
or
x
29
dy dy
=
.
dx
dt
with solutions m 1 = 1 and m 2 = −1. This gives us
(1.20)
Now
where a and b are constants. For the particular integral we have f (t) = t and
this can be thought of as a polynomial of degree 1. Therefore, according to the
method developed in Sect. 1.3.3.1 we need to try a general polynomial of the
same degree. Hence we try
dy 1 dy
=
dx
x dt
d2 y
1 d y 1 d2 y dt
+
=− 2
⇒
2
2 dx
x dt dx
x dt
2
2
d y
1 d y dy
⇒
= 2
−
dt
dx 2
x
dt 2
d2 y d2 y d y
.
⇒ x2 2 = 2 −
dt
dx
dt
Equation (1.20) can be written as
yh = a et + be−t
yp = ct + d .
Substitution in the differential equation gives
P(Dt )yp = (Dt2 − 1)(ct + d) = −ct − d = t .
(1.21)
x D x y = Dt y
Comparing coefficients gives c = −1 and d = 0. Therefore
yp = −t
and the general solution is
y = yh + yp = a et + be−t − t
where Dx ≡ d/dx and Dt ≡ d/dt . Similarly Eq. (1.21) can be written
x 2 Dx2 y = Dt (Dt − 1)y .
or, in terms of x ,
y = ax + bx −1 − ln x .
We can generalise this (using proof by induction) to give
x r Dxr y = Dt (Dt − 1)(Dt − 2) · · · (Dt − (r − 1))y .
Having transformed to the variable t the Euler equation becomes linear with
constant coefficients and we can solve it using the techniques developed in
Sect. 1.3.
The initial condition x = 1, y = 0 gives 0 = a + b while the condition x = 1,
d y/dx = 3 gives 3 = a − b − 1. Hence a = 2 and b = −2. Therefore the general
solution is
y = 2x − 2x −1 − ln x .
1.5 Simultaneous Linear Equations with Constant Coefficients
Example:
Find the general solution of the differential equation
d2 y
dy
− y = ln x
+x
dx
dx 2
with initial conditions y = 0, d y/dx = 3 at x = 1.
First we need to transform to the new variable t using the substitution x = et .
Recall that for linear simultaneous algebraic equations such as
a1 x + b1 y = c1
x2
⇒ x D x = Dt ,
x 2 Dx2 = Dt2 − Dt
and substitution in the differential equation gives
(Dt2 y − Dt y) + Dt y − y = t
⇒ Dt2 y − y = t .
a2 x + b2 y = c2
(1.22)
(1.23)
we can find the solution by first eliminating one unknown, y say, by taking
(b2 × Eq. (1.22)) − (b1 × Eq. (1.23)).
The approach for solving simultaneous ODEs is similar except that the a ’s
and b’s are now differential operators and we can exploit the fact that these are
now expressed as polynomials in D . It is best to follow the procedure by means
of an example.
Example:
30
1 Ordinary Differential Equations
1.6 Series Solutions of ODEs
Find the solution of the simultaneous differential equations
There is no particular integral because the equation is homogeneous. Therefore
dx d y
−
+ 2 y = e−2t
dt
dt
dy
dx
+2
+ 2x − 11 y = 0
dt
dt
where x(t) and y(t) are unknown functions of t .
First we rewrite the equations using the notation D ≡ d/dt and then collect
the terms involving x and y respectively. This gives
Dx − (D − 2)y = e−2t
(1.24)
(1.25)
(D + 2)x + (2 D − 11)y = 0 .
Now eliminate x first by taking ((D + 2) × Eq. (1.24) − D × Eq. (1.25)), just as
we do with simultaneous algebraic equations.
y = yh = Ae4t + B e−t/3 .
(1.29)
Now, as in the algebraic equations, we obtain x from the original equations
by substituting for y . Using Eq. (1.24) we have
Dx − (D − 2)y = e−2t .
Therefore substituting for y from Eq. (1.29) we could integrate our expression
for Dx to obtain x . However, it is quicker to avoid integration by eliminating the
Dx term if possible. We can do this by subtracting Eq. (1.24) from Eq. (1.25) to
give
2x + (D − 2)y + (2 D − 11)y = −e−2t
which gives x directly by substituting for y from Eq. (1.29) and simplifying.
This gives the general solution
Note: Operators must be applied from the left. For example, Dx makes sense
because it is dx/dt , a function of x . However, x D does not make sense because
it is just x(d/dt) which is an operator, not a function of x .
So (D + 2) × Eq. (1.24) and D × Eq. (1.25) give
(D + 2)Dx − (D + 2)(D − 2)y = (D + 2)e−2t
31
1
1 4t
Ae + 7 B e−t/3
2
2
y = Ae4t + B e−t/3 .
x = − e−2t +
(1.26)
(1.27)
Note: The number of arbitrary constants in the solution is the same as the order
of y in the equation, when x has been eliminated or vice versa.
Now, since D(D + 2) = (D + 2)D the first term on the left-hand side cancels
when we subtract these equations. So subtracting Eq. (1.27) from Eq. (1.26)
gives
Remark: Similar methods apply to systems of three or more ODEs (e.g. x(t),
y(t), z(t)).
D(D + 2)x + D(2 D − 11)y = 0
− [(D + 2)(D − 2)y + D(2 D − 11)y ] = (D + 2)e−2t .
But
(D + 2)e−2t =
d −2t e
+ 2e−2t = 0
dt
and so Eq. (1.28) becomes
(D 2 − 4 + 2 D 2 − 11 D)y = 0
⇒
(3 D 2 − 11 D − 4)y = 0 .
P(D) = 3 D 2 − 11 D − 4 .
Hence the auxiliary equation is
and the roots are m 1 = 4 and m 2 = −1/3. Hence
yh = Ae4t + B e−t/3 .
Remark: Sometimes inspection can suggest more convenient variables. For
example,
D 2 x − D 2 y + 3x − 3 y = e4t
can be written as
D 2 (x − y) + 3(x − y) = e4t
which suggests solving for a new variable w = x − y .
1.6 Series Solutions of ODEs
This can now be solved by the standard method using
P(m) = 3m 2 − 11m − 4 = 3(m − 4)(m +
(1.28)
1
)=0
3
So far we have mainly dealt with ODEs with constant coefficients. It turns out
that there are no known types of second order linear ODEs, apart from those
with constant coefficients or those like Euler’s equations which are reducible to
constant coefficients, that are solvable in terms of elementary functions.
Definition: Elementary functions consist of (i) algebraic functions, such as
y = f (x), satisfying
Pn (x)y n + Pn−1 (x)y n−1 + · · · + P1 (x)y + P0 (x) = 0
32
1 Ordinary Differential Equations
1.6 Series Solutions of ODEs
where the Pi (x) are polynomials, plus (ii) elementary transcendental (or nonalgebraic) functions such as trigonometric, inverse trigonometric, exponential,
logarithmic or combinations of these. For example,
y = tan
x e1/x
−1
x 2)
+ tan (1 +
cos 3x − log x sin x
=
x
0
33
dx = x
Hence the first iteration gives
y1 = x .
Now use y1 in the second iteration with n = 1. This gives
is an elementary function.
The idea here is to solve ODEs with non-constant coefficients using power
series. We shall consider three methods.
x
y2 (x) = y0 +
(1 + y12 ) dx
x0
x
=0+
(1 + x 2 ) dx
(n = 1)
0
1.6.1 Picard’s Method
=x+
This method works with differential equations and boundary conditions of the
form
dy
= f (x, y), y = y0 at x = x0
dx
where f (x, y) is a given function with some reasonable properties (such as being
a continuous function; see later). The idea is to obtain successive approximations
using the formula
yn+1 (x) = y0 +
x
f (x, yn ) dx,
n ≥ 0.
y2 = x +
Here f (x, y) = 1 + y 2 with x0 = 0 and y0 = 0. To solve the equation we start
by using Eq. (1.30) with n = 0. This gives
(n = 0)
x
y1 (x) = y0 +
(1 + y02 ) dx
x0
x
=0+
(1 + 02 ) dx
0
.
x3
3
.
Now use y2 in the third iteration with n = 2. This gives
(n = 2)
x
y3 (x) = y0 +
x0
=0+
x0
Example:
Use Picard’s method to find a series solution of the following differential equation:
dy
= 1 + y 2 , y = 0 at x = 0 .
dx
3
Hence the second iteration gives
(1.30)
Therefore, given an approximate solution yn this formula gives the next approximation yn+1 and so on. Note that if yn+1 = yn then we have a solution of
d y/dx = f (x, y).
Picard proved that this method gives a convergent sequence yn tending to a
unique solution y for very general functions f (x, y). For this method to work it
is sufficient for f (x, y) to be joint differentiable in a region R of the (x, y) plane
in which | f (x, y)| < M , |d f /d y| < A where M and A are positive constants.
x3
=
x
0
x
0
(1 + y22 ) dx

1 + x +
x3
3
2 
 dx
2x 4 x 6
1+x +
+
dx .
3
9
2
Note: This step produced two new terms. For Picard’s procedure to work we
must take only one new term at each step.
Hence
x
2x 4
2
y3 =
1+x +
dx
3
0
and the third iteration gives
2x 5
+
.
3
15
In this particular case the answer could also have been found by separation of
variables because
dy
dy
= 1 + y2 ⇒
=
dx
dx
1 + y2
with solution
2x 5 17x 7
x3
y = tan x = x +
+
+
+ ···
3
15
315
y3 = x +
x3
34
1.6 Series Solutions of ODEs
1 Ordinary Differential Equations
which shows that the next term, x 6 /9 would have been wrong to take.
35
The second gives
Remark: In solving problems you are either told to (i) continue the procedure
until the term involving x n first appears; this means stop as soon as x n appears in
the solution for y , or (ii) continue until the term involving x n has been shown to
be determined correctly; this means take one more term to see if the coefficient
of x n term remains the same. If so stop there, otherwise continue.
D n (5x Dy) = 5x D n+1 y + 5n D n y
where we have used u = 5x and v = Dy .
The third gives
D n (3 y) = 3 D n y .
Substituting in Eq. (1.32) gives
1.6.2 Taylor Series Method
(1 − x 2 )D n+2 y − x(2n + 5)D n+1 y − (n + 1)(n + 3)D n = 0 .
To use this method we need to find, for a given differential equation, y , y , . . .,
etc. by differentiating the differential equation and then constructing a Taylor
series. The method is best illustrated with an example.
Example:
Solve the differential equation
d2 y
dy
− 3y = 0
− 5x
(1.31)
2
dx
dx
The differential equation cannot be solved easily. We assume a solution in the
form of a Maclaurin series (about x = 0):
(1 − x 2 )
y(x) = y(0) + x y (0) +
x2
2!
y (0) + · · · +
where
r
x (r )
y (0) + · · ·
r!
D (uv) = u D v + n(Du)(D
n
n−1
v) + n
2!
With n = 1 we have
y 3 (0) = 8 y (0) .
With n = 3 we have
dr y .
dx r x=0
(n − 1)
This establishes a recurrence relation between the coefficients. We use this with
successive values of n to get the coefficients in the Taylor series solution of the
differential equation.
With n = 0 we have
y 2 (0) = 3 y(0) .
y 4 (0) = 15 y 2 (0) = 45 y(0) .
y 5 (0) = 24 y 3 (0) = 192 y (0) .
Now to calculate y , . . ., etc. at x = 0, we differentiate Eq. (1.31) n times using
Leibniz’s rule.
Recall that writing D ≡ d/dx , Leibniz’s formula for the n th derivative of a
product uv is given by
n
y n+2 (0) = (n + 1)(n + 3)y n (0) .
With n = 2 we have
y (r ) (0) =
Hence, setting x = 0 and using the notation D n y|x=0 = y n (0) gives
2
(D u)(D
n−2
The general form of the Taylor series is
y(x) = y(0) + x y (0) +
Substitution gives
v) + · · · + v D u .
n
Equation (1.31) can be written as
(1 − x 2 )D 2 y − 5x Dy − 3 y = 0
which we can differentiate n times using Leibniz’s formula. There are three
terms to differentiate.
The first gives
D n ((1 − x 2 )D 2 y) = (1 − x 2 )D n+2 y − 2xn D n+1 y −
where we have used u = 1 − x 2 and v = D 2 y .
2n(n − 1) n
D y
2!
2!
y 2 (0) + · · · .
3
45 4
y(x) = y(0) 1 + x 2 +
x + ···
2!
4!
8 3 192 5
+ y ( 0) x + x +
x + ··· .
3!
(1.32)
x2
5!
Note that only y(0) and y (0) appear in our solution. Recognising the pattern in
the coefficients allows us to write the general solution as
1 · 3 2 1 · 32 · 5 4
1 · 32 · 52 · 7 6
y(x) = y(0) 1 +
x +
x ++
x + ···
2!
4!
6!
2 · 4 3 2 · 42 · 6 5 2 · 42 · 62 · 8 7
x +
x +
x + ··· .
+ y (0) x +
3!
5!
7!
1 Ordinary Differential Equations
1.6 Series Solutions of ODEs
This is a linear combination of two series with arbitrary constants y(0) and y (0);
this is what should be expected for a general solution of a second order ODE.
we have P(x) = 7/x 3 , Q(x) = x 2 and x = 0 is an irregular singular point because
x P(x) = 7/x 2 → ∞ as x → 0.
The following theorems tell us how to find solutions in each case:
36
Note: This method of finding the solution is equivalent to substituting into
Eq. (1.31) a trial solution in the form of a series
y(x) =
∞
an x n
n=0
Theorem: If x = 0 is an ordinary point and P(x) and Q(x) can be expanded
in an infinite power series around x = 0 (in which case they are called analytic)
then all solutions of the differential equation can be written as power series of
the form:
with unknown coefficients an . This gives a recurrence relation giving all of the
an in terms of the first two coefficients a0 and a1 , Either method works for any
(second order) differential equation with well-behaved coefficients.
1.6.3 Frobenius Method (partial)
This is a rather general method of solving ODEs with non-constant coefficients.
We consider second order cases only.
Suppose we have put our ODE into the standard form
d2 y
dy
+ Q(x)y = 0
+ P(x)
dx
dx 2
where P(x) and Q(x) are known functions of x and where the initial conditions
are y = y0 , d y/dx = y0 at x = x0 . Note that in the standard form the coefficient
of the d2 y/dx 2 is unity.
The first step is to check what happens as x → x0 . If P(x) and Q(x) have
finite limits as x → x0 then x0 is called an ordinary point; otherwise it is called
a singular point. If x = x0 is singular but (x − x0 )P(x) and (x − x0 )2 Q(x) remain
finite as x → x0 then x = x0 is called a regular singular point. If x = x0 does
not satisfy this it is called an irregular singular point.
y=
we have P(x) = x 3 , Q(x) = 3 and x = 0 is an ordinary point.
In the differential equation
d2 y 5 d y
3
+ 2y =0
+
2
x dx
dx
x
we have P(x) = 5/x , Q(x) = 3/x 2 and x = 0 is a regular singular point because
x P(x) = 5 and x 2 Q(x) = 3.
In the differential equation
7 dy
d2 y
+ x2y = 0
+
dx 2 x 3 dx
∞
an x n .
n=0
Theorem: If x = 0 is a regular singular point then there is a solution of the
form:
y=
∞
an x n+c ,
a0 = 0
n=0
for some values of the constant c. It is important to note that c can be fractional.
Note that this theorem does not state that all solutions are of this form.
Example: (Legendre’s Equation)
Derive the series solution for the differential equation
(1 − x 2 )
d2 y
dy
+ ( + 1)y = 0
− 2x
dx
dx 2
about the origin x = 0 where is a constant.
To begin with we have to put the equation in standard form. This gives
2x d y ( + 1)
d2 y
+
−
y=0
dx 2 1 − x 2 dx
1 − x2
Examples:
In the differential equation
d2 y
dy
+ 3y = 0
+ x3
dx
dx 2
37
and hence
P(x) = −
2x
,
1 − x2
Q(x) =
( + 1)
.
1 − x2
There are three main steps involved in the solution.
Step 1.
Note that x = 0 is an ordinary point (but x = ±1 are regular singular points).
Therefore, for a solution about x = 0 the first theorem applies and we let
y=
∞
n=0
Step 2.
an x n .
38
1.6 Series Solutions of ODEs
1 Ordinary Differential Equations
We find the various derivatives. Using the D and the notation to denote d/dx
we have
D(y) = y =
D 2 (y) = y =
∞
n=0
∞
nan x
These are substituted into the relevant terms in the differential equation giving
−x y = −
n(n − 1)an x
−2x y = −2
y =
nan x n
( + 1)an x n
n(n − 1)an x n−2 .
n=0
These terms have to be added to assemble the left-hand side of the equation and
then set to zero to give the right-hand side.
Step 3.
Substitute the various terms in the equation and equate the coefficients of all
powers of x to zero. Note that apart from the y term, all other terms have x n in
the summation. Therefore
y =
∞
n(n − 1)an x n−2
n=0
=
=
∞
n=2
∞
n(n − 1)an x n−2
(since n = 0, 1 terms are 0)
(n + 1)(n + 2)an+2 x n
(changing n → n + 2)
( − n)( + n + 1)
an .
(n + 1)(n + 2)
(1.35)
This is our recurrence relation giving higher an ’s in terms of lower an ’s.
Now we can substitute different values of n in Eq. (1.35).
n=0
∞
which implies that
n
n=0
( + 1)y =
n=0
an+2 = −
n=0
∞
∞
[(n + 1)(n + 2)an+2 − n(n + 1)an + ( + 1)an ] x n = 0 .
(n + 1)(n + 2)an+2 + [( + 1) − n(n + 1)] an = 0
n=0
2 ∞
We can now equate the coefficients of all powers of x to zero in order to obtain
a relationship between the coefficients. We get
n−1
n(n − 1)an x n−2 .
∞
⇒
39
(1.33)
(1.34)
n=0
⇒
n=1
⇒
n=2
⇒
( + 1)
a
1×2 0
( − 1)( + 2)
a3 = −
a1
2×3
( − 2)( + 3)
a4 = −
a2
3×4
( − 2)( + 1)( + 3)
=
a0
a2 = −
4!
n=3
⇒
( − 3)( + 4)
a5 = −
a3
4×5
( − 1)( − 3)( + 2)( + 4)
=
a1 .
5!
Therefore all the an ’s are expressible in terms of a0 and a1 . Substituting these
expressions in our series solution gives:
( + 1) 2 ( − 2)( + 1)( + 3) 4
y = a0 1 −
x +
x + ···
2!
4!
( − 1)( + 2) 3 ( − 1)( − 3)( + 2)( + 4) 5
+ a1 x −
x +
x + ··· .
3!
5!
n=0
We can check that Eqs. (1.33) and (1.34) are equivalent by calculating the first
few terms in the series. We get
Eq. (1.33)
Eq. (1.34)
→ 2(2 − 1)a2 x 0 + 3(3 − 1)a3 x 1 + · · ·
→ (0 + 1)(0 + 2)a2 x 0 + (1 + 1)(1 + 2)a3 x 1 + · · ·
and so the series are the same. We can now substitute in the differential equation:
∞
n=0
[(n + 1)(n + 2)an+2 − n(n − 1)an − 2nan + ( + 1)an ] x n = 0
Remark: When is not an integer, each bracketed series is an infinite series
and a solution of the original differential equation. Now, since these series are
linearly independent, the solution involving two arbitrary constants (i.e. a0 and
a1 ) is a general solution.
Remark: Each series is convergent with the radius of convergence R = 1. To
see this use the recurrence relation (with n replaced by 2n ) for each series:
a
2n+2 2n+2 x
( − 2n)( + 2n + 1) 2 = −
x a2n x 2n (2n + 1)(2n + 2) →
2
x as n → ∞ .
40
1 Ordinary Differential Equations
1.6 Series Solutions of ODEs
The same is also true for the second series (with n replaced by 2n − 1).
Remark: Functions defined by these series are called Legendre’s functions and
in general they are not elementary functions. However, when is a non-negative
integer, one of the series terminates and is therefore a polynomial. (The first series
is a polynomial if is even and the second is a polynomial if is odd.) These
lead to particular solutions of Legendre’s equations referred to as Legendre’s
polynomials which have nice properties and applications. For example, the
solution of Laplace’s equation, ∇ 2 φ = 0 leads to Legendre polynomials (see
Calculus III).
∞
∞
an (n + r )2 − p 2 x n+r +
an x n+r +2
n=0
n=0
= 0.
Now we adjust the equation to make the powers of x the same in both series.
We replace n by n + 2 in the first series which we now have to start at n = −2
rather than n = 0. Hence we can write the first series as
∞
n=−2
∞
dy
+ (x 2 − p 2 )y = 0
+x
dx
dx 2
about the origin x = 0 where p is a constant.
To begin with we have to put the equation in standard form. This gives
d2 y 1 d y (x 2 − p2 )
+
+
y=0
dx 2 x dx
x2
+ a0 r 2 − p 2 x r + a1 (r + 1)2 − p 2 x r +1
where we have separated off the first two terms corresponding to n = −2 and
n = −1 so that the final series starts at n = 0 again.
We are now in a position to group together all the terms in the series solution.
This gives
∞ an+2 (n + r + 2)2 − p 2 + an x n+r +2
n=0
and hence
P(x) =
x
,
(x 2 − p 2 )
Q(x) =
.
x2
Therefore x = 0 is a regular singular point and the second theorem listed above
applies. Hence we try a power series solution of the form
y=
∞
an+2 (n + r + 2)2 − p 2 x n+r +2
n=0
d2 y
1
an+2 (n + r + 2)2 − p 2 x n+r +2
=
Example: (Bessel’s Equation)
Derive the series solution for the differential equation
x2
=
41
+ a0 r 2 − p 2 x r + a1 (r + 1)2 − p 2 x r +1 = 0 .
In order to satisfy this equation the coefficient of each power of x should be zero.
This condition gives
a0 r 2 − p 2 = 0
a1 (r + 1)2 − p 2 = 0
an x n+r .
(1.36)
(1.37)
n=0
Note that we must include the r because of the fact that x = 0 is a regular singular
point. Substituting this series for y in the terms in the differential equation gives
x
x2
∞
dy =
an (n + r )x n+r
dx n=0
or
∞
d2 y =
an (n + r )(n + r − 1)x n+r
2
dx
n=0
∞
n=0
(1.38)
(r 2 − p 2 )a0 = 0 .
d2 y
dy
+ (x 2 − p 2 )y
+x
dx
dx 2
=
an+2
1
=−
an
(n + r + 2)2 − p 2
an+2
1
.
=−
an
(n + r + 2 − p)(n + r + 2 + p)
Therefore, in order to satisfy the equations we must satisfy Eqs. (1.36), (1.37)
and (1.38). Eq. (1.36) gives
and adding all the terms together gives
x2
and the recurrence relation
∞
an (n + r )(n + r − 1) + (n + r ) − p 2 x n+r +
an x n+r +2
n=0
Note: We always take a0 = 0 in the Frobenius method which gives the undetermined r . This gives r = ± p. Note that two roots can be considered. The
equation r 2 − p2 = 0 is called the indicial equation and it is obtained by setting
to zero the coefficient of the lowest power of x and taking a0 = 0.
42
1 Ordinary Differential Equations
Taking r = ± p, Eq. (1.37) can only be satisfied if and only if a1 = 0. Therefore,
by Eq. (1.38) we must have a3 = a5 = · · · = 0. Therefore all odd powered terms
are zero.
Now with r = + p we have a solution
y=
∞
an x
2
Functions of More than One Variable
r +n
n=0
which, using Eq. (1.38), gives
an+2
1
=−
an
(n + 2)(n + 2 + 2 p)
and hence
a2
1
.
=−
a0
2(2 + 2 p)
The resulting series solution is
y[1] = a0 x p 1 −
x2
2 (2 p + 2 )
+
x4
2 × 4(2 p + 2)(2 p + 4)
− ···
which is an infinite series provided p is not a negative integer.
Now with r = − p we have a similar procedure. Replacing p by − p in the
above case we obtain the series solution
y[2] = a0 x − p 1 +
2.1 Functions of One Variable
x4
x2
+
+ ···
2(2 p − 2) 2 × 4(2 p − 2)(2 p − 4)
which is a well behaved infinite series provided p is not a positive integer.
Therefore a general solution is of the type
y = Ay[1] + By[2]
Previously we have considered functions of one variable. We have written these
in terms of functional notation as, for example, y = f (x), where f is some
function of the variable x . Therefore for each value of x there is a corresponding
value of y . For example, y = sin x .
The derivative of such a function can be written as d y/dx = f (x). This
corresponds to the gradient of the tangent to the curve at the point (x, y). For
example, d y/dx = cos x .
We can also consider the integral of the function between x = a and x = b.
This is written as
b
a
f (x) dx = [ F(x)]ab
where
d F(x)
= f (x) .
dx
where A and B are arbitrary constants.
Remark: It is clear that whenever the constant p in the Bessel equation is an
integer or whenever the difference between the roots of the indicial equation is
zero or a positive integer (i.e. if p is a half integer), then we expect difficulties.
To deal with these cases (and similar cases in other equations) we require the
full Frobenius method that is not covered in this course.
For example, when the roots of the indicial equation are equal, i.e. r1 = r2 ,
the solutions coincide implying that there is only one solution,
y[1] = x r1
∞
y = fHxL
2
1
-2
an x n .
1
2
-1
n=0
The second solution is then of the form
y[2] = ln x y[1] .
-1
-2
Fig. 2.1
Plot of a general function y = f (x).
43
x
44
2 Functions of More than One Variable
2.2 Functions of Two Variables
A function of x can also be considered graphically. Here y(x) or y = f (x)
represents the height above the x -axis of a point on the curve represented by
y = f (x). This is illustrated in Fig. 2.1.
Figure 2.2 contains plots of several functions. In each case the value of y for
a particular value of x defines the height above the x -axis. Note the relationship
between the parabolas shown in Fig. 2.2(b), (c) and (d).
(a)
y = sin x
1
(b)
y = x2
20
0.5
-6
-4
15
-2
2
4
x
6
10
5
-1
2.2 Functions of Two Variables
It is perfectly possible for some dependent variable to be a function of two
variables. Here we use the notation z = f (x, y) for such a function where the
‘input’ is (x, y) and the ‘output’ is z . If we consider the dependent variable z as
the third dimension then it can be thought of as the height above the (x, y) plane
(defined by z = 0) as illustrated in Fig. 2.3(a).
Note that f (x, y) depends on both x and y and therefore z is a function of two
variables. Hence the ‘graph’ of z = f (x, y) is now a two-dimensional surface
(see Fig. 2.3(b)). Good analogies are the heights of a mountainous terrain above
sea level, or the weather maps with temperature or atmospheric pressure plotted
as a function of geographic location.
Therefore we can imagine a function of two variables as a surface in three
dimensional space.
Now we consider some of the techniques for plotting surfaces defined by
functions of two variables.
45
(c)
-4 -3 -2 -1
2
3
4
x
y = 2 - x2
(d)
y = 5 + x2
1
20
15
-4 -3 -2 -1
-5
10
-10
5
1
2
3
4
x
-15
-4 -3 -2 -1
1
2
3
4
x
-20
Example:
Plot the surface defined by
z = f (x, y) = x 2 + y 2 .
(e)
(a)
z
(b)
5
z
-2
surface defined
by z = f (x,y)
height above
(x,y) plane
-15
y
x
(x,y)
-5
1
2
3
(f)
4
5
y = x 2 - cos x + 3 sin x
20
x
15
10
-10
z = f (x,y)
in the plane
(z = 0)
-1
y = x3 - 4x 2 + 4
y
x
Fig. 2.3. (a) Each value of z = f (x, y) for a particular (x, y) can be thought of as
defining the height above the (x, y) plane. (b)
-20
5
-4
-2
-5
2
4
x
Fig. 2.2. Plots of various functions of the variable x. (a) y = sin x, (b) y = x 2 , (c)
y = 5 + x 2 , (d) y = 2 − x 2 , (e) y = x 3 − 4x 2 + 4 and (f) y = x 2 − cos x + 3 sin x.
46
2 Functions of More than One Variable
2.2 Functions of Two Variables
47
In order to plot this surface recall that, for a fixed y , y = a say, we have
z = x 2 + a 2 . In this special case we have made it a function of one variable only
(z is now only a function of x ); in fact z = x 2 + a 2 defines a parabola in the plane
y = a , perpendicular to the y -axis.
We can assemble the surface by plotting a set of one dimensional curves
z = f (x, a) where we think of a as a constant (i.e. the curves lie on the y = a
plane). Using three different values of a we obtain the plots shown in Fig. 2.4.
Each value of a gives rise to a different parabola. For example, for a = 0 we
obtain the parabola z = x 2 . Therefore the required surface is made up of all
the parabolas. Note that w could equally well fix the value of x and plot the
parabolas of the form z = a 2 + y 2 which lie in planes perpendicular to the first
set of parabolas. The final surface is shown in Fig. 2.5.
Likewise we can sketch the surface defined by (i) z = −(x 2 + y 2 ) which is
the surface defined by z = x 2 + y 2 turned upside down (see Fig. 2.6) and (ii)
z = x 2 + y 2 + 3 which is the surface defined z = x 2 + y 2 lifted up by three units
along the z -axis (see Fig. 2.7).
Consider the surface defined by z = y 2 − x 2 . Holding x = 0 but letting y vary
we obtain a parabola z = y 2 in the y -z plane. Holding y = 0 but letting x vary
we obtain a parabola z = −x 2 in the x -z plane. Extending these results we see
that we obtain a set of inverted parabolas whose vertices increase in height as
|y| increases. The resulting surface is a saddle and the origin is a saddle point.
This is illustrated in Fig. 2.8
Unlike the case of functions of one variable, for saddles (arising from functions
of two variables), the origin (in this case) is a maximum in x and a minimum in
y.
Before proceeding it is worthwhile reminding ourselves that in the plane we
can choose two sets of coordinates. These are cartesian coordinates (x, y) and
z
z=x2+a2
Fig. 2.5 The surface defined by z = x 2 + y 2 .
polar coordinates (r, θ) where 0 < r < ∞ and 0 ≤ θ ≤ 2π (see Fig. 2.9). The
relationships between them are given by
x = r cos θ,
y = r sin θ
z=x2+a2
z=x2
y=-a
y=+a
x
Fig. 2.4
y
Three of the parabolas that go to make up the surface defined by z = x 2 + y 2 .
Fig. 2.6
The surface defined by z = −(x 2 + y 2 ).
48
2 Functions of More than One Variable
2.3 Limits and Continuity
49
y
(x,y)
(r,θ)
r
θ
x
Fig. 2.9 Cartesian and polar coordinates in the plane.
Fig. 2.7
The surface defined by z = x 2 + y 2 + 3.
Polar coordinates can be useful when considering surfaces defined by functions of two variables. For example, consider again the surface defined by
z = f (x, y) = x 2 + y 2 , the bowl shown previously in Fig. 2.5. In polar coordinates we can write the equation of the surface as z = r 2 . Therefore the value of
z depends only on the distance to the origin. Furthermore, the surface must be
rotationally symmetric about the z -axis. The surface is shown again in Fig. 2.10.
2.3 Limits and Continuity
In this section the aim is to extend ideas from the one variable case to functions
of more than one variable. We begin by examining the concepts for functions of
one variable.
Fig. 2.8
and
r=
The surface defined by z = y 2 − x 2 .
x 2 + y2,
θ = tan−1
y
x
.
Fig. 2.10
The surface defined by z = x 2 + y 2 .
50
2 Functions of More than One Variable
2.3 Limits and Continuity
2.3.1 Limits and Continuity for Functions of One Variable
(a)
For functions of one variable we say that f (x) approaches the limit L as x → a
whenever
(b)
51
3
2
1
1
lim f (x) = L .
x→a
-10
⇒
lim f (x) = f (0)
x→0
and also because there is no break at x = 0.
-3
Fig. 2.11
3
(b)
-10
10
-10
10
-1
-1
Fig. 2.12. Functions of one variable. (a) f (x) = 1 when x > 0 and f (x) = 0 when
x ≤ 0, (b) f (x) = 1/x, (c) f (x) = (sin x)/x when x = 0 and f (x) = 1 when x = 0,
(d) f (x) = 0 when x = 0 and f (x) = 1 when x = 0.
=
(c)
1
1
1
3
-3
-2
-1
1
2
3 -3
-2
f (x) =
3
2
2
1
lim f (x) = 0 .
x→0−
Therefore the function f (x) is not continuous at x = 0.
2
1
(d)
1
lim f (x) = 1
2
-1
-3
Example:
Determine whether or not the function defined by
3
-2
3
-2
x→0+
when x > 0 ;
when x ≤ 0 .
is continuous at x = 0.
(a)
2
The function is shown in Fig. 2.12(a). Note that there exists a break at x = 0.
As x → 0 from below (the negative side), f (x) → 0. As x → 0 from above (the
positive side), f (x) → 1.
Example:
Determine whether or not the function defined by
1
f (x) =
0
1
x→a
Intuitively by continuity we mean that there is a lack of breaks in the graph of
the function f (x).
Consider the three examples shown in Fig. 2.11. These are the functions
f (x) = x , f (x) = x 2 and f (x) = x 3 . As x approaches 0 in each case the
functions are continuous at x = 0. This is because
-1
-1
(c)
lim f (x) = f (a) .
-2
-1
Therefore as x approaches a , so f (x) approaches L . The limits are the same as
x → a from both directions.
For functions of one variable we say that f (x) is continuous at x = a whenever
limx→a f (x) exists, f (a) is defined and the limit L equals f (a). Therefore
continuity of the function f (x) at x = a
-3
10
-1
-1
-1
-2
-2
-2
-3
-3
-3
x
is continuous at x = 0.
1
-1
1
2
3
Functions of one variable. (a) f (x) = x, (b) f (x) = x 2 and (c) f (x) = x 3 .
The function defines a pair of hyperbolas and is shown in Fig. 2.12(b). The
function f (x) is not continuous at x = 0 because limx→0 f (x) does not exist.
Example:
Determine whether or not the function defined by
f (x) =
sin x
x
1
x = 0 ;
x = 0.
52
2.3 Limits and Continuity
2 Functions of More than One Variable
is continuous at x = 0.
(a)
53
(b)
The function is shown in Fig. 2.12(c). The function f (x) is continuous at
x = 0 because
lim f (x) = 1 .
x→0
Note that we do not have to provide a special definition for the function at x = 0.
This is because, for small x ,
sin x ≈ x −
x3
3!
+ ···
and so
sin x
x
=1−
x2
6
+ ···
→ 1 as x → 0 .
Example:
Determine whether or not the function defined by
f (x) =
0
1
whenx =
0;
whenx = 0 .
is continuous at x = 0.
The function is shown in Fig. 2.12(d). The function is clearly not continuous
at x = 0 because
lim f (x) = 0
x→0
Fig. 2.13. To examine the continuity at a point we have to have the same limit as the
point is approached from all directions, including (a) radial directions and (b) tangential
directions.
Definition: The function f (x, y) is said to be joint continuous at the point
(x, y) = (x0 , y0 ) if:
1) f (x, y) is defined at (x0 , y0 )
2) lim(x,y)→(x0 ,y0 ) f (x, y) exists
3) lim(x,y)→(x0 ,y0 ) f (x, y) = f (x0 , y0 ) .
A function that is joint continuous at every point (x, y) is said to be a joint
continuous function.
In summary, for joint continuity check to see if:
lim
as approached from both sides but f (0) = 0
2.3.2 Limits and Continuity for Functions of Two Variables
We wish to define continuity of the function f (x, y) at (x, y) = (0, 0).
Note: In two dimensions we do not (as in one dimension) just approach the
origin from the left and right. For continuity we now require there to be the same
limit as the origin is approached in all directions (see Fig. 2.13).
Note: With functions of one variable we consider smaller and smaller intervals
about the origin. With functions
of two variables we have to replace intervals by
discs and let the radius, r = x 2 + y 2 → 0. We want all values of the function
f inside the disc tend to the same limit.
Therefore we wish to say that the function f (x, y) is continuous at the point
(0, 0) provided the values of f (x, y) in a disc of radius d approaches
the value of
f (0, 0) as d → 0; we define (x, y) → (0, 0) to mean r = x 2 + y 2 → 0.
(x,y)→(x0 ,y0 )
f (x, y) = f (x0 , y0 ) .
Example:
Show that the function f (x, y) = x 2 − y 2 is (joint) continuous at (0, 0).
In polar coordinates, x = r cos θ , y = r sin θ we have
f = r 2 (cos2 θ − sin2 θ )
where r becomes the radius of the disc discussed above. Therefore
lim f → 0 = f (0, 0) = 0
r →0
independent of θ and therefore the function is joint continuous. Note that this
function represents a saddle and therefore it is intuitively obvious that it is continuous.
Example:
54
2 Functions of More than One Variable
Determine the continuity of the function defined by
f (x, y) =


1


1
0
for y = 0 (x -axis) ;
for x = 0 ( y -axis) .
away from both axes
The values of the function in the x - y plane are shown in Fig. 2.14. Note that
if we keep y fixed and vary x then f (x, 0) = 1 for all x . Therefore f (x, y) is
continuous in x alone. Similarly if we keep x fixed equal to 0 and varying y gives
f (0, y) = 1 for all y . Therefore f (x, y) is continuous in y alone. But f (x, y) = 0
anywhere off the two axes. Therefore the function is continuous in each variable
separately. But the function f (x, y) is not jointly continuous. This is because
in any disc around the origin (see Fig. 2.14) the function always has values zero
and one no matter how small the disc is.
This function is not joint continuous but separately continuous, i.e. the functions f (x, 0) and f (0, y) are continuous functions of one variable.
In order to determine whether or not functions are joint continuous, there are
two procedures to try:
2.3 Limits and Continuity
55
2) Try substitutions of the type x = y n or y = x n to see if f (x, y) has different
limits along different curves, in which case it is not continuous.
Example:
Determine the continuity of the function defined by
f (x, y) =



(x 2
x+y
+ y 2 )1/4
(x, y) = (0, 0) ;
(x, y) = (0, 0) .
0
In polar coordinates we have
x 2 + y 2 = r 2 cos2 θ + sin2 θ
√
(x 2 + y 2 )1/4 = (r 2 )1/4 = r
and hence
r (cos θ + sin θ )
√
r
√
= r (cos θ + sin θ ) → 0
f =
1) Write the function in polar coordinates and see if f → L as r → 0, for
example, independent of θ . If it is then f is joint continuous.
as r → 0 .
Therefore f is joint continuous at (0, 0).
y
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
111111111111
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
Fig. 2.14
00000000000
00000000000
00000000000
00000000000
00000000000
00000000000
00000000000
00000000000
11111111111
00000000000 x
00000000000
00000000000
00000000000
00000000000
00000000000
00000000000
00000000000
The function f (x, y) = 1 when x = 0 or y = 0, but 0 elsewhere.
Example:
Determine the continuity of the function defined by
f (x, y) =



2x y
x 2 + y2
(x, y) = (0, 0) ;
0
(x, y) = (0, 0) .
First method:
In polar coordinates we have
f =
2r 2 cos θ sin θ
= sin 2θ
r 2 (cos2 θ + sin2 θ )
away from the origin.
Therefore, as r → 0, f → 0 in all directions. Therefore the function is not joint
continuous.
For example, if we approach the origin along the line θ = π/4, then f |θ=π/4 =
sin(π/2) = 1. Therefore f = 1 everywhere along this line, except at the origin
where it is equal to zero. Therefore the function is not joint continuous at (0, 0).
Second method:
56
2 Functions of More than One Variable
2.4 Partial Differentiation
57
We proceed by letting y = mx and see how the function changes as we
approach the origin along this line. We have
f =
2mx 2
x 2 + m2 x 2
=
Remark: Sums and products of continuous functions are always continuous.
2m
1 + m2
= constant .
Therefore f is a constant along y = mx everywhere except at the origin (note
that we divided by r in the above equation). Therefore f is not joint continuous
at (0, 0).
Note: In this example if we keep y = 0 fixed, then f (x, 0) = 0 everywhere
along the x -axis. Therefore f is continuous in x alone. Similarly f is continuous
in y alone. Therefore f is separately continuous in x alone and y alone, but not
jointly continuous in x and y .
Remark: Another way of expressing joint continuity is as follows. If we write
f = f (x + h, y + k) − f (x, y)
then joint continuity at the point (x, y) amounts to
lim
(h,k)→(0,0)
Remark: All functions of the form
 n/2 m/2
x y
f (x, y) =
x n + ym

Example:
Determine the continuity of the function defined by
f (x, y) =



( f ) → 0 .
0
x2y
x 4 + y2
(x, y) = (0, 0) ;
0
(x, y) = (0, 0) .
are not joint continuous as letting y = x
n/m
(x, y) = (0, 0) .
(x, y) = (0, 0) .
easily demonstrates.
2.4 Partial Differentiation
First method:
In polar coordinates we have
f =
r 3 cos2 θ sin θ
r 4 cos4 θ + r 2 sin2 θ
.
Therefore it is not quite clear what happens as r → 0. So we must try the second
method.
Second method:
We proceed by letting y = x 2 and see how the function changes as we approach
the origin along this line. We have
f =
For functions of one variable: y = f (x). In this case the derivative at a point is
the gradient of the tangent to the curve at that point (see Fig. 2.15).
For functions of two variables: z = f (x, y). However, there exist an infinite
number of tangents that can be drawn at a point (x0 , y0 ) (see Fig. 2.16).
So how can we define a derivative in this case? The answer is that we define
a partial derivative. The idea is as follows: As we have seen, the function
y
1
x2 · x2
x4
=
= .
2
2
+ (x )
2x 4 2
x4
Therefore f reduces to a constant along this curve. Therefore f is not jointly
continuous. We can draw a disc around the origin to see this.
Remark: We can in general prove that
joint continuity
⇒
separate continuity
x = x0
but that in general
separate continuity
⇒
joint continuity .
Fig. 2.15
x
The tangent to the curve y = f (x) at the point x = x0 .
58
2 Functions of More than One Variable
2.4 Partial Differentiation
Similarly the derivative of f (x0 , y) with respect to y is defined as:
z
∂f
≡ f y = lim
∂y
k→0
surface defined
by z = f (x,y)
y
f (x0 , y + k) − f (x0 , y)
k
∂f
∂x
∂f
fy ≡
∂y
∂f
fz ≡
∂z
(x0,y0)
x
There is an infinite number of tangents to the surface at the point (x0 , y0 ).
z = f (x, y) can be represented as a surface. If we draw a plane through (x0 , y0 )
parallel to the x -z plane, then the two surfaces intersect in a curve for which y
is fixed. We can then draw a tangent to this curve at the point (x0 , y0 ) which is
unique (see Fig.2.17). This is the partial derivative with respect to x for fixed y .
Similarly one can draw a plane parallel to the z - y plane and define the partial
derivative with respect to y for fixed x .
So, if we fix y = y0 in f (x, y) and let x vary, then f (x, y0 ) depends only on x .
Its derivative with respect to x is called the partial derivative with respect to x
and is defined as
lim
h→0
f (x + h, y0 ) − f (x, y0 )
.
h
and similarly for functions of three or more variables; we define partial derivatives by holding all but ONE variable fixed.
For example, if f = f (x, y, z) then
fx ≡
Fig. 2.16
59
y, z constant
x, z constant
x, y constant
Remark: The symbol ∂ is the “curly d” and is NOT the same as d. Also, we
cannot treat ∂ f /∂ x as a ratio as in some sense we can do with d f /dx .
Remark: Calculations are as easy (or even easier!) as for functions of one
variable.
Example:
f (x, y) = x 2 + y 2
⇒ f x = 2x,
f y = 2y
Example:
f (x, y, z) = x y 2 z 3
⇒ fx = y2 z3,
f y = 2x yz 3 ,
f z = 3x y 2 z 2
Provided this limit exists, this partial derivative is denoted by
∂f
∂x
Example:
or
fx .
f (x, y) = e(x
2 +y 2 )
⇒ f x = 2x e(x
2 +y 2 )
,
f y = 2 y e(x
2 +y 2 )
Example:
z
f (x, y) =
surface defined
by z = f (x,y)
Fig. 2.17
⇒ fx =
−1
,
(x + y)2
fy =
−1
(x + y)2
Note: Partial derivatives are themselves functions of x and y , or x , y and z , etc.
Therefore we can take further partial derivatives. So we define:
y
x
1
x+y
(x0,y0)
The tangent to the surface at the point (x0 , y0 ) parallel to the x-z plane.
∂
∂x
∂
∂y
∂f
∂x
∂f
∂x
=
∂2 f
≡ ( f x )x = f x x
∂x2
∂2 f
≡ ( fx )y = fx y
∂ y∂ x
∂2 f
∂ ∂f
≡ f y x = f yx
=
∂x ∂y
∂ x∂ y
=
60
2 Functions of More than One Variable
2.4 Partial Differentiation
where f x y , f yx , etc. are called mixed partial derivatives. Note that f x y means
(partial) differentiate with respect to x first and then with respect to y .
Assuming joint continuity of higher order derivatives, similar theorems hold in
higher dimensions. For example, if f = f (x, y, z) and we assume joint continuity
of the function f , then
Example:
Given the function
61
f x x yyz = f x yx yz = f zx x yy = · · · etc.
2
f (x, y) = sin(x − 2 y )
find f x , f y , f x x , f yy , f x y and f yx .
∂f
= cos(x − 2 y 2 )
keeping y constant
∂x
∂f
= −4 y cos(x − 2 y 2 )
fy =
keeping x constant
∂y
∂2 f
∂ ∂f
fx x =
=
= − sin(x − 2 y 2 )
∂x ∂x
∂x2
∂2 f
∂ ∂f
=
f yy =
= −4 cos(x − 2 y 2 ) + 4 y(−4 y) sin(x − 2 y 2 )
∂y ∂y
∂ y2
∂ ∂f
∂2 f
=
fx y =
= 4 y sin(x − 2 y 2 )
∂ y∂ x
∂y ∂x
∂ ∂f
∂2 f
=
f yx =
= 4 y sin(x − 2 y 2 ) .
∂ x∂ y
∂x ∂y
fx =
Therefore provided the conditions of joint continuity are met the operational
order of differentiation does not matter.
Example:
Show that (i) f (x, y, z) = x 2 + y 2 − 2z 2 and (ii) f (x, y, z) = e(3x+4 y) cos 5z
satisfy Laplace’s equation
∂2 f
∂2 f
∂2 f
+
+
= 0.
∂x2
∂ y2
∂z 2
(i) We have
∂f
= 2x,
∂x
∂2 f
= 2,
∂x2
∂f
= 2 y,
∂y
∂2 f
= 2,
∂ y2
∂f
= −4z,
∂z
2
∂ f
= −4,
∂z 2
∂2 f
∂2 f
∂2 f
+
+
= 2 + 2 − 4 = 0.
⇒
∂x2
∂ y2
∂z 2
Note: In this example f x y = f yx . This is not a coincidence. It turns out that
mixed f x y and f yx are equal for most functions in practice. The condition for
equality involves the concept of joint continuous functions.
Suppose x = x0 + h and y = y0 + k . Then
f (x, y) − f (x0 , y0 ) = f
and f = f (x0 + h, y0 + k) − f (x0 , y0 ) .
Joint continuity means that f → 0 as (h, k) → (0, 0) where (h, k) may approach
(0, 0) along any curve. Furthermore, sums and products of joint continuous
functions are joint continuous.
Theorem: If at a point (x, y) both f x y and f yx exist (for a function f (x, y)) and
are joint continuous, then we have f x y = f yx .
We can also define higher partial derivatives. For example,
fx x x
∂
≡
∂x
fx x y
∂
≡
∂y
∂2 f
∂x2
∂2 f
∂x2
=
∂3 f
∂x3
=
∂3 f
= f yx x = f x yx .
∂ y∂ x 2
(ii) We have
∂f
= 3e(3x+4 y) cos 5z,
∂x
∂2 f
= 9e(3x+4 y) cos 5z = 9 f
∂x2
∂f
= 4e(3x+4 y) cos 5z,
∂y
∂2 f
= 16e(3x+4 y) cos 5z = 16 f,
∂ y2
∂f
= −5e(3x+4 y) sin 5z,
∂z
62
2 Functions of More than One Variable
2.5 Joint Differentiability
∂2 f
= −25e(3x+4 y) cos 5z = −25 f,
∂z 2
∂2 f
∂2 f
∂2 f
+ 2 + 2 = 9 f + 16 f − 25 f = 0 .
⇒
2
∂x
∂y
∂z
63
y
y = f(x)
2.5 Joint Differentiability
Again we start with the idea of differentiability for functions of one variable.
Consider a function y = f (x) and two neighbouring points x = a and x = a + h
(see Fig. 2.18).
The question is: Does the slope
x=a
f (a + h) − f (a)
h
x
Fig. 2.19 The gradient of the tangent at the point x = a on the curve y = f (x).
tend towards a definite limit A? If so, A is the required tangent slope (the
derivative) at x = a and the function is said to be differentiable at x = a .
Alternatively we can write
f (a + h) − f (a)
−A=ρ
h
where ρ determines how well the slope approximates A (ρ can be thought of as the
“error” in the slope). This leads to the alternative definition of differentiability:
Definition: Let f = f (x) be a function and let f = f (a + h) − f (a). The
function f (x) is differentiable at x = a provided
f = Ah + ρh
such that ρ → 0 as h → 0.
y
y = f(x)
Therefore differentiability in one dimension implies that the tangent line with
slope
A
d f =
dx x=a
x=a
gives an excellent approximation to the curve near x = a (see Fig. 2.19).
Note the form of the functions shown in Fig. 2.20. In each case there are
no tangents to the curve at x = x0 . This implies that these functions are not
differentiable.
When we are dealing with functions of two variables we require a generalisation of this idea and we expect that joint differentiability of a function at a point
(x0 , y0 ) implies that the surface z = f (x, y) has a tangent plane at (x0 , y0 ) and
near this point the tangent plane gives an excellent approximation to the surface
(see Fig. 2.21).
y
y
f (a+h) – f (a)
h
x=a
Fig. 2.18
x = a+h
x
The approximation to the slope at the point x = a on the curve y = f (x).
x = x0
x
x = x0
Fig. 2.20 Two functions which are not differentiable.
x
64
2.6 Directional Derivative
2 Functions of More than One Variable
65
and where (i) ρ1 and ρ2 depend on h , k , a , b (ρ1 and ρ2 are “error” terms) and
(ii) the first and second terms in Eq. (2.1) represent the tangent plane itself:
⇒ A = fx ,
B = fy .
It follows from Eq. (2.1) that
lim
(h,k)→(0,0)
f = 0 .
Therefore:
joint differentiability
Fig. 2.21
Note: A function can have both partial derivatives but fail to be even joint
continuous.
Therefore the existence of partial derivatives is not enough for joint differentiability (or even joint continuity). The condition for this is given by:
A(x − x0 ) + B(y − y0 ) + C(z − z 0 ) = 0 .
Dividing by C this can be rearranged to give
z − z 0 = A (x − x0 ) + B (y − y0 )
where A = −A/C and B = −B/C . For this plane to represent the tangent plane
at x = x0 , its intersection with the plane y = y0 must be the tangent line with
gradient given by the partial derivative
f x (x
h = δx,
(x0 ,y0 )
0 ,y0
+ (y − y0 ) f y (x
)
0 ,y0 )
.
Definition: The function f (x, y) is jointly differentiable at (a, b) provided
f = f (a + h, b + k) − f (a, b) = Ah + Bk + hρ1 + kρ2
such that
(h,k)→(0,0)
ρ1 → 0
and f = δ f .
and
δf =
lim
(h,k)→(0,0)
ρ2 → 0
∂f
∂f
δx +
δy + (ρ1 δx + ρ2 δy)
∂x
∂y
and so
(Total change in f ) = (change due to δx ) + (change due to δy ) + (error terms) .
2.6 Directional Derivative
Now we can consider the concept of joint differentiability in the case of functions
of two variables.
lim
k = δy
Hence Eq. (2.1) becomes
.
Therefore the equation of the tangent plane to the surface z = f (x, y) at
(x0 , y0 , z 0 ) is:
z − z 0 = (x − x0 ) f x (x
Theorem: If f x and f y both exist and are both joint continuous then the function
f is joint differentiable.
Another way of writing the joint differentiability condition is to set (for increments)
0 ,y0 )
by definition. Similarly at y = y0 the intersection with x = x0 plane must be the
tangent line with gradient given by the partial derivative
fy
joint continuity .
The opposite need not be true. Note that even in functions of one variable continuity does not necessarily imply differentiability (see left-hand plot in Fig. 2.20).
The tangent plane to a surface z = f (x, y) at the point (x0 , y0 ).
The equation of the plane passing through the point P(x0 , y0 , z 0 ) is:
⇒
(2.1)
Recall that partial derivatives are obtained by letting only one of the variables
vary. Therefore partial derivatives are derivatives along the x and y directions
respectively (see Fig. 2.22).
Suppose we wish to find derivatives in any direction making an angle α with
the x -axis (see Fig. 2.23). Then there are variations in both directions with
δx = t cos α
δy = t sin α .
66
2 Functions of More than One Variable
67
Hence
δx
(x,y)
2.6 Directional Derivative
Dα f = lim
t→0
δf
∂f
∂f
= cos α
+ sin α
t
∂x
∂y
where ρ1 and ρ2 → 0 as t → 0.
Note: Clearly when α = 0, Dα f = ∂ f /∂ x and when α = π/2, Dα f = ∂ f /∂ y .
δy
Example:
Calculate Dπ/4 f for f = sin(x + y 2 ).
Fig. 2.22
Partial derivatives as derivatives along the x and y directions.
π ∂f
π ∂f
+ sin
Dπ/4 f = cos
4 ∂x
4 ∂y
1
Definition: The directional derivative of a function f along a direction making
an angle α with the x -axis is given by:
Dα f = lim
t→0
δf
.
t
∂f
∂f
δx +
δy + (ρ1 δx + ρ2 δy)
∂x
∂y
and for δx and δy gives
δf =
2
We can also generalise the expression
Substituting for δ f given by
δf =
1
= √ cos(x + y 2 ) + √ (2 y) cos(x + y 2 )
2
2
√ 1
= √ + y 2 cos(x + y 2 ) .
∂f
∂f
t cos α +
t sin α + ρ1 t cos α + ρ2 t sin α .
∂x
∂y
d y d y du
=
dx
du d x
which is the expression for differentiating a function of a function of one variable.
Here, instead of a straight line, at an angle α , consider the change in f (x, y) along
a curve given by x = x(t), y = y(t) (i.e. a parametric curve).
For a general curve defined by x(t) and y(t) the value of t defines the point
along the curve (see Fig. 2.24).
y
y
as t
varies
t
δy
α
δx
Fig. 2.23
x
x
The components of the length t along the x and y axes.
Fig. 2.24. The parametric curve where x = x(t) and y = y(t). The value of t
determines the location along the curve.
68
2 Functions of More than One Variable
For example, a circle of radius a can be defined parametrically by
2.7 Chain Rule
Therefore
∂ f dx
∂ f dy
δf
=
+
+ “error terms.”
δt
∂ x dt
∂ y dt
x = x(t) = a cos t
y = y(t) = a sin t
where 0 ≤ t ≤ 2π . Note that x 2 + y 2 = a 2 , the square of the radius.
Theorem: Consider a joint differentiable function f (x, y) along the curve x =
x(t), y = y(t), where x and y are differentiable with respect to t . Then f (x(t), y(t))
depends on t only and we have:
∂ f dx
∂ f dy
df
=
+
dt
∂ x dt
∂ y dt
This represents the derivative of the function f (x, y) along the parametric curve
(x(t), y(t)) (see Fig. 2.25).
Proof : For a small change in t → t + δt , we have a small change δx in x and δy
in y . Using the one variable formulae:
δx =
dx
δt + error,
dt
δy =
dy
δt + error
dt
Then the change in f becomes
∂f
∂f
δx +
δy + error terms
∂x
∂y
∂ f dx
∂ f dy
δt + error +
δt + error .
=
∂ x dt
∂ y dt
δf =
69
In the limit as δt → 0 the “error terms” → 0. Hence
lim
δt→0
∂ f dx
∂ f dy
δf
df
=
=
+
.
δt
dt
∂ x dt
∂ y dt
Note: d f /dt means “rate of change of height z = f (x(t), y(t)) as t varies along
the curve.”
Note: There exist two ways of finding d f /dt ; either direct substitution of x(t)
and y(t) in f (x, y) or by the use of the above formula.
Example:
Find the derivative of f (x, y) = x y 3 along the curve x(t) = cos t , y(t) = sin t .
We have
∂f
∂x
∂f
∂y
dx
dt
dy
dt
= y 3 = sin3 t
= 3x y 2 = 3 cos t sin2 t
= − sin t
= cos t
and hence, from the formula,
df
= (sin3 t)(− sin t) + (3 cos t sin2 t)(cos t) .
dt
2.7 Chain Rule
Often we wish to express our functions in terms of new variables, say f = f (x, y),
x = x(r, θ) = r cos θ , y = y(r, θ) = r sin θ . How do we find ∂ f /∂r , ∂ f /∂θ in terms
of ∂ f /∂ x , ∂ f /∂ y or vice versa?
Suppose in general that x and y are functions of two variables s and t :
x = x(t, s),
y = y(t, s) .
Keeping s fixed (s = s0 ), these equations define a curve:
Fig. 2.25
The derivative of the function f (x, y) along the parametric curve (x(t), y(t)).
x = x(t, s0 ),
y = y(t, s0 ) .
70
2 Functions of More than One Variable
By the previous theorem we can find d f /dt
the curve. We have
df
dt
=
s=s0
∂f
∂x
dx
dt
s=s0
+
s=s0
∂f
∂y
2.7 Chain Rule
which is the derivative along
dy
dt
.
Remark: Sometimes we can invert functions of x = x(t, s) and y = y(t, s)
and find t = t (x, y) and s = s(x, y). One such example is polar coordinates
r = r (x, y) and θ = θ (x, y). Then
∂f
=
∂x
∂f
=
∂y
.
s=s0
But calculating the total derivative of x(t, s) and y(t, s) with respect to t while
keeping s at a fixed value is equivalent to taking the partial derivative. Hence
∂ f ∂x
∂ f ∂y
∂f
=
+
.
∂t
∂ x ∂t
∂ y ∂t
Similarly,
∂ f ∂x
∂ f ∂y
∂f
=
+
.
∂s
∂ x ∂s
∂ y ∂s
In both these expressions the ∂ f /∂ x term represents the change in f due to x
changing and the ∂ f /∂ y term represents the change in f due to y changing.
in one dimension. Here we have two expressions, one for s and one for t .
Example:
Express
∂2 f
∂2 f
+ 2
2
∂x
∂y
where f = f (x, y) in terms of polar coordinates (r, θ).
Recall that x = r cos θ , y = r sin θ . First we find ∂ f /∂r and ∂ f /∂θ in terms of
∂ f /∂ x and ∂ f /∂ y using the chain rule:
∂ f ∂x
∂ f ∂y
∂f
∂f
∂f
=
+
= cos θ
+ sin θ
∂r
∂ x ∂r
∂ y ∂r
∂x
∂y
∂ f ∂x
∂ f ∂y
∂f
∂f
∂f
=
+
= −r sin θ
+ r cos θ
∂θ
∂ x ∂θ
∂ y ∂θ
∂x
∂y
Using the chain rule we have
Therefore we have expressed ∂ f /∂r and ∂ f /∂θ in terms of ∂ f /∂ x and ∂ f /∂ y .
Similarly we have the chain rule for functions of more than two variables.
Consider that example of a function f = f (x, y, z) where x = x(u, v, w), y =
y(u, v, w), z = z(u, v, w). The chain rule for a function of three variables gives:
∂ f ∂x
∂ f ∂y
∂ f ∂z
∂f
=
+
+
∂u
∂ x ∂u
∂ y ∂u
∂z ∂u
∂ f ∂x
∂ f ∂y
∂ f ∂z
∂f
=
+
+
∂v
∂ x ∂v
∂ y ∂v
∂z ∂v
∂ f ∂x
∂ f ∂y
∂ f ∂z
∂f
=
+
+
∂w
∂ x ∂w
∂ y ∂w
∂z ∂w
∂ f ∂s
∂t
+
∂x
∂s ∂ x
∂ f ∂s
∂t
+
∂y
∂s ∂ y
Remark: We can also use the chain rule to obtain higher derivatives. For
example, ∂ 2 f /∂ x 2 , ∂ 2 f /∂ y 2 , etc. in terms of (r, θ).
Example:
Given x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ and f = f (x, y), find
expressions for ∂ f /∂r and ∂ f /∂θ .
∂ f ∂x
∂ f ∂y
∂f
=
+
= f x cos θ + f y sin θ
∂r
∂ x ∂r
∂ y ∂r
∂ f ∂x
∂ f ∂y
∂f
=
+
= f x (−r sin θ ) + f y (r cos θ )
∂θ
∂ x ∂θ
∂ y ∂θ
∂f
∂t
∂f
∂t
However, often it is easier to find ∂ f /∂ x and ∂ f /∂ y from the above formula for
∂ f /∂t and ∂ f /∂s .
Note: This is the generalisation of
d f dx
df
=
dt
d x dt
71
(2.2)
(2.3)
Use these to find ∂ f /∂ x and ∂ f /∂ y in terms of ∂ f /∂r and ∂ f /∂θ . Multiply Eq. (2.2)
by cos θ and Eq. (2.3) by − sin θ/r and then add. This gives:
cos2 θ + sin2 θ
∂f
∂x
= cos θ
∂f
sin θ ∂ f
−
∂r
r ∂θ
or
∂f
sin θ ∂ f
∂f
= cos θ
−
.
∂x
∂r
r ∂θ
(2.4)
∂f
cos θ ∂ f
∂f
= sin θ
+
.
∂y
∂r
r ∂θ
(2.5)
Similarly:
These expressions apply to any function f ; we may therefore write them
symbolically as operators applied to any function:
∂
∂
sin θ ∂
= cos θ
−
∂x
∂r
r ∂θ
(2.6)
72
2 Functions of More than One Variable
∂
∂
cos θ ∂
= sin θ
+
∂y
∂r
r ∂θ
2.8 Taylor’s Theorem for Functions of Two Variables
(2.7)
where h and k are fixed constants. Therefore, if we apply the operator to a
function f it is taken to mean
∂
∂f
∂
∂f
+k
+k
D̃ f ≡ D̃( f ) = h
f = h
.
∂x
∂y
∂x
∂y
Now applying the operator in Eq. (2.6) twice to f gives:
∂f
∂f
sin θ ∂
∂
−
= cos θ
∂x
∂r
r ∂θ
∂x
sin θ ∂ ∂ f
∂ ∂f
= cos θ
−
∂r ∂ x
r ∂θ ∂ x
sin θ ∂ f
sin θ ∂
sin θ ∂ f
∂
∂f
∂f
= cos θ
−
−
cos θ
cos θ
−
.
∂r
∂r
r ∂θ
r ∂θ
∂r
r ∂θ
∂2 f
∂
≡
∂x
∂x2
We can now perform the differentiation using the product rule and using the fact
that fr θ = f θr . This gives:
2
sin θ cos θ ∂ 2 f
∂2 f
2 ∂ f
=
cos
θ
−
2
r
∂r ∂θ
∂x2
∂r 2
sin2 θ ∂ 2 f
sin2 θ ∂ f
+
+
r ∂r
r 2 ∂θ 2
2 sin θ cos θ ∂ f
+
∂θ
r2
Similarly
∂
∂f
∂
∂f
+k
+k
h
D̃ 2 f ≡ D̃( D̃ f ) = h
∂x
∂y
∂x
∂y
= h2
∂2 f
∂2 f
∂2 f
∂2 f
+ kh
+ k2 2 .
+ hk
2
∂ x∂ y
∂ y∂ x
∂x
∂y
But the second and third terms in the last expression are equal and so
D̃ 2 f = h 2 f x x + 2hk f x y + k 2 f yy .
Similarly we can define D̃ 3 f , D̃ 4 f , D̃ n f , etc.
Theorem: Assume the function f (x, y) has joint continuous partial derivatives
up to order n . Then
f (a+h, b+k) = f (a, b)+( D̃ f )(a,b) +
1 2
1
( D̃ f )(a,b) +· · ·+
( D̃ (n−1) f )(a,b) +E
2!
(n − 1)!
where E is the small error term given by
Similarly we find
sin θ cos θ ∂ 2 f
∂2 f
∂2 f
= sin2 θ 2 + 2
2
r
∂r ∂θ
∂y
∂r
+
−
cos2 θ ∂ 2 f
∂θ 2
r2
+
cos2 θ ∂ f
2 sin θ cos θ ∂ f
r2
∂θ
r
∂r
E=
F(t) = f (a + ht, b + kt)
where t is varying. Note that F(0) = f (a, b) and F(1) = f (a + h, b + k) and
d
dF
=
[ f (a + ht, b + kt)]
dt
dt
∂ f d(a + ht) ∂ f d(b + kt)
+
=
∂x
dt
∂y
dt
2.8 Taylor’s Theorem for Functions of Two Variables
Here we consider a generalisation of Taylor’s theorem for functions of one
variable. This will be useful for finding stationary points of functions of two
variables.
First we introduce a new notation. We define a differential operator D̃ by
∂
∂
+k
D̃ ≡ h
∂x
∂y
( D̃ n f )(a+θ h,b+θk)
Proof : Define a new function of one variable by
∂2 f
∂2 f
∂2 f
1 ∂f
1 ∂2 f
+
+
=
+
.
r ∂r
∂x2
∂ y2
∂r 2
r 2 ∂θ 2
1
n!
for some θ with 0 < θ < 1.
.
Adding these together gives
73
=h
∂f
∂f
+k
.
∂x
∂y
Similarly
d
d2 F
= ( D̃ f ) = D̃( D̃ f ) = D̃ 2 f
dt
dt 2
and
dn F
= D̃ n f .
dt n
74
2 Functions of More than One Variable
2.10 Stationary Points of Functions of Two Variables
75
Now, by Taylor’s theorem for one variable, we have
F(1) = F(0) +
d F(0) 1 d2 F(0)
1
dn−1 F(0) 1 dn F(θ)
+
+ ··· +
+
2
dt
2! dt
(n − 1)! dt n−1
n ! dt n
where 0 < θ < 1. Substituting into the formula for F(1) and the expressions for
the higher derivatives we obtain the theorem.
2.9 Quadratic Forms
Consider a quadratic function:
2
f (x, y) = ax + 2bx y + cy
2
where a , b and c are constants. Let
2
= ac − b = 0 .
Fig. 2.26. Examples of a (a) minimum, (b) a maximum and (c) a saddle point in the
surfaces defined by the function f (x, y).
If = 0, we have proved nothing. One can in this case show that either
f (x, y) = 0 for all (x, y) (i.e. the surface is the horizontal plane) or the surface
resembles f (x, y) = cy 2 . This corresponds to a flat-bottomed valley – neither a
Then:
(i) If > 0, then a f (x, y) > 0 for all (x, y) = (0, 0).
(ii) If < 0, then f (x, y) can take either sign for values of (x, y) near (0, 0).
maximum, minimum or a saddle point.
Proof : = ac − b2 > 0 implies a = 0. Consider
For functions of one variable, f (x), the function is stationary at x = a whenever
(d f /dx)|x=a = 0 (or f (a) = 0), i.e. the curve has a horizontal tangent.
a f (x, y) = a 2 x 2 + 2abx y + acy 2 .
2.10 Stationary Points of Functions of Two Variables
The nature of stationary points is given by the quadratic term in Taylor’s
expansion:
We can “complete the square” and rewrite this as
a f (x, y) = (ax + by)2 + (ac − b2 )y 2 = (ax + by)2 + y 2 .
But each of these two terms are positive, since > 0 by assumption. Therefore
their sum can only vanish if x = y = 0; otherwise a f (x, y) > 0 which proves (i).
= ac − b2 < 0 implies that we can no longer necessarily have a = 0. If
a = c = 0, then f (x, y) = 2bx y which takes opposite signs when x = y and
x = −y . Now suppose a = 0, say a > 0. Then the above expression for
y = 0, x = 0, is f = ax 2 > 0. On the other hand for (x, y) = (0, 0) on the line
ax + by = 0, f = (/a)y 2 < 0 (since < 0 by assumption). If c = 0, a similar
argument holds. Hence this proves (ii).
There are three important consequences of this:
1) If > 0 and a > 0, then f > 0 for all (x, y) = (0, 0) and the surface defined
by f is a minimum (see Fig. 2.26(a)).
2) If > 0 and a < 0, then f < 0 for all (x, y) = (0, 0) and the surface defined
by f is a maximum (see Fig. 2.26(b)).
3) If < 0, then f takes either sign arbitrarily close to (0, 0) and the surface
defined by f is a saddle (see Fig. 2.26(c)).
f (a + h) = f (a) + h f (a) +
h2
f (a) + E .
2!
The term f (a) determines the nature of the stationary point. If f (a) > 0 then
the point is a minimum. If f (a) < 0 then the point is a maximum.
For functions of two variables, f (x, y) has a maximum at the point (a, b)
whenever f (a + h, b + k) − f (a, b) < 0 for sufficiently small (h, k).
Note: Such a point is a maximum in x alone (with y fixed) and in y alone (with
x fixed). Hence ∂ f /∂ x = 0 = ∂ f /∂ y by the one variable theory. This suggests
the following definition:
Definition: The point (a, b) is a stationary point of the function f (x, y) whenever
∂f
∂f
=0=
at (a, b) .
∂x
∂y
Now consider Taylor’s theorem for a function of two variables:
f (a + h, b + k) = f (a, b) + h f x + k f y
76
2 Functions of More than One Variable
+
1 2
h f x x + 2hk f x y + k 2 f yy + Error
2!
2.11 Lagrange Multipliers
Example:
Find the stationary points and their nature for the function
Adopting the notation
p = fx ,
77
f (x, y) = x 4 + 4x 2 y 2 − 2x 2 + 2 y 2 + 1 .
q = fy,
r = fx x ,
t = f yy ,
s = fx y
we can write the Taylor expansion as
f (a + h, b + k) = f (a, b) + ph + qk
1 2
+
r h + 2shk + tk 2 + Error
The stationary points are given by the solutions of
p = f x = 4x 3 + 8x y 2 − 4x = 4x(x 2 + 2 y 2 − 1) = 0
2
2
q = f y = 8x y + 4 y = 4 y(1 + 2x ) = 0 .
(2.8)
(2.9)
where the stationary points occur where p = q = 0. Therefore, at a stationary
point,
> 0 Eq. (2.9) implies that y = 0. Then Eq. (2.8) gives 4x(x 2 −1) = 0
giving the stationary points at x = 0, x = ±1 at y = 0. The three stationary
points are (0, 0), (1, 0) and (−1, 0).
1 2
f (a + h, b + k) − f (a, b) =
r h + 2hks + tk 2 + Error .
2
r = f x x = 12x 2 + 8 y 2 − 4
2!
Now, by definition, this is a maximum, minimum, or saddle according to whether
the quadratic term on the right-hand side is negative, positive or either for small
(h, k).
Using the properties of quadratic functions and analysing (r h 2 + 2shk + tk 2 )
we conclude that:
Stationary points are given by p = q = 0 whenever the surface f (x, y) has horizontal tangents at (a, b) and there exist three possibilities as shown in Fig. 2.27.
From this we conclude that stationary points are given by p = f x = 0 and
q = f y = 0. Their nature depends on r = f x x , s = f x y and t = f yy . Define the
quantity = r t − s 2 . Then we have:
1) If > 0 and r > 0, then the point is a minimum.
2) If > 0 and r < 0, then the point is a maximum.
3) If < 0, then the point is a saddle.
Since 1+2x 2
For the nature of the points we need to look at:
s = f x y = 16x y
t = f yy = 8x 2 + 4
= r t − s2
These give us the values shown in Table 2.1 below.
2.11 Lagrange Multipliers
Lagrange multipliers can be used to find the maximum and minimum of functions
f (x, y) subject to some constraint φ(x, y) = 0, where φ is a given function.
Note: Any condition such as x 2 +y 2 = 1 can be written as φ(x, y) = x 2 +y 2 −1 =
0. In principle, we can solve φ(x, y) = 0 for y in terms of x and then find the
minimum or maximum of f (x, y(x)) – as a function of one variable.
In practice this is often difficult or impossible, so we proceed as follows:
Consider the families of curves f (x, y) = c for different values of c. The
condition φ(x, y) = 0 also describes a fixed curve in the (x, y) plane. The
diagrams in Fig. 2.28 suggest that the function f attains a maximum or minimum
Table 2.1. The values of r , s, t and at
the three stationary points.
Point
Fig. 2.27. Examples of a (a) maximum, (b) a minimum and (c) a saddle at the point
(a, b) on the surfaces defined by the function f (x, y).
(0, 0)
(1, 0)
(−1, 0)
r
s
−4 0
8 0
8 0
t
Nature
4
12
12
−16
96
96
Saddle
Minimum
Minimum
78
2 Functions of More than One Variable
2.11 Lagrange Multipliers
(subject to φ(x, y) = 0) precisely when the curve f = c ( f = c3 in the diagrams)
is tangent to the curve φ(x, y) = 0.
Therefore formally we can represent φ(x, y) = 0 parametrically as x = x(t),
y = y(t) so that φ(x(t), y(t)) = 0 for all t . Then f (x, y) is stationary whenever
Example:
Find the stationary points (and the values of the function at these points) of
f (x, y) = x y subject to the constraint φ(x, y) = x 2 + y 2 − 1 = 0.
df
= 0.
(2.10)
dt
Also we have φ(x(t), y(t)) = 0 for all t . This implies (trivially by differentiation)
dφ
=0
dt
so that at the stationary points we have:
fx
φx
d y/dt
=
=−
=λ
φy
dx/dt
fy
Substitute in Eqs. (2.14)–(2.16) above gives:
(2.11)
dx
dy
df
= fx
+ fy
=0
dt
dt
dt
dx
dy
dφ
= φx
+ φy
=0
dt
dt
dt
Hence at such points the following ratios all hold:
(2.12)
(2.13)
(2.14)
(2.15)
(2.16)
φ(x, y) = 0 .
(2.18)
(2.19)
Equation (2.17) gives −y/x = 2λ. Equation (2.18) gives −y/x = 1/(2λ). Equating these gives
1
= 2λ ⇒ 4λ2 = 1 .
2λ
Hence
1
2
x = ±√ ,
1
1
(x, y) = + √ , + √
2
2
1
1
(x, y) = − √ , − √
2
2
1
1
(x, y) = + √ , − √
2
2
1
1
(x, y) = − √ , + √
y
f=c1
2
f=c1
f=c2
f=c3
φ(x,y) = 0
y = ±x .
Therefore altogether there exist four possibilities:
These are three conditions for three unknowns x , y and λ. The constant λ is called
the Lagrange multiplier. These equations give the stationary points (x, y). The
quantity λ is a constant in each case but different at different stationary points.
φ(x,y) = 0
Eq. (2.15) → x + 2λy = 0
Eq. (2.16) → x 2 + y 2 − 1 = 0 .
Putting λ = 1/2 in Eq. (2.17) and Eq. (2.18) gives y + x = 0. Putting
λ = −1/2 in Eq. (2.17) and Eq. (2.18) gives −y + x = 0. Therefore x 2 = y 2
always; substitute in Eq. (2.19) gives 2x 2 = 1. Therefore
say
f y + λφ y = 0
f=c2
(2.17)
1
2
f x + λφx = 0
f=c3
Eq. (2.14) → y + 2λx = 0
λ=± .
where λ is a constant.
Therefore we have three equations:
y
79
2
1
2
1
λ=−
2
1
λ=
2
1
λ=
2
λ=−
For the stationary values of f the above results give two distinct values:
f (x, y) =
1
2
(maximum)
f (x, y) = −
1
2
(minimum)
We can see what is happening from Fig. 2.29.
x
x
Fig. 2.28. The function f attains a maximum (subject to φ(x, y) = 0) when the curve
f = c3 is tangent to the curve φ(x, y) = 0.
Note: In such problems do not investigate the nature of such points except by
looking at values of f : f = 1/2 is larger so it is a maximum; f = −1/2 is
smaller so it is a minimum.
80
2.12 Inverse Functions
2 Functions of More than One Variable
81
Therefore, for non-zero x , y and z we must have
2
y 2 z 2 = −λ,
f = -1
x 2 z 2 = −λ,
x 2 y 2 = −λ
and so
f=1
1
x 2 = y2 = z2 .
x2 + y2 = 1
0
Substituting in Eq. (2.27) gives 3x 2 = a 2 . Hence
f = -1/2
f = 1/2
f = 1/2
f = -1/2
a
x = ±√ ,
y = ±x,
3
z = ±x,
λ=−
a4
9
and so
-1
f=1
a
x = ±√ ,
f = -1
3
-2
-2
-1
0
1
2
Fig. 2.29. Plots of the hyperbolas defined by f (x, y) = x y for f = ±1 and ±1/2, and
the circle defined by the condition x 2 + y 2 − 1 = 0.
A similar method can be used for functions of three variables to find stationary
points of f (x, y, z) subject to the conditions φ(x, y, z) = 0. Here, by analogy
with the case of two variables we obtain:
f x + λφx = 0
f y + λφ y = 0
f z + λφz = 0
φ(x, y, z) = 0 .
(2.20)
(2.21)
(2.22)
(2.23)
Therefore we have four equations (Eqs. (2.20)–(2.23)) for the four unknowns x ,
y , z , λ.
Example:
Find the maximum and minimum of f (x, y, z) = x 2 y 2 z 2 subject to x 2 + y 2 + z 2 =
a 2 (i.e. points on the sphere of radius a ).
Here f = x 2 y 2 z 2 and φ(x, y, z) = x 2 + y 2 + z 2 − a 2 = 0. Hence
f x + λφx = 2x y 2 z 2 + 2λx = 2x(y 2 z 2 + λ) = 0
f y + λφ y = 2x 2 yz 2 + 2λy = 2 y(x 2 z 2 + λ) = 0
f z + λφz = 2x 2 y 2 z + 2λz = 2z(x 2 y 2 + λ) = 0
(2.24)
(2.25)
(2.26)
(2.27)
a
z = ±√
3
3
independently. Therefore there exist eight stationary points apart from the origin
at (0, 0, 0).
At (0, 0, 0), f = 0 and since f = x 2 y 2 z 2 ≥ 0 for all (x, y, z), the point
(x, y, z) = (0, 0, 0) must be a minimum.
At all other points, f (x, y, z) = a 6 /27 which is a maximum. Now since these
are maximum points we have
x 2 + y2 + z2 − a2 = 0
a
y = ±√ ,
x 2 y2 z2
1/3
≤
a2
3
=
x 2 + y2 + z2
3
.
Therefore the geometric mean is less than the arithmetic mean for the three
positive numbers.
2.12 Inverse Functions
A function with domain D is called one-to-one if no two elements of D have the
same image; i.e. f (x1 ) = f (x2 ) for all x1 = x2 .
For example, the function f (x) = x 3 is one-to-one (see Fig. 2.30a) because
each value of the function f (x) corresponds to a unique value of x . However,
the function f (x) = x 2 is not one-to-one because each value of f (x) corresponds
to two values of x (see Fig. 2.30b).
Definition: Let f be a one-to-one function with domain D and range R . Then
the inverse function f −1 has domain R and range D : It expresses x as a function
of y so that
y = f (x)
⇔
x = f −1 (y) .
Example:Find the inverse function of f (x) = x 3 .
82
2.13 Implicit Functions
2 Functions of More than One Variable
(a)
(b)
3
-2-
2
1
1
-1
1
2
-2
Theorem: If f is a one-to-one and continuous function defined on an interval,
then its inverse f −1 is also continuous.
3
2
-1
Theorem: Let f be a continuously differentiable function with inverse f −1 in
some region R . Let p be a point in R and let f ( p) = 0. Then there exists an
open interval containing p such that f is one-to-one and the inverse function f −1
is continuously differentiable.
The idea with the condition f ( p) = 0 in the theorem above is to avoid maxima
and minima when dealing with the inverse function.
1
2
-1
-1
-2
-2
-3
-3
Fig. 2.30. (a) Plot of the function f (x) = x 3 ; this is a one-to-one function. (b) Plot of
the function f (x) = x 2 ; this is not a one-to-one function because, for example, f (x) = 1
could arise from x = 1 or x = −1.
2
2.13 Implicit Functions
Often functions define implicit relations between variables as in
x 5 + x 2 y 3 + y 6 + 18x y + e y = 0
such that it is not easy (or even possible!) to obtain y = f (x) explicitly. In
general the function f (x, y) = 0 defines some curve in the (x, y) plane.
Consider the function shown in Fig. 2.32. Near the point A we have an implicit
function y = F(x) but we cannot define such a function near the point B , for
example. We therefore require conditions on the function f (x, y) such that we
can assume the existence of such implicit functions. This leads to the following
theorem.
y=x
f
f –1
1
83
y
A
-2
-1
1
2
-1
-2
Fig. 2.31
Plots of the function y = f (x) = x 3 and its inverse y = x 1/3 .
B
If we write y = x 3 then x = y 1/3 . If we then change x to y we have y = x 1/3 so
that
f −1 (x) = x 1/3 .
Note that the graph of the inverse function f −1 can be obtained by reflecting the
graph of f about the line y = x (see Fig. 2.31).
x
Fig. 2.32. A curve in the x-y plane where we can define an implicit function in the
vicinity of the point A but not in the vicinity of the point B.
84
2 Functions of More than One Variable
Theorem: (The Implicit Function Theorem.) Let f (x, y) be defined on an open
disc containing the point (a, b) where f (a, b) = 0, f y (a, b) = 0 and f x , f y are
continuous on the disc. Then f (x, y) = 0 defines y as a function of x near the
point (a, b).
This theorem allows d y/dx to be calculated in the following way. Let y = F(x).
Hence
f (x, y(x)) = 0 .
(2.28)
3
Multiple Integrals
We can use the chain rule to differentiate both sides of Eq. (2.28). This gives:
∂ f dy
∂ f dx
+
=0
∂ x dx
∂ y dx
or, because dx/dx = 1,
∂ f dy
∂f
+
=0
∂x
∂ y dx
Recall that for functions of one variable, y = f (x), we obtain ab f (x) dx by
dividing the range [a, b] into small intervals of width δxi and adding up areas of
strips above the intervals. The total area is given by
and hence
fx
dy
=− .
dx
fy
(2.29)
n
δ Ai ≈
i=1
Example:Find d y/dx if x 3 + y 3 = 6x y .
n
f (xi ) δxi
i=1
where n is the number of intervals (see Fig. 3.1).
We can write this as
f (x, y) = x 3 + y 3 − 6x y = 0
and then, using Eq. (2.29), we have
dy
x 2 − 2y
fx
3x 2 − 6 y
=− 2
.
=− =− 2
dx
fy
3 y − 6x
y − 2x
y
The implicit function theorem can be extended to functions of more than two
variables.
f(x)
δxi
f(xi)
x=a
x=xi
x=b
x
Fig. 3.1. The area underneath the curve y = f (x) can be approximated by the sum of
the areas of individual strips.
85
86
3 Multiple Integrals
3.1 Rectangular Regions
As we take smaller and smaller intervals, i.e. as n → ∞, |δxi | → 0 and in the
limit we get the exact area under the curve f (x) between x = a and x = b as
b
y
y
R
f (x) dx .
This limit is well defined for continuous functions.
For functions of one variable, f (x), we therefore integrate over an interval
[a, b].
For functions of two variables, f (x, y), we start by integrating over a rectangular area R in the x - y plane.
The question is: Given z = f (x, y), what is the volume under the surface z =
f (x, y) over the rectangle R ? (See Fig. 3.2.)
The approach is to divide the region R into identical small rectangles of area
δ Ai j = δxi δyi (see Fig. 3.3).
Therefore the vertical box in Fig. 3.2 has an approximate volume
f (xi , yj ) δ Ai j =
i, j
It can be shown that in the limit where the number of areas δ Ai j → ∞ and
|δ Ai j | → 0, these sums tend to the same limit, provided the function f (x, y) is
joint continuous.
In the limit we have, by definition, the exact volume,
f (x, y) d A .
In other words
f (xi , yj ) δxi δyj .
lim
To do this sum we can either fix x and first sum over δyi and then sum the
result over δxi , or the other way round. Hence
V ≈
f (x, y) δx δy =
δx
R
f (x, y)δy δx =
δy
x
R
i, j
δxi
x
δVi j = height × area = f (xi , yj ) δ Ai j .
δAij
Fig. 3.3. The region R is divided into rectangles of dimensions δxi by δyi and area
δ Ai j = δxi δyi .
3.1 Rectangular Regions
V ≈
R
δyi
a
Hence the total volume is approximately
87
δy
f (x, y) δx δy
≡
f (x, y) d A = required volume, V .
R
δx,δy
For the two different ways of doing the sum we might expect:
f (x, y)δx δy .
V = lim
δx
lim
δx
= lim
lim
f (x, y) δy δx
δy
δy
z
(3.1)
f (x, y) δx δy
(3.2)
δx
where the x in f (x, y) is fixed in Eq. (3.1) and
the y in f (x, y) is fixed in Eq. (3.2).
This gives us a way of evaluating V =
R f (x, y) d A. In Eq. (3.1) we can
write the limit where x is fixed as
surface defined
by z = f (x,y)
lim
f (x, y) δy =
f (x, y) d y = g(x), say.
δy
y
x
Fig. 3.2. The volume underneath the surface z = f (x, y) can be approximated by the
sum of the volumes with the same base area and height given by the value of z.
Then the second (outer) limit in Eq. (3.1) is
lim
δx
δy
lim
f (x, y) δy δx
= lim
(g(x)) δx
=
g(x) dx .
δx
We can treat Eq. (3.2) in a similar way doing the x integration first and then the
y integration.
88
3 Multiple Integrals
3.1 Rectangular Regions
Therefore we have two ways of evaluating double integrals as repeated ordinary integrals.
Let the region R be defined by
(a)
y
a≤x ≤b
(b)
y
R
d
89
R
d
and c ≤ y ≤ d
c
as shown in Fig. 3.4. Then
c
V =
=
=
R
x=b
x=a
y=d
y=c
f (x, y) d A
y=d
f (x, y) d y dx
y=c
x=b
a
(3.3)
f (x, y) dx d y
(3.4)
x=a
When we do the y integration first in Eq. (3.3) we keep x fixed. This results in
an integrand that is only a function of x which we then integrate from x = a
to x = b. We can think of this pictorially as shown in Fig. 3.5a. We first sum
over small rectangles (with x fixed) and then sum the resultant over a ≤ x ≤ b
to cover the whole region R .
When we do the x integration first in Eq. (3.4) we keep y fixed. This results
in an integrand that is only a function of y which we then integrate from y = c
to y = d . We can think of this pictorially as shown in Fig. 3.5b. We first sum
over small rectangles (with y fixed) and then sum the resultant over c ≤ y ≤ d
to cover the whole region R .
Therefore double integrals become repeated single integrals.
b
a
x
b
x
Fig. 3.5. Pictorial representation of the order of the integrations in Eq. (3.3) and (3.4).
(a) First sum over small rectangles with x fixed and then over a ≤ x ≤ b as in Eq. (3.3).
(b) First sum over small rectangles with y fixed and then over c ≤ y ≤ d as in Eq. (3.4).
Example:
Find the volume V under z = f (x, y) = x 2 y with base R , when R is the rectangle
1 ≤ x ≤ 2, −3 ≤ y ≤ 4.
First do it by integrating with respect to y first keeping x fixed. We have
x=2
2
V =
x y dA =
R
=
x=2
=
y
7x 3
6
y=−3
=
x=1
2
y=−3
dx =
x=2
y=4
x y d y dx
x=1
y=4
2
x=1
x 2 y2
x=2
x=1
7x 2
dx
2
49
.
6
Now do it by integrating with respect to x first keeping y fixed. We have
R
d
x2y dA =
V =
R
c
=
y=4
y=−3
=
a
Fig. 3.4
b
x
The region of integration, R, corresponding to a ≤ x ≤ b and c ≤ y ≤ d.
7y2
6
1 3
x y
3
y=−3
x=2
y=4
=
y=−3
y=4
dy =
x=1
x=2
x 2 y dx d y
x=1
y=4
y=−3
7y
dy
3
49
.
6
Therefore we get the same answer regardless of the order in which we do the
integrations.
90
3 Multiple Integrals
3.2 Non-rectangular Regions
In this particular example we could have separated the integrand into x and y
parts. Hence
0 ≤ y ≤ 2x . In these strips the bottom limit is y = 0 and the top limit is y = 2x .
Therefore, in the limit the contribution from the vertical strip is:
V =
x=2
x=1
y=4
y=−3
x y d y dx =
7
3
x=1
y=4
2
x dx
=
x=2
2
y=−3
More generally, if the function f (x, y) can be written as f (x, y) = g(x) h(y),
i.e. if the function is separable, and if the region R is a rectangle, then we can
write:
g(x) h(y) d A =
R
=
x=b
y=c
y=d
g(x) dx
x=a
x=1
x=0
For regions more complicated than rectangles we must think carefully about the
limits of the single integrations.
Let the region R be a triangle bounded by the x -axis ( y = 0), the vertical line
x = 1 and the straight line y = 2x . The region is shown in Fig. 3.6.
To proceed to calculate the double integral we go back to approximating sums.
First we sum over the vertical strips (as indicated in Fig. 3.7) with fixed x and
f (x, y) d A =
h(y) d y
3.2 Non-rectangular Regions
x=1
x=0
R
y=2x
y=0
f (x, y) d y dx
where the integral from y = 0 to y = 2x is done for fixed x .
We can also try the integration the other way round whereby we fix y and
integrate over x first. This is equivalent to taking horizontal strips from x = y/2
to x = 1 for fixed y and then integrating from y = 0 to y = 2. This is illustrated
in Fig. 3.8. This gives
f (x, y) d A =
R
y=2
y=0
x=1
f (x, y) dx d y .
x=y/2
Remark: The two ways will give exactly the same result. In practice it makes
sense to choose the simplest method.
Example:
Find
R f (x, y) d A for the triangular region bounded by y = 0, x = 1 and
y = 2x for the function f (x, y) = x 2 y 2 .
y
For the first way we will first do the y integration keeping x fixed. This
corresponds to taking vertical strips as shown in Fig. 3.7a. Then we integrate
over x from x = 0 to x = 1. We have
y = 2x
x=1
2 2
x y dA =
=
x=1
The triangular region bounded by y = 0, x = 1 and y = 2x.
x=1
x=0
x
=
y=2x
2 2
x y d y dx
x=0
R
Fig. 3.6
f (x, y) d y dx .
y=0
y=c
and so the integral factors into the product of two integrals.
However, note that f (x, y) can rarely be expressed as a product (i.e. separable).
More often we have functions such as x 2 cos(x y) or ln(x 2 + y 2 ).
y=0
y=2x
This covers the whole of the triangular region. From Fig. 3.7 we have
g(x) h(y) d y dx
x=a
x=b
f (x, y) d y
where we have fixed x . Adding up all the vertical strips by integrating the result
over all allowed x , i.e. 0 ≤ x ≤ 1 is equivalent to
y=d
y=2x
y=0
7
49
.
=
2
6
y dy
91
8x 6
18
y=0
x 2 y3
y=2x
3
y=0
x=1
=
x=0
dx =
4
.
9
x=1
x=0
8x 5
dx
3
92
3 Multiple Integrals
(a)
y
3.2 Non-rectangular Regions
(b)
y
(a)
y
a
(b)
y = +√a2-x2
x
y=2
-a
y = 2x
y=0
y=0
y=0
x=1
y=0
x
x
a
fixed x
y
a
= - √a2-y2
x =+√a2-y2
-a
0
a
x
Fig. 3.8. The semi-circular region bounded by x 2 + y 2 = a 2 and y ≥ 0, with (a) vertical
strips for fixed x and varying y, and (b) horizontal strips for fixed y and varying x.
x=1
x = y/2
0
93
x=1
as in Fig. 3.8a. We can then allow x to vary within the limits −a ≤ x ≤ a so that
the whole region is covered. Hence
x
Fig. 3.7. The triangular region bounded by y = 0, x = 1 and y = 2x, with (a) vertical
strips for fixed x and varying y, and (b) horizontal strips for fixed y and varying x.

x=+a
f (x, y) d A =
R

y=+
a 2 −x 2
y=0
x=−a

√
f (x, y) d y  dx .
Alternatively, if we fix y then the horizontal strips have x -limits given by
For the second way we will first do the x integration keeping y fixed. This
corresponds to taking horizontal strips as shown in Fig. 3.7b. Then we integrate
over y from y = 0 to x = 2. We have
x 2 y2 d A =
y=2
y=0
R
=
y=2
y3
9
x=1
x 3 y2
−
as in Fig. 3.8b. We can then allow y to vary within the limits 0 ≤ y ≤ a so that
the whole region is covered. Hence
f (x, y) d A =
x 2 y 2 dx d y
x=1
R
dy =
x=y/2
y6
6 × 24
y=2
=
y=0
y=2
y=0
y2
3
−
y5
24
y=a

y=0
x=+
x=−

√
a 2 −y 2
√
a 2 −y 2
f (x, y) dx  d y .
Note that we will obtain the same answer in each case. In practice we need
to use the way that provides the simpler integration.
dy
4
.
9
Remark: It is more convenient to use a simpler notation. Since
f (x, y) d A =
R
Note: This example is not separable because of the non-rectangular (x , y dependence) of the limits.
x 2 + y2
=
and
Now consider the region R inside the semi-circle defined by
y ≥ 0. The region and the two possible strips that could be taken are shown in
Fig. 3.8.
If we fix x then the vertical strips have y -limits given by
f (x, y) dx d y =
R
f (x, y) d y dx
R
we can, for example, write the first integral over the semi-circular region above
(integrating with respect to y first) as:
x=+a
a2
√
y=
dx
x=−a
y=0
a 2 −x 2
f (x, y) d y
where we have dropped the brackets. This is to be interpreted as “do the righthand side integration (with respect to y ) first and then do the integration with
respect to x ”.
0 ≤ y ≤ + a2 − x 2

x=y/2
3
y=0
=
− a2 − y2 ≤ x ≤ + a2 − y2
Example:
94
3 Multiple Integrals
Evaluate the double integral
3.2 Non-rectangular Regions
In the second method we will integrate with respect to x first. Therefore we
are integrating along the horizontal strip C D with fixed y and then allow y to
move from 0 to 1 to cover the whole region. Here C represents the start of the
√
strip at x = y and D represents the end of the strip at x = y . This gives
(2x 2 + y) dx d y
R
where R is the region bounded by the line y = x and the curve y = x 2 .
⇒
x = 1,
x=1
y=x
dx
y=x 2
x=0
x=1
2
(2x + y) d y =
x=0
=
x=1
2
dx 2x y +
x=0
=
x4
2
+
y=0
(2x 2 + y) dx =
y2
y=x
2
=
6
−
x5
x=1
2
y
5 y 3/2
3
y=0
−
=
x=0
x=√ y
2y3
3
x=y
− y2 dy
y=1
=
y=0
1
.
6
Note: In this case the two different ways are equally easy or difficult. In some
cases, one way can be much simpler than the other. Therefore the choice of
order in the integration can be important.
1
.
6
Example:
Find the region R for the double integration
√
x=2a
x=0
(a)
2 y 5/2 y 4 y 3
−
−
=
3
6
3
dx
y
y=1
2x 3
dy
+ xy
3
5x 4
2x +
−
dx
2
2
x3
y=0
x=y
So far we have shown how, given the region R , we can find the limits for x
and y in the integrals. However, sometimes we can have the limits of repeated
integrals and be asked to find the corresponding region R (and perhaps change
the order of integration).
y=x 2
x2
3
y=1
√
x= y
x = 0.
Therefore the points of intersection are (0, 0) and (1, 1). The resulting region is
shown in Fig. 3.9.
In the first method we will integrate with respect to y first. Therefore we are
integrating along the vertical strip AB with fixed x and then allow x to move
from 0 to 1 to cover the whole region. Here A represents the start of the strip at
y = x 2 and B represents the end of the strip at y = x . This gives
y=1
dy
The first step is always to calculate where the curves intersect and draw a
diagram of the region of integration.
The curves intersect where y = x = x 2 . This gives
x − x2 = 0
95
6a 2 −x 2
y=
√
y= ax
2x y d y .
(b)
In the second integral the limits of y are
1
1
√
ax ≤ y ≤ 6a 2 − x 2 .
y=x
y=x
Therefore the lower limit defines a parabola, y 2 = ax
√, while the upper limit
defines a circle, x 2 + y 2 = 6a 2 (the circle’s radius is 6a ). These two curves
intersect at
B
C
6a 2 − x 2 = ax
D
A
y = x2
y = x2
1
x
1
x
Fig. 3.9. The region bounded by the line y = x and the curve y = x 2 , with (a) vertical
strips for fixed x and varying y, and (b) horizontal strips for fixed y and varying x.
⇒
x 2 + ax − 6a 2 = 0
⇒
x = 2a .
We are now in a position to draw the two curves. These are shown in Fig. 3.10.
Therefore the integral can be written as
x=2a
dx
x=0
√
6a 2 −x 2
√
y= ax
2x y d y =
x=2a
dx x y
x=0
2
√
y= 6a 2 −x 2
√
y= ax
96
3 Multiple Integrals
3.2 Non-rectangular Regions
97
y
y = √ax
Note that as it stands the integral is very difficult because of the cos x 5 dx
part.
√
The limits of the second integral are from x = y to x = 1 implying that one
limit is the curve y = x 2 . The limits of the first integral are y = 0 (the x -axis)
to y = 1. The resulting region and the associate horizontal strip are shown in
Fig. 3.11a.
Now change the order of the integration so that we do the y integration first
taking vertical strips for fixed x as shown in Fig. 3.11b. This gives
y=1
dy
y=0
x=1
√
y cos x 5 dx =
x=0
=
x=0
x = 2a
Fig. 3.10. The region bounded by the curve y =
with vertical strips for fixed x and varying y.
=
x=2a
x=0
√
√
x and the curve y = 6a 2 − x 2 ,
x=0
= 3a 2 x 2 −
x=1
x4
2
x=0
sin x 5
10
y2
2
y=x 2
cos x 5
y=0
cos x 5 dx
x=1
=
x=0
sin 1
.
10
The last integration can be seen by substituting u = x 5 (du = 5x 4 dx ).
2
2
6a x − ax − x
=
dx
x
y cos x 5 d y
y=0
x=1
=
y=x 2
dx
x= y
y = √6a2-x2
x=1
ax 3
3
−
x4
4
3
Remark: Sometimes it is sensible to divide the region R in order to do the
integration. For example, if the region R can be subdivided into regions R1 and
dx
x=2a
=
x=0
16 4
a .
3
y
In this case it is actually possible to do the integration using horizontal strips
keeping y fixed for the first integration. However, the integration would have to
be done in two parts.
In the first part the limit for x are 0 ≤ x≤ y 2 /a with y
√
going from 0 √
to 2a .√In the second part the limits are 0 ≤ x ≤ 6a 2 − y 2 with
y going from 2a to 6a .
y
(a)
1
(b)
1
y = x2
y = x2
x=1
x=1
Remark: There are always two ways to do a double integral; choose the simpler
because the other may be impossible!
Example:
By changing the order of the integration, evaluate the integral
y=1
dy
y=0
x=1
√
x= y
y cos x 5 dx .
y=0
1
x
y=0
1
x
Fig. 3.11. The region bounded by the curve y = x 2 and the lines x = 1 and y = 0,
with (a) horizontal strips for fixed y and varying x, and (b) vertical strips for fixed x and
varying y.
98
3 Multiple Integrals
3.3 Change of Variables in Area Integrals
R2 as shown in Fig. 3.12, where R = R1 ∪ R2 , then we have
f (x, y) d A =
f (x, y) d A +
f (x, y) d A .
R
R1
R2
3.3 Change of Variables in Area Integrals
For functions of one variable it is often useful to integrate by a change of variable,
e.g. x = x(u). The rule is to replace x by x(u) and dx by (dx/du)du and then
alter the x -limits to the u -limits. This is integration by substitution. This gives
x=b
I =
Remark: Double integrals can be used to find the area of a particular region.
This is because
dA ≡
R
1 · d A = Area of R .
R
Therefore to find the area of R we just find
Remark: In evaluating
expressions such as
R
R
y=2−2x
u=u 1
x=a
x=b
I =
are meaningless because they will depend on x . The outer or second set of limits
should always be constants.
Remark: If we interpret δ A as an area element, then f d A is the volume between
the surface z = f (x, y) and the surface element δ A. Another interpretation is to
imagine R made of sheet metal of variable density f (x, y) grammes/unit area.
Then f (x, y) δ A = density × area = mass of δ A. Summing gives
f (x, y) d A = total mass of R .
R
x=b
x=a
x y dx
x=0
dx
du
du
f (x) dx = −
u=u 2
f (x(u))
u=u 1
dx
du .
du
But dx/du < 0 in this case, so we can combine both cases in one formula:
x=1
dy
y=0
f (x(u))
where the limits u 1 and u 2 correspond to the limits a and b such that a = x(u 1 )
and b = x(u 2 ).
This procedure is fine if x(u) increases with u . If x(u) is a decreasing function
of u the u -limits are then reversed and therefore we have a change of sign:
f (x, y) d A with f (x, y) = 1.
f (x, y) d A the answer must be a number. Therefore,
u=u 2
f (x) dx =
x=a
99
f (x) dx =
u=u 2
u=u 1
dx f (x(u)) du .
du
(3.5)
Remark: On the right-hand side of Eq. (3.5) the function f (x) is expressed as
f (x(u)).
The right-hand side of Eq. (3.5) includes a magnification factor
Remark:
dx/du , multiplying the du ; this comes from transforming from dx to du .
For functions of two variables one would similarly expect that the change in
variables
x = x(u, v),
y = y(u, v)
(for example, for polar coordinates u = r and v = θ ) would result in a change in
the area by a magnification factor M such that
dx d y = M du dv .
As an example consider a linear change of coordinates:
x = x(u, v) = au + bv,
or
R1
R2
Fig. 3.12. A region R that can be subdivided into regions R1 and R2 to make integration
easier.
x
a
=
y
c
y = y(u, v) = cu + dv
b
d
u
v
(3.6)
where a , b, c and d are constants.
Now write M for the transformation matrix composed of a , b, c and d and
recall that a unit square in (u, v) variables has sides
1
u
=
= e1 ,
0
v
u
0
=
= e2
1
v
100
3 Multiple Integrals
v
(a)
e2
(b, d)
Therefore,
P
e1
(u=1,v=0)
u
b
1
a
=
0
d
c
b
0
b
=
1
d
d
a
c
a
Me2 = e2 =
c
where (a, c) and (b, d) represent the coordinates of the new corners in the (x, y)
plane (see Fig. 3.13b).
Therefore, under the transformation M we find
unit square in (u, v) based on e1 , e2 → parallelogram in (x, y) based on e1 , e2 .
Note from the matrix and the diagram that the point (1, 1) in (u, v) transforms to
the point (a + b, c + d) in (x, y).
Now consider the area of the parallelogram P (see Fig. 3.14). We have
Area P = [Total area of rectangle]
y
c
b
R
d
P
T1
T2 c
Fig. 3.14
T1
R
b
= det M
δx
δy
=
a
c
b
d
δu
δv
or,
δx = a δu + b δv,
δy = c δu + d δv, .
Therefore, for a linear change of variables:
(Rectangular area du dv in (u, v) plane)
→
(“Parallelogram” area, i.e. (det M)δu δv in (x, y) plane) .
Now let us consider a nonlinear change of coordinates. We take the transformation to have the following form:
x = x(u, v),
y = y(u, v)
where, neglecting small errors, the increments in x and y are given by
or, in matrix form,
δx
δy
=
∂ x/∂u
∂ y/∂u
∂x
δv
∂v
∂y
δv
∂v
∂ x/∂v
∂ y/∂v
δu
δv
d
Definition: The Jacobian matrix is defined to be
b
a
∂x
δu +
∂u
∂y
δu +
δy =
∂u
T2
b
b
d
Since the unit square (area) gets multiplied by a factor of det M, a small
rectangle of sides δu and δv , with area δu δv also gets multiplied by the same
factor det M. Hence, replacing u and v in Eq. (3.6) by δu and δv gives the
corresponding changes δx and δy as
δx =
a
c
a
c
= ad − bc = det
x
as shown in Fig. 3.13a. Now, to see what happens to this unit square under the
transformation M, just apply M. This gives
1
1
Area P = (a + b)(c + d) − 2 · ac − 2 · bd − 2bc
2
2
(a, c)
Fig. 3.13. (a) The unit square in the (u, v) coordinate system. (b) The transformed unit
square in the (x, y) coordinate system.
Me1 = e1 =
− [Area of 2 rectangles R ] .
(a+b, c+d)
e2
e1
101
− [Area of 2 pairs of equal triangles T1 and T2 ]
(b)
y
(u=1,v=1)
(u=0,v=1)
3.3 Change of Variables in Area Integrals
c
x
Individual areas in the transformed unit square.
M
x, y
u, v
≡
∂ x/∂u
∂ y/∂u
and the Jacobian determinant is defined to be
∂ x/∂v
∂ y/∂v
∂(x, y)
x, y
≡ det M
.
∂(u, v)
u, v
.
102
3 Multiple Integrals
3.3 Change of Variables in Area Integrals
103
This should help when trying to remember the formula
∂(x, y) ∂ x/∂u
=
∂ y/∂u
∂(u, v)
So, for a nonlinear change of variables:
(Rectangular area du dv in (u, v) plane )
∂ x/∂v .
∂ y/∂v →
(Parallelogram area, i.e. (det M)δu δv in (x, y) plane) .
This is illustrated in Fig. 3.15.
Therefore, the required formula for double integrals under a change of variables is:
f (x, y) dx d y =
R
R
where
∂(x, y) du dv
f (x(u, v), y(u, v)) ∂(u, v) A=
then
a
∂(x, y) ∂(u, v) = det M
det A = and
can be thought of as the magnification factor.
Remark: The factor
Note: In matrix notation, vertical lines on either side of a matrix denote the
determinant. However, vertical lines on either side of an expression also denote
the absolute value. For example, if we let
c
a
c
b
d
b = ad − bc
d
det A = |ad − bc| .
∂(x, y) ∂(u, v) is det M, the absolute value of the determinant of the matrix M. Note that we
take the modulus as in the one variable case.
Remark: Note that
(x 2 + y 2 ) dx d y
I =
∂(x, y) du dv
dx d y → ∂(u, v) is the analogue of
Example:
Evaluate the integral
R
where R is a circle x 2 + y 2 = a 2 , by changing to polar coordinates.
dx dx → du .
du
In polar coordinates we have
x = r cos θ,
y = r sin θ .
Therefore, taking u = r and v = θ , we can write the Jacobian matrix as
(a) v
(b) y
v+δv
R
v
u
u+δu
u
v+δv
v
R'
u+δu
u
x
Fig. 3.15. The transformation of the rectangular area R in the (u, v) plane to the area
R in the (x, y) plane.
M=
∂ x/∂r
∂ y/∂r
and the Jacobian determinant is
∂ x/∂θ
∂ y/∂θ
∂(x, y) cos θ
=
det M =
sin θ
∂(r, θ)
=
cos θ
sin θ
−r sin θ
r cos θ
−r sin θ = r cos2 θ + sin2 θ = r
r cos θ
which is always positive and so we do not need to take the absolute value. The
original area R and the transformed area R are shown in Fig. 3.16. Note that
the circle in the (x, y) plane transforms into a rectangle in the (r, θ) plane. Note
104
3 Multiple Integrals
3.4 Changing Variables Twice
y
(a)
(∂ x/∂u)(∂u/∂s) + (∂ x/∂v)(∂v/∂s)
(∂ y/∂u)(∂u/∂s) + (∂ y/∂v)(∂v/∂s)
∂ x/∂s ∂ x/∂t
=
∂ y/∂s ∂ y/∂t
x, y
=M
.
s, t
=
(b)
θ
2π
R'
R
a
x
Fig. 3.16. The transformation of the circular area R in the (x, y) plane to the rectangular
area R in the (r, θ) plane.
that here R is the region given by x 2 + y 2 ≤ a 2 and R is the region given by
0 ≤ r ≤ a , 0 ≤ θ ≤ 2π .
Therefore
r 2 (r ) dr dθ
(x + y ) dx d y =
2
I =
r =a
r =0
θ=2π
θ=0
r 3 dr dθ =
r =a
r =0
r 3 dr
θ=2π
θ=0
dθ =
πa 4
M
x, y
u, v
M
u, v
s, t
=M
x, y
s, t
M
x, y
u, v
M
u, v
s, t
=
∂ x/∂u
∂ y/∂u
∂ x/∂v
∂ y/∂v
∂u/∂s
∂v/∂s
∂(u, v)
∂(x, y)
nor
∂(u, v)
∂(x, y)
(3.7)
∂u/∂t
∂v/∂t
x, y
x, y
=I
(3.8)
can ever be zero
−1
.
(3.9)
The result in Eq. (3.9) is often useful in solving problems.
I =
This implies that instead of undergoing a two-stage transformation we can obtain
the necessary Jacobian matrix directly from the relationship between (x, y) and
(s, t). To prove this consider the left-hand side of Eq. (3.7). We have
∂(x, y)
∂(u, v)
Example:
Evaluate the integral
v = v(s, t) .
.
=M
Therefore the product of the magnification factors of a transformation and its
inverse is unity.
So, for reversible changes,
∂(x, y)
=
∂(u, v)
u = u(s, t),
u, v
x, y
and Eq. (3.8) gives
Then for the Jacobian matrices we obtain
M
we obtain
neither
2
Suppose that we change variables twice, first from (x, y) to (u, v) and then from
(u, v) to (s, t), such that
and
det AB = det A det B
3.4 Changing Variables Twice
y = y(u, v)
x, y
u, v
∂(x, y) ∂(u, v)
= 1.
∂(u, v) ∂(x, y)
where we note that the integral is separable.
x = x(u, v),
where I is the unit or identity matrix. Now taking determinants and recalling
that
R
where the r 2 on the right-hand integral comes from the transformed x 2 + y 2
and the r dr dθ is from the transformed dx d y with r coming from the Jacobian
determinant det M. Hence
I =
M
2
R
(∂ x/∂u)(∂u/∂t) + (∂ x/∂v)(∂v/∂t)
(∂ y/∂u)(∂u/∂t) + (∂ y/∂v)(∂v/∂t)
Then, by definition the equations x = x(u, v) and y = y(u, v) can be solved
for (u, v) in terms of (x, y) to give u = u(x, y) and v = v(x, y). Therefore, setting
x = s and y = t in Eq. (3.7) we have
ar
0
105
1 · dx d y
R
(i.e. the area of the region R ) where R is enclosed by y 2 = x , y 2 = 2x , x y = 1
and x y = 2.
The region R bounded by the curves is shown in Fig. 3.17a.
To solve the integral consider the change of variables defined by
u = y 2 /x,
v = xy .
106
3.5 Volume Integrals
3 Multiple Integrals
(a)
y
2
1
Volume integrals are integrations where the region of integration is a volume.
The basic concepts are similar to those we introduced for two-dimensional (area)
integrals, but now we have
R'
y2=x
R
3.5 Volume Integrals
(b)
v
v=2
y2=2x
lim
v=1
xy=2
δx,δy,δz→0
1
2
0
0
x
u=1
u
u=2
Fig. 3.17. The transformation of (a) the area R in the (x, y) plane bounded by the
curves y 2 = x, y 2 = 2x, x y = 2 and x y = 1 to (b) the square R in the (u, v) plane
bounded by u = 1, u = 2, v = 1 and v = 2.
Then we can write the four bounding curves as
y 2 = x ⇔ u = 1,
y 2 = 2x ⇔ u = 2,
x y = 1 ⇔ v = 1,
xy = 2 ⇔ v = 2 .
So the region becomes a square (the region R in Fig. 3.17b).
Now, for the Jacobian determinant it is easier to use Eq. (3.9). So, to calculate
∂(x, y)/∂(u, v) we first calculate ∂(u, v)/∂(x, y) and then take the inverse. Using
u = y 2 /x and v = x y we have
∂(u, v) ∂u/∂ x
=
∂(x, y) ∂v/∂ x
f (x, y, z) δx δy δz
where δV = δx δy δz are now small volumes (see Fig. 3.18).
The limit as the size of the volume element δV → 0 is written as
xy=1
0
0
107
∂u/∂ y −y 2 /x 2
=
y
∂v/∂ y 2 y/x y2
= −3 x = −3u .
x
f (x, y, z) dx d y dz =
f (x, y, z) dV
V
V
where V is the three–dimensional region being integrated over.
The integrals are, as in the two-dimensional case, evaluated by repeated integration where we integrate over one variable at a time. For example, we could
start by integrating over z first (see Fig. 3.19).
The procedure is as follows.
1) Fix (x, y) and integrate over the allowed values of z in the region V . The
z -integral limits are the small, filled circles at the bottom and the top of the
dashed line with, say, z = z 1 (x, y) at the bottom and z = z 2 (x, y) at the top as
shown in Fig. 3.19. Therefore we are summing vertically over the boxes as
shown in Fig. 3.20.
2) This result depends on the choice of (x, y) and is defined in the region R
of the (x, y) plane which is the projection of V on to this plane as shown in
Fig. 3.21. This now defines the region over which we must do the x and y
integrations.
Therefore, using Eq. (3.9),
∂(x, y)
=
∂(u, v)
Hence
∂(u, v)
∂(x, y)
1 · dx d y =
R
=−
z
1
.
3u
∂(x, y) du dv
∂(u, v) R
v=2
1 u=2
1
du dv =
du
dv
3 u=1
v=1 u
I =
−1
δV
1 · 1
− =
3u R
u=2 v=2
v
1
du
=
3 u=1 u v=1
u=2
1
1
1
ln 2
2
=
du = [ln u ]u=
u=1 =
3 u=1 u
3
3
V
y
x
Fig. 3.18 The volume of integration, V , and the volume element δV = δx δy δz.
108
3 Multiple Integrals
3.5 Volume Integrals
109
Now we can take the double integral of the result of the z -integration over the
region R in the (x, y) plane (see Fig. 3.22).
Therefore
z
z2
V
dx
V
z1
y
x=b
f (x, y, z) dV =
x=a
Example:
Evaluate the integral
y=y2 (x)
y=y1 (x)
dy
z=z 2 (x,y)
z=z 1 (x,y)
f (x, y, z) dz .
f (x, y, z) dV
T
x
(x,y)
over the tetrahedron T bounded by the planes x = 0, y = 0, z = 0 and x +y+z = 1.
Fig. 3.19. The lower (z = z 1 ) and upper (z = z 2 ) limits on z for the first integration
over the volume of integration, V .
z
V
Note that the plane x + y + z = 1 passes through x = 1 (putting y = z = 0) and
similarly through y = 1 and z = 1. The relevant planes are shown in Fig. 3.23.
Now evidently for fixed (x, y) the z -limits are the heavy dots corresponding
to z = 0 at the bottom and z = 1 − x − y at the top. This gives our z -limits.
The projection R of T on to the (x, y) plane is the triangle on which the
tetrahedron rests, i.e. the triangle given by x = 0, y = 0 and x + y = 1 (obtained
by setting z = 0). So
I =
y
x
R
x=1
x=0
z=1−x−y
dy
y=0
z=0
f (x, y, z) dz .
For example, if f (x, y, z) = 1 then
(x,y)
y=1−x
dx
I =
1 · dV =
dV = volume of T .
T
Fig. 3.21. The projection of the volume V on to the (x, y) plane defines the region R
over which we must do the x and y integrations.
T
y
y=y2(x)
R
z2
z1
Fig. 3.20
The stack of volume elements from z = z 1 to z = z 2 .
y=y1(x)
a
b
x
Fig. 3.22. The region R in the (x, y) plane over which we must do the x and y integrations.
110
3 Multiple Integrals
3.6 Change of Variables in Volume Integrals
z
We define the Jacobian matrix for change of variables from (x, y, z) to (u, v, w)
to be
z
(a)
(b)
1
x, y, z
M
u, v, w
1
1
1
y
1

∂ x/∂u
=  ∂ y/∂u
∂z/∂u
Fig. 3.23. The volume formed by the intersection of the planes x = 0, y = 0, z = 0
and x + y + z = 1. (a) The lower and upper bounds on z (at each end of the dashed
line) for fixed x and y. (b) Vertical strips from the (x, y) plane to the tetrahedron plane
defined by z = 1 − x − y as well as strips in the (x, y) plane (z = 0).
∂(x, y, z)
≡ det M
∂(u, v, w)
∂(x, y, z) du dv dw .
∂(u, v, w) dx d y dz = As before, for reversible transformations ∂(x, y, z)/∂(u, v, w) = 0 and we have
∂(x, y, z)
=
∂(u, v, w)
Therefore, in this case
I =
=
x=0
y=0
=
x=1
y=0
y=1−x
dx
x=0
=
y=0
x=1
x=0
x=1
(1 −
1 − x − y dy
dx y − x y −
2
x=0
1−x−y
dy
[z ]z=
z=0
x)2
1 dz
z=0
y=1−x
dx
x=0
z=1−x−y
dy
x=1
dx =
y2
2
y=1−x
y=0
1
6
∂(u, v, w)
∂(x, y, z)
−1
.
f (x, y, z) dx d y dz =
V ∂(x, y, z) du dv dw
f (x(u, v, w), y(u, v, w), z(u, v, w)) ∂(u, v, w) V
where V is the transformed volume in (u, v, w) coordinates.
Example:
Find an expression for dx d y dz for the coordinate transformation from cartesian
coordinates to spherical polar coordinates given by
y = y(r, θ, φ) = r sin θ sin φ
f dV = total mass of V .
V
3.6 Change of Variables in Volume Integrals
Changing variables in volume integrals is similar to the procedure used for double
integrals. Suppose
y = y(u, v, w),
x = x(r, θ, φ) = r sin θ cos φ
and this is the volume
of the tetrahedron.
We can “picture”
V f d V as a “four-dimensional” volume, but it is easier
to think of f (x, y, z) as a mass density (i.e. mass per unit volume), so that
x = x(u, v, w),
The integral under the change of variables becomes
y=1−x
dx
=
x=1

∂ x/∂w
∂ y/∂w  .
∂z/∂w
such that the transformation for volume is
x
∂ x/∂v
∂ y/∂v
∂z/∂v
We can also define the Jacobian determinant
y
1
x
111
z = z(u, v, w) .
z = z(r, θ, φ) = r cos θ .
The relationship between the coordinate systems is shown in Fig. 3.24. Note
that O P = r sin θ .
Let u = r , v = θ and w = φ . Then
∂(x, y, z)
∂(x, y, z)
=
∂(u, v, w)
∂(r, θ, φ)
sin θ cos φ
= sin θ sin φ
cos θ
= r 2 sin θ
r cos θ cos φ
r cos θ sin φ
−r sin θ
−r sin θ sin φ r sin θ cos φ 0
112
3.6 Change of Variables in Volume Integrals
3 Multiple Integrals
The Jacobian determinant is
z
1−v
∂(x, y, z)
= v(1 − w)
∂(u, v, w) vw
P
θ
y
1 · d x d y dz =
T
P'
=
x
Fig. 3.24. The spherical polar coordinate system (r, θ, φ) for a point P in relation to
the cartesian coordinate system (x, y, z).
Therefore
dx d y dz = r 2 sin θ dr dθ dφ .
Example:
Consider the tetrahedron T confined within the planes x = 0, y = 0, z = 0 and
x + y + z = 1. Consider the change of variables
x = u(1 − v),
−uv = u 2 v .
uv The volume is given by
r
O
φ
0 −u
u(1 − w)
uw
y = uv(1 − w),
z = uvw .
Calculate u , v and w in terms of x , y and z . Deduce that in the new variables T
is given by 0 < u < 1, 0 < v < 1, and 0 < w < 1. Hence calculate the volume
of T .
From the definitions of u , v and w we have
u =x+y+z
y+z
v=
x+y+z
z
w=
y+z
(3.10)
(3.11)
(3.12)
Taking Eq. (3.10) with (x, y, z) = (0, 0, 0) gives u = 0 while x + y + z = 1
gives u = 1.
Taking Eq. (3.11) with y = z = 0 and x = 0 gives v = 0 while x = 0 and
(y, z) = (0, 0) gives v = 1.
Taking Eq. (3.12) with z = 0 and y = 0 gives w = 0 while y = 0 and z = 0
gives w = 1.
Therefore the new limits are 0 < u < 1, 0 < v < 1 and 0 < w < 1.
=
as before.
u=1
u=0
1
0
1
6
v=1
v=0
u 2 du
0
w=1
w=0
1
v dv
1 · u 2 v du dv dw
1
0
dw
113
Download