The Chain Rule for Multivariate Functions In

advertisement
Section 14.5
The Chain Rule for Multivariate Functions
In Calculus 1, we often considered functions f (x) where x was actually a function of another
variable, say x = g(t). To differentiate f with respect to t (as opposed to differentiating it with
respect to x), we must use the chain rule:
d
dx
f (g(t)) = f ′ (g(t))g ′ (t) = f ′ (x) .
dt
dt
For example, if f (x) = sin x and x = g(t) = et , then the rate of change of f with respect to x
is given by
d
d
f (x) =
sin x = cos x,
dx
dx
while the rate of change of f with respect to t is given by
d
d
d
f (x) = f (g(t)) =
sin(et ) = et cos(et ).
dt
dt
dt
Now that we are working with multivariate functions, we can consider similar questions for
f (x, y) when x and y are actually functions of another variable (or of multiple variables). There
are many different such situations, but all of the outcomes look very similar to the chain rule for a
single-variable function.
The Chain Rule for f (x, y) when x and y are functions of a parameter t
We have seen that, if f is a function of two variables x and y, then each of the variables x and
y plays a role in the rate of change of the height of the surface f (x, y). Because of this, we consider
the rate of change of f with respect to x, fx , and the rate of change of f with respect to y, fy ,
separately.
Now it may be the case that each of x and y is actually controlled by a single variable t; if so,
we think of x as x = x(t) and y as y = y(t). Then we may rewrite f (x, y) = f (x(t), y(t)). While we
can still consider the rate of change of f with respect to x or with respect to y (the intermediate
variables), thinking of f as a function of the single variable t allows us to consider the rate of change
of f with respect to t.
Let’s break down the way the chain rule for one intermediate variable works:
d
f (g(t))
dt
is determined by finding the ”intermediate derivative” f ′ , as well as the derivative g ′ of the inside
function g. We put all of this information together by multiplying f ′ (g(t)) by g ′ (t), so that
d
f (g(t)) = f ′ (g(t))g ′ (t).
dt
The chain rule for f (x(t), y(t)) works nearly the same way: we calculate
d
f (x(t), y(t))
dt
by finding the ”intermediate derivatives” fx and fy , as well as the derivatives x′ (t) and y ′ (t) of the
inside functions x(t) and y(t). Finally, we put all of the information together:
1
Section 14.5
Theorem 2. If f (x, y) has continuous partial derivatives fx and fy , and if x = x(t) and y = y(t)
are differentiable functions of t, then the composite function f (x(t), y(t)) is a differentiable function
of t and
d
f (x(t), y(t)) = fx (x(t), y(t)))x′ (t) + fy (x(t), y(t))y ′ (t)
dt
∂f dx ∂f dy
=
+
.
∂x dt
∂y dt
Notice that our final answer is no longer a partial derivative; since we are thinking of f as being
controlled by the single variable t, our answer is a full derivative, not a partial one.
Example:
Given the function f (x, y) = x3 sin y, where x = ln t and y = t2 , find:
1. The rate of change of f with respect to x
2. The rate of change of f with respect to y
3. The rate of change of f with respect to t.
To answer (1), we calculate the partial fx :
fx = 3x2 sin y.
Similarly,
fy = x3 cos y.
To find the rate of change of f with respect to t, we will need to know the rates of change of x
and y with respect to t:
d
1
d
x(t) =
ln t =
dt
dt
t
and
d
d
y(t) = t2 = 2t.
dt
dt
So the derivative of f with respect to t is
df
∂f dx ∂f dy
=
+
dt
∂x dt
∂y dt
1
= (3x2 sin y)( ) + (x3 cos y)(2t)
t
1
2
= (3(ln t) sin(t2 ))( ) + ((ln t)3 cos(t2 ))(2t)
t
2
2
3(ln t) sin(t )
=
+ 2t(ln t)3 cos(t2 ).
t
2
Section 14.5
This is actually not the only way to calculate df
. Since we already know that x = ln t and
dt
2
2
y = t , we could begin by rewriting f (x, y) as f (ln t, t ) = (ln t)3 sin(t2 ). Then
df
d
=
(ln t)3 sin(t2 )
dt
dt
3(ln t)2 sin(t2 )
=
+ 2t(ln t)3 cos(t2 ).
t
This is exactly the same answer we got in the previous computation, and either method is
perfectly acceptable to use (although the first calculation may be a bit simpler since it breaks the
derivative down into more manageable pieces).
The Chain Rule for f (x, y) when x and y are functions of two variables s and t
In the case above, the intermediate variables x and y were controlled by a single variable t; in
other words, we can view x and y as single variable functions. However, the intermediate variables
x and y could each be multivariate functions: if x = g(s, t), y = h(s, t), then we can either choose
to think of f (x, y) as a function of two variables, or as f (g(r, s), h(r, s)). In particular, we can now
consider the rate at which f changes with respect to either of the variables r or s, i.e. we can find
∂f
and ∂f
.
∂s
∂t
We need another version of the chain rule for this situation:
Theorem 3. If the functions f (x, y), x = g(s, t), and y = h(s, t) are differentiable, then the partial
derivatives of f with respect to s and t are given by
∂
∂
f (x, y) =
f (g(r, s), h(r, s))
∂s
∂s
=
∂f ∂x ∂f ∂y
+
∂x ∂s
∂y ∂s
and
∂
∂
f (x, y) =
f (g(r, s), h(r, s))
∂t
∂s
=
∂f ∂x ∂f ∂y
+
∂x ∂t
∂y ∂t
We can make the above discussion far more general–we can consider a function u of n variables
x1 , x2 , . . . , xn , each of which is a function of the m variables t1 , t2 , . . . , tm . Then the function u is
actually a function of t1 , t2 , . . . , tm , and we can evaluate the partial derivative of u with respect to
any of the ti .
3
Section 14.5
Theorem 4. The partial derivative
∂u
is given by
∂ti
∂u
∂u ∂x1
∂u ∂x2
∂u ∂xn
=
+
+ ... +
.
∂ti
∂x1 ∂ti
∂x2 ∂ti
∂xn ∂ti
Example:
The function f (x, y, z) = xyz, and each of x, y, and z is a function of the variables r and s,
and ∂f
.
given by x = r2 + s, y = r cos s, and z = sin(rs). Find the partials ∂f
∂r
∂s
To use the formulas given above, we first need to calculate
∂f
∂x
,
∂f
∂y
, and
∂f
∂z
:
∂f
∂f
∂f
= yz,
= xz, and
= xy.
∂x
∂y
∂z
We also need to find the partials of each intermediate variable x, y, and z with respect to r and
s:
∂x
∂y
∂z
= 2r,
= cos s, and
= s cos(rs);
∂r
∂r
∂r
and
∂y
∂z
∂x
= 1,
= −r sin s, and
= r cos(rs).
∂s
∂s
∂s
Finally, we have
∂
∂f ∂x ∂f ∂y ∂f ∂z
f (x, y, z) =
+
+
∂r
∂x ∂r
∂y ∂r
∂z ∂r
= (yz)(2r) + (xz)(cos s) + (xy)(s cos(rs))
= (r cos s)(sin(rs))(2r) + (r2 + s)(sin(rs))(cos s) + (r2 + s)(r cos s)(s cos(rs))
= 2r2 (cos s)(sin(rs)) + (r2 + s)(cos s)(sin(rs)) + (r3 s + rs2 )(cos s)(cos(rs))
and
∂
∂f ∂x ∂f ∂y ∂f ∂z
f (x, y, z) =
+
+
∂s
∂x ∂s
∂y ∂s
∂z ∂s
= (yz)(1) + (xz)(−r sin s) + (xy)(r cos(rs))
= (r cos s)(sin(rs)) + (r2 + s)(sin(rs))(−r sin(rs)) + (r2 + s)(r cos s)(r cos(rs))
= r(cos s)(sin(rs)) − (r3 + rs)(sin2 (rs)) + (r4 + r2 s)(cos s)(cos(rs)).
Implicit Differentiation
4
Section 14.5
The information we have seen in this section can help us simplify the process of implicit differentiation that we learned in Calculus 1. Recall that an equation in terms of x and y, such
as
x sin(xy) = 0,
gives us a relationship between the two variables; in particular, the equation defines y as a function
dy
of x, so that we can find
.
dx
On the other hand, if F (x, y) is a multivariate function, setting F (x, y) = 0 and thinking of y
as a function of x, y = y(x), we can use the chain rule formula above to calculate
dy
:
dx
d
d
0=
F (x, y)
dx
dx
dx
dy
= Fx
+ Fy
dx
dx
dy
= Fx + Fy .
dx
Since
dy
dy
d
0 = 0, we have Fx + Fy
= 0; solving for
, we see that
dx
dx
dx
dy
Fx
=− .
dx
Fy
Theorem 0.0.1. If F (x, y) is a differentiable function and F (x, y) = 0 defines y as a differentiable function
of x, then the rate of change of y with respect to x is given by
dy
Fx
=− .
dx
Fy
Example:
Given the equation sin(xy) − x3 y = 0, find
dy
.
dx
Thinking of F (x, y) = sin(xy)−x3 y, we can use the formula in the theorem above: since Fx = y cos(xy)−
3x y and Fy = x cos(xy) − x3 , we have
2
dy
Fx
y cos(xy) − 3x2 y
=−
=−
.
dx
Fy
x cos(xy) − x3
5
Download