1 The Chain Rule

advertisement
1
The Chain Rule
• With functions of several variables, each of which is defined in terms of other variables (for example, f (x, y)
where x and y are themselves functions of s and t), we recover versions of the Chain Rule specific to the
relations between the variables involved.
◦ Recall that the Chain Rule for functions of one variable states that
dg dy
dg
=
·
, if g is a function of y
dx
dy dx
and y is a function of x.
◦ The various Chain Rules for functions of more than one variable have a similar form, but they will involve
more terms that depend on the relationships between the variables.
df
for a function f (x, y) where x
◦ Here is the rough idea of the formal proof, for the situation of finding
dt
and y are both functions of t:
∗ If t changes to t + ∆t then x changes to x + ∆x and y changes to y + ∆y.
∗ Then ∆f = f (x + ∆x, y + ∆y) − f (x, y) is roughly equal to the directional derivative of f (x, y) in
the direction of the (non-unit) vector h∆x, ∆yi.
∗ From the results about the gradient and directional derivatives, we know
that this
directional deriva∂f ∂f
tive is equal to the dot product of h∆x, ∆yi with the gradient ∇f =
,
.
∂x ∂y
∂f ∂f
∆f
∆x ∆y
≈
,
,
∗ Then
·
. Taking the limit as ∆t → 0 (and verifying a few other details)
∆t
∂x ∂y
∆t ∆t
∂f dx ∂f dy
df
=
·
+
·
.
then gives us that
dt
∂x dt
∂y dt
• The simplest method for generating the statement of the Chain Rule specific to any particular set of dependencies of variables is to draw a “tree diagram” as follows:
◦ Step 1: Start with the initial function f , and draw an arrow pointing from f to each of the variables it
depends on.
◦ Step 2: For each variable listed, draw new arrows branching from that variable to any other variables
they depend on. Repeat the process until all dependencies are shown in the diagram.
∂[top]
◦ Step 3: Associate each arrow from one variable to another with the derivative
.
∂[bottom]
∂v1
◦ Step 4: To write the version of the Chain Rule that gives the derivative
for any variables v1 and v2
∂v2
in the diagram (where v2 depends on v1 ), first find all paths from v1 to v2 .
◦ Step 5: For each path from v1 to v2 , multiply all of the derivatives that appear in each path from v1 to
∂v1
.
v2 . Then sum the results over all of the paths: this is
∂v2
df
• Example: State the Chain Rule that computes
for the function f (x, y, z), where each of x, y, and z is a
dt
function of the variable t.
.
◦ Steps 1-2: The tree diagram looks like this: x
↓
t
f
↓
y
↓
t
&
z .
↓
t
◦ Step 4: In this diagram, there are 3 paths from f to t: they are f → x → t, f → y → t, and f → z → t.
∂f ∂x
◦ Step 5: The path f → x → t gives the product
·
, while the path f → y → t gives the product
∂x ∂t
∂f ∂y
∂f ∂z
·
, and the path f → z → t gives the product
·
.
∂y ∂t
∂z ∂t
1
∗ Then the statement of the Chain Rule here is
∂f
∂f dx ∂f dy ∂f dz
=
·
+
·
+
·
.
∂t
∂x dt
∂y dt
∂z dt
∂f
∂f
and
for the function f (x, y), where x = x(s, t) and
∂t
∂s
• Example: State the Chain Rule that computes
y = y(s, t) are both functions of s and t.
f
.
◦ Steps 1-2: The tree diagram here is
.
s
x
↓
t
&
y
↓
s
.
&
t
◦ Step 4: In this diagram, there are 2 paths from f to s: they are f → x → s and f → y → s, and also
two paths from f to t: f → x → t and f → y → t.
∂f ∂x
·
, while the path f → y → t gives the product
◦ Step 5: The path f → x → t gives the product
∂x ∂t
∂f ∂y
∂f ∂x
·
. Similarly, the path f → x → s gives the product
·
, while the path f → y → s gives the
∂y ∂t
∂x ∂s
∂f ∂y
product
·
.
∂y ∂s
∗ Then the two statements of the Chain Rule here are
∂f
∂f ∂x ∂f ∂y
∂f
∂f ∂x ∂f ∂y
=
·
+
·
and
=
·
+
·
.
∂s
∂x ∂s
∂y ∂s
∂t
∂x ∂t
∂y ∂t
• Once we have the appropriate statement of the Chain Rule, it is easy to compute examples with specific
functions.
• Example: For f (x, y) = x2 + y 2 , with x = t2 and y = t4 , find
◦ In this instance, the Chain Rule says that
df
, both directly and via the Chain Rule.
dt
∂f dx ∂f dy
df
=
·
+
·
.
dt
∂x dt
∂y dt
df
= (2x) · (2t) + (2y) · (4t3 ).
dt
df
◦ Plugging in x = t2 and y = t4 yields
= (2t2 ) · (2t) + (2t4 ) · (4t3 ) = 4t3 + 8t7 .
dt
◦ Computing the derivatives shows
◦ To do this directly, we would plug in x = t2 and y = t4 : this gives f (x, y) = t4 + t8 , so that
df
=
dt
4t3 + 8t7 .
◦ As should be true if the Chain Rule formula is correct, we obtain the same answer either way.
∂f
∂f
and
both directly and via the
• Example: For f (x, y) = x2 + y 2 , with x = s2 + t2 and y = s3 + t4 , find
∂s
∂t
Chain Rule.
◦ By the Chain Rule we have
∂f
∂f ∂x ∂f ∂y
=
·
+
·
= (2x) · (2s) + (2y) · (3s2 ). Plugging in x = s2 + t2
∂s
∂x ∂s ∂y ∂s
∂f
= (2s2 + 2t2 ) · (2s) + (2s3 + 2t4 ) · (3s2 ) = 4s3 + 4st2 + 6s5 + 6s2 t4 .
∂s
∂f
∂f ∂x ∂f ∂y
◦ We also have
=
·
+
·
= (2x) · (2t) + (2y) · (4t3 ). Plugging in x = s2 + t2 and y = s3 + t4
∂t
∂x ∂s ∂y ∂s
∂f
yields
= (2s2 + 2t2 ) · (2t) + (2s3 + 2t4 ) · (4t3 ) = 4s2 t + 4t3 + 8s3 t3 + 8t7 .
∂s
◦ To do this directly, we plug in x = s2 +t2 and y = s3 +t4 : this gives f (x, y) = (s2 +t2 )2 +(s3 +t4 )2 = s4 +
∂f
∂f
2s2 t2 + t4 + s6 + 2s3 t4 + t8 , so that
= 4s2 t + 4t3 + 8s3 t3 + 8t7 and
= 4s3 + 4st2 + 6s5 + 6s2 t4 .
∂t
∂s
and y = s3 + t4 yields
2
Download