1 The Chain Rule • With functions of several variables, each of which is defined in terms of other variables (for example, f (x, y) where x and y are themselves functions of s and t), we recover versions of the Chain Rule specific to the relations between the variables involved. ◦ Recall that the Chain Rule for functions of one variable states that dg dy dg = · , if g is a function of y dx dy dx and y is a function of x. ◦ The various Chain Rules for functions of more than one variable have a similar form, but they will involve more terms that depend on the relationships between the variables. df for a function f (x, y) where x ◦ Here is the rough idea of the formal proof, for the situation of finding dt and y are both functions of t: ∗ If t changes to t + ∆t then x changes to x + ∆x and y changes to y + ∆y. ∗ Then ∆f = f (x + ∆x, y + ∆y) − f (x, y) is roughly equal to the directional derivative of f (x, y) in the direction of the (non-unit) vector h∆x, ∆yi. ∗ From the results about the gradient and directional derivatives, we know that this directional deriva∂f ∂f tive is equal to the dot product of h∆x, ∆yi with the gradient ∇f = , . ∂x ∂y ∂f ∂f ∆f ∆x ∆y ≈ , , ∗ Then · . Taking the limit as ∆t → 0 (and verifying a few other details) ∆t ∂x ∂y ∆t ∆t ∂f dx ∂f dy df = · + · . then gives us that dt ∂x dt ∂y dt • The simplest method for generating the statement of the Chain Rule specific to any particular set of dependencies of variables is to draw a “tree diagram” as follows: ◦ Step 1: Start with the initial function f , and draw an arrow pointing from f to each of the variables it depends on. ◦ Step 2: For each variable listed, draw new arrows branching from that variable to any other variables they depend on. Repeat the process until all dependencies are shown in the diagram. ∂[top] ◦ Step 3: Associate each arrow from one variable to another with the derivative . ∂[bottom] ∂v1 ◦ Step 4: To write the version of the Chain Rule that gives the derivative for any variables v1 and v2 ∂v2 in the diagram (where v2 depends on v1 ), first find all paths from v1 to v2 . ◦ Step 5: For each path from v1 to v2 , multiply all of the derivatives that appear in each path from v1 to ∂v1 . v2 . Then sum the results over all of the paths: this is ∂v2 df • Example: State the Chain Rule that computes for the function f (x, y, z), where each of x, y, and z is a dt function of the variable t. . ◦ Steps 1-2: The tree diagram looks like this: x ↓ t f ↓ y ↓ t & z . ↓ t ◦ Step 4: In this diagram, there are 3 paths from f to t: they are f → x → t, f → y → t, and f → z → t. ∂f ∂x ◦ Step 5: The path f → x → t gives the product · , while the path f → y → t gives the product ∂x ∂t ∂f ∂y ∂f ∂z · , and the path f → z → t gives the product · . ∂y ∂t ∂z ∂t 1 ∗ Then the statement of the Chain Rule here is ∂f ∂f dx ∂f dy ∂f dz = · + · + · . ∂t ∂x dt ∂y dt ∂z dt ∂f ∂f and for the function f (x, y), where x = x(s, t) and ∂t ∂s • Example: State the Chain Rule that computes y = y(s, t) are both functions of s and t. f . ◦ Steps 1-2: The tree diagram here is . s x ↓ t & y ↓ s . & t ◦ Step 4: In this diagram, there are 2 paths from f to s: they are f → x → s and f → y → s, and also two paths from f to t: f → x → t and f → y → t. ∂f ∂x · , while the path f → y → t gives the product ◦ Step 5: The path f → x → t gives the product ∂x ∂t ∂f ∂y ∂f ∂x · . Similarly, the path f → x → s gives the product · , while the path f → y → s gives the ∂y ∂t ∂x ∂s ∂f ∂y product · . ∂y ∂s ∗ Then the two statements of the Chain Rule here are ∂f ∂f ∂x ∂f ∂y ∂f ∂f ∂x ∂f ∂y = · + · and = · + · . ∂s ∂x ∂s ∂y ∂s ∂t ∂x ∂t ∂y ∂t • Once we have the appropriate statement of the Chain Rule, it is easy to compute examples with specific functions. • Example: For f (x, y) = x2 + y 2 , with x = t2 and y = t4 , find ◦ In this instance, the Chain Rule says that df , both directly and via the Chain Rule. dt ∂f dx ∂f dy df = · + · . dt ∂x dt ∂y dt df = (2x) · (2t) + (2y) · (4t3 ). dt df ◦ Plugging in x = t2 and y = t4 yields = (2t2 ) · (2t) + (2t4 ) · (4t3 ) = 4t3 + 8t7 . dt ◦ Computing the derivatives shows ◦ To do this directly, we would plug in x = t2 and y = t4 : this gives f (x, y) = t4 + t8 , so that df = dt 4t3 + 8t7 . ◦ As should be true if the Chain Rule formula is correct, we obtain the same answer either way. ∂f ∂f and both directly and via the • Example: For f (x, y) = x2 + y 2 , with x = s2 + t2 and y = s3 + t4 , find ∂s ∂t Chain Rule. ◦ By the Chain Rule we have ∂f ∂f ∂x ∂f ∂y = · + · = (2x) · (2s) + (2y) · (3s2 ). Plugging in x = s2 + t2 ∂s ∂x ∂s ∂y ∂s ∂f = (2s2 + 2t2 ) · (2s) + (2s3 + 2t4 ) · (3s2 ) = 4s3 + 4st2 + 6s5 + 6s2 t4 . ∂s ∂f ∂f ∂x ∂f ∂y ◦ We also have = · + · = (2x) · (2t) + (2y) · (4t3 ). Plugging in x = s2 + t2 and y = s3 + t4 ∂t ∂x ∂s ∂y ∂s ∂f yields = (2s2 + 2t2 ) · (2t) + (2s3 + 2t4 ) · (4t3 ) = 4s2 t + 4t3 + 8s3 t3 + 8t7 . ∂s ◦ To do this directly, we plug in x = s2 +t2 and y = s3 +t4 : this gives f (x, y) = (s2 +t2 )2 +(s3 +t4 )2 = s4 + ∂f ∂f 2s2 t2 + t4 + s6 + 2s3 t4 + t8 , so that = 4s2 t + 4t3 + 8s3 t3 + 8t7 and = 4s3 + 4st2 + 6s5 + 6s2 t4 . ∂t ∂s and y = s3 + t4 yields 2