Chain Rule for Functions of Several Variables June 21, 2011 1 Functions of Several Variables We write f : Rn → Rm for a rule assigning to each vector in a domain D ⊆ Rn a unique vector in Rm . Examples: 1. Suppose your position at time t is given by p(t) = h2t + 1, 3t, 1 − ti. This gives a function p : R → R3 . Observe that any vector line equation r = r0 + tv is a function r : R → Rn , where r0 , v ∈ Rn . 2. On a windy day, the force of the wind at a point (x, y, z) might be given by a function F (x, y, z) : R3 → R2 F (x, y, z) = h2xz + 1, y + zi. n m For a function f : R → R , we write f (x1 , x2 , . . . , xn ) = (f1 (x1 , x2 , . . . , xn ), f2 (x1 , x2 , . . . , xn ), . . . , fm (x1 , x2 , . . . , xn )). For the function p in Example 1, we have p1 (t) = 2t + 1, p2 (t) = 3t, and p3 (t) = 1 − t. 2 Derivative Matrix We can organize the partial derivatives of f : R2 → R into a 1 × 2 matrix called the derivative matrix, h i ∂f ∂f Df = ∂x ∂y More generally, for f : Rn → Rm , the derivative matrix of f (x1 , . . . , xn ) = (f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )) is Df = ∂f1 ∂x1 ∂f2 ∂x1 ∂f1 ∂x2 ∂f2 ∂x2 ... ... .. . ∂f1 ∂xn ∂f2 ∂xn ∂fm ∂x1 ∂fm ∂x2 ... ∂fm ∂xn .. . .. . Examples: Find the derivative matrix. 1. r(t) = h2t + 1, 3t, 1 − ti 2 Dr = 3 −1 1 .. . . 2. F (x, y) = hsin x, cos(xy), ln xi cos x 0 DF = −y sin(xy) −x sin(xy) 1 0 x 3 Composition for Functions of Several Variables Recall from single variable calculus that for two functions f, g : R → R, we may find the composition f ◦ g(x) = f (g(x)). We can illustrate this as follows: g f R→R→R To take derivatives of these compositions, we use the Chain Rule: df dg d (f ◦ g(x)) = (g(x)) (x) dx dx dx n m k For multivariable functions f : R → R and g : R → Rn , we can take the composition f ◦ g(t) = f (g(t)) illustrated by g f Rk → Rn → Rm . Our next task will be to find a rule for expressing the derivative matrix of f ◦ g in terms of the matrices Df and Dg. Examples: 1. Find F ◦ g(x, y), where F (t) = ht2 , 2t, 3t + 1i and g(x, y) = 3x + 2y. F ◦ g(t) = F (g(t)) = F (3x + 2y) = h(3x + 2y)2 , 3(3x + 3y), 3(3x + 2y) + 1i 2. Express T in terms of t, when T (x, y) = hxy, 3x + yi, and x = t2 , y = 2t. T (t) = h(t2 )(2t), 3(t2 ) + (2t)i We can view this as a composition by defining a function f (t) = hx(t), y(t)i = ht2 , 2ti. Then expressing T in terms of t is the same as finding T ◦ f (t). 4 Chain Rule, Case 1 Suppose the function f (x) gives the height of a mountain range at position x. We are thinking of a cross section through the mountains, like the following graph. 2 Now suppose your are traveling through the mountains along this cross section, with position at time t given by a function g(t). Then your height at time t is f (g(t)). We can define a function h(t) = f (g(t)). Sometimes we write h = f ◦ g or h(t) = (f ◦ g)(t). Recall that the one-dimensional chain rule is dh df dg (t) = g(t) (t). dt dx dt We can state this in terms of the derivative matrices of f and g. Since f : R → R and g : R → R, the matrices Df and Dg are 1 × 1 matrices. df dg Df = dx , Dg = dx Since multiplication of 1×1 matrices is just scalar multilplication, the one-dimensional chain rule can be written Dh(t) = Df (g(t))Dg(t). 5 Chain Rule, Case 2 Now redefine the mountain range by a function of two variables f (x, y). As before, you travel across the mountain range in a path g(t) = hg1 (t), g2 (t)i. Now your height at time t is given by h(t) = f (g(t)), which means that we evaluate f (x, y) at x = g1 (t) and y = g2 (t). The derivative h0 (t) will give the rate of change of height with respect to time. We compute this by taking the product of derivative matrices Dh(t) = Df (g(t))Dg(t). This what the computation looks like: Dh(t) = = = h h ∂f (x, y) ∂x ∂f (g(t)) ∂x ∂f (x, y) ∂y ∂f (g(t)) ∂y i i g(t) ∂g1 (t) ∂t ∂g2 (t) ∂t ∂g1 (t) ∂t ∂g2 (t) ∂t ∂f ∂g1 ∂f ∂g2 (g(t)) (t) + (g(t)) (t) ∂x ∂t ∂y ∂t 3 6 Chain Rule for Multivariable Functions For general functions f : Rn → Rm , f (x) = hf1 (x), . . . , fm (x)i) and g : Rk → Rn g(t) = hg1 (t), . . . , gn (t)i we can take the composition h(t) = f (g(t)) = f (g1 (t), . . . , gn (t)). We can find the derivative matrix of the composition h by taking the product of derivative matrices. Dh(t) = Df (g(t))Dg(t) 7 Examples 1. Suppose in our example from the beginning, the height of the mountain range at point (x, y) is given by f (x, y) = 2x sin(xy), and the path you travel is given by g(t) = ht, t2 i. Then your height at time t is h(t) = f (g(t)). We can compute the rate of change of height at time t using the chain rule. Dh(t) = Df (g(t))Dg(t) = 2xy cos(xy) + 2 sin(xy) 2x cos(xy) 2 = 2t3 cos(t3 ) + 2 sin(t3 ) 2t2 cos(t3 ) = g(t) 2t3 cos(t3 ) + 2 sin(t3 ) + 4t3 cos(t3 ) = 6t3 cos(t3 ) + 2 sin(t3 ) 1 2t 1 2t We can also compute this derivative by first composing the functions, then taking the derivative of the composition. h(t) = f (g(t)) = 2t sin(t3 ) Dh(t) = 6t3 cos(t3 ) + 2 sin(t3 ) 2. Find Dh and Dh(1, −1) where h = f (g(s, t)), and f (x, y, z) = h2x + z 2 , xyzi, h(s, t) = ht, 2st2 , s3 i. Using the chain rule, we compute: Dh(s, t) = Df (g(s, t))Dg(s, t) 4 0 1 2 0 2z 2t2 4st = yz xz xy g(s,t) 3s2 0 0 1 3 2 0 2s 2t2 4st = 2s4 t2 s3 t 2st3 3s2 0 6s5 2 = 2s3 t3 + 6s3 t3 2s4 t2 + 4s4 t2 6s5 2 = 8s3 t3 6s4 t2 Without using the chain rule, we can first compose the functions, then find the derivative matrix of the composition. h(s, t) = (f ◦ g)(s, t) = h2t + s6 , 2s4 t3 i Dh(s, t) = D(f ◦ g)(s, t) = 6s5 2 3 3 8s t 6s4 t2 From this, we compute Dh(1, −1) = xy+1 3. Find D(F ◦ g)(3, 5), if F (x, y) = he 6 2 −8 6 −x . , e i, Dg(3, 5) = h1, 0i. D(F ◦ g)(3, 5) = DF (g(3, 5))Dg(3, 5) xy+1 −1 ye xexy+1 = −e−x 0 2 (1,0) 0 e −1 = − 1e 0 2 2e = 1 e 5 −1 , and g(3, 5) = 2