MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou Chapter 2: Calculus 1 Univariate Calculus This section studies real functions of one variable f : R → R in the Euclidean space with metric d(x, y) = |x − y|. 1.1 Differentiation and derivatives • A function is differentiable at x if the following limit exists: lim z→0 f (z + x) − f (x) . z We denote this limit by f 0 (x) or df (x) , dx and call it the derivative of fucntion f at x. A function is differentiable if it is differentiable at every point in its domain. — the derivative f 0 (x) is the slope of the tangent line of f at x. Roughly speaking, it measures the rate of change of f (x) when x changes. — the differentiation of f is df (x) = f 0 (x)dx. • Differentiability and continuity: — if a function f is differentiable at a point x, then it must be continuous at this point. The proof is simple: as z → 0, f (z + x) − f (x) = f (z + x) − f (x) · z → f 0 (x) · 0 = 0. z — a differentiable function is always continuous, but a continuous function may not be differentiable. For example, f (x) = |x| is continuous but not differentiable at x = 0.1 • A function f is said to be continuously differentiable or of class C 1 if it is differentiable and f 0 is a continuous function.2 1 In effect, there exist functions which are continuous but nowhere differentiable. See an example in pp.154 in Rudin (1976), for instance. 2 The following function is differentiable everywhere but its derivative is not continuous at x = 0: + x2 sin x1 if x 6= 0 f (x) = 0 if x = 0. 1 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — polynomial functions are continuously differentiable. • Higher order derivatives can be obtained by sequential differentiation. Denote by f 00 the second (order) derivative, f 000 the third (order) derivative, and in general f (n) the derivative of degree n. More explicitly, d 0 f (x), dx .. . d (n−1) f (x). f (n) (x) = dx f 00 (x) = — a function is of class C n if its nth derivative is a continuous function. 1.2 Computing derivatives • Useful rules (k is a real constant): — (kf )0 = kf 0 — (f ± g)0 = f 0 ± g 0 — (f · g)0 = f 0 · g + f · g 0 (product rule) ³ ´0 — fg = g12 (f 0 · g − f · g 0 ) (quotient rule) — d dx f (g(x)) — d −1 (x) dx f = f 0 (g(x)) · g 0 (x) (chain rule) = 1 f 0 (f −1 (x)) (inverse function rule)3 • Useful formulas (k is a real constant): — the derivative of a constant function is 0 — (xk )0 = kxk−1 — (ex )0 = ex — (ln x)0 = 1 x — (sin x)0 = cos x — (cos x)0 = − sin x However, f 0 cannot be “too” discontinuous in the sense that f 0 cannot have any discontinuous point x0 at + 0 0 − 0 + which both f 0 (x− 0 ) and f (x0 ) exist. For instance, in the above example, f (0 ) and f (0 ) do not exist. (See a formal statement in pp.109 in Rudin (1976).) 3 Notice that f −1 (x) is well defined only when f (x) is strictly monotonic on some domain. 2 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou Exercise 1 Use the above results to show √ 0 (i) ( x) = 2√1 x (ii) (ax )0 = ax ln a for a > 0 1 (iii) (loga x)0 = x ln a 0 (x) (iv) (ln f (x))0 = ff (x) h i ¡ ¢0 0 (x) (v) f (x)g(x) = f (x)g(x) g 0 (x) ln f (x) + g(x) ff (x) Exercise 2 Let Q(P ) be the demand for a good at price P . Show that the price elasticity is d ln Q . d ln P Let R = P Q(P ) be the revenue. Show that the marginal revenue with respect to price is =− Q(P )(1 − ), and the marginal revenue with respect to quantity is µ ¶ 1 P 1− . 1.3 Important results • The Mean Value Theorem: If f is a continuous function on [a, b] which is differentiable in (a, b), then there exists a point x ∈ (a, b) such that f (b) − f (a) = (b − a)f 0 (x). In particular, if f (a) = f (b), then there exists x ∈ (a, b) such that f 0 (x) = 0. • L’Hospital’s rule: Suppose, at some point x0 , f and g are both zero or |f (x0 )| = |g(x0 )| = ∞ such that f (x0 )/g(x0 ) is indeterminate. Then lim x→x0 f (x) f 0 (x) = lim 0 g(x) x→x0 g (x) if limx→x0 f 0 (x)/g 0 (x) exists (including ∞). This rule is very useful in evaluating limits. • Taylor’s expansion: If f is a C n+1 function defined on (a, b), then for any x, x + ε ∈ (a, b), we have 1 1 1 f (n+1) (x̃)εn+1 f (x + ε) = f (x) + f 0 (x)ε + f 00 (x)ε2 + · · · + f (n) (x)εn + 2 n! (n + 1)! for some x̃ between x and x + ε. 3 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — when n = 0, this is just the mean value theorem. — notice that, as ε → 0, 1 f (n+1) (x̃)εn+1 1 f (n+1) (x) · 0 = 0. → n (n + 1)! ε (n + 1)! That is, the last term in the right-hand side decreases faster than εn as ε decreases to zero. Therefore, when ε is relatively small, the right-hand side without the last term is a good approximation of f (x+ε). The accuracy of approximation increases as n becomes larger or ε becomes smaller. — this theorem can be understood by appealing to the mean value theorem: ∗ f (x + ε) = f (x) + f 0 (x1 )ε for some x1 between x and x + ε; ∗ f 0 (x1 ) = f 0 (x) + f 00 (x2 )(x1 − x) ≈ f 0 (x) + f 00 (x2 ) 2ε for some x2 between x and x1 if ε is small; ∗ these two steps imply 1 f (x + ε) ≈ f (x) + f 0 (x)ε + f 00 (x2 )ε2 ; 2 ∗ we further approximate f 00 (x2 ) ≈ f 00 (x) + f (3) (x3 ) 3ε for some x3 between x and x2 , and so 1 1 f (x + ε) ≈ f (x) + f 0 (x)ε + f 00 (x)ε2 + f (3) (x3 )ε3 . 2 3! We can continue this process till f (n+1) . x Exercise 3 (i) Show limx→0 sinx x = 1; limx→0 e x−1 = 1; limx→∞ 0; and limx→0 xx = 1. (ii) Approximate ex around x = 0 by Taylor’s expansion. 1.4 √ x ln x = ∞; limx→0+ x ln x = The indefinite integral • For the function f : R → R which is the derivative of some differentiable function, we call Z f (x)dx the indefinite integral (or antiderivative) of f . Its meaning is that the derivative of R f (x)dx should be f . — the indefinite integral is the reverse operation of differentiation: we want to recover a function from its derivative. 4 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — the indefinite integral may not exist for some (discontinuous) functions, but it always exists for continuous functions. R R — clearly, f (x)dx does not represent a unique function. If F (x) = f (x)dx, then the derivative of F (x) + k for any constant k is also f . • Some integration formulas (where c and k are real constants): — R R c dx = cx + k n+1 xn dx = xn+1 + k for n 6= −1 ( R 1 ln x + k for x > 0 — x dx = ln(−x) + k for x < 0 R x x — e dx = e + k R x — cx dx = lnc c + k R — sin xdx = − cos x + k R — cos xdx = sin x + k — They can be derived from the derivative formulas. But for many indefinite integrals, they are irreducible (i.e., we are unable to derive their formulas explicitly). Examples R R −x R R R 2 include e−x dx, ln1x dx, e x dx, sinx x dx, cosx x dx, etc. Exercise 4 Calculate 1.5 Z ex + 1 dx; ex + x Z (x2 + 2x + 4)1/2 (x + 1)dx. The definite integral We only review the Riemann integral in this course. • For a bounded real function f defined on [a, b], we denote by Z b f (x)dx a the Riemann integral of f over [a, b]. Roughly speaking, it measures the area under the graph of f on [a, b]. • A more precise definition goes as follows: — let P be a partition of [a, b]: {xi }ni=0 such that a = x0 ≤ x1 ≤ · · · ≤ xn−1 ≤ xn = b and [ [xi , xi+1 ] = [a, b]. i=0,...n−1 5 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — for i = 1, ..., n, define 4i = xi − xi−1 and Mi = sup f (x), x∈[xi−1 ,xi ] mi = inf x∈[xi−1 ,xi ] f (x). — we further define U (P, f ) = n X i=1 L(P, f ) = n X i=1 Mi 4i , mi 4i . — f is Riemann integrable if sup L(P, f ) = inf U (P, f ), where inf and sup are taken over all partitions of [a, b],4 and we denote this common value by Z b f (x)dx. a • Do we have easier ways to identify whether f is Riemann integrable? The bounded function f is integrable on [a, b] if f is continuous, monotonic, or has only finitely many discontinuous points.5 Example 1 f (x) = ( 1 if x ∈ Q ∩ [a, b] 0 if x ∈ [a, b]\Q is not Riemann integrable on [a, b]. • Properties of the Riemann integral: — — — — 4 Rc Rb a f (x) dx = a f (x) dx + c f (x) dx for any Rb Ra a f (x) dx = − b f (x) dx. Rb Rb Rb a (f1 + f2 )dx = a f1 dx + a f2 dx. Rb Rb if f1 ≤ f2 on [a, b], then a f1 dx ≤ a f2 dx. Rb c ∈ [a, b]. ¯ R ¯R ¯ ¯ b b — if f is integrable, |f | is integrable as well, and ¯ a f dx¯ ≤ a |f | dx. One can show that sup L(P, f ) ≤ inf U(P, f ). See, for instance, pp.124 in Rudin (1976). In general a bounded real function on [a, b] is Riemann integrable if and only if f is continuous “almost” everywhere on [a, b]. 5 6 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — if both f1 and f2 are integrable, f1 f2 is integrable as well. • The first fundamental theorem of calculus: Suppose f is Riemann integrable on [a, b]. For x ∈ [a, b], put Z x f (t) dt. F (x) = a Then F is differentiable on [a, b], and F 0 (x) = f (x). — this result indicates that integration and differentiation are, in some sense, inverse operation. • The second fundamental theorem of calculus: Suppose f is Riemann integrable on [a, b], and there is a differentiable function F on [a, b] such that F 0 = f , then Z b a f (x)dx = F (x)|ba ≡ F (b) − F (a). • The integral mean value theorem: If f is continuous on [a, b], then there exists some c ∈ (a, b) such that Z b a f (x)dx = (b − a)f (c). (Think about why continuity is needed.) • Liebnitz’s rule: if F (t) = Z b(t) f (x, t)dx a(t) where all functions are C 1 , then 0 F (t) = Z b(t) a(t) ∂f (x, t)dx + f (b(t), t)b0 (t) − f (a(t), t)a0 (t) ∂t In particular, we have d dt if f is C 1. Z b f (x, t)dx = a Z a • Some integration rules: 7 b ∂f (x, t)dx ∂t MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — integration by parts: if both f and g are differentiable, then Z b Z b 0 b f · g dx = (f · g) |a − f 0 · g dx. a It is just from calculus. Rb a · g)0 dx = (f · g) |ba by the second fundamental theorem of a (f — change of variables: suppose the function g(x) is monotonic and differentiable. Then Z g(b) Z b 0 f (g(x))g (x) dx = f (z)dz. a g(a) This is because, if we let z = g(x), then dz = g 0 (x)dx. Exercise 5 (i) Calculate Z 1 xn ln xdx, 0 Z 1 e ln x dx, x Z 0 x2 e2x dx, and −∞ Z π 2 /4 0 √ sin x √ dx. x (ii) Suppose f is a continuously differentiable function on [a, b] with f (a) = f (b) = 0. Prove that Z b f f 0 dx = 0. a If we further have Z b f 2 dx = 1, a prove that Z a b xff 0 dx = − 1 and 2 Z a b 1 x(f 0 )2 dx = . 2 (iii) For 0 < t < ∞, define the Gamma function as Z ∞ Γ(t) = xt−1 e−x dx. 0 Show that Γ(t + 1) = tΓ(t) and Γ(n + 1) = n! for n ∈ N. (iv) Suppose the demand function is Q(P ) with Q0 (P ) < 0. We define consumer surplus at price P as Z Q(P ) [P (t) − P ] dt V (P ) = 0 where P (·) is the inverse demand function. Show that V 0 (P ) = −Q(P ) and V (P ) is convex in P . 8 MSc Maths and Statistics 2008 UCL Department of Economics 2 Chapter 2: Calculus Jidong Zhou Multivariate Calculus We will now study functions of several variables. In general, we are interested in functions mapping Rn into Rm . We continue to work with the Euclidean distance as metric. For x, y ∈ Rn , recall that it is defined as v u n uX d(x, y) = kx − yk = t (xi − yi )2 i=1 Consider a (vector-valued) function f : S ⊂ Rn → Rm . We can write it as f (x) = [f1 (x) f2 (x) · · · fm (x)]T where fi : Rn → R is a real-valued function. 2.1 2.1.1 Differentiation Derivatives • The partial derivative of fi with respect to xj is obtained by holding xk fixed for all k 6= j and differentiating fi as if it were a single variable function of xj . We write, fi (x + zej ) − fi (x) ∂fi (x) = lim z→0 ∂xj z where z is a real number and ej is the unit n-dimensional vector with a 1 in position j and zeros everywhere else. — this partial derivative reflects the impact of a small change of xj on the value fi when all other variables are remained constant, or measures the slope of the curve in the xj -direction at the point x. • The matrix ⎛ ⎜ Df (x) = ⎜ ⎝ ∂f1 (x) ∂x1 ··· .. . ∂f1 (x) ∂xn ∂fm (x) ∂x1 ··· ∂fm (x) ∂xn .. . .. . ⎞ ⎟ ⎟, ⎠ where every entry is a partial derivative of f with respect to an argument, is the derivative or the Jacobian derivative of f at x. (When m = 1 (i.e., when f is real-valued), the column vector ∇f (x) = Df (x)T is the gradient vector of f ). • f is C 1 if all its partial derivatives exist and are continuous. • The extended chain rule: the chain rule can be naturally extended to the multivariate case. Here we only present the simplest case: 9 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou Suppose we have a function f : Rn → R where the arguments (x1 (t), · · · , xn (t)) are themselves functions of another real variable t. Then n X ∂f dxi df = dt ∂xi dt i=1 — in particular, if the arguments (x2 , · · · , xn ) can be written as functions of x1 , and we wish to know how f changes with x1 allowing for all the indirect effects of x1 on the remaining arguments, then the above chain rule yields: df = dx1 ∂f ∂x1 |{z} + direct effect ∂f dx2 ∂f dxn + ··· + ∂x dx ∂xn dx1 | 2 1 {z } indirect effects Exercise 6 (i) Compute the partial derivative of the following functions with respect to x: 2 exy+x ; x+y ; [x2 + y 2 ]1/2 . x2 − y (ii) Let Q1 (P1 , P2 , I) be the demand function for good 1, where Pi is the price of good i and I is the income. Show that the cross price elasticity of demand for good 1 and its income elasticity are ∂ ln Q1 ∂ ln Q1 , and 1,I = 1,2 = ∂ ln P2 ∂ ln I respectively. If Q1 = kP1α P2β I γ , show that all elasticities are constant. (iii) Given the two vector-valued functions f (x, y) = (x2 + 1, y2 ) and g(u, v) = (u + v, v2 ), compute the Jacobian derivative matrix of g(f (x, y)) at the point (x = 1, y = 1). In the following, we mainly focus on real-valued functions mapping Rn into R. • Higher order derivatives: Let us consider a differentiable real-valued function f : S ⊂ Rn → R. Its derivative ´T ³ ∂f ∂f and it is also a function mapping S ⊂ Rn into Df (x) is a vector · · · ∂x1 ∂xn Rn . Then we can define its second order derivative at x as ⎛ ∂ 2 f (x) ∂ 2 f (x) ⎞ · · · 2 ∂x1 ∂xn ⎜ ∂x. 1 ⎟ .. .. ⎟ . D2 f (x) = ⎜ . . . ⎝ ⎠ 2 2 ∂ f (x) ∂ f (x) ∂xn ∂x1 · · · ∂x2 n where ∂ 2 f (x) ∂ = ∂xi ∂xj ∂xj µ ∂f (x) ∂xi ¶ . If each entry exists, we say f is twice differentiable at x. If continuous at any x, f is C 2 . 10 ∂2f ∂xi ∂xj for any i and j is also MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou — D2 f (x) is also called the Hessian of f at x. — ∂2f ∂x2i 2 f measures the curvature of f in the xi -direction, and ∂x∂i ∂x measures the rate j at which the slope in xi -direction changes as we change xj . 2 2 f f = ∂x∂j ∂x . That is, the differentiation — (Young’s Theorem) if f is C 2 , then ∂x∂i ∂x j i order does not matter for twice continuously differentiable functions.6 — higher order derivatives can be obtained by applying differentiation sequentially though complicated. 2.1.2 The implicit function theorem • Basic idea: an illustration in R2 — f (x, y) = 0 defines y as an implicit function of x or x as an implicit function of y. In many circumstances, f (x, y) = 0 is rather complicated so that we cannot solve, say, y as an explicit function of x. For example, exy + x2 y = 1. But we still want to know how the change of x affects y. — applying the total differentiation to f (x, y) = 0 yields ∂f ∂f dx + dy = 0. ∂x ∂y Then if ∂f ∂y dy ∂f ∂f =− / dx ∂x ∂y 6= 0. — or we can write f (x, y(x)) = 0 since y is an implicit function of x. The the chain rule implies the same result: ∂f 0 ∂f ∂f ∂f + y (x) = 0 =⇒ y 0 (x) = − / . ∂x ∂y ∂x ∂y • The implicit function theorem in general Let f1 , · · · , fn : Rn+m → R be C 1 functions. Consider the system of n equations f1 (y1 , · · · , yn ; x1 , · · · , xm ) = 0 .. . fn (y1 , · · · , yn ; x1 , · · · , xm ) = 0 6 There are examples of weird functions which are twice differentiable but not continuously twice differentiable and whose cross partial derivatives are not equal. See exercise 14.28 in pp.332 in Simon&Blume (1994). 11 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou as possibly defining y1 , · · · , yn as implicit functions of x1 , · · · , xm . Suppose (y∗ , x∗ ) is a solution. If the matrix ⎞ ⎛ ∂f1 ∂f1 · · · ∂yn ⎜ ∂y. 1 . .. ⎟ ⎜ . . Dfy (y, x) = ⎝ . . . ⎟ ⎠ ∂fn ∂fn · · · ∂y1 ∂yn evaluated at (y∗ , x∗ ) is nonsigular, then there exist C 1 functions yi = yi (x) for i = 1, · · · , n defined on an open ball (or a neighborhood) B around x∗ such that: (a) fi (y1 (x), · · · , yn (x); x1 , · · · , xm ) = 0 for all x ∈ B and i = 1, · · · , n, (b) y∗ = y(x∗ ), and (c) or ⎛ ⎜ Dyxj (x∗ ) = − [Dfy (y∗ , x∗ )]−1 ⎜ ⎝ ∂f1 (y∗ ,x∗ ) ∂xj .. . ∂fn (y∗ ,x∗ ) ∂xj ∂yi (x∗ ) |Ai | =− ∂xj |Dfy (y∗ , x∗ )| ⎞ ⎟ ⎟ ⎠ where Ai is the matrix Dfy (y∗ , x∗ ) with its ith column replaced by ³ ∂f1 (y∗ ,x∗ ) ∂xj ··· ∂fn (y∗ ,x∗ ) ∂xj — again, the expression for Dyxj (x∗ ) (i.e., how xj affects y at the point x∗ ) is derived from differentiating the system of equation with respect to xj (remember all yi are functions of xj ). (Show it as an exercise.) — this implicit function theorem is very important in solving optimization problems as we will see in next chapter. Exercise 7 Suppose x and y satisfy exy + x2 y = 1. Evaluate 2.1.3 dy dx at (x = 1, y = 0). Taylor’s expansion in Rn The spirit of Taylor’s expansion in the multi-dimensional case is the same as that in the unidimensional case. • Taylor’s expansion of order one: Suppose f is a C 1 real-valued function defined on an open set A ⊂ Rn . For any x, x+ε ∈ A, we have f (x + ε) = f (x) + Df (x) · ε + R1 (ε; x) 12 ´T . MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou where R1 (ε; x) → 0 as ε → 0. kεk • Taylor’s expansion of order two: Suppose f is a C 2 real-valued function defined on an open set A ⊂ Rn . For any x, x+ε ∈ A, we have 1 f (x + ε) = f (x) + Df (x) · ε + εT D2 f (x)ε+R2 (ε; x) 2 where R2 (ε; x) → 0 as ε → 0. kεk2 • The expansions of higher orders have the similar but more complicated forms. See, for example, pp. 835 in Simon&Blume (1994). • When ε is relatively small, we can use the expansion without the last term to approximate a function at some point. Exercise 8 Use the second order Taylor’s expansion about (1, 1) to approximate f (x, y) = √ xy at (x = 1.2, y = 0.9). (That is, x = (1, 1) and ε = (0.2, −0.1).) 2.2 Integrals Since integration in multi-dimensional space is usually complicated, in this course we will only deal with double integration with f (x, y) : Ω ⊂ R2 → R and well-behaved Ω (as we will specify). We will also content ourselves with not very precise exposition. The domain Ω can be drawn in a plane with x-axis as the horizontal axis and y-axis as the vertical one. Similarly to the definition of single integration, we can partition the domain Ω into grids by drawing horizontal and vertical lines on the plane. Let us denote by xi (with xi−1 < xi ) the points where the vertical lines cut the x-axis and by yi (with yj−1 < yj ) the points where the horizontal lines cut the y-axis. Then we form the Riemann sum XX i j f (xi , yj ) · ∆xi · ∆yj where ∆xi = xi − xi−1 , ∆yj = yj − yj−1 . When the partition gets finer and finer, if the limit of this sum exists, then we say f (x, y) is integrable on Ω and denote it by Z Z f (x, y)dxdy. Ω 13 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou An intuitive interpretation of this integral is that, if f (x, y) ≥ 0, it is just the volume of the solid over Ω and beneath the graph of f . We can calculate the double integral conveniently in the following three cases: • Ω is a square on the plane. That is, Ω = {(x, y) : x ∈ (a, b) and y ∈ (c, d)}. (This is a special case of the following two more general case.) In this case, the double integral is written as Z dZ b f (x, y)dxdy c a and it can be calculated by first keeping y fixed and integrating over x and then integrating over y (or in the opposite order). That is, Z c dZ b f (x, y)dxdy = a Z c Example 2 Z 2Z 1 d µZ b | a ¶ f (x, y)dx dy. {z } a function of y √ x)dxdy 0 0 ¶ Z 1 µZ 2 √ 2 = (x y + x)dy dx 0 0 Z 1 √ 2(x2 + x)dx = (x2 y + 0 = 2( x3 2 3 1 + x 2 )|0 = 2. 3 3 • Ω has the following form: Ω = {(x, y) : x ∈ (a, b) and g(x) < y < h(x)}. Then the double integral can be calculated as Z Z f (x, y)dxdy Ω = Z b ÃZ a | h(x) ! f (x, y)dy dx. g(x) {z a function of x } That is, integrate over y for any given x first, then integrate over x. 14 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou • A similar case is that Ω has the following form: Ω = {(x, y) : y ∈ (c, d) and g(y) < x < h(y)}. Then the double integral can be calculated as Z Z f (x, y)dxdy Z = Ω d c x 16+y 5 Example 3 f (x, y) = √ ÃZ f (x, y)dx dy. g(y) | ! h(y) {z } a function of y and Ω = {(x, y) : y ∈ (0, 2) and 0 < x < y 2 }. Then Z Z Z Ω 2 f (x, y)dxdy ÃZ y2 x ! p dx dy 16 + y 5 ! Z 2à y4 p = dy 2 16 + y 5 0 Z 48 1 4 √ √ dt = ( 3 − 1). = 5 16 10 t = 0 0 Some more complicated cases can be handled if Ω can be divided into several parts and each of them belongs to one of the above three cases by using the result that Z Z Z Z Z Z f (x, y)dxdy = f (x, y)dxdy + f (x, y)dxdy Ω Ω1 Ω2 if Ω = Ω1 ∪ Ω2 and Ω1 ∩ Ω1 = ∅. As a final remark, in some cases the order in which we take integration matters. Sometimes the calculation involved in one order is much simpler than the other; sometimes the integration can be calculated explicitly only in a certain order. Exercise 9 Compute Z Z Ω where Ω = {(x, y) : x ∈ (0, 2) and x2 <y< xydxdy √ 8x}. 15 MSc Maths and Statistics 2008 UCL Department of Economics 3 Chapter 2: Calculus Jidong Zhou Using Calculus to Characterize Functions 3.1 Monotonic functions • A differentiable function f : (a, b) → R is increasing iff f 0 (x) > 0 for x ∈ (a, b). If the inequality is strict, the function is strictly increasing. Decreasing and strictly decreasing functions can be defined with the inequality reversed. — notice that a monotonic function need not be differentiable, or even continuous. — the sum of two increasing (decreasing) functions is still increasing (decreasing). 3.2 Concave and convex functions We mainly characterize concave functions, since convex functions can be similarly treated but with all inequalities reversed. • Definition A real-valued function f defined on a convex set A ⊂ Rn is said to be concave if f (αx + (1 − α)y) ≥ αf (x) + (1 − α)f (y) for all x and y ∈ A and all α ∈ [0, 1]. It is strictly concave if the inequality is strict for α ∈ (0, 1). Graphically, the line segment connecting two points in the graph of a concave function lies below the graph. • Properties: — f is concave iff −f is convex. — the sum of two concave (or convex) functions is still concave (or convex). — a concave or convex function must be continuous on the interior of its domain Ao .7 — (Jensen’s inequality) if f : R → R is concave, then µZ ¶ Z f xdG(x) ≥ f (x)dG(x) for any distribution function G(x). We then present two (more practical) tests for concavity: 7 Moreover, a concave or convex function is differentiable “almost” everywhere. 16 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou • A C 1 function f defined on a convex set A ⊂ Rn is concave if and only if f (x + z) ≤ f (x) + Df (x) · z for all x and x + z ∈ A. It is strictly concave if the inequality is strict for z 6= 0. • A twice differentiable function f defined on a convex set A ⊂ Rn is concave if and only if its Hessian ⎛ ∂ 2 f (x) ∂ 2 f (x) ⎞ · · · 2 ∂x ∂x1 1 ∂xn ⎜ ⎟ . .. 2 . ⎟ .. .. D f (x) = ⎜ . ⎝ ⎠ ∂ 2 f (x) ∂xn ∂x1 ··· ∂ 2 f (x) ∂x2n is negative semidefinite for any x ∈ A. The function is convex iff its Hessian is positive semidefinite. — in the case A ⊂ R, f is concave iff f 00 (x) ≤ 0 for all x ∈ A, and it is convex iff f 00 (x) ≥ 0 for all x ∈ A. — this result can be easily understood by using the previous results. For example, let us consider the single-variable case: the Taylor’s expansion implies 1 f (x + z) = f (x) + f 0 (x)z + f 00 (x̃)z 2 2 for some x̃ between x and x + z. f is concave iff f (x + z) ≤ f (x) + f 0 (x)z for all x and x + z ∈ A, which equals f 00 ≤ 0. • A twice differentiable function f defined on a convex set A ⊂ Rn is strictly concave if the Hessian D2 f (x) is negative definite for any x ∈ A. The function is strictly convex iff the Hessian is positive definite. — notice that negative definiteness or positive definiteness is only sufficient but not necessary for concavity or convexity. For example, f (x) = −x4 is strictly concave but f 00 (0) = 0 is not strictly negative. Exercise 10 (i) Using different ways to show that (a) for k > 0, xk is strictly convex on (0, ∞) if k > 1, and it is strictly concave if k < 1; (b) ln x is concave; (c) ex is convex. (ii) For a, b > 0, show that the Cobb-Douglas function f (x, y) = xa y b defined on R2+ is concave iff a, b < 1 and a + b < 1. (iii) Give an example in which f − g is not concave though both f and g are concave. (iv) Let f and g : R → R are two twice differentiable functions. Then when will f g be convex or concave? 17 MSc Maths and Statistics 2008 UCL Department of Economics 3.3 Chapter 2: Calculus Jidong Zhou Quasiconcave and quasiconvex functions • Definition A real-valued function f defined on a convex set A ⊂ Rn is said to be quasiconcave if its upper contour sets {x ∈ A : f (x) ≥ t} are convex sets. That is, for any t ∈ R, if x and y ∈ A, f (x) ≥ t and f (y) ≥ t, then f (αx + (1 − α)y) ≥ t for any α ∈ [0, 1]. Analogously, f is quasiconvex if its lower contour sets {x ∈ A : f (x) ≤ t} are convex sets. — the definition implies that f is quasiconave iff f (αx + (1 − α)y) ≥ min{f (x), f (y)} for all x and y ∈ A, and α ∈ [0, 1], and f is quasiconvex iff f (αx + (1 − α)y) ≤ max{f (x), f (y)} for all x and y ∈ A, and α ∈ [0, 1]. (Show them as an exercise.) — the two concepts are not mutually exclusive. For example, all monotonic functions defined on a convex set are both quaisconcave and quasiconvex. — quasiconcavity is a “weaker” requirement than concavity. A concave function defined on a convex set must be quasiconcave; a convex function must also be quasiconvex. (Show them as an exercise.) But, again, a convex function can also be quasiconcave, and a concave function can also be quasiconvex. For example, f (x) = x2 on [0, ∞) is both convex and quasiconcave. — quasiconcave or quasiconvex functions can be discontinuous (vs concave or convex functions). • Properties: — f is quasiconcave iff −f is quasiconvex. — any nondecreasing transformation of a quasiconcave function is still quasiconcave.8 In particular, any nondecreasing transformation of a concave function results in a quasiconcave function.9 (Similar properties hold for quasiconvexity.) 8 This is an advantage of the concept of quasiconcavity relative to concavity. Concavity is only a cardinal property, which means that an increasing transform of a concave function can become convex. But quasiconcavity does suffer this problem. 9 But not every quasiconcave function can be from a monotone tansformation of some concave function. Otherwise, quasiconcavity would add nothing to concavity in dealing with the optimization problem. 18 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou We then present two tests for quasiconcavity: • A C 1 function f defined on a convex set A ⊂ Rn is quasiconcave if and only if f (y) ≥ f (x) =⇒ Df (x) · (y − x) ≥ 0 for all x and y ∈ A. If the second inequality is strict for x 6= y, then it is strictly quasiconcave. — this result has a nice geometric interpretation: the gradient vector at x and the vector y − x must form an acute angle if y brings higher value of f . (See, for instance, the graph in pp.935 in MWG.) • A C 2 function f defined on a convex set A ⊂ Rn is quasiconcave iff the Hessian D2 f (x) is negative semidefinite in the subspace {z ∈ Rn : Df (x) · z = 0} for any x ∈ A. It is strictly quasiconcave if the Hessian D2 f (x) is negative definite in that subspace for any x ∈ A. — since checking negative semidefiniteness is quite complicated, we here only present the practical way to check “the Hessian D2 f (x) is negative definite in the subspace {z ∈ Rn : Df (x) · z = 0}.” ∗ define a bordered Hessian as ⎛ ⎜ ⎜ Hn = ⎜ ⎜ ⎝ 0 f1 .. . f1 · · · f11 · · · .. .. . . fn f1n .. . fn fn1 · · · fnn ⎞ ⎟ ⎟ ⎟ ⎟ ⎠ where fi is the partial derivative with respect to xi at x and fij is the cross partial derivative at x. ∗ its leading principal minors of size ≥ 3 alternate sign with the first one (which has size three) being positive. That is, (−1)k |Hk | > 0 for k = 2, · · · , n.10 10 • For f (x1 , x2 ), it is strictly quasiconcave if ¯ ¯ 0 f1 f2 ¯ ¯ ¯ f1 f11 f12 ¯ ¯ f2 f21 f22 Notice that |H1 | must be nonpositive. 19 ¯ ¯ ¯ ¯ ¯ > 0; ¯ ¯ MSc Maths and Statistics 2008 UCL Department of Economics and it is quasiconcave iff Chapter 2: Calculus Jidong Zhou ¯ ¯ 0 f1 f2 ¯ ¯ ¯ f1 f11 f12 ¯ ¯ f2 f21 f22 ¯ ¯ ¯ ¯ ¯ ≥ 0. ¯ ¯ Exercise 11 (i) Give an example in which the sum of two quasiconcave functions is not quasiconcave. (vs concavity) (ii) For a, b > 0, show that the Cobb-Douglas function f (x, y) = xa y b must be quasiconcave. 3.4 Homogeneous functions • Definition A real-valued function f (x1 , · · · , xn ) defined on a cone is homogeneous of degree k if f (tx1 , · · · , txn ) = tk f (x1 , · · · , xn ) for all (x1 , · · · , xn ) and all t > 0.11 For example, f (x, y) = xa y b is homogenous of degree a + b. • Properties: — if a C 1 function f is homogeneous of degree k, then its first order partial derivatives are homogeneous of degree k − 1. — fi fj is homogenous of degree zero. — (Euler’s theorem) n X fi (x)xi = kf (x). i=1 Exercise 12 Prove the above three properties. A A.1 Appendix: Directional Derivatives Consider a function f : Rn → R. We want to measure the rate of its change at a given point x∗ in a given direction v = (v1 , · · · , vn ).12 To parameterize the direction v from the point x∗ , we write the line through x∗ in the direction v as x = x∗ + tv 11 A cone is a set with the property that whenever x is in this set, every positive scalar multiple tx of x is also in the set. 12 In a unidimensional domain, the direction is unique. But in a multi-dimensional domain, we have infinitely many directions which can be represented by vectors. 20 MSc Maths and Statistics 2008 UCL Department of Economics Chapter 2: Calculus Jidong Zhou where t is a real number. The rate of change of f along that line can be evaluated as ¯ n X ∂f (x∗ ) df (x∗ + tv) ¯¯ = vi = Df (x∗ ) · v, ¯ dt ∂x i t=0 i=1 where Df (x∗ ) is the derivative vector or gradient vector at x∗ . This is the derivative of f at x∗ in the direction v. In particular, if v is a unit vector, then the directional derivative degenerates to the partial derivative. Since Df (x∗ ) · v = kDf (x∗ )k kvk cos θ where k·k is the length of the vector and θ is angle between the vector Df (x∗ ) and v at the base point x∗ , we can see that, given x∗ and kvk, f increases most rapidly when v has the same direction as Df (x∗ ) (i.e., θ = 0). That is, the gradient vector Df (x∗ ) points at x∗ into the direction in which f increases most rapidly. 21