ARE211, Fall2012 Contents 3. Univariate and Multivariate

advertisement
ARE211, Fall2012
CALCULUS1: TUE, OCT 2, 2012
PRINTED: OCTOBER 9, 2012
(LEC# 12)
Contents
3. Univariate and Multivariate Differentiation
1
3.1.
Univariate calculus: the derivative
1
3.2.
The fundamental idea: linear approximations to nonlinear functions
2
3.3.
Univariate Calculus: the differential
4
3.4.
Multivariate calculus: functions from Rn to R
5
3.4.1.
Partial Derivative
5
3.4.2.
The Gradient
6
3.4.3.
Crosspartial Derivative
6
3.4.4.
The differential for functions from Rn to R
7
3. Univariate and Multivariate Differentiation
3.1. Univariate calculus: the derivative
Assume you all know how to calculate the derivative of a single variable function, i.e., given f ,
calculate
df (·)
dx ,
denoted also f ′ (·). Important to know the difference between f ′ (·), which is a
function and f ′ (x), which is a number, the function evaluated at a point.
I’ll try to be careful to use this notation from now on: g(·) is a RULE, represents a function.
1
2
CALCULUS1: TUE, OCT 2, 2012
PRINTED: OCTOBER 9, 2012
(LEC# 12)
Since f ′ (·) is a function, like any other, it may have a derivative; if it does, call it f ′′ (·). Do example
f (x) = x2 .
Lots of standard kinds of functions you have to be able to differentiate in your sleep. Equivalent
of being able to spell. Brainless activity.
3.2. The fundamental idea: linear approximations to nonlinear functions
The most important thing to keep in mind about calculus is:
(Calculus Mantra) When you do calculus, you are, ALWAYS, approximating a small change in a
differentiable function by evaluating THE (unique) appropriate LINEAR function at the magnitude
of the change. That is, for sufficiently small dx, f (x + dx) − f (x) ≈ f ′ (x)dx.
df x (·) = f ′ (x)(·) is the linear function that approximates the difference between f (x + ·) and f (x).
Draw picture of the comparison: concave example.
• graph of f (·) with points x∗ and x∗ + dx∗ .
• look at difference in the value of the function at the two points.
• look at the graph of the linear function df = f ′ (x∗ )dx:
• note that the difference between (f (x∗ + dx∗ ) − f (x∗ )) and f ′ (x∗ )dx∗ will be arbitrarily
small provided that dx∗ is sufficiently small.
• Explain the enormous practical importance of this.
ARE211, Fall2012
3
f
df
∗
df x = f ′ (x∗ )dx
x∗
f (x∗ )
Estimated change in f
Actual change in f
df (dx† )
f ′ (x∗ )dx†
+
f (x∗ + dx† )
dx†
f (x∗ )
dx
x
x∗
x∗
+
dx†
Figure 1. Linear Approximations of the change in a nonlinear function
When dx is bigger, don’t do so well at approximation. Could do better if you threw in the second
derivative, i.e.,
f (x + dx) − f (x)
≈ even better
1
f ′ (x)dx + f ′′ (x)dx2 .
2
Note that in the above figure, f ′′ (x∗ ) < 0. Note also that the linear approximation gives an
overestimate of the change, so that adding the second term helps.
This is the germ of a crucial theorem called Taylor’s theorem, that we’ll see more of later.
4
CALCULUS1: TUE, OCT 2, 2012
PRINTED: OCTOBER 9, 2012
(LEC# 12)
3.3. Univariate Calculus: the differential
You’ve probably seen the expression dy = f ′ (x)dx, known as the differential, before. What sort
of object is this? What’s the relationship between the differential and the derivative? Recall from
linear algebra about the relationship bewteen linear functions from R1 to R1 and scalars; every
such function is uniquely defined by a single scalar. E.g., I sometimes talked about the function p,
when I really meant the linear function f (·) defined by f (x) = px. Well... The differential is to the
derivative as the linear function is to the scalar/vector that defines it.
The relationship between the differential and the derivative is a particular example of the relationship between any linear function and the scalar/vector/matrix that defines it.
Defn: the differential of f at x is the linear function defined by the scalar f ′ (x). In other words, the
differential of f at x is the unique linear function whose coefficient (slope) is equal to the derivative
of f at x.
Restate: The most important thing to keep in mind about calculus is: when you do calculus on
f at x, you are, ALWAYS, approximating a small change in f , starting from x, by evaluating
the differential of f at x at the magnitude of the change. This is the idea that underlies all of
comparative statics.
That is, dy = f ′ (x)dx is the value of the approximation to the change in f when you shift from x
to x + dx.
Economic example: have π(p), i.e., profits are a function of prices (assuming optimal input choices
for prices p.) If p change is small enough, dπ p ≈ π ′ (p)dp.
ARE211, Fall2012
5
It is the ability to make computations like this—we call them comparative statics—that distinguish
economists from art historians. Example:
• Let f (x) = x3 + 2x2
• set x∗ = 1, dx∗ = 0.1
• then f (x∗ ) = 3, f ′ (x∗ ) = 3x∗ 2 + 4x∗ = 7, f (x∗ + dx∗ ) = 1.13 + 2 ∗ 1.21 = 3.751.
• The true difference f (x∗ + dx∗ ) − f (x∗ ) is .751.
• the estimated difference is f ′ (x∗ )dx∗ = (3 × 12 + 4 × 1) × dx∗ = 0.70, not bad.
3.4. Multivariate calculus: functions from Rn to R
3.4.1. Partial Derivative. Cookbook approach: If f : Rn → R, then fi (·) ≡
∂f (·)
∂xi
is the (usual)
derivative of the single variable function you get by treating all other variables as constant.
Example:
f (x) = x21 + 2x1 x2 + x22
= x21 + 2x1 α + β
f1 (x) = 2x1 + 2α + 0
= 2x1 + 2x2
A more graphical view of partial derivatives. Take a cross-section of the graph of the function along
the i’th axis, and look at the slope of the single-dimensional function you obtain in this way. This
slope is the i’th partial derivative.
6
CALCULUS1: TUE, OCT 2, 2012
PRINTED: OCTOBER 9, 2012
(LEC# 12)
Similarly, you could take a diagonal cross-section, and get a different one-dimensional slope. Generally, you get what is called a directional derivative. Partial derivatives are just special kinds
of directional derivatives: in particular, they tell you the slopes you obtain when you take crosssections along the various axes. We’ll talk about directional derivatives much more in the next
lecture.
3.4.2. The Gradient. Given a function f : Rn → R, the gradient of f is just the function from Rn
to Rn that maps each x ∈ Rn to the vector of partial derivatives of f at x. That is,


 f1 (·)

 .
 ..



▽f (·) = 
 fi (·)

 .
 ..



fn (·)















▽f (·) is the exact analog for f : Rn → R. as f ′ (·) is for f : R → R. In fact the words “slope of f,”
“gradient of f” and “derivative of f” are synonomous.
The important distinction between the “gradient of f ” which is a function, written ▽f or ▽f (·)
and the “gradient of f at x,” which is a vector.
3.4.3. Crosspartial Derivative. A cross-partial derivative is just an entry in the matrix which is the
derivative of the derivative. That is, take a function f : Rn → R. The i’th partial derivative of this
function fi (·) is a function, like any other, and if the function is differentiable, it has derivatives:
ARE211, Fall2012
fij (·) ≡
∂2f
∂xi ∂xj (·)
is the j’th partial derivative of the function fi (·) ≡
7
∂f
∂xi (·).
Thus, the Hessian of f just consists of the matrix of crosspartials of f .
3.4.4. The differential for functions from Rn to R. Recall from earlier: the most important thing
to keep in mind about calculus is that when you do calculus on f at x, you are, ALWAYS, approximating a small change in f , starting from x, by evaluating the differential of f at x at the
magnitude of the change. We’ll look at two numerical examples, then look at the picture.
Numerical example:
• f (x, y) = x2 + y 2 ;
• ▽f (x, y) = (2x, 2y);
• Consider (dx, dy) = (0.1, 0.1) and set (x∗ , y ∗ ) = (100, 100)
• approx f (x∗ +dx, y ∗ +dy)−f (x∗ , y ∗ ) by ▽f (x∗ , y ∗ )×(0.1, 0.1) = (200, 200)×(0.1, 0.1) = 40.
• Now evaluate at f at both (x∗ , y ∗ ) = (100, 100) and (x∗ + dx, y ∗ + dy) = (100.1, 100.1), and
compare: f (x∗ , y ∗ ) = 20, 000; f (x∗ + dx, y ∗ + dy) = 20040.02; Actual difference is 40.02.
Symbolic example:
• y = f (ℓ, k);
• ▽f (ℓ, k) = (fℓ (ℓ, k), fk (ℓ, k)).
• Fix (ℓ∗ , k∗ ) and let (ℓ, k) = (ℓ∗ , k∗ ) + (dℓ, dk).
• Approximate the difference f (ℓ, k) − f (ℓ∗ , k∗ ) by the value of the differential, evaluated at
(dℓ, dk) = (ℓ, k) − (ℓ∗ , k∗ ):
f (ℓ, k) − f (ℓ∗ , k∗ ) = dy = fℓ (ℓ∗ , k∗ )dℓ + fk (ℓ∗ , k∗ )dk
8
CALCULUS1: TUE, OCT 2, 2012
PRINTED: OCTOBER 9, 2012
(LEC# 12)
Graphical example: In Fig. 2, we’re doing exactly the same thing for functions from Rn to R as
we did in Fig. 1 for functions from R to R. We evaluate the difference f (x∗ + dx† ) − f (x∗ ), where
here dx† and x∗ ’s are both vectors, by the value of the differential, evaluated at the difference, dx† .
What’s the differential, in terms that we’ve used before? Recall that it is a linear function from Rn
∗
to R; so it is characterized by a vector of coefficients. That is, df x = ▽f (x∗ ) · dx.
As Fig. 2 indicates, the differential here plays exactly the same role as it played for functions from
R to R. In Fig. 2, the tangent plane lies below the function, at least a neighborhood of x∗ . Then,
as we move from x∗ to x∗ + dx† , the value of the function declines; but the height of the tangent
plane declines even further. Applying the calculus mantra, we approximate the change in f , i.e.,
f (x∗ + dx† ) − f (x∗ ), by translating the tangent plane so that x∗ becomes the new origin, then
approximate the change in f when you move from x∗ to x∗ + dx† by evaluating the height of this
∗
translated plane at the point dx† . Notice from the top panel that the absolute value of df x (dx† )
over-estimates the absolute magnitude of the change in the function.
The differential is sometimes refered to as the total differential. I’ll hopefully never use this term,
because it is too close to something that’s different, i.e., the total derivative. The total derivative
is related to (but not the same as) a directional derivative (i.e., you take the cross section of the
function in a special direction). The total derivative is a either number
function
df (·)
dx .
df (x)
dx
∗
or possibly a non-linear
The total differential is the name of a linear function,. df x (dx) = ▽f (x∗ )dx.
ARE211, Fall2012
9
Graph of f , with tangent plane at x∗
0
f (x∗ )
f (x∗ +dx† )
f (x∗ )+▽f (x∗ )·dx†
x∗ + dx†
x2
x1
x∗
Graph of f ; rotated
0
f (x∗ )
f (x∗ +dx† )
x∗ + dx†
f (x∗ )+▽f (x∗ )·dx†
x∗
x1
x2
Differential of f at x∗ , eval at dx†
0
∗
df x (dx† )
dx2
0
dx1
0
Figure 2. Linear Approximation of the change in a nonlinear multivariate function
Download