Chapter 2 Mathematics of Optimization • Many economic concepts can be expressed as functions (eg., cost, utility, profit), – To be optimized (maximized or minimized), – Subject to or not subject to constraints (eg., income or resource constraints). • Optimizing a function of one variable (unconstrained): Maximization (one variable and no constraint) • To maximize π=f(q), look at changes in π as q increases. • As long as Δ π > 0, π increases. • Can express Δ’s as a ratio ΔΠ ; could be + or -. Δq • Limit of ratio as Δq 0 is derivative of π=f(q). • For q<q*, derivative >0. • For q>q*, derivative <0. • At q*, derivative = 0. Π Π =f(q) is concave. ∆Π=0 Π=f(q) ∆Π>0 ∆Π<0 ∆q>0 q* ∆q>0 q Minimization (one variable and no constraint) • To minimize AC=f(q), look at ∆AC as q increases. • As long as ∆AC<0, AC is decreasing. The change ratio is AC =f(q) is convex. AC dAC <0 dq dAC >0 dq ∆AC ∆q Limit of ratio as ∆q 0 is derivative of AC=f(q). For q<q**, derivative < 0. For q>q**, derivative > 0. At q= q** , derivative = 0. AC = f(q) q** q • First derivative is the slope of a line tangent to the Slope = dπ/dq at the point of tangency. function at a point. Π • Derivative is denoted as: dΠ dAC df or or or f ′ (q) = f ′ dq dq dq 0 q • To maximize or minimize f(q), find q value where: df = 0 dq • Derivative will be a constant if the original function is linear; thus, the function has no Max or Min. • But if the function is nonlinear, its derivative will be a function of q itself (the derivative varies as q varies). • For derivative at a specific value of q, say q1, denote dΠ At q*, dΠ dq q = q1 . dq q =q* = 0. • To find the q where Π is maximized or AC is minimized, find the derivative of f(q), and set it equal to 0 and solve for q to get q*. • The “First Order Condition (FOC)” for a maximum or a minimum of an unconstrained function of one variable is: df FOC is = 0. dq Solve the FOC for q to get q* (the optimal q). The FOC is a necessary condition for a maximum (or minimum) but not a sufficient condition. Second Order Condition (SOC) The FOC is dy dx = 0. Convex function y 0 x* dy = 0 dx Concave function y 0 x y 0 x* dy = 0 dx x FOC is necessary but not sufficient dy dx 0 x Increases through 0 0 Convex then concave dyx* =0 dx x 0 x x* x* Concave then convex dy dx Decreases through 0 dy dx Inflection point x x* Take the derivative (second derivative) of the derivative (first derivative). d2y dx 2 + 0 d2y dx 2 d2y x*> 0 dx 2 Min x* x 0 - d2y x*< 0 dx 2 x* Max x d2y dx 2 Inflection 0 d2y x* = 0 dx 2 x x* Examples: y dy =0 dx MAX 2 d y d2y = 0 dx 2 = 0 2 dx d2y =0 dx 2 Local maximum and local minimum dy =0 dx MIN dy >0 dx dy <0 dx d2y <0 2 dx x dy >0 dx d2y >0 2 dx dy >0 dx 2 d2y d y <0 2 >0 2 dx dx An inflection point is where the second derivative equals zero when evaluated at the q* identified by the FOC. When the first derivative equals zero, you have a minimum, maximum, or an inflection point. Not all inflections have f ′ = 0. SOC if f ′′ q* < 0, have a maximum at q* if f ′′ * > 0, have a minimum at q* q if f ′′ * = 0, have an inflection at q* q Max FOC f′ = 0 SOC f ′′ q* < 0 For an unconstrained function of one variable Min f′ = 0 f ′′ q* > 0 Solve FOC for q* and evaluate f ′′ at q*. Rules for Finding Derivatives 1.The derivative of a constant is 0. If b is a constant in y = b, then dy db = = 0. dx dx Parameters Variable y = a + bx, the derivative of a with respect to x is 0. 2. The Power Rule – The derivative of a variable raise to b dax dax a power is = bax b −1 , = ax1−1 = a , dx dx = 1, dx dx b = bx b −1. dx dx • Examples: For y = a+bx2 dy = 2bx dx 2 FOC 2bx=0 ⇒x*=0 d y = 2b 2 dx f ′′ x * = 2b y = a+bx+cx2+ex3 dy = b + 2cx + 3ex 2 dx Anything raised to the 0 power is 1! 2 d y = 2c + 6ex 2 dx Review of the FOC and SOC for an unconstrained function of one variable: 2 dy d y = 0 ), then look at SOC by substituting x* into Find x* from the FOC ( dx dx 2 to see if second derivative evaluated at x* is >, <, or = 0. • The derivative of the natural log of x is the inverse of X, dy d ln x 1 = = . dx dx x c dbln(x ) bc because The actual rule is = dx x dclnx c = ln(x ) = clnx and dx x dbclnx bc c = bln(x ) = bclnx and dx x May not always dy 5 ⋅ 2 If y = 3 + 5ln(x 2 ), = , have the dx x parentheses. c The exponential function rule dy bx For y = a , dx = the derivative of the exponent bx (bx) times the original function ( a times the natural log of the constant (a). ) bx da dbx bx = a lna = ba bx lna, dx dx x da x = a ln a dx Can give constants different labels than above. x dy x If y = ac , then = ac ln c. dx x bx de x x x de bx bx bx = e ln e = e (1) = e = be ln e = be (1) = be , dx dx e is the base of natural logarithms = 2.71828, so that lne = 1. The Sum of Two Functions Rule – d[f ( x ) + g( x )] = f ′( x ) + g′( x ). dx The derivative of the sum of two functions of x is the sum of derivatives of the two functions: Example: If y = ax + bx2, dy then = a +2bx. dx The Product Rule – The derivative of the product of two functions of x is the first times the derivative of the second plus the second times the derivative of the first. d[f ( x ) ⋅ g( x )] = f ( x )g′( x ) + g( x )f ′( x ) dx Example: y = (2x-1)(2-9X2) dy 2 = (2X − 1)(−18X) + (2 − 9 x )(2) dx = (−36 x 2 + 18x ) + (4 − 18x 2 ) = 4 + 18x − 54 x 2 The Quotient Rule - The derivative of the quotient of two functions of x is the derivative of the top times the bottom minus the top times the derivative of the bottom all divided by the bottom squared. f (x) d g ( x ) = f ′( x )g ( x ) − f ( x )g′( x ) dx [g( x )]2 Ex 1: If Ex 2: If Or ( ) 2 2 3 4 2 x3 + − + dy 3 x x 2 x ( 2 x ) x 6 x y= 2 , = = 4 4 2 x + 2 dx x + 4x + 4 x + 4x 2 + 4 ( ) dy 0 x 4 − 1( 4 x 3 ) − 4 x 3 −4 1 −5 y= 4 , 4 x − = = = = x dx x8 x8 x5 y= 1 = x −4 , 4 x dy = −4 x −5 dx The Chain Rule - Allows differentiation of complex functions in terms of elementary functions. If y = f ( x ) and x = g ( z ) so that y = f (g (z)), then dy dy dx df dg df dg = ⋅ = ⋅ = ⋅ dz dx dz dg (z) dz dx dz Examples: 1 3 y = ( z + 2 z ) , x = z + 2z, so y = x 1. 2 2 1 3 − 2 dx dy dy 1 2 3 = = (z + 2z) = 2z + 2 dz dx 3 dz −2 2(z + 1) 1 2 3 = z + 2 z ( 2 z + 2) = 2 2 3 3(z + 2z) 3 ( ) Examples: w w dy = 2(3x + 1)3 = 18x + 6 2. y = (3x + 1) , dx 2 w 3. y = ln(x − 4x + 2) 2 dw dx dy 1 2x − 4 = 2 ⋅ (2x − 4) = 2 x − 4x + 2 dx x − 4x + 2 dy dw Functions of More than One Variable “Partial” derivatives – May want to know how y changes as one of several x variables changes, ceteris paribus (as in demand curve). y = f (x 1 , x 2 , x 3 ,..., x n ) Several derivatives can be taken; one with respect to each Xi holding the other variables constant. Denoted as: ∂y = f xi = fi ∂x i When taking this derivative, all other xi’s are treated as constants. Fortunately, the rules for derivatives of functions of one variable apply to partial derivatives also. We simply treat the other x’s as constants. Second Order Partial Derivatives are analogous to the early one-variable case. Denoted as: “cross second partial” i≠j ∂2y ∂x i ∂x j = f ij “own second partial” i=j There are four combinations in the two variable case, fij, fii, fji, and fjj; but fij and fji are equal (Young’s Theorem). The order of differentiation does not matter. Partial derivatives reflect the units of measurement for y and x. If y = quantity of apples demanded per year in 100 pound units and x is the price of apples per pound, ∂y/∂x measures the change in the quantity of apples demanded in 100 pound units per year for a dollar change in the apple price, ceteris paribus. If demand were measured in pounds per year, the partial derivative would be 100 time larger than when demand is measured in 100 pounds per year. Example 1: Q = 1000 − 20P 2 + 36I 2 ∂ 2Q = −40 ∂P∂P ∂ Q ∂Q = 0, = −40P, ∂P∂I ∂P 2 ∂Q ∂ Q = 72I, = 0, ∂I ∂I∂P 2 ∂ 2Q = 72 ∂I∂I Example 2: Q = 1000 − 20P + 36I + IP 2 ∂Q = −40P + I, ∂P ∂ Q = 1, ∂P∂I ∂Q = 72I + P, ∂I ∂ 2Q = 1, ∂I∂P 2 2 ∂ 2Q = −40 ∂P∂P ∂ 2Q = 72 ∂I∂I Total Differential The total differential shows what happens to y as both x1 and x2 change simultaneously. The total differential is dy = ∂f dx1 + ∂f dx 2 ∂x1 ∂x 2 To find a maximum or minimum for y (where dy = 0 as x1 and x2 change), the FOC is that all ∂f ∂x = 0 because then dy = 0! That is, all the first partial derivatives must be 0. This happens at f2 the “top of the Hill” in x2 three dimensions or f1 the “bottom of the hole”. i x1 Example 1: To find max. or min. for y = − x1 + 2x1 − x 2 + 4x 2 + 5, 2 2 we must find where both first partials equal zero (FOCs) and solve the FOCs simultaneously for x1* and x *2 . Plug these ∂y * 2 x1 = 2 x1 = 1 optimal values = −2 x1 + 2 = 0 FOC 1: ∂x1 into y to get y ∂ FOC 2: 2x 2 = 4 x *2 = 2 y* = 10. = −2 x 2 + 4 = 0 ∂x 2 Example 2: Optimize y = − x1 + 2x1 − x 2 + 4x 2 + 5 − x1x 2 , FOC 1: ∂y = −2x1 + 2 − x 2 = 0 ⇒ x 2 = −2x1 + 2 ∂x1 FOC 2: ∂y = −2x + 4 − x = 0 ⇒ -2(-2x + 2) + 4 − x = 0 2 1 1 1 ∂x 2 ⇒ x1* = 0 and x *2 = 2 Plug these optimal values into y to get y* = 9. 2 2 These FOCs are not sufficient; could be a maximum, a minimum, or a saddle point. All we know is that it is a flat spot! Need SOCs (later). Implicit Functions • Usually express functions explicitly as y = f(x1, x2 ,…, xn) where y is a function of xi. • However, some functions will be expressed implicitly as 0 = f(x1, x2 ,…, xn, y) where there is no dependent variable. • Can find: dy fi =− dx i fy dx i fy =− and dy fi or Remember the negative sign. fj dx i =− dx j fi Example f(x, y) = x + y − 2x + 4y + 4 = 0 2 f x = 2x − 2 2 f y = 2y + 4 dy 2x − 2 x −1 1− x =− =− = dx 2y + 4 y+2 y+2 dx 2y + 4 y+2 Remember the sign! =− = dy 2x − 2 1 − x Examples of implicit functions are production possibility curves, indifference curves, budget constraints. Envelope Theorem • Deals with how the optimal value of the function (y*) changes as the value of one of the parameters changes (What are parameters?) dy • Theorem states that one can calculate da by holding independent variables at their optimal values in the function and ∗ ∂ y ∗ * * * finding y = f(x , x ,..., x . 1 2 n) ∂a Example: y = − x + ax + bx 2 dy = −2x + a + b = 0, First, find the FOC: dx a+b * Second, solve FOC for x: x = 2 Third, 2 2 substitute x* + a b ( a b ) a b a b + + + ∗ into y to get: y = − = + b + a 2 2 ∗ 2 Forth, partially differentiate y with respect to a: (Partial derivative because b is constant) ∗ 4 ∂y a + b = 2 ∂a Example cont’d ∗ From previous slide ∗ Thus, ∂y ∂a ∂y a + b = ∂a 2 depends only on values of a and b, but no variables! ∂y* =3 a = 4, Assume b=2 then: ∂a ∂y* ∂y* a+2 = = a = 2, = 2 ∂a ∂a 2 ∂y* =1 a = 0, ∂a In many-variable case, solve all FOC simultaneously for xi*(a) * and substitute into y to get y* = f(x1*, x2*, …, xn*), then dy = ∂y . da ∂a Constrained Optimization • Because we are looking at scarce resources, we usually must find an optimum value subject to one or more constraints (limits). These may prevent us from achieving the maximum (or minimum) of the objective function. Lagrangian Multiplier Method Mathematical “voodoo” to short-cut optimization subject to a constraint. The following is an important format for setting up a constrained optimization problem. Max (Min):y = f ( x1 , x 2 , x 3 ,..., x n ) Objective function to be optimized s.t. 0 = g( x1 , x 2 , x 3 ,..., x n ) Constraint Implicit form! Set up L = f ( x1 , x 2 , x 3 ,..., x n ) + λg( x1 , x 2 , x 3 ,..., x n ) Where λ is the Lagrangian multiplier. Constraint example: x1 + 2x2 + x3 = 3 0 = 3 − x1 − 2x 2 − x 3 = g( x1 , x 2 , x 3 ,..., x n ) Value of constraint Find FOC for each xi and λ: ∂L = f1 + λ g 1 = 0 ∂x 1 ∂L = f 2 + λg 2 = 0 ∂x 2 ∂L = f 3 + λg 3 = 0 ∂x 3 ∂L = f n + λg n = 0 ∂x n ∂L = g ( x1 , x 2 ,..., x n ) = 0 ∂λ FOC Lagrangian Multiplier Method (cont) Solve these n + 1 FOCs simultaneously to find * * * * * values for x1 , x 2 , x 3 ,..., x n , and λ . These are optimum values of the xis given the constraint. * i * Substituting the x into y will give y . These conditions are necessary, but not sufficient for an optimum. Look at Second Order Conditions (SOC) later. You can use the Lagrangian method for any number of xi (i=1…n)and for any number of constraints. Just add another λj (j=1…m) for each constraint, gj (j=1…m). L = f(x1 ,..., x n ) + λ1g1(x1 ,..., x n ) + λ2g2(x1 ,..., x n ) Interpretation of λ Solve the first FOC for λ to get: f1 f1 + λg1 = 0, λg1 = −f1 , λ = − g1 fi Do it for all n FOC to get: λ = − gi Because all ratios equal λ, all ratios are equal. f1 f2 fn λ= = = ... = − g1 − g 2 − gn • At optimum (Max or Min of y), the negative of the ratios of the partial derivatives (objective/constraint) with respect to each xi are equal and equal to λ. • The fi shows the marginal contribution (benefit) of xi to y (the objective function), while the –gi shows the marginal cost of xi in the constraint. That is, –gi shows how much the other xis must be reduced when xi is increased slightly, and therefore, how much y is foregone if the constraint is to hold. • Because λ = the ratio of marginal benefit to marginal cost for all xi, all these ratios are equal. This condition is necessary for a maximum (or minimum). That is, the Margin Benefit/Marginal Cost ratios must be equal for all xi and equal to λ. If one xi ratio were larger than the others, then more xi should be used and less of other xis should be used until the ratios are equal again. • λ* indicates how y * would be affected by relaxing the constraint slightly. λ* is the shadow price for a unit of the resource represented by the constraint (the value of an additional unit of the resource). If λ* = 0, the constraint is redundant and the optimum is unconstrained. An additional unit of the resource would be worth nothing in regard to increasing y. • If λ* is large, relaxing the constraint would be very beneficial because the benefit/cost ratio is high. • Relaxing the constraint refers to increasing (decreasing) the value of the constraint in a maximization (minimization) problem! Duality • Any constrained Max (Min) problem (called the primal problem) is associated with a constrained Min (Max) problem (called the dual problem). – Primal (Dual) may be: Max Utility s.t. Income; or Min Cost s.t. Output level. – Dual (Primal) would be: Min Expenditures s.t. Utility level; or Max Output s.t. Cost level. Always two ways to look at any constrained optimization problem. Example: Primal Min: f ( x, y) = z = x 2 − y 2 + 3xy + 5x x − 2y = 0 s.t. L = x − y + 3xy + 5x + λ(0 − x + 2y) 2 1. 2 ∂L = 2 x + 3y + 5 − λ = 0 ∂x 2. ∂L = −2 y + 3x + 2λ = 0 ∂y 3. ∂L = −x + 2y = 0 ∂λ FOCs Primal example (cont.) Simultaneous Solution 1. Multiply FOC 1 by 2 and add it to FOC 2. 4x + 6y + 10 − 2λ = 0 3x − 2y + 0 + 2λ = 0 2. Solve FOC 3 for x to get x = 2 y. 7x + 4y + 10 = 0 3. Substitute x into above result to get: 7(2 y) + 4 y + 10 = 0. This gives: y* = − 5 9 . 4. Substitute y* into x=2y from FOC 3 to get: x * = 2( − 5 9) = − 10 9 . 5. Substitute x * and y*into FOC 1 to get: λ* = 10 9 . 6. Substitute x * and y*into z to get: z * = − 225 81. * * * 7. Optimal solution is: x * = − 10 9 , y = − 5 9 , z = − 225 81, λ = 10 9 . If you could relax the constraint by one unit (decrease it by one unit from 0 to -1), λ* shows that z*would decrease by 10/9 of a unit. Example: Dual Max: w = x − 2y s.t.: x 2 − y 2 + 3xy + 5x = − 225 81 From the constraint of the primal. From the objective function of the primal assuming optimality. The constraint in implicit form is: 0 = –225/81 – f(x,y). L = x − 2y + λ (− 225 81 − x + y − 3xy − 5x) D 2 2 Dual example (cont.) FOCs ∂L = 1 − 2λ D x − 3λ D y − 5λ D = 0 ∂x ∂L = −2 + 2λ D y − 3λ D x = 0 ∂y ∂L 225 − x 2 + y 2 − 3xy − 5x = 0 = − 81 ∂λ D Solve these FOCs simultaneously to get: x * = − 10 9 , y* = − 5 9 , λ D* = 9 10 , w * = 0 If we could relax (increase) the constraint by 1 unit, the optimal value of the objective function ( w * ) would increase by 9/10 of a unit. Notice that values for x* and y* are common to the primal and dual, while the value of λD* for the dual is the reciprocal of λ* for the primal. Envelope Theorem in Constrained Max. • Theorem can be used as discussed earlier to examine how a function’s optimum value (y*) changes as a parameter in the problem changes. In the case at hand, differentiate optimal Lagrangian expression with respect to the parameter: Max: y = f ( x1 , x 2 ; a ) st: 0 = g( x1 , x 2 ; a ) L = f ( x 1 , x 2 ; a ) + λg ( x 1 , x 2 ; a ) Solve FOC for x1*, x2* and substitute into L to get L*, which is a function of parameters only. Envelope Theorem …(cont.) dy ∂L = da ∂a * * Only difference is that y* is a constrained optimum. dy* * ∂L* If a is the value of the constraint, then =λ = da ∂a is the shadow price of the resource in the constraint. See the section in the text on Inequality Constraints (pp. 4547) for the Lagranian methods for solving problems with inequality constraints. Warnings: Not all optimization problems can be solved with calculus. Functions must be continuous at optimum. If not, use Linear Programming methods. Further discussion of FOC and SOC We have already talked about FOC and SOC for functions of one independent variable and no constraints: y = f(x). y ′ f = 0 FOC Max SOC f ′′ x * < 0}concave y FOC f ′ = 0 Min SOC f ′′ x * > 0}convex y* concave x X* convex y* X* x Suppose y = f ( x1 , x 2 ) Two independent variables, no constraint. FOC f1 = 0 Indicates a flat spot— f 2 = 0 max, min, saddle point. Logic suggests that the own second partials * * evaluated at x1 and x 2 would indicate which (max or Min), but…. f11 x * , x * < 0 SOC 1 2 Max f 22 x * , x * < 0 1 2 f11 x * , x * > 0 1 2 Min f 22 x * , x * > 0 1 2 Assumes only x1 or x2 is changing! However, what if both x1 and x2 are changing? Must consider a third SOC condition (partial derivatives are evaluated at X1* and X2* for SOCs): 2 11 22 12 f f −f > 0 This condition must hold when evaluated at the optimal values of x1 and x2. This condition is one of three sufficient conditions (SOCs) when f is a function of two variables (unconstrained). Summary of FOC and SOC for functions of two independent variables and no constraint. y = f(x1 , x 2 ) FOC for maximum or minimum f1 = 0 max,min, saddle point f 2 = 0 SOC for a maximum: f11 < 0, f 22 < 0 2 f11f 22 − f12 > 0 Concave function SOC for a minimum: f11 > 0, f 22 > 0 Convex function 2 f11f 22 − f12 > 0 • f1 and f2 locate a flat spot. • f11 and f22 distinguish between a minimum and a maximum. • Think of a saddle in case of a minimum. The flat spot on the saddle is not the lowest point on the saddle. • f11f22 - f122 > 0 confirms that it is not a saddle point. Constrained Optimization FOC The SOC for constrained maximization of a function of two variables, subject to a linear constraint (or its dual) is: f11f 22 − 2f12 f1f 2 + f 22 f12 < 0, evaluated at x1*and x *2 . ∂L =0 If a constrained function has this property, it is strictly quasi-concave. ∂x1 In economic theory, we usually use this problem (e.g., Max. a utility ∂L = 0 function, s.t. a linear budget constraint (primal), or Min. linear ∂x 2 expenditures, s.t. a fixed level of the utility function (dual). … The SOC for constrained minimization of a function of two ∂L = 0 variables, subject to a linear constraint (or its dual) is: ∂x n f11f 22 − 2f12 f1f 2 + f 22 f12 > 0, evaluated at x1* and x *2 . ∂L = 0 If a constrained function has this property, it is strictly quasi-convex. ∂λ 1 Solve the FOC simultaneously to get the optimal solution and plug … ∂L =0 ∂λ m the xi* into the appropriate expression above (SOC) to see if it is satisfied (or just determine the signs of the partial derivatives and determine if the expression has the appropriate sign).