BEE1020 — Basic Mathematical Economics Week 7, Lecture Tuesday 17.11.05 Convexity, concavity. Sign diagrams 0.1 Strict convexity and concavity If we look again at the cost increases in Example 2 we notice that the cost increases are themselves increasing: 0 110 Q TC ∆T C ∆2 T C 1 135 25 10 2 170 35 10 3 215 45 10 4 270 55 10 5 335 65 10 6 410 75 10 7 495 85 In other words, the increase of the increase (written as ∆2 T C = ∆ (∆T C)) is always positive. Costs are accelerating, the more is already produced, the more costly it is to further increase production. Economists speak of increasing marginal costs, the costs of producing one unit more is higher when more is produced. Mathematicians speak here of a strictly convex function. In the graphs we see this as follows: – The graph is upward-bowed. – The tangents get steeper from left to right, i.e., their slopes are increasing. Therefore, the marginal costs M C (Q) = 10Q + 20 are increasing, not only positive, if we draw the graph of the marginal cost curve: 500 100 400 80 60 300 TC MC 200 40 100 20 0 1 2 3 Q 4 5 6 The total costs in Example 2 7 0 1 2 3 Q 4 5 6 7 Increasing marginal costs. Mathematicians call a function with a graph which is upward-bowed (like a cup ) strictly convex . In contrast, a function with a downward-bowed graph (like a cap ) is called strictly concave.1 The word “strictly” is used here to indicate that the graph is properly curved and not, at least partly, a straight line. Correspondingly, a linear function is regarded as both convex and concave, but not as strictly convex or as strictly concave. 1 ? use “upward concave ” instead of “strictly convex” and “downward concave” instead of “strictly concave”. I have never seen these terminology in any other book. Hence I prefer to stick hence with the terminology your future teachers will understand. I guess the authors did not know the “cave-rule”. It is easy to memorize what concave is a opposed to convex because of the word “cave” appears in concave: concave Example 3 does not exhibit increasing marginal costs: The cost increases ∆T C are first decreasing and then increasing. 0 50 Q TC ∆T C ∆2 T C 1 94 44 2 114 20 -24 3 122 4 130 8 8 -12 0 240 220 200 180 160 140 TC 120 100 80 60 40 20 5 150 20 6 194 44 12 7 274 80 24 36 100 80 60 MC 40 20 0 1 2 3 Q 4 5 6 7 0 1 2 3 Q 4 5 6 7 Example 3 U-shaped marginal costs. In the graph of the total cost function this is reflected by the fact that the graph of the total function is first downward-bowed and then upward bowed. The tangents are first decreasing and then increasing. We say that the total costs function is strictly concave for 0 ≤ Q ≤ 3 and strictly convex for 3 ≤ Q. The graph of the marginal cost curve is given above. For obvious reasons economists speak of a U-shaped marginal cost curve. Again, calculus can help to decide whether a function is convex or concave on an interval. Since we have been looking here at differences of costs differences, we must now use the second derivative of a function. This is simply the derivative of the derivative of the function. Newton used y (x) to denote the second derivative of a function, Leibniz d2 y used dx 2. 2 In Example 2 we have d2 T C d (10Q + 20) dM C = = 10 > 0. = 2 dQ dQ dq In Example 3 we have dM C d2 T C d (6Q2 − 36Q + 60) = = = 12Q − 36 = 12 (Q − 3) dQ2 dQ dq which is negative for Q < 3 and positive for Q > 3. This information allows us to deduce immediately on which intervals the total cost functions are concave or convex and where, correspondingly, marginal costs are increasing or decreasing. The result we can use here is: Theorem 1 The following statements are equivalent for a twice continuously differentiable function on an interval: a) The function is strictly convex on the interval. b) Its first derivative is increasing on the interval c) Its second derivative is nonnegative on the interval and never constantly zero on any subinterval. Theorem 2 The following statements are equivalent for a twice continuously differentiable function on an interval: a) The function is strictly concave on the interval. b) Its first derivative is decreasing on the interval c) Its second derivative is nonpositive on the interval and never constantly zero on any subinterval. Summary: A function is convex (upward-bowed) if its tangents get steeper from left to right. The latter means that its first derivative is increasing and hence positively sloped. Thus convex function corresponds to increasing first derivative and the latter to positive second derivative. Correspondingly, concave (downward-bowed) functions have decreasing first derivatives and negative second derivatives. Sign diagrams Roughly speaking: A function is increasing (decreasing) where its first derivative is positive (negative). A function is strictly convex (strictly concave) where its second derivative is positive (negative). In order to apply this to a polynomial function y = f (x) we have to solve inequalities of the type and f (x) > 0 f (x) > 0 for x. In general, solving inequalities is messy and prone to errors. Consider, for instance, f (x) = x2 + 1 x > 0. (1) 3 We may be tempted to divide by x, as we would do with an algebraic equation, to obtain x2 + 1 > 0 (2) which is true for all x since x2 is non-negative. We may conclude that f (x) > 0 for all x = 0. (For x = 0 we have f (x) = 0.) However, this reasoning is wrong because if x is a negative number then division by x changes the sign on the left-hand side of (1). So, if x < 0 then x2 + 1 must also be negative to have f (x) > 0, i.e., when we divide by negative x the inequality reverts sign and we obtain (3) x2 + 1 < 0 instead of (2). Inequality 3 is not satisfied for any x. Putting together our results for positive and negative x we see that f (x) > 0 if and only if x > 0. There is a much simpler way to get to this result which avoids any algebraic manipulation of inequalities: f (x) is the product of the two factors x2 + 1 and x. In order for f (x) to be positive either both factors must be positive or both must be negative. Since x2 + 1 is always positive, x must be positive. The method of sign diagrams uses this simple type of consideration. It requires us, however, to find a “nice” factorization of the relevant polynomial f, which, in turn, requires us to find its roots. In this lecture we will first discuss 1. How sign diagrams can be used to decide where a polynomial is positive- or negative valued. 2. How we can find the roots of a polynomial (i.e., the solutions to the equation f (x) = 0) in special cases. 3. How roots help us to factorize a polynomial. Thereafter we demonstrate how these methods can be used to determine the important qualitative features of a polynomial function without drawing tables or evaluating the polynomial at many points. We present this information with the aid of a summary sign diagram. 1 Sign diagrams for polynomial functions Consider the polynomial P (x) = (x + 5) (x − 2)2 (−2x + 6) = −2x4 + 4x3 + 38x2 − 136x + 120 Obviously, the roots are x = −5, x = 2 and x = 3. To find out where P (x) is positive or negative we draw a sign diagram. This is a table with one column for each root, one column for each interval between the roots, one column for the numbers to the left of all roots and one column for the numbers to the right of all roots. There is one row for each 4 factor of the polynomial and a final row for the polynomial itself. The entries in the table are +, − or 0. For each factor it is easy to decide where it is positive, negative or zero and hence to make the corresponding entry in the table. Once we know the signs of all factors in an interval, we know the sign of f (x) in this interval. In our example x+5 x−2 x−2 −2x + 6 f (x) x < −5 x = −5 −5 < x < 2 x = 2 2 < x < 3 x = 3 3 < x − 0 + + + + + − − − 0 + + + − − − 0 + + + + + + + + 0 − − 0 + 0 + 0 − The signs for the factor −2x + 6 are obtained as follows: A linear factor changes sign only once, namely at the root which is here x = 3 (since −2x + 6 = 0 yields 6 = 2x). For x = 4 we have −2x+6 = −2 < 0. Therefore −2x+1 is positive to the right of x = 3 and it must be positive to the left of the root. (Check: For x = 2 we have indeed −2x + 6 = 2 > 0.) For x < −5 and for 3 < x the polynomial f (x) is negative because it has an odd number of negative factors. For −5 < x < 2 and for 2 < x < 3 the polynomial is positive because it has an even number of negative factors. A look at the graph of y = f (x) confirms our results: 600 4 400 2 200 0 1.6 1.8 2 2.2 x2.4 2.6 2.8 3 3.2 -2 -4 -2 0 2x 4 -4 Problem 3 Construct the sign diagram of the polynomial f (x) = −3 (x + 1)3 (x − 1)2 (x − 4) = −3x6 + 9x5 + 18x4 − 18x3 − 27x2 + 9x + 12 5 2 Finding roots of a polynomial The hard work is to find the roots of a polynomial and to factorize it. Except for linear or quadratic polynomials, we restrict ourselves to methods which work only in special cases. Nonetheless, we start with a very deep and general result in algebra. 2.1 The fundamental theorem of algebra Gauss (1777 — 1855): Every non-constant polynomial can be written as a product of linear factors and quadratic factors with no real roots.2 As a consequence, the roots of a polynomial are precisely the roots of its linear factors. Example 1 Solution 1 x4 −1 = (x2 + 1) (x2 − 1) = (x2 + 1) (x + 1) (x − 1) using twice the always important formula a2 − b2 = (a + b) (a − b) . Here the quadratic factor x2 + 1 has no real roots. Example 2 x8 −1 = (x4 + 1) (x4 − 1) = (x4 + 1) (x2 + 1) (x + 1) (x − 1) where the polynomial x4 +1 has no real roots and must hence be the product of two quadratic polynomials with no real roots. This factorization is harder to find, however √ √ x2 + 2x + 1 x2 − 2x + 1 √ = x4 +√2x3 +x2 √ − 2x3 −2x2 −√2x x2 + 2x +1 +1 = x4 √ √ so x8 − 1 = x2 + 2x + 1 x2 − 2x + 1 (x2 + 1) (x + 1) (x − 1) where the quadratic factors are easily seen to have no real roots. 2.2 Roots of linear polynomials The root of a linear polynomial f (x) = ax + b with a = 0 is x0 = − ab . 2.3 Roots of quadratic polynomials The roots of a quadratic polynomial f (x) = ax2 + bx + c with a = 0 are given by √ −b ± b2 − 4ac x1/2 = . 2a When the discriminant b2 − 4c is negative there are no real roots. √ The term “real roots” is used to emphasie that we do not consider “imaginary roots” like −6. One can actually calculate with such numbers in a meaningful way. However, they do not represent points on the number line are hence difficult to interpret economically. 2 6 Suppose x1 , x2 are the roots of a quadratic polynomial. Then one has the formulas of Vieta (1540 — 1603) b c x1 + x2 = − and x1 x2 = a a and the factorization is f (x) = a (x − x1 ) (x − x2 ) since a (x − x1 ) (x − x2 ) = a x2 − (x1 + x2 ) x + x1 x2 = ax2 + bx + c by Vieta’s formulas. Supplementary useful information on quadratic function: The graph of a quadratic function is called a parabola. If a > 0 the function is strictly b . If a < 0 the function is strictly concave convex with a unique minimum at x∗ = − 2a b ∗ with a unique maximum at x = − 2a . The parabola is mirror-symmetric to the vertical line through the maximum/minimum (x∗ , 0), i.e., one has f (z + x∗ ) = f (−z + x∗ ) for all z. The minimum or maximum is always in the middle between the two roots x1 , x2 when 2 the two exist because x∗ = x1 +x by Vieta’s formula. 2 -3 16 -2 -1 1 14 0 -2 12 -4 10 -6 8 -8 6 -10 4 -12 2 x 3 4 5 2 -3 -2 -1 0 1 2 x 3 4 5 a convex parabola a concave parabola Problem 4 Suppose the government imposes an excise tax t, where t is the percentage of the price charged to consumers a) What is tax revenue when the tax is t = 0%? b) What is tax revenue when the tax is t = 100%? c) Suppose tax revenue is a quadratic function of the excise tax t imposed. What excise tax does then maximize tax revenue? Solution 2 ? 7 2.4 Polynomials of higher order: historic remarks Greece (ca. 1000 BC — 600 AC): The early Greeks were the masters of geometry. They studied the quadratic curves (parabolas, hyperbolas, ellipses) as intersections with cones. hyperbolas parabolas circles or ellipses Orient (ca. 600 AC — 1500 AC): Algebra was invented in the islamic countries. Muhamed ibn Musa al-Khwarizmi (∼825): “Hisab al-jabr wal-muqabala” (“The Science of Reduction and Mutual Cancellations”) ‘al-jabr’ → ‘algebra’. He verbally discussed equations like 3x + 4 = x2 . Omar Khayyam (ca. 1038 —1123): He found solutions to cubic equations in special cases. Europe (Renaissance): “mathematical entrepreneurs” Scipio del Ferro (died 1526): reported to have known the solutions to all cubic equations, but never told anyone how to do it. Tartaglia (the “stutterer”) rediscovered the general method to solve all cubic equations and announced that he could do it (but not how) in a public lecture. He explained his findings to Hieronimo Cardano in a private conversation. To Tartaglia’s great dismay Cardano published the results in a book (1546). The formulae are now named after Cardano. For instance, the equation x3 + px2 = q p, q > 0 has a unique root, namely x= 3 p3 25 + q2 4 + q − 2 3 p3 q 2 q + − 25 4 2 These formulae are too bulky for exam purposes! In 1547 Ferrari found the general solution to polynomial equations of order 4 (‘biquadratic equations’). Evariste Galois (1811 — 1832): Was not considered to be a student who could express himself very well. The night before he died in a duel (aged 21) he wrote down his 8 mathematical ideas and asked for them to be sent to Gauss. This “gibberish” (his own words) is now known as “Galois theory”. It could be solved many open problems, for instance that squaring the circle is impossible and that polynomial equations of degree 5 √ √ + etc.). or higher cannot be solved by formulae using roots ( Niels Hendrik Abel (1802 — 1829): Thought to have solved polynomial equations of degree 5 as a student, but later proved that this is not possible. He died from consumption and poverty. Of course, numerical methods approximately to solve such equations exist. 2.5 Integer roots of integer polynomials Integer roots of a polynomial with integer coefficients divide the constant term. Proof: Suppose that x0 is an integer root of the polynomial f (x) = an xn + . . . + a1 x + a0 with integer coefficients. Then f (x0 ) = an xn0 + . . . + a1 x0 + a0 = 0 which can be rewritten as −an xn0 − . . . − a1 x0 = a0 or as So x0 is a factor of a0 . −an x0n−1 − . . . − a1 × x0 = a0 integer integer integer Problem 5 Find a root of f (x) = x3 + 10x2 + 31x + 30. Solution 3 30 = 2 × 3 × 5. Try x = ±1, ±2, ±3, ±5, ±6, ±10, ±15, ±30. f (1) f (−1) f (2) f (−2) f (3) f (−3) f (5) f (−5) > = > = > = > = 0 −1 + 10 − 31 + 30 = −2 < 0 0 −8 + 40 − 62 + 30 = 0 HIT! 0 −27 + 90 − 93 + 30 = 0 HIT! 0 −125 + 250 − 155 + 30 = 0 HIT! The factorization must be (x − 2) (x − 3) (x − 5) Problem 6 Find all roots of f (x) = x3 − 3x2 − 25x + 75. (There can be at most 3!) Solution 4 ? 9 2.6 Reducing the degree for even polynomials A polynomial is called even if it has only even powers. For such polynomials the substitution z = x2 leads to a polynomial of only half the degree, for which roots are easier to find. Problem 7 Find at least two roots of f (x) = x5 − 5x3 + 6x. Solution 5 This is an odd polynomial with factor x and root x = 0. To find further roots we divide by x and obtain the polynomial f (x) = x4 − 5x2 + 6x0 which is even. Using the substitution z = x2 we have √ f ± z = z 2 − 5z + 6. This is a quadratic polynomial. Since 6√= 2 × 3 and −2 − 3 = −5 it has the roots z = 2 √ and z = 3. Hence x = ± 2 and x = ± 3 are roots of the original polynomial. In fact, √ √ √ √ x− 2 x+ 2 x− 3 x+ 3 = x2 − 2 x2 − 3 = x4 − 2x2 − 3x2 + 6 = f (x) 3 3.1 Factorization Polynomial division Polynomial division is similar to the division of two integers with remainder (100÷7 = 14 27 etc.), only simpler. To divide the polynomial f (x) = 6x6 + 5x5 + 4x4 + 3x3 + 2x2 + 1 by the polynomial g (x) = x3 + x2 + x2 + x + 1 we proceed as follows: 3 x +x 2 −x 6x3 −x2 6 5 +x +1 6x +5x +4x4 6x6 +6x5 +6x4 −x5 −2x4 −x5 −x4 −x4 −x4 10 −1 +3x3 +2x2 +x +1 R0 = P +6x3 −3x3 +2x2 +x +1 R1 3 2 −x −x −2x3 +x2 +x +1 R2 3 2 −x −x −x −x3 2x2 +2x +1 R3 3 2 −x −x −x −1 2 3x +3x +2 R5 We divide the leading term 6x6 of the polynomial f (x) by the leading term x3 of g (x), which gives 6x3 . We write 6x3 above the term 6x6 of f (x). Then we multiply each term of g (x) with 6x3 and write it below f (x). Then we form the difference between f (x) and the polynomial written below. This gives our first, intermediate remainder R1 (x). R1 (x) has one degree less than f because the term 6x6 cancels in the subtraction. We now proceed with R1 (x) — and with every successive remainder — in the same fashion as we did before with f (x). Namely, we divide the leading term −x5 of R1 (x) by the leading term x3 of g (x) and write the result −x2 above f (x). Then we multiply each term of g (x) with −x2 and write the result below R1 (x). We subtract to obtain the next remainder R2 (x), which is a polynomial of degree 4. We continue in this fashion to obtain R3 (x), R4 (x) and R5 (x). R5 (x) is a polynomial of degree 2. If we would try to divide the leading term 3x2 of R5 (x) by the leading term x3 of g (x) we would get x3 which is no longer a polynomial. Hence we stop once the degree of the remainder is less than the degree of the polynomial g (x) by which we want to divide. Our final remainder is R (x) = R5 (x) Denoting the polynomial above f (x) by h (x) = 6x3 − x2 − x − 1 we have f (x) R (x) = h (x) + g (x) g (x) where in the last fraction the degree of the numerator R5 (x) is smaller than the degree of the denominator. Notice again the similarity to the division of integers: If we divide 100 by 7 we obtain 14 27 where in the fraction 27 the numerator is smaller than the denominator. Suppose we can divide a polynomial f (x) by a polynomial g (x) without rest, i.e., the remainder R (x) is zero. Then f (x) = h (x) g (x) or f (x) = g (x) h (x) . So we have factorized f (x). 11