Intermediate Microeconomic Theory Intermediate Microeconomic Theory Tools and Step-by-Step Examples Ana Espinola-Arredondo and Felix Muñoz-Garcia The MIT Press Cambridge, Massachusetts London, England c 2020 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. This book was set in Times New Roman by Westchester Publishing Services. Library of Congress Cataloging-in-Publication Data Names: Espinola-Arredondo, Ana, author. | Muñoz-Garcia, Felix, author. Title: Intermediate microeconomic theory : tools and step-by-step examples / Ana Espinola-Arredondo and Felix Muñoz-Garcia. Description: Cambridge, Massachusetts : MIT Press, [2020] | Includes bibliographical references and index. Identifiers: LCCN 2019053969 | ISBN 9780262044233 (hardcover) Subjects: LCSH: Microeconomics. Classification: LCC HB172 .E855 2020 | DDC 338.5--dc23 LC record available at https://lccn.loc.gov/2019053969 Contents Chapter Examples xiii Preface xix Organization of the Book How to Use This Textbook Ancillary Materials xxii Acknowledgments xxii 1 Introduction 1.1 1.2 1.3 2 xx xxi 1 What Is Microeconomics? 1 Comparative Statics 2 Overview of the Book 2 1.3.1 Consumer Theory 2 1.3.2 Production Theory 4 1.3.3 Markets—Putting Consumers and Producers Together 1.3.4 Strategy—Let’s Play Games! 5 1.3.5 Putting Game Theory to Work 5 1.3.6 More Market Failures—When Markets Work Well and When They Don’t 6 Consumer Preferences and Utility 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 7 Introduction 7 Bundles 7 Preferences for Bundles 8 2.3.1 Ranking Bundles with More Units 10 2.3.2 Satiation and Bliss Points 12 Utility Functions 14 Marginal Utility 18 2.5.1 Diminishing Marginal Utility 19 Indifference Curves 20 2.6.1 Properties of Indifference Curves 22 Marginal Rate of Substitution 25 2.7.1 Diminishing MRS 26 Special Types of Utility Functions 28 2.8.1 Perfect Substitutes 29 2.8.2 Perfect Complements 30 4 vi Contents 2.8.3 Cobb-Douglas 32 2.8.4 Quasilinear 34 2.8.5 Stone-Geary 35 2.9 A Look at Behavioral Economics—Social Preferences 36 2.9.1 Fehr-Schmidt Social Preferences 36 2.9.2 Bolton and Ockenfels Social Preferences 37 Appendix. Finding the Marginal Rate of Substitution 37 Exercises 38 3 Consumer Choice 45 3.1 3.2 3.3 3.4 3.5 3.6 Introduction 45 Budget Constraint 45 Utility Maximization Problem 49 Utility Maximization Problem in Extreme Scenarios 55 Revealed Preference 57 Kinked Budget Lines 60 3.6.1 Quantity Discounts 60 3.6.2 Introducing Coupons 62 Appendix A. Applying the Lagrange Method to Solve the Utility Maximization Problem 64 Appendix B. Expenditure Minimization Problem 65 Relationship between the Utility Maximization Problem and the Expenditure Minimization Problem 68 Exercises 70 4 Substitution and Income Effects 75 4.1 4.2 Introduction 75 Income Changes 75 4.2.1 Using the Derivative of Demand 76 4.2.2 Using Income Elasticity 77 4.2.3 Using the Income-Consumption Curve 79 4.2.4 Using the Engel Curve 80 4.3 Price Changes 82 4.3.1 Using the Derivative of Demand 83 4.3.2 Using the Price-Elasticity of Demand 83 4.3.3 Using Price-Consumption Curves 85 4.4 Income and Substitution Effects 87 4.5 Putting Income and Substitution Effects Together 88 4.5.1 Income and Substitution Effects on the Labor Market 94 Appendix A. Not All Goods Can Be Inferior 97 Appendix B. An Alternative Representation of Income and Substitution Effects Using Elasticities to Represent the Slutsky Equation 101 Exercises 102 5 Measuring Welfare Changes 5.1 5.2 Introduction 107 Consumer Surplus 107 107 98 Contents vii 5.3 Compensating Variation 110 5.4 Equivalent Variation 114 5.5 Measuring Welfare Changes with No Income Effects 116 Appendix. An Alternative Representation of the Compensating and Equivalent Variations 120 A.1 Compensating Variation 120 A.2 Equivalent Variation 122 Exercises 124 6 Choice under Uncertainty 127 6.1 6.2 6.3 6.4 6.5 6.6 Introduction 127 Lotteries 128 Expected Value 128 Variance 129 Expected Utility 131 Risk Attitudes 132 6.6.1 Risk Aversion 132 6.6.2 Risk Loving 134 6.6.3 Risk Neutrality 136 6.7 Measuring Risk 138 6.7.1 Risk Premium 138 6.7.2 Certainty Equivalent 139 6.7.3 Arrow-Pratt Coefficient of Absolute Risk Aversion 140 6.8 A Look at Behavioral Economics—Nonexpected Utility 142 6.8.1 Weighted Utility 144 6.8.2 Prospect Theory 145 Exercises 148 7 Production Functions 155 7.1 7.2 7.3 7.4 7.5 7.6 7.7 Introduction 155 Production Function 156 Marginal and Average Product 157 161 Relationship between APL and MPL Isoquants 163 Marginal Rate of Technical Substitution 165 Special Types of Production Functions 168 7.7.1 Linear Production Function 168 7.7.2 Fixed-Proportions Production Function 169 7.7.3 Cobb-Douglas Production Function 170 7.7.4 Constant Elasticity of Substitution Production Function 7.8 Returns to Scale 171 7.9 Technological Progress 173 7.9.1 Types of Technological Progress 174 Appendix A. MRTS as the Ratio of Marginal Products 175 Appendix B. Elasticity of Substitution 176 Exercises 180 171 viii 8 Contents Cost Minimization 183 8.1 8.2 8.3 8.4 Introduction 183 Isocost Lines 183 Cost-Minimization Problem 185 Input Demands 189 8.4.1 Input Demand—Responses 192 8.5 Cost Functions 193 8.6 Types of Costs 195 8.7 Average and Marginal Cost 198 8.7.1 Output Elasticity to Total Cost 199 8.8 Economies of Scale, Scope, and Experience 201 8.8.1 Economies of Scale 201 8.8.2 Economies of Scope 203 8.8.3 Economies of Experience 205 Appendix. Cost-Minimization Problem—A Lagrangian Analysis Exercises 208 9 Partial and General Equilibrium 206 213 9.1 9.2 9.3 9.4 Introduction 213 Features of Perfectly Competitive Markets 214 Profit Maximization Problem 214 Supply Curves 217 9.4.1 Individual Firm Supply 217 9.4.2 Market Supply 220 9.5 Short-Run Supply Curve 221 9.6 Market Equilibrium 224 9.6.1 Short-Run Equilibrium 224 9.6.2 Long-Run Equilibrium 225 9.7 Producer Surplus 226 9.8 General Equilibrium 228 9.8.1 Equilibrium Prices 230 9.8.2 Efficient Allocations 233 9.8.3 Equilibrium versus Efficiency 234 9.8.4 Adding Production to the Economy 239 9.9 A Look at Behavioral Economics—Market Experiments 240 Appendix. Efficient Allocations and Marginal Rate of Substitution 241 Exercises 242 10 Monopoly 10.1 10.2 10.3 10.4 10.5 10.6 247 Introduction 247 Why Do Monopolies Exist? 247 The Monopolist’s Profit Maximization Problem 249 10.3.1 A Closer Look at Marginal Revenue 250 10.3.2 Solving the Monopolist’s Problem 253 Common Misunderstandings of Monopoly Markets 255 The Lerner Index and Inverse Elasticity Pricing Rule 257 Multiplant Monopoly 260 Contents ix 10.7 Welfare Analysis under Monopoly 10.8 Advertising in Monopoly 266 10.9 Monopsony 268 Exercises 271 11 Price Discrimination and Bundling 263 277 11.1 11.2 Introduction 277 Price Discrimination 278 11.2.1 First-Degree Price Discrimination 279 11.2.2 Second-Degree Price Discrimination 281 11.2.3 Third-Degree Price Discrimination 284 11.3 Bundling 286 Exercises 291 12 Simultaneous-Move Games 297 12.1 12.2 12.3 12.4 12.5 12.6 Introduction 297 What Is a Game? 298 Strategic Dominance 300 Nash Equilibrium 306 Common Games 310 Mixed-Strategy Nash Equilibrium 316 12.6.1 Graphical Representation of Best Responses Exercises 323 13 Sequential and Repeated Games 321 329 13.1 13.2 13.3 13.4 Introduction 329 Game Trees 330 Why Don’t We Just Find the Nash Equilibrium of the Game Tree? 332 Subgame-Perfect Equilibrium 334 13.4.1 Subgame Perfect Equilibrium in More Involved Games 335 13.5 Repeated Games 340 13.5.1 Finite Repetitions 340 13.5.2 Infinite Repetitions 341 13.6 A Look at Behavioral Economics—Cooperation in the Experimental Lab? Exercises 347 14 Imperfect Competition 14.1 14.2 14.3 355 Introduction 355 Measuring Market Power 356 Models of Imperfect Competition 357 14.3.1 Cournot Model—Simultaneous Quantity Competition 358 14.3.2 Bertrand Model—Simultaneous Price Competition 365 14.3.3 Cartels and Collusion 369 14.4 Stackelberg Model—Sequential Quantity Competition 373 14.5 Product Differentiation 377 Appendix. Cournot Model with N Firms 380 Exercises 383 346 x 15 Contents Games of Incomplete Information and Auctions 391 15.1 15.2 15.3 Introduction 391 Extending Nash Equilibria to Games of Incomplete Information 392 Auctions 396 15.3.1 Auctions as Allocation Mechanisms 396 15.4 Second-Price Auctions 397 15.5 First-Price Auctions 400 15.5.1 Privately Observed Valuations 400 15.5.2 Equilibrium Bidding in First-Price Auctions 401 15.5.3 Extending the First-Price Auction to N Bidders 405 15.5.4 First-Price Auctions with Risk-Averse Bidders 407 15.6 Efficiency in Auctions 409 15.7 Common-Value Auctions 410 15.8 A Look at Behavioral Economics—Experiments with Auctions 411 Appendix. First-Price Auctions in More General Settings 412 Exercises 414 16 Contract Theory 419 16.1 16.2 Introduction 419 Moral Hazard 421 16.2.1 Contracts When Effort Is Observable 422 16.2.2 Contracts When Effort Is Unobservable 424 16.2.3 Preventing Moral Hazard 428 16.3 Adverse Selection 428 16.3.1 Market for Lemons 428 16.3.2 Market for Lemons—Symmetric Information 429 16.3.3 Market for Lemons—Asymmetric Information 429 16.3.4 Principal-Agent Model 431 16.3.5 Principal-Agent Model—Symmetric Information 431 16.3.6 Principal-Agent Model—Asymmetric Information 433 16.3.7 Principal-Agent Model—Comparing Information Settings 16.3.8 Preventing Adverse Selection 438 Appendix. Showing That PCH and ICL Hold with Equality 439 Exercises 440 17 Externalities and Public Goods 17.1 17.2 17.3 17.4 436 445 Introduction 445 Externalities 445 17.2.1 Unregulated Equilibrium 446 17.2.2 Social Optimum 448 Restoring the Social Optimum 451 17.3.1 Bargaining between the Affected Parties 451 17.3.2 Government Intervention 453 Public Goods 455 17.4.1 A Look at Behavioral Economics—Public-Good Experiments 459 Contents Common-Pool Resources 459 17.5.1 Finding Equilibrium Appropriation 460 17.5.2 Common-Pool Resources—Joint Profit Maximization Exercises 464 xi 17.5 References 469 Index 471 462 Chapter Examples Chapter 2: Consumer Preferences and Utility Example 2.1: Monotonic and strictly monotonic preferences. 11 Example 2.2: Nonsatiated preferences. 13 Example 2.3: Utility ranking and increasing transformations of the utility function. Example 2.4: Testing properties of preference relations. 15 Example 2.5: Finding marginal utility, MU. 18 Example 2.6: Diminishing marginal utility. 19 Example 2.7: Finding ICs for two utility functions. 21 Example 2.8: Finding MRS. 27 Chapter 3: Consumer Choice Example 3.1: UMP with interior solutions–I. 51 Example 3.2: UMP with interior solutions–II. 52 Example 3.3: UMP with corner solutions. 53 Example 3.4: Testing for WARP. 58 Example 3.5: Quantity discounts. 62 Example 3.6: Coupons. 63 Example 3.7: EMP with a Cobb-Douglas utility function. Example 3.8: EMP with a quasilinear utility. 67 Example 3.9: Dual problems. 69 66 Chapter 4: Substitution and Income Effects Example 4.1: Increasing income in a Cobb-Douglas utility function. 77 Example 4.2: Finding income elasticity in the Cobb-Douglas scenario. 78 Example 4.3: Finding income-consumption curves. 80 Example 4.4: Finding Engel curves. 81 Example 4.5: Demand and price changes. 83 Example 4.6: Price elasticity and demand. 84 Example 4.7: Finding price-consumption curves. 87 14 xiv Chapter Examples Example 4.8: Finding IE and SE with a Cobb-Douglas utility function. 91 Example 4.9: Finding IE and SE with a quasilinear utility. 93 Example 4.10: Applying the Slutsky equation to the Cobb-Douglas case. 100 Chapter 5: Measuring Welfare Changes Example 5.1: Finding CS with linear demand. 108 Example 5.2: Finding CS with nonlinear demand. 109 Example 5.3: Finding the CV of a price decrease. 112 Example 5.4: Finding the EV of a price decrease. 115 Example 5.5: CS, CV, and EV with a quasilinear utility function. Example 5.6: An alternative representation of CV. 121 Example 5.7: An alternative representation of EV. 123 117 Chapter 6: Choice under Uncertainty Example 6.1: Finding the EV of a lottery. 129 Example 6.2: Finding the variance of a lottery. 130 Example 6.3: Finding the EU of a lottery. 132 Example 6.4: Finding the EU of a lottery under risk-loving preferences. 134 Example 6.5: Finding the EU of a lottery under risk-neutral preferences. 136 Example 6.6: Finding the RP of a lottery. 138 Example 6.7: Measuring RP and CE with other risk attitudes. 140 Example 6.8: Finding the AP coefficient. 141 Example 6.9: The certainty effect. 143 Example 6.10: Weighted utility. 144 Example 6.11: Using WU to explain the certainty effect. 145 Example 6.12: Prospect theory. 147 Example 6.13: Using prospect theory to explain the certainty effect. 148 Chapter 7: Production Functions Example 7.1: Examples of production functions. 156 Example 7.2: Finding average product. 158 Example 7.3: Finding marginal product. 160 Example 7.4: Relationship between APL and MPL . 162 Example 7.5: Finding isoquant curves for a Cobb-Douglas production function. Example 7.6: Finding the MRTS of a Cobb-Douglas production function. 166 Example 7.7: Finding the MRTS of a linear production function. 167 Example 7.8: Testing for returns to scale. 172 Example 7.9: Testing for technological progress. 174 Example 7.10: Identifying the type of technological progress. 175 165 Chapter Examples xv Chapter 8: Cost Minimization Example 8.1: A particular isocost. 185 Example 8.2: Cost minimization with Cobb-Douglas production functions. 188 Example 8.3: Cost minimization with linear production functions. 189 Example 8.4: Finding input demands with a Cobb-Douglas production function. 190 Example 8.5: Finding input demands with a linear production function. 191 Example 8.6: Finding total cost in the Cobb-Douglas case. 194 Example 8.7: Finding total costs in the linear production case. 194 Example 8.8: Comparing long- and short-run costs. 196 Example 8.9: Finding average and marginal cost. 199 Example 8.10: Output elasticity in the Cobb-Douglas case. 200 Example 8.11: Testing for economies of scale. 202 Example 8.12: Economies of scope. 204 Example 8.13: Slope of the experience curve. 205 Chapter 9: Partial and General Equilibrium Example 9.1: PMP in the Cobb-Douglas case. 216 Example 9.2: Finding the long-run supply curve. 219 Example 9.3: Finding market supply. 220 Example 9.4: Finding the short-run supply curve. 223 Example 9.5: Finding short-run equilibrium output and price. 224 Example 9.6: Finding long-run equilibrium output and price. 225 Example 9.7: Finding producer surplus. 227 Example 9.8: Finding an equilibrium allocation and price. 230 Example 9.9: Finding efficient allocations. 234 Example 9.10: Testing the First Welfare Theorem. 235 Example 9.11: Testing the Second Welfare Theorem. 237 Chapter 10: Monopoly Example 10.1: Positive and negative effects of selling more units. 251 Example 10.2: Finding marginal revenue with linear demand. 251 Example 10.3: Finding monopoly output with linear demand. 253 Example 10.4: Price elasticity of output qM under a linear demand. 256 Example 10.5: Lerner index with a linear demand. 258 Example 10.6: Lerner index with constant elasticity demand. 259 Example 10.7: Multiplant monopoly. 261 Example 10.8: Finding the deadweight loss of a monopoly. 264 Example 10.9: Finding the monopolist’s optimal advertising ratio. 267 Example 10.10: Finding optimal L in monopsony. 269 xvi Chapter Examples Chapter 11: Price Discrimination and Bundling Example 11.1: First-degree price discrimination. 279 Example 11.2: Second-degree price discrimination. 282 Example 11.3: Third-degree price discrimination. 284 Example 11.4: Bundling. 286 Chapter 12: Simultaneous-Move Games Example 12.1: Finding strictly dominant strategies. 301 Example 12.2: When IDSDS does not provide a unique equilibrium. Example 12.3: When IDSDS does not have a bite. 305 Example 12.4: Finding best responses and NEs. 308 Example 12.5: Prisoner’s Dilemma game. 310 Example 12.6: Battle of the Sexes game. 312 Example 12.7: Coordination game. 314 Example 12.8: Anticoordination game. 315 Example 12.9: Penalty kicks in soccer. 317 304 Chapter 13: Sequential and Repeated Games Example 13.1: Applying NE to the Entry game. 332 Example 13.2: Backward induction in the Entry game. 334 Example 13.3: Applying backward induction in more involved game trees. Example 13.4: Sustaining cooperation with a Grim-Trigger Strategy. 342 336 Chapter 14: Imperfect Competition Example 14.1: Cournot model with symmetric costs. 362 Example 14.2: Cournot model with asymmetric costs. 364 Example 14.3: Bertrand model. 368 Example 14.4: Collusion when firms compete in quantities. 369 Example 14.5: Sustaining cooperation within the cartel. 371 Example 14.6: Stackelberg model. 376 Example 14.7: Output competition with product differentiation. 379 Chapter 15: Games of Incomplete Information and Auctions Example 15.1: Cournot competition, with asymmetric information about costs. 393 Chapter Examples xvii Chapter 16: Contract Theory Example 16.1: Finding optimal contracts when effort is observable. 422 Example 16.2: Finding optimal contracts when effort is unobservable. 425 Example 16.3: Principal-agent problem under symmetric information. 432 Example 16.4: Principal-agent problem under asymmetric information. 436 Chapter 17: Externalities and Public Goods Example 17.1: Unregulated equilibrium. 446 Example 17.2: Finding the social optimum. 448 Example 17.3: Prohibiting pollution. 450 Example 17.4: Finding optimal emission fees. 453 Example 17.5: Free-riding of public goods. 456 Preface This textbook offers an introduction to intermediate microeconomic theory for undergraduate students. Our presentation differs from current intermediate microeconomics textbooks—such as Besanko and Braeutigam (2013), Varian (2014), Goolsbee, Levitt, and Syverson (2015), and Perloff (2016)— along several dimensions: • Length. The book is significantly shorter than most current books on this topic, which often exceed 830 pages. Most current textbooks include lengthy presentations, such as 45 page-long chapters. Providing shorter chapters, we seek to make the material more attractive to students, who can read each chapter (the material corresponding to approximately a week of the course) in less than one hour. • Worked-out examples. Every chapter provides the basic theoretical elements, reducing them to their main ingredients, and includes several detailed examples and applications. The chapters also present the intuition behind each mathematical assumption and result. • Tools. We provide step-by-step tools on how to solve standard exercises, so students can apply a common approach to solve similar exercises. • Algebra support and step-by-step calculations. We assume readers have little mathematical background in algebra and calculus, so we walk them through each algebra step and simplification, helping them reproduce all the results on their own. From our recent experience, students’ calculus for this course is appropriate, but their algebra is often rusty. Hence, we give algebra steps and simplifications, making sure that students can more easily follow every step and recall basic algebra properties. • Self-assessment exercises. The book includes 140 self-assessment exercises, which give readers the opportunity to review concepts from previous examples. These questions encourage readers to repeat the step-by-step approach presented in each example, considering slightly new scenarios to gain extra practice. Students can then check their answers with the Practice Exercises for Intermediate Microeconomic Theory book. • Practice Exercises for Intermediate Microeconomic Theory. This accompanying book provides detailed answer keys to all the self-assessment exercises, as well as the xx Preface 173 odd-numbered end-of-chapter exercises. In addition, it offers step-by-step explanations, promoting understanding about how students can approach similar exercises on their own, emphasizing the economic intuition behind the mathematical results. This is, then, radically different from solution manuals, which rarely provide detailed explanations, are difficult to read on their own, and are distributed only to instructors. The combination of both textbooks seeks to help undergraduate students improve both their theoretical and practical preparation in intermediate microeconomics. Therefore, this book is especially attractive for students in programs in economics, business administration, finance, or related fields in social sciences. Given its step-by-step approach to examples and intuition, it should be appropriate for Intermediate Microeconomics courses, with or without calculus, as we kept the amount of calculus to a minimum. Organization of the Book Chapter 1 defines Microeconomics, how it is used to examine different real-world problems, and provides an outline of the book. Chapters 2 and 3 are dedicated to consumer theory, first describing preference relations and utility functions (chapter 2), followed by a presentation of how individuals choose optimal bundles (chapter 3). We then take a more applied approach by using the tools presented in previous chapters to examine how income or price changes affect consumer purchases (chapter 4), and how to evaluate the welfare gain/loss that consumers experience from a price change (chapter 5). Chapter 6 then investigates how to represent individual attitudes toward risk and uncertainty, and how to measure different risky situations. Chapters 7 and 8 switch the focus toward the analysis of firms, first analyzing their production decisions, inputs, and technology (chapter 7), how to represent the firm’s costs, and how to minimize them to find the optimal combination of inputs (chapter 8). Chapters 9 and 10 study two extreme types of markets: perfectly competitive markets in partial and general equilibrium (chapter 9) and monopolies where a single firm operates (chapter 10). Chapter 11 expands on the monopolist’s analysis by considering forms of price discrimination that the monopolist can practice, as well as bundling. We then explain some basic game theory in chapter 12 (simultaneous-move games) and chapter 13 (sequential and repeated games). These two chapters serve as the building blocks for most of the subsequent chapters, starting with chapter 14, which examines markets with few firms, either competing in quantities or prices, choosing their actions simultaneously or sequentially, and selling products that are regarded as homogeneous or heterogeneous by their customers. Chapter 15 extends the analysis of games to contexts in which one (or all) players cannot observe some relevant piece of information (games of incomplete Preface xxi information). Auctions are an interesting application of this type of games, because every bidder observes her valuation for the object being sold, but cannot observe her rivals’ valuations for the object before submitting her bid. Chapter 16 examines contract theory and incentives, which are natural applications of game-theoretic tools as well. Unfortunately, the presentation of this topic in most Intermediate Microeconomics textbooks is either too verbal, and thus does not provide precise equilibrium results, or too formal and difficult for the average student to grasp. We hope that this chapter strikes a balance between rigor and intuition. Finally, chapter 17 also uses the game theory tools practiced in previous chapters, applying them now to the study of externalities, public goods, and common-pool resources. How to Use This Textbook The writing style of the textbook, as well as the possibility of combining it with the Practice Exercises book, allows for flexible uses by instructors of the following courses: • Intermediate Microeconomics with Calculus. This book probably fits best with this type of course. Instructors can recommend most chapters in the book as the main reading reference for students. In addition, instructors could assign the reading of specific exercises in the Practice Exercises book, which should help students better understand the application of the theoretical foundations, ultimately allowing them to become better prepared for homework assignments and exams. • Intermediate Microeconomics without Calculus, or Managerial Economics. The book is also appropriate for instructors in this type of course because it includes little calculus, mostly contained in a few worked-out examples and end-of-chapter appendices. The instructor can also recommend some exercises from the Practice Exercises book, as some of these exercises do not require the use of calculus tools. • Introduction to Microeconomics (for students in Honors programs, or with some mathematical background). The book can also be used in Introductory Microeconomics courses in programs with students with an algebra background. We believe that students should be comfortable with the style of this book after just one class in Mathematical Economics, which they often take in their freshman year of college. Otherwise, students who took at least one algebra and one calculus course in high school should also be relatively comfortable given our step-by-step approach in all the calculations and worked-out examples. • Managerial Economics (Master’s level). For instructors teaching Managerial Economics courses in Masters of Science (MS) programs in finance, business administration, or related fields, the book can also serve as a direct, easy-to-follow, reading reference. xxii Preface Some of the numerical exercises from the Practice Exercises book can also support this pedagogical strategy. The length of this book can be attractive to instructors of the abovementioned courses who use a “flipped classroom” approach, as students can easily read on their own the main theory and examples before class, moving activities such as working on exercises and homework to during class time. It can also be appealing to instructors using “case-based” teaching, who cover real-life case studies from companies in class, leaving the learning of the main theory and applications to students at home. Ancillary Materials • Practice Exercises for Intermediate Microeconomic Theory. This book includes step-bystep answer keys with intuitive explanations to all self-assessment exercises, and all oddnumbered exercises at the end of every chapter (342 exercises in total). This can be useful for students to practice with more exercises, seeing the common approach that we follow to solve them here. • Solutions Manual for Intermediate Microeconomic Theory (available only to instructors). It includes step-by-step answer keys to the 140 self-assessment exercises and 341 end-ofchapter exercises (481 exercises in total). These exercises are ranked in order of difficulty, with A next to the title for the easiest exercises (often involving no calculus), B for the intermediate-level exercises, and C for the most difficult exercises, which require some calculus or several algebra steps. • Microsoft PowerPoint slides (available only to instructors). They cover the main topics in every chapter of the textbook. The slides include all definitions, equations, short explanations, and figures, thus facilitating class preparation. Slides can also be distributed to students as a first set of lecture notes that they can complement with in-class explanations. Acknowledgments We would first like to thank several colleagues who encouraged us in the preparation of this manuscript: Alan Love, Ron Mittelhammer, Jill McCluskey, and Antonio Manresa. We are, of course, especially grateful to our teachers and advisors at the University of Pittsburgh and at the University of Barcelona for instilling a passion for research and teaching in microeconomics. We are thankful to Emily Taber, Laura Keeler, Melody Negron, and all the publishing team at the MIT Press, for their constant encouragement and support; and to several teaching assistants at Washington State University, who helped us with this project over several years; and to a number of students who provided feedback on earlier versions of Preface xxiii the manuscript: Eric Dunaway, John Strandholm, Pak-Sing Choi, Kiriti Kanjilal, Samantha Johnson, Hyoenjun Hwang, Loles Garrido, Mursaleen Shiraj, Chelsea Pardini, and Casey Bolt. Last, but not least, we would like to thank our family and friends for encouraging us during the preparation of the manuscript. Ana Espinola-Arredondo and Felix Muñoz-Garcia School of Economic Sciences, Washington State University, Pullman, WA 1 Introduction 1.1 What Is Microeconomics? What is microeconomics, anyway? This is often the first question that many noneconomists ask you when you tell them that you are taking a microeconomics course or doing research in microeconomics. One of our friends humorously says that we study “small economies,” or goods and services that you can buy with a few cents. But, seriously, what is microeconomics, and how can we use it? Microeconomics seeks to understand individual behavior, where we understand the “individual” to mean a consumer, a firm, a voter, a group of friends, a public official, or a regulator. This behavior is mainly in economic contexts, but also in other social situations. Here are some examples: • Consumers. We seek to study consumer purchasing decisions. If you find out that your favorite singer will be in town, and that tickets are sold for $45, will you buy one? What about buying one more ticket so you can invite a friend? The answers to these questions not only depend on how much you like this singer (a lot!) but also on your budget (whether you can afford spending money on buying one or two tickets this month). • Firms. We also analyze firms’ input decisions (e.g., how many workers to hire and computers to purchase); how firms use these inputs to produce units of output with different technologies; how many units each firm chooses to produce; and at what price each firm sells these units. The answers to these questions will change, of course, depending on how many other firms compete in that market and if its goods are regarded as relatively similar to its competitors by consumers. • Regulators. With no regulation, public officials can anticipate how firms and consumers behave in different markets. The officials then ask whether policy tools, such as taxes or quotas on consumers or firms, can be beneficial. We return to these questions at different points throughout the book. 2 Chapter 1 We investigate the behavior of these agents under the assumption of rationality: each agent seeks to maximize her payoff (i.e., utility for the consumer, or profits for the firm) given her resources, and given the information to which she has access. Importantly, this understanding of rationality is sufficiently broad, so we can consider situations where an agent seeks to maximize her own material payoff, as well as other contexts where she maximizes a combination of her own and other agents’ payoffs. In other words, rationality implies only that the agent seeks to maximize some kind of payoff mix, allowing her to be selfish or altruistic. We consider consumers with different motivations (often referred to as “other-regarding motivations” and “biases”) in several sections of the book with the title “A Look at Behavioral Economics,” where we mainly present alternative theories of consumer behavior the literature proposed in the last decades; see the end of chapters 2, 6, 9, 13, 15, and 17. 1.2 Comparative Statics Besides analyzing individual behavior under given conditions (e.g., a specific ticket price of $45), we seek to predict how this behavior varies when some of these conditions change. Generally, in economics we use the term “comparative statics” to measure how an individual’s behavior changes when we vary one, and only one, variable (such as the price of an item). In the above example about concert tickets, if your answer was to purchase two tickets (one for you and another for a friend) when the price is $45, would you make a different choice if the ticket price increases to $55?1 For an example concerning firms, imagine that a technology is discovered that decreases ice cream production costs. We then seek to understand how much ice cream sellers respond by lowering their prices, and how the answer to this question changes depending on the competitive pressures that they face in the industry. 1.3 Overview of the Book 1.3.1 Consumer Theory Chapters 2–6 examine a consumer’s preferences for different goods, her choices about how many units of each good to purchase; and how these choices vary when the consumer operates under uncertainty (e.g., not knowing the exact return she will receive in her investment portfolio). In chapter 2, you will find the simplified model that economists use to represent you as a consumer. Using the notion of a consumption bundle (a list of goods and services), we 1. Alternatively, think about ice cream purchases. If your favorite ice cream brand becomes cheaper, by how much would you increase your purchases of that brand? What about your purchases of other ice cream brands which become, in relative terms, more expensive? Introduction 3 analyze how a consumer’s preferences rank different bundles of goods and how to represent these preferences in a utility function, which helps us measure the consumer’s well-being from each bundle. We then discuss the various properties that utility functions can satisfy. Chapter 3 studies the consumer’s optimal purchasing decision. In that regard, we start by making an obvious, yet important, point: individuals may like more units of all (or at least some) goods, but they cannot afford all of them! We then describe the budget constraint that we all face as consumers, which is essentially dictated by prices of goods and our available income. Given this budget constraint, the consumer’s purchasing decision can be informally understood as follows: Buy the bundle that increases my utility as much as possible but… without breaking the bank! For compactness, we refer to this purchasing decision as the consumer’s “demand” for the good.2 In chapter 4, we essentially look at our results from the consumer’s problem (her demand for the good) and check how it varies when we increase her income by a small amount. After winning the lottery, you may increase your purchases for most goods (such as a new house or a nicer car), and yet you may decrease your purchases of some goods, such as fast food. We then spend some time evaluating how purchases of a good change when its price experiences a small increase. Chapter 5 follows up on the discussion in chapter 4, evaluating now the welfare loss that the consumer suffers once the price of a good increases. For completeness, we present three common measures that economists use to evaluate this welfare loss: the change in consumer surplus, the compensating variation, and the equivalent variation. We analyze their similarities and differences, as well as providing several numerical examples to illustrate how you can apply them to other contexts. Previous chapters of this book considered, for simplicity, that the consumer operates under certainty. Coming back to the ice cream example, the analysis assumed that you know each flavor. However, what if you are buying an ice cream cone at an unfamiliar place? Chapter 6 focuses on situations where the consumer faces uncertainty about some element that affects her utility, such as the ice cream flavor she buys. Another example is that of accepting a job paying a salary of $60,000 a year with certainty, or working for a start-up company that will pay you $95,000 if the company makes it to the New York Stock Exchange (which, according to your information, will happen with a probability of 30 percent) or $15,000 if the company does not (which would occur with the remaining probability, 70 percent). In this chapter, we ask a simple question: Which job would you 2. We also consider an alternative way to approach consumer purchasing decisions in which, rather than buying the bundle that maximizes her utility given a budget constraint, the consumer chooses the bundle that minimizes her budget (her expenditure) while reaching a minimum utility level. As you probably suspect, both approaches to the consumer’s problem usually yield the same results (i.e., the same optimal bundle). 4 Chapter 1 choose? To answer this question, we introduce the “expected utility” that a worker can obtain from each job. Using the tools that we learned in this chapter, we present different risk attitudes that an individual may have towards a risky investment (or a risky job at a startup company), and finish this chapter by introducing several measures of risk aversion that economists often use, such as the risk premium of an investment, or its certainty equivalent. 1.3.2 Production Theory Chapters 7 and 8 focus on firms, rather than consumers. We first analyze a firm’s production decision, such as its use of inputs (how many workers to hire or machines to purchase) and its production level (how many units of output to produce). For simplicity, our presentation is as similar as possible to consumer theory, which should help readers understand that agents (whether consumers or firms) face relatively similar problems from a mathematical standpoint: agents seek to maximize their payoff (either utility for the consumer, or profits for the firm) and face constraints (the budget constraint for the consumer, reflecting all the bundles she can afford, or technological constraints for the firm, indicating those output levels the firm can produce given its technology). While chapter 7 helps us find the production decision that maximizes a firm’s profit, chapter 8 evaluates the cost that the firm incurs from this output decision. We find the units of each input that the firm hires; its average cost (i.e., cost per unit of output); its marginal cost (i.e., increase in cost when the firm increases its output by one unit); and how the firm’s average cost is affected when its scale expands (economies of scale) or when it offers more product lines (economies of scope). 1.3.3 Markets—Putting Consumers and Producers Together Chapters 9–11 combine the results from previous chapters about which bundles of goods consumers purchase and which bundles firms produce, placing these agents into two types of markets: perfectly competitive markets (chapter 9) and monopolies (chapters 10 and 11). These are, of course, extreme market structures. Indeed, as covered in chapter 9, perfectly competitive markets encompass many firms, each producing a small share of industry output. Therefore, when choosing to produce a larger output, every firm can anticipate that its decision will not affect market prices. In other words, firms are “price takers” as they take prices as given. In monopoly markets, in contrast, a single firm operates, choosing the output that maximizes its profits. While firms are price takers in competitive industries, the monopolist is a “price setter” because its output decision uniquely determines market price. We start our analysis of monopolies in chapter 10 with a natural question: why do monopolies exist? Using this question as a motivation, we examine the optimal output decision for the monopolist. We then apply our analysis to multiplant monopolies, in which a firm is the only seller of a product, which is made at two or more plants. Finally, we evaluate the Introduction 5 welfare loss that society experiences when an industry is monopolized rather than operated under perfect competition. Chapter 11 expands our analysis of monopolistic firms by asking a provocative question: how can monopolies further increase their profits? As we discuss, this firm can practice three forms of price discrimination, which informally can be understood as charging different prices to different types of customers. We also explore other tools the monopolist can use to increase its profits: advertising, which makes its product known to a larger pool of customers (chapter 10); and bundling, where the firm offers customers a “bundled product” (such as a PC tower and monitor). 1.3.4 Strategy—Let’s Play Games! Previous chapters have considered extreme types of industries, such as perfectly competitive markets, where the output decision of an individual firm does not affect market prices, and thus does not affect the profits of other firms in the same industry; and monopolies, where a single firm sells its product, and thus is not affected by competition. In subsequent chapters of the book, we analyze other, less extreme industries, where a few firms compete against each other. Importantly, their competition gives rise to strategic effects. Consider, for instance, an industry with two firms. If one of them increases its output, it can sell more units, but now the market becomes a bit more flooded with products, decreasing market prices and ultimately reducing its rival’s profit. Generally, every industry having more than one firm (but not an infinite number of firms) will experience these strategic effects from firms’ interactions: the actions of one firm affect its rivals’ profits! Before exploring these types of markets, we must equip ourselves with the tools to better understand strategic interactions among firms. The branch of economics that studies strategic behaviors is known as “game theory” because it examines the interactions among “players” (such as firms, consumers, or governments) in situations where the action of one player affects the payoffs of other players. Chapters 12 and 13 present us with these tools, starting with games in which all players (e.g., firms) choose their actions (e.g., output levels) simultaneously (chapter 12); and continuing with games in which players act sequentially (chapter 13). For all the games we analyze, we seek to predict how players behave so that we can anticipate which is the “equilibrium behavior” in each game. 1.3.5 Putting Game Theory to Work Chapter 14 uses game theory tools from chapters 12 and 13 to study industries with a limited number of firms (i.e., imperfectly competitive markets). We start analyzing markets in which firms simultaneously choose their actions, either competing in price or in quantity. We then move on to examine industries where firms act sequentially, and extend our results to settings where firms sell products that consumers regard as close (but not perfect) substitutes. 6 Chapter 1 Chapter 15 extends many of the game theory tools from previous chapters to situations with incomplete information (i.e., contexts where one player has more information than its rivals). A common example is that of a firm observing its production cost, but its rivals cannot perfectly observe it. Another typical example is an auction where, as a bidder, you know how much you are willing to pay for the object on sale (e.g., a Picasso painting), but you do not get to know how much other bidders are willing to pay for it. We dedicate special attention to different auction formats, such as first-price auctions (where the winning bidder pays the highest bid) and second-price auctions (where the winning bidder only pays the second highest bid). We then analyze what the optimal bidding strategy is for each one (i.e., how much money you should bid if you participate in one of these auctions). 1.3.6 More Market Failures—When Markets Work Well and When They Don’t In our analysis of monopolies, we highlight the fact that they reduce social welfare relative to what arises under a competitive industry. However, is this the only case of a market failure (i.e., a market producing less-than-optimal outcomes)? In the final chapters of the book, we provide a negative answer to that question, as we identify several other contexts where market failures exist. Specifically, we use game theory again to analyze two other market failures: those emerging in contracts where one party is better informed than the other (chapter 16), and those arising in situations where the actions of one agent produce external effects on another agent’s well-being (chapter 17). In this final chapter, we also examine sustainability issues in common-pool resources, such as a fishing ground or a forest. In a short-run analysis, where agents ignore the long-term effects of their behavior, they may choose to exploit the resource intensively. However, when agents consider these long-run effects, their optimal behavior dictates a less intense exploitation. 2 Consumer Preferences and Utility 2.1 Introduction In this chapter, we explore how to represent consumer preferences for different goods. We start by discussing properties of consumer preferences, such as wanting more units of some goods, and then mathematically describe how to measure the satisfaction that a person enjoys with a utility function. Next, we analyze this function in detail, discussing how to represent preferences over various types of goods, such as goods regarded as substitutes or complements by the consumer. We also discuss other types of utility functions often used in economic applications, such as the Cobb-Douglas utility function, the Stone-Geary utility function, and quasilinear utility functions. We also discuss how to depict these utility functions graphically. Finally, we describe utility functions representing “social” rather than “selfish” preferences. 2.2 Bundles In this chapter, we describe how to represent preferences over bundles of goods and services that the individual considers consuming. First, we need to define what we mean by a “bundle.” Bundle A list of goods and services. For instance, if an individual consumes only two goods (apples and oranges, represented by goods x and y, respectively) a bundle could be A = (40, 30), indicating that the individual consumes x = 40 apples and y = 30 oranges. Figure 2.1 represents apples on the horizontal axis and oranges on the vertical axis, implying that bundles can be depicted by a point in the positive quadrant. Bundle A = (40, 30) would then have a length of 40 on the x-axis, and a height of 30 on the y-axis. 8 Chapter 2 y, oranges 30 Bundle A = (40,30) 40 x, apples Figure 2.1 Bundle A = (40, 30). 2.3 Preferences for Bundles We can now start our analysis of consumer preferences over bundles, which will help us understand how a consumer ranks different bundles. For instance, a consumer might prefer bundles with more units of all goods. However, she may dislike other goods (i.e., “bads”), such as garbage or pollution, thus preferring bundles with the smallest possible amount of those goods. Most of the examples in this book will nonetheless consider goods rather than bads, unless otherwise stated. We next provide a list of properties satisfied by most of the preference relations we examine in this and future chapters. We refer to two bundles, A and B, each with units of goods x and y, A = (xA , yA ) and B = (xB , yB ). Each bundle could then be depicted as a point in figure 2.1. In addition, our explanations use the following notation: • A B denotes that the individual “strictly prefers” bundle A to B (so “strictly” rules out the possibility that she is indifferent between the two bundles). • A ∼ B means that the individual is indifferent between bundles A and B (i.e., she is equally happy with either of them). • A B denotes that the individual “weakly prefers” bundle A to B, which allows her to be indifferent between the two bundles or to strictly prefer A to B. Next, we describe our first property on preferences. Completeness A preference relation is complete if the consumer has the ability to compare every two bundles A and B. Formally, the consumer strictly prefers bundle A to B (represented as A B), strictly prefers bundle B to A (denoted as B A), or is indifferent between these two bundles (represented as A ∼ B). Consumer Preferences and Utility 9 That is, we do not allow the consumer to respond, “I don’t know how to compare these two bundles!” While we have all found ourselves in situations where comparing two completely new options was rather difficult (think about the last time you ordered food in an ethnic restaurant with which you were unfamiliar), completeness implies that the consumer has enough time to be able to compare and rank the two bundles. In other words, completeness requires only that the individual is capable of ranking bundles (allowing her to prefer one over the other, or be indifferent between them), and does not allow her to be unable to compare both bundles. Transitivity For every three bundles A, B, and C, if the consumer prefers A to B (A B), and B to C (B C), then she must also prefer A to C (A C). Intuitively, if the consumer prefers a first option to a second option, and the second option to a third, it must be that she prefers the first to the third. A consumer with intransitive preferences would have A B and B C (the same premise as in the previous property), but state that C A (the opposite conclusion than in the previous definition). Hence, her preferences would exhibit a cycle, which becomes evident when we summarize all the previous information as follows: A B C A, as she both starts and finishes at bundle A. Importantly, individuals with intransitive preferences would be subject to exploitation, as we discuss next. Consider three goods, an orange, an apple, and a banana, and a consumer with the following preferences: Exploitation of intransitive individuals Orange Banana and Banana Apple, but Apple Orange, where her preference Apple Orange violates transitivity. For her preferences to satisfy transitivity, we would need her to prefer the opposite, Orange Apple. To illustrate this point, assume that the individual initially owns an orange, and she plays a game with a fruit seller. If the fruit seller gives the individual her preferred fruit, she pays $1. In this scenario, a seller could offer her an apple for $1, which the individual would accept because she strictly prefers the apple to the orange, Apple Orange. Once the consumer has the apple, the seller could approach her again, offering a banana for $1, which she would also accept because she strictly prefers a banana to an apple, Banana Apple. Now that the consumer owns a banana, the seller could approach her again, offering an orange for $1, which she would accept, given that she reported Orange Banana. At this point, she has the same fruit as at the beginning of the exchange (orange), but has lost $3 in the process due to her intransitive preferences. 10 Chapter 2 Of course, the seller can start the process all over again and continue it ad infinitum, taking all the money from the consumer. As a result, individuals with intransitive preferences would be subject to exploitation by sellers (or heartless microeconomics students!), and ultimately be eliminated from the marketplace. Given this rationale, transitivity does not seem to be a very demanding property for preferences to satisfy. 2.3.1 Ranking Bundles with More Units The next two properties (strict monotonicity and monotonicity) describe how an individual ranks a bundle that contains more units of one good, or more units of all goods, than another bundle. Strict monotonicity Consider an initial bundle A and a new bundle B, where bundle B has the same amount of good x as bundle A (xB = xA ), but it contains more units of good y (yB > yA ). We say that a consumer’s preferences satisfy strict monotonicity if she strictly prefers B to A (B A). Therefore, increasing the units of even a single good, as we do with good y in bundle B, produces a new bundle that is strictly preferred to the original bundle A. Informally, strict monotonicity can be understood as “more is strictly better” (or “more of anything is strictly preferred”) because the consumer prefers bundles containing more units of any good. We next explore a weaker version of strict monotonicity, which allows the consumer to be indifferent between the new bundle B and the original bundle A. Monotonicity Consider an initial bundle A and new bundles B and C, where bundle B has the same amount of good x as bundle A (xB = xA ), but it contains more units of good y (yB > yA ), whereas bundle C has more units of both goods than bundle A does (i.e., xC > xA and yC > yA ). We say that a consumer’s preferences satisfy monotonicity if she weakly prefers B to A (B A), but strictly prefers C to A (C A). Intuitively, with monotonicity, the consumer can be indifferent between bundles A and B, despite B containing more units of good y than bundle A does. With strict monotonicity, however, increasing the number of units of any good was strictly preferred. Informally, monotonicity can be interpreted as “more is weakly better” because the consumer is either indifferent about receiving a bundle that contains more units of at least one good, or strictly better off (but never worse off!). Furthermore, monotonicity states that, if the amounts of all goods are higher, as in bundle C, then the consumer is strictly better off. Monotonicity then says, informally, that “more of everything is strictly preferred,” whereas strict monotonicity says that “more of anything is strictly preferred.” Consumer Preferences and Utility 11 Example 2.1: Monotonic and strictly monotonic preferences Consider the following scenario. We present two bundles A = (2, 3) and B = (2, 4) to Eric, an undergraduate student in your Microeconomics class. While the amount of good x is the same in both bundles, there is more of good y in bundle B than in A. We then ask him which bundle he prefers. He responds that he strictly prefers bundle B to A, which we write as B A. If this ranking holds for any two bundles we present to him, where only one of the two goods is increased, we can say that his preferences are strictly monotonic. Relative to bundle A, bundle B only increased the amount of good y, and that was enough for Eric to strictly prefer B to A. We then present the same two bundles to Chelsea, a classmate of Eric, who reports being indifferent between bundles B and A (i.e., B ∼ A). However, if we replace bundle B with bundle C = (3, 4), Chelsea tells us that she strictly prefers C to A (i.e., C A). If this ranking holds for any two bundles in which one has more units of all goods than the other, then we can say that her preferences satisfy monotonicity. Intuitively, bundle C has more units of both goods x and y than A does, leading Chelsea to say that C A. In contrast, increasing the amount of only good y (as we did in bundle B) made her indifferent between the two bundles (B ∼ A). We next present the first “self-assessment” of the book, which are short questions checking your understanding by changing one of the features in a previous example. We strongly encourage you to work on these questions. You can check your answers with the Practice Exercises in Intermediate Microeconomic Theory book, which includes detailed answer keys to all these questions, along with the odd-numbered exercises at the end of every chapter. Self-assessment 2.1 Consider now that Eric prefers bundle A = (1, 1) to B = (2, 1). Assume that this ranking holds for any two bundles we present to him, where only the amount of good x increases from bundle A to B. Are Eric’s preferences monotonic? Are they strictly monotonic? What if he prefers bundle A = (3, 3) to B = (2, 2), and a similar ranking holds for any two bundles where the amount of both goods decreases from A to B? From the previous discussion, if a consumer becomes strictly better off if we increase any one of the goods she consumes, then she is not worse off, which is the minimal requirement we need for her preferences to satisfy monotonicity. That is, if a consumer’s preferences satisfy strict monotonicity, then they also satisfy monotonicity: Strict monotonicity implies monotonicity. Strict monotonicity =⇒ Monotonicity 12 Chapter 2 In addition, monotonicity and strict monotonicity require that the consumer regards all items in her bundle as goods rather than bads (such as pollution or garbage). To see this, recall that if some good were a bad, increasing the number of units in the initial bundle A would produce a new bundle B that would be less preferred than the original bundle A, thus violating the definitions of monotonicity and strict monotonicity. To better understand this type of preferences, the next property allows for bads. 2.3.2 Satiation and Bliss Points Nonsatiation Preferences satisfy nonsatiation if, for every bundle A, we can find another bundle B for which the consumer is strictly better off. Formally, for every bundle A, there is another bundle B for which B A. Intuitively, nonsatiation means that there is no “bliss bundle” where the consumer cannot be made any happier by consuming an alternative bundle.1 An alternative explanation of nonsatiation is the following: For every bundle A, we can always find another bundle B that is weakly preferred to A. Importantly, this definition allows us to search for the “more preferred” bundle B anywhere we need to.2 Therefore, nonsatiation allows the consumer to regard some goods as “bads,” as opposed to monotonicity (strict monotonicity), where more units of all goods (at least one good) are desirable. Starting from an initial bundle A, the consumer can identify other bundles preferred to A, such as B, that contain more units of one of the goods (e.g., food) but fewer units of the other good (e.g., pollution or garbage). Nonsatiation only requires the consumer to, essentially, always find more preferred bundles. Lastly, note that if a consumer’s preferences satisfy monotonicity, they also satisfy nonsatiation: Monotonicity =⇒ Nonsatiation, Recall that, by definition, monotonicity requires that, starting from any bundle A, we can increase the amount of all goods (creating a new bundle B) and make the individual better off. Therefore, the consumer is not satiated at bundle A, because we can still find other bundles, such as B, which make him better off. 1. As a consequence, the utility function representing these preferences (a topic we discuss later in the chapter) cannot have a maximum, as that would be a satiation (i.e., bliss) point. In most applications, we guarantee this requirement by asking the consumer to choose a bundle from an affordable (or feasible) set of bundles where no bliss point exists. 2. That is, we can search for the “more preferred” bundle B to the northeast of A in figure 2.1 (i.e., bundles with more units of both goods than bundle A); to the southeast of A (i.e., bundles with more units of good x, but fewer units of y); and to the northwest of A (i.e., bundles with more units of good y, but fewer units of x). Consumer Preferences and Utility 13 The opposite relationship, however, does not necessarily hold; that is, Monotonicity Nonsatiation. We can find consumers whose preferences satisfy nonsatiation but violate monotonicity, as the following example illustrates. Example 2.2: Nonsatiated preferences Consider again your classmate Eric. We present bundles A = (2, 3) and D = (2, 1) to him, as depicted in figure 2.2. After asking which bundle he prefers, Eric responds that he strictly prefers bundle D to A, which we write as D A. In addition, he says that no other bundle makes him as happy as D does. We can then conclude that his preferences satisfy nonsatiation, but violate monotonicity. Why? First, note that, relative to bundle A, bundle D decreased the amount of good y, keeping good x unaffected. If Eric strictly prefers bundle D, it must be that he regards good y as a bad, seeking to reduce the amount of it that he consumes. y 3 Bundle A = (2,3) 1 Bundle D = (2,1) 2 x Figure 2.2 Bundles A and D. Self-assessment 2.2 Assume that Eric prefers bundles with more units of good x but fewer units of y. If he cannot consume negative amounts of either good, can you find a bliss point (i.e., a bundle where he is satiated)? Given your answer, do Eric’s preferences satisfy nonsatiation? What about monotonicity? 14 Chapter 2 In the next section, we describe how to represent consumer preferences using utility functions, and then provide examples of utility functions where some of (or all) these properties hold. 2.4 Utility Functions We use utility functions to mathematically represent consumer preferences, as defined next. Utility function The level of satisfaction that an individual enjoys from consuming a bundle of goods. For instance, if the individual consumes bundle A = (40, 30) and her utility function is u(x, y) = 3x + 5y, we can evaluate this utility function at bundle A to obtain a utility level of u(40, 30) = (3 × 40) + (5 × 30) = 270. Importantly, the utility level that we obtain from bundle A (e.g., 270 in the previous example about bundle A) is not as important as the ranking of utilities across bundles. In other words, only the utility ranking matters, which is often known as “ordinality” because it focuses on how the consumer orders bundles. In contrast, the specific utility level that the consumer reaches with each bundle does not matter, which is referred to as “cardinality.” The following examples illustrates this point. Example 2.3: Utility ranking and increasing transformations of the utility function Consider utility function u(x, y) = xy. Bundle A = (40, 30) produces in this context a utility level of u(40, 30) = 1, 200, while a new bundle B = (20, 30) generates a lower utility level of u(20, 30) = 600, implying that the individual prefers bundle A to B (A B). Imagine now that, rather than using utility function u(x, y) = xy to represent the preferences of this consumer, we use v(x, y) = 3xy + 8, which is just an increasing transformation of the original utility function u(x, y).3 In this situation, bundle A yields a utility level of v(40, 30) = 3, 608, whereas bundle B still generates a lower utility level of v(20, 30) = 1, 808, entailing that the individual still prefers bundle A to B (A B). In summary, a consumer’s preference over bundles A and B is unaffected (i.e., her ranking of A and B does not change) 3. Graphically, you can interpret function v(x, y) = 3u(x, y) + 8 as an upward shift of the initial function u(x, y) originating at 8 and increasing its slope by 3. Other increasing transformations include v(x, y) = au(x, y) + b, where a and b are positive constants; v(x, y) = u(x, y)2 ; and, more generally, functions where v(x, y) is increasing in u(x, y). Consumer Preferences and Utility 15 if we apply an increasing transformation on her initial utility function. These increasing transformations are also known as “monotonic” transformations because, graphically, they produce an upward shift on the initial utility function. Self-assessment 2.3 Consider again bundles A = (40, 30) and B = (20, 30) from Example 2.3, and assume that Chelsea’s utility function is u(x, y) = 2x + 3y. Does she prefer bundle A, or B, or is she indifferent between them? What if her preferences are represented with utility function v(x, y) = 4x + 6y? What if they are represented with v (x, y) = 4x − 6y? Hint: Utility function v (x, y) is not an increasing transformation of Chelsea’s original utility function u(x, y). Next, for practice, we consider a specific utility function and test the above properties of preference relations. Example 2.4: Testing properties of preference relations Consider again the utility function u(x, y) = xy from Example 2.3. Let us check if the preference relation that this utility function represents satisfies (a) completeness, (b) transitivity, (c) strict monotonicity, (d) monotonicity, and (e) nonsatiation. Completeness. For every two bundles A = (xA , yA ) and B = (xB , yB ), completeness holds when either u(xA , yA ) u(xB , yB ), u(xB , yB ) u(xA , yA ), or both (thus implying u(xA , yA ) = u(xB , yB )). This indeed holds because the utility level that we obtain from bundle A, u(xA , yA ), is a real number (e.g., 1, 200 as in Example 2.3), and so is the utility level that we obtain from bundle B, u(xB , yB ). We can then verify this by comparing these numbers and showing that either u(xA , yA ) u(xB , yB ), u(xB , yB ) u(xA , yA ), or u(xA , yA ) = u(xB , yB ).4 Transitivity. For every three bundles A, B, and C, where u(xA , yA ) u(xB , yB ) and u(xB , yB ) u(xC , yC ), transitivity holds when u(xA , yA ) u(xC , yC ). This follows the same argument as with completeness: since utility levels are real numbers, u(xA , yA ) u(xB , yB ) and u(xB , yB ) u(xC , yC ), implying that u(xA , yA ) u(xC , yC ) must also hold. For example, if u(xA , yA ) = 1, 200, u(xB , yB ) = 600 and u(xC , yC ) = 300, we know that 1, 200 600, 600 300, and 1, 200 300, thus implying that transitivity is satisfied. 4. For the bundles in Example 2.3, it is easy to check that u(xA , yA ) ≥ u(xB , yB ) because 1, 200 > 600, but a similar argument applies to any pair of bundles A and B. 16 Chapter 2 Strict monotonicity. For this property to hold, we need utility function u(x, y) = xy to be strictly increasing in both goods. (Recall that consumers with strictly monotonic preferences prefer bundles with more units of any good.) We can formally check this by differentiating the utility function with respect to x, with respect to y, and confirming that these derivatives are positive. That is, for this example, we need: ∂u(x, y) = y and ∂x ∂u(x, y) = x. ∂y Therefore, we can say that increasing the units of good x produces a strict increase in the consumer’s utility level, so long as she consumes positive units of good y (i.e., if y > 0). If, instead, she does not consume good y at all, y = 0, increasing good x does not alter the individual’s utility level. Therefore, strict monotonicity does not hold because an increase in good x does not necessarily increase the consumer’s utility.5 Monotonicity. For this property to hold, we need the utility function u(x, y) = xy to be weakly increasing in both goods. That is, separately increasing the amount of either good does not hurt. From the previous discussion, we know that an increase in good x either produces a strict increase in the consumer’s utility (which occurs when y > 0) or does not affect her utility (when y = 0), but it never reduces her utility. A similar argument applies for an increase in good y, which never produces a decrease in her utility level. Lastly, an increase in both x and y produces a new bundle that generates a strictly greater utility. To see this, consider that good x is increased by a > 0 units (from x to x + a units) and good y is increased by b > 0 units (from y to y + b units). This yields a utility level of u(x + a, y + b) = (x + a)(y + b), which strictly exceeds the utility of the original bundle u(x, y) = xy for any values of increments a and b. Nonsatiation. This property holds by monotonicity. Indeed, we found that increasing the amounts of both goods produces a new bundle (x + a, y + b) that is strictly preferred to the original bundle (x, y); this result holds for any original bundle (x, y) that we analyze. In other words, starting from any bundle (x, y), we can always find another bundle, such as (x + a, y + b), for which the consumer is better off. As a consequence, the consumer is never satiated (i.e., she does not reach a bliss point), as required by nonsatiation. 5. A similar argument applies to the effect of increasing y, which strictly increases the consumer’s utility level (does not modify her utility level) if she consumes a positive amount of good x (no units of good x at all). Consumer Preferences and Utility 17 Table 2.1 Utility functions and their properties. Utility Function u(x, y) = by u(x, y) = ax u(x, y) = ax − by u(x, y) = ax + by u(x, y) = A min{ax, by} u(x, y) = Axα yβ Completeness Transitivity √ √ √ √ √ √ √ √ √ √ √ √ Strict Monotonicity X X X √ X X Monotonicity Nonsatiation √ √ √ √ √ √ √ √ X √ √ √ Table 2.1 shows which of these five properties hold for common utility functions, where we assume that parameters a, b, A, α, and β are all strictly positive. You can replicate the steps in example 2.4 for each utility function to identify which properties hold, and confirm that you obtain the same results as in table 2.1. Let us briefly describe the intuition behind each utility function and the goods that each of them normally represents. First, utility function u(x, y) = by considers an individual who regards good x as irrelevant, and thus increasing x has no effect on her utility; u(x, y) = ax exhibits a similar pattern, but instead good y is now irrelevant. In utility function u(x, y) = ax − by, good y is a bad because increasing y decreases her utility. The other utility functions represent goods that the consumer regards as perfect substitutes, u(x, y) = ax + by; complements, u(x, y) = A min{ax, by}; or neither perfect substitutes nor complements, u(x, y) = Axα yβ . These are explained in detail in the last sections of this chapter. Following the discussion in section 2.2, if a preference relation satisfies strict monotonicity, it then satisfies monotonicity and nonsatiation, such as in utility function u(x, y) = Axα yβ . Similarly, if a utility function satisfies monotonicity, then it must satisfy nonsatiation, which holds for all other utility functions on the table. The converse of these relationships is not necessarily true, however. In other words, a preference relation can satisfy monotonicity, but not necessarily strict monotonicity, such as in u(x, y) = A min{ax, by}. Self-assessment 2.4 Consider that Eric’s utility function is u(x, y) = 2x + 3y, which is just an example of u(x, y) = ax + by, where a = 2 and b = 3. Following the steps in example 2.4, show that this utility function satisfies completeness, transitivity, monotonicity, strict monotonicity, and nonsatiation, as summarized in table 2.1. Then consider one of Eric’s friends, John, who has a utility function u(x, y) = min{2x, 3y}. Following the steps in example 2.4, show that the utility function satisfies all the properties in table 2.1, except for strict monotonicity. 18 Chapter 2 2.5 Marginal Utility We next describe how an individual’s utility increases as we increase the amount of one, and only one, of the goods she consumes. Marginal utility of a good good increases. The rate at which utility changes as consumption of a Intuitively, marginal utility answers the question: how much better off do you become by consuming 1 more unit of good x? Mathematically, we measure the marginal utility of good x, MUx , by partially differentiating the utility function u(x, y) with respect to x: MUx = ∂u(x, y) , ∂x and similarly for the marginal utility of good y, MUy = ∂u(x,y) ∂y . Graphically, we measure the slope (rate of change) of the utility function as we increase the amount of good x, holding the amount of other goods constant. Recall that, when differentiating with respect to good x, we keep the amount of good y constant, so we only change one thing at a time (that is, we only increase the amount of good x). Similarly, when we differentiate with respect to good y, we only increase the amount of good y, while leaving good x unchanged. The next example illustrates how to find marginal utility in a common utility function. Example 2.5: Finding marginal utility, MU x1/2 y1/2 . The marginal utility of good x is Consider the utility function u(x, y) = 1 1 1 1 1 MUx = x 2 −1 y1/2 = x− 2 y 2 2 2 1/2 or, after rearranging, MUx = 12 yx1/2 . (Recall that, for any exponent α > 0, expression x−α can alternatively be written as a ratio, as follows: x1α .) This MUx is positive when the individual consumes positive amounts of good x and y, thus indicating that 1 more unit of good x raises her utility. Similarly, the marginal utility of good y is 1 1 1 1 1 1 MUy = x 2 y 2 −1 = x 2 y− 2 , 2 2 1/2 or, MUy = 12 xy1/2 . As for good x, we find here that when the individual consumes positive amounts of goods x and y, MUy is positive. Consumer Preferences and Utility 19 Self-assessment 2.5 Chelsea’s utility function is u(x, y) = 5x + 2y. Find the marginal utility for goods x and y. Repeat your analysis, assuming that her utility function is u(x, y) = 5x − 2y. Interpret. 2.5.1 Diminishing Marginal Utility An interesting property of many utility functions is that their marginal utilities are decreasing in the amount of the good that the individual consumes. That is, MUx decreases in x, or ∂MUx 0; ∂x and the same applies for good y, where MUy decreases in y. Intuitively, this entails that, while more units of good x increase the individual’s utility level, further increments in x produce smaller utility gains. In other words, when the consumer has few units of a good (such as food), providing her with 1 more unit increases her utility by a great deal. When she already has large amounts of the food, however, giving her 1 more unit of food produces a small utility gain (or no gain at all!). Example 2.6 illustrates this property. Example 2.6: Diminishing marginal utility Consider the individual in example 1/2 2.5, where we found a marginal utility from good x of MUx = 12 yx1/2 . Because good x only shows up in the denominator of MUx , this marginal utility is decreasing in the amount that the consumer enjoys of good x. More formally, we can differentiate MUx with respect to x, obtaining y1/2 ∂MUx = − 3/2 , ∂x 4x which is negative for any combination of x and y. Similarly, her marginal utility 1/2 ∂MU x1/2 from good y, MUy = 12 xy1/2 , is decreasing in good y because ∂y y = − 4y 3/2 < 0 for all values of x and y. Self-assessment 2.6 Are the expressions of MUx and MUy that you found in selfassessment 2.5 decreasing or increasing? 20 Chapter 2 2.6 Indifference Curves Figure 2.3a depicts the utility function analyzed in example 2.4, u(x, y) = x1/2 y1/2 . Graphically, it resembles a mountain where the height is the utility that the individual achieves by consuming a specific amount of x and y. For instance, with a bundle such as (x, y) = (4, 9), the consumer reaches a utility level of u(4, 9) = 41/2 91/2 = 6; but she can also obtain this utility level at other bundles, such as (x, y) = (6, 6), which yields a utility of u(6, 6) = 61/2 61/2 = 6, or at (x, y) = (9, 4), which also generates a utility of u(9, 4) = 91/2 41/2 = 6. Figure 2.3b depicts a “slice” of the utility mountain in Figure 2.3a at a height of u = 6. As discussed previously, we can reach a height of u = 6 at bundles such as (4, 9), (6, 6) and (9, 4). Figure 2.3b connects these bundles with a curve and indicates that the consumer (a) x y (b) y 25 20 15 (4,9) 10 (6,6) (9,4) 5 2 Figure 2.3 (a) Utility function. (b) Indifference curve. 4 6 8 IC for u=6 10 x Consumer Preferences and Utility 21 obtains the same utility u = 6 in all of them. That is, she is indifferent between consuming any of these bundles, as she achieves the same utility with each of them. For this reason, this curve is referred to as an indifference curve, as defined next. For practice, you can repeat this process for a different utility level, such as u = 14. Indifference curve (IC) same utility level. A curve connecting consumption bundles that yield the Figure 2.3b illustrates the indifference curve for utility function u(x, y) = x1/2 y1/2 , evaluated at a utility level of u = 6. Because we represent good y on the vertical axis, we only need to solve for y to obtain the expression of an indifference curve. That is, rearranging 6 which, after squaring both sides, yields the equation of the x1/2 y1/2 = 6, we find y1/2 = x1/2 36 indifference curve y = x . The next example describes how to find ICs in other common utility functions. Example 2.7: Finding ICs for two utility functions Consider again utility function u(x, y) = x1/2 y1/2 . Let us obtain the expression for the indifference curve when the consumer reaches utility level u = 10. This indifference curve entails that x1/2 y1/2 = 10. We now seek to solve for y. First, we find that y1/2 = x10 1/2 , and then we square both sides to obtain our indifference curve: y= 100 . x To identify a few bundles on the indifference curve, we can plug in several values for good x, such as x = 4, which produces y = 100 4 = 25; x = 8, which yields 100 = 12.5; or x = 10, which entails y = = 10. Finally, we can plot the three y = 100 8 10 bundles that we obtained –(4, 25), (8, 12.5), and (10, 10)– as points on the positive quadrant, and connect these points to form the indifference curve for u = 10. We can follow a similar approach for a different utility function, such as u(x, y) = 5x + 3y, and the utility level of u = 9. Solving for y in 5x + 3y = 9, we obtain 5 y = 3 − x. 3 This indifference curve, hence, originates at y = 3 and decreases at a rate of 5/3, crossing the horizontal axis at 9/5. Recall that, to obtain the horizontal intercept of a function, we just need to set the function equal to zero (as its height becomes zero when crossing the horizontal axis) and solve for x. In this example, we set the function 22 Chapter 2 of the indifference curve equal to zero, 3 − 53 x = 0. Rearranging, we obtain 9 = 5x, and solving for x we find a horizontal intercept of x = 9/5. We can now evaluate the indifference curve y = 3 − 53 x at several values for good x (of course, x needs to be smaller than the horizontal intercept 9/5 1.8, since otherwise we would obtain points on the negative quadrant). For instance, if x = 1/4, we find that y = 3 − 53 × 14 2.58; and similarly for x = 1/2, we obtain y = 2.16; and for x = 1, where we have y = 1.33. Self-assessment 2.7 Repeat example 2.7, but using utility function u(x, y) = x1/3 y2/3 and utility level u = 16. 2.6.1 Properties of Indifference Curves ICs are negatively sloped. In our previous examples, all indifference curves were negatively sloped. This is a property that holds from monotonicity. To see this, consider a bundle A = (xA , yA ), such as that depicted in Figure 2.4. The indifference curve passing through bundle A cannot go through region I because bundles in this region contain strictly more units of both goods x and y than bundle A does. Hence, the individual is not indifferent between bundles in region I and bundle A; instead, she strictly prefers the former to the latter. A similar argument applies to bundles in region II, which contain strictly fewer units of both goods x and y than in bundle A. As a consequence, the consumer is not indifferent between bundles in region II and bundle A, but she strictly prefers the latter to the former. The only two regions where the indifference curve passing through A can lie are regions III y Region IV yA Region I A Region II Region III xA Figure 2.4 A positively sloped IC. x Consumer Preferences and Utility 23 y D E A B C x Figure 2.5 Two ICs intersecting. (where the consumer gets more units of x, but fewer of y) and IV (where she has more units of y, but less of x). As an exercise, note that the indifference curves we found in example 2.5 were both strictly decreasing in x because both originated from a monotonic utility function.6 We often refer to negatively sloped indifference curves as “convex” preferences. Mathematically, this means that, for any two bundles on the curve, such as A and B in figure 2.3b, a straight line connecting them lies: (1) strictly above the curve, thus yielding a higher utility level than bundles A or B, which occurs when the indifference curve is strictly decreasing such as that in figure 2.3b; or (2) on the indifference curve, yielding the same utility level as either of the two points we connected, which happens when the indifference curve is a straight line. We return to this property, and its economic interpretation, in section 2.6. Self-assessment 2.8 Consider a consumer with utility function u(x, y) = 3y + 2x who seeks to reach a utility level u = 20. Solve for y to find her indifference curve. Is it increasing or decreasing? What if her utility function is u(x, y) = 3y − 2x? ICs cannot intersect. This property also follows from monotonicity. To illustrate why, figure 2.5 depicts a situation where two indifference curves intersect at bundle A. Let’s examine why these indifference curves violate monotonicity. First, bundle B lies to the northeast of bundle C, implying that the former contains larger amounts of both goods than bundle C does. With monotonicity, the consumer prefers the bundle with more units of both goods, B, 6. Utility functions like u(x, y) = by − ax, where a, b > 0, have a positively sloped IC. To see this, consider a utility 20 a level of u = 10, and solve for good y, to obtain an indifference curve y = 10+ax b = b + b x, which increases in x as in figure 2.4. 24 Chapter 2 y 25 20 15 A 10 Thick indifference curve B 5 2 4 6 8 10 x Figure 2.6 Thick ICs violate monotonicity. entailing that uB > uC .7 However, bundle D lies northeast of E. With monotonicity, the utility from consuming D is larger than that of E, so uD > uE . Finally, because bundles C and D lie on the same indifference curve, we must have that uC = uD . Similarly, bundles B and E lie on the same indifference curve, implying that uB = uE . Combining these equalities with inequality uB > uC , we obtain that uE = uB > uC = uD , which implies uE > uD , contradicting the result found about bundles E and D (uD > uE ). Therefore, monotonicity implies that indifference curves cannot intersect. As a corollary of this property, note that every consumption bundle lies on one, and only one, indifference curve. If, instead, a bundle could lie on two indifference curves (as bundle A does in figure 2.5), we would experience contradictions like the one just discussed. This property also follows from monotonicity. To see this, figure 2.6 depicts a thick indifference curve. Starting from any bundle, such as A in the figure, we could find other bundles to the northeast of A, such as B. This bundle contains more units of both goods than A does, implying that, by monotonicity, the consumer reaches a higher utility level at B than at A. As a consequence, the consumer is not indifferent between bundles A and B; instead, she strictly prefers B to A. Therefore, monotonicity implies that indifference curves cannot be thick.8 ICs are not thick. 7. The consumer is not indifferent between bundles B and C, which could occur only if both bundles yield the same utility level. If these two bundles lie on different indifference curves, they must yield strictly different utility levels. 8. When indifference curves are nonthick (such as that depicted in figure 2.3b), we cannot reproduce this argument. Starting from a bundle such as A, we cannot find other bundles to the northeast of A that lie on the same indifference curve. Instead, bundles to the northeast of A lie on indifference curves associated with higher utility levels. Consumer Preferences and Utility 25 y A 3 units B IC x 1 unit Figure 2.7 Interpreting MRSx,y . 2.7 Marginal Rate of Substitution As noted in the previous discussion, indifference curves are negatively sloped when monotonicity holds. We next present a more formal expression of the slope of the indifference curve. Marginal rate of substitution (MRS) The rate at which a consumer is willing to give up units of good y as she receives an additional unit of good x, in order to keep her utility level constant. Formally, the MRS of good x for y is given by the ratio of marginal utilities: MRSx,y = MUx . MUy Intuitively, we start at a bundle on an indifference curve, such as bundle A of figure 2.7, and ask the consumer: If you could receive 1 more unit of good x, how many units of good y would you be willing to give up to keep your utility level unaffected? As depicted in figure 2.7, this question means that we move along the indifference curve, from a bundle like A, to a new bundle B, where the consumer has 1 more unit of good x but gives up 3 units of y to maintain her utility level. In other words, MRSx,y measures how much disutility from consuming fewer units of good y the individual is willing to suffer, as captured by MUy < 0, to receive 1 more unit of good x, as represented by MUx > 0. When 26 Chapter 2 y A 8 C 5 B 3 u2 u1 2 5 7 x Figure 2.8 Diminishing MRSx,y —First interpretation: Preference for variety. MUx > 0 and MUy < 0, the MRSx,y becomes negative, MRSx,y = (+) (−) = (−), as depicted by the slope of the indifference curve in figure 2.7.9 (The appendix at the end of this chapter mathematically proves why the MRSx,y coincides with the ratio of marginal utilities.) 2.7.1 Diminishing MRS One interesting property of common utility functions is that they exhibit a diminishing MRS. Because the MRS graphically represents the slope of the indifference curve, this property implies that the indifference curve is relatively steep for small amounts of good x (on the left side of figure 2.7), but becomes flatter as we move rightward toward greater amounts of good x. This property can be interpreted according to two economic intuitions we discuss next. 1. First interpretation: Preference for variety. A diminishing MRS implies that indifference curves are bowed in toward the origin. In this context, the consumer is indifferent between two extreme bundles, such as A in figure 2.8 (which contains many units of y, but few of x) and B (which has many units of x, but few of y), both bundles yielding the same utility level u1 . The consumer, however, prefers more balanced bundles, such as C, which yields a higher utility level u2 , where she consumes an intermediate amount of both goods x and y. 2. Second interpretation: Decreasing willingness to substitute. Starting at a bundle like A in figure 2.9, the individual is willing to give up several units of good y in order to obtain 1 more unit of x, because she has several units of y, but few of x. Her willingness to give 9. If, upon receiving 1 more unit of good x, the consumer did not give up units of y, her utility level would not necessarily be the same as at bundle A. Her utility level would be strictly higher when her preferences satisfy strict monotonicity, because she has 1 more unit of good x and the same number of units of good y. Her utility level, however, could coincide with that at bundle A if the individual’s preferences satisfy monotonicity, and if she is indifferent between A and a bundle containing more units of x than A does. Consumer Preferences and Utility 27 y A 3 units B C D 1 unit 1 unit IC x 1 unit Figure 2.9 Diminishing MRSx,y —Second interpretation: Decreasing willingness to substitute. up units of good y, however, decreases once she has more units of good x and few units of y (as in bundle C). That is, the consumer is willing to give up several units of the good that is relatively more abundant to obtain 1 unit of the good she lacks (this is the case of good x in bundle A). However, she becomes less willing to give up units of good y once she has few units of this good (as in bundle C). x This interpretation can be seen in the MRS definition, MRSx,y = MU MUy . At a point like A in figure 2.9, the marginal utility from additional units of x is relatively high (because this good is scarce), while the marginal disutility from giving up y is relatively low (as x the good is abundant), yielding a large ratio MU MUy , which entails a large MRS and a steep indifference curve. At point C, in contrast, the marginal utility from additional units of good x is now low (because this good became more abundant than at point A) and the marginal disutility from giving up y becomes high (as the good is now relatively scarce), x yielding a small ratio MU MUy , which entails a low MRS and an almost flat indifference curve. Example 2.8: Finding MRS In this example, we examine three utility functions, where MRS is decreasing, constant, or increasing in good x, respectively. First, consider utility function u(x, y) = x1/2 y1/2 from example 2.5 again, where we found the marginal utilities for goods x and y, MUx and MUy . These can now be used to obtain: MRSx,y = MUx = MUy 1 − 12 12 y 2x 1 12 − 12 2x y = y 1 2− 1 2− − 12 − 12 y = , x x where we canceled the 1/2 on the numerator and denominator, and used the property a that xxb = xa−b for exponents a and b. Therefore, we found that MRSx,y = yx , which is 28 Chapter 2 decreasing in good x, yielding indifference curves that are bowed in toward the origin, such as those in figures 2.7–2.9. Consider now the linear utility function u(x, y) = ax + by, where a and b are positive parameters. In this situation, marginal utilities are MUx = a and MUy = b, yielding MRSx,y = MUx a = , MUy b which is constant in x. For instance, if a = 10 and b = 4, then MRSx,y = 2.5, indicating that the slope of the indifference curve is −2.5 along all its points (i.e., a straight line).10 Lastly, consider a consumer with utility function u(x, y) = ax2 + by3 . In this context, marginal utilities are MUx = 2ax and MUy = 3by2 , yielding MRSx,y = MUx 2ax = , MUy 3by2 which is increasing in x.11 Therefore, the indifference curve is relatively flat for low values of x, but becomes steeper as we move rightward along the x-axis, eventually becoming almost vertical. Graphically, indifference curves are bowed away from the origin. Self-assessment 2.9 Eric’s utility function is u(x, y) = x1/3 y2/3 . Find his MRS and show whether it is increasing or decreasing in x. Repeat your analysis for Pam’s utility function, u(x, y) = 3x + 2y, and for Maria’s utility function, u(x, y) = 3x − 2y. 2.8 Special Types of Utility Functions This section presents, in detail, five types of utility functions often used in economic applications, each capturing a consumer’s preference for different classes of goods. 10. As an exercise, we can find the equation of the indifference curve in this utility function. For a given utility level u, we have u = 10x + 4y. Rearranging, we obtain u − 10x = 4y, which, solving for good y, yields y = u4 − 10 4 x. Graphically, this means that the indifference curve originates at a vertical intercept of u4 (e.g., 20 4 = 5 if u = 20) and decreases with a slope of − 10 = −2.5, confirming what we found with the MRS. Of course, this slope is constant, 4 as it does not depend on the amount of good x that the individual consumes. 11. For instance, if a = 10 and b = 5, then the MRS becomes MRSx,y = 4x2 . 3y Consumer Preferences and Utility 29 y 2 b slope = – a b 1 b 1 a 2 a x Figure 2.10 Perfect substitutes in consumption. 2.8.1 Perfect Substitutes The consumer may regard two goods as close substitutes, such as two brands of unflavored mineral water (e.g., Aquafina versus Dasani), or two universal serial bus (USB) memory sticks (e.g., Sandisk and Kingston), as the consumer can use either good without significantly affecting her utility. Other goods that are often regarded as substitutes are coffee and black tea, or butter and margarine. For these goods, the consumer’s utility function takes the form u(x, y) = ax + by, where a and b are positive parameters. (For instance, if a = 2 and b = 4, one unit of x gives the consumer twice as much utility as one unit of y, because MUx = 2 and MUy = 4, so that MRSx,y = 4/2 = 2.) This utility is linear in both good x, because its marginal utility MUx = a is constant, and good y, because MUy = b is also a constant. As discussed in example 2.8, the MRS in linear utility functions is MRSx,y = MUx a = . MUy b Graphically, a constant MRS results in indifference curves that are represented by a straight line with slope − ab . To see this point, solve for y in utility function u(x, y) = ax + by, which yields the equation of the indifference curve, y = ub − ab x, originating at ub , decreasing at a rate of ab , and crossing the horizontal axis at ua . Figure 2.10 illustrates two indifference curves: one evaluated at u = 1, and another at u = 2.12 12. In some settings, the utility function takes the form u(x, y) = ax + ay = a (x + y), indicating that parameters a and b coincide. In this situation, the slope of the indifference curve simplifies to − aa = −1. 30 Chapter 2 Recall that, intuitively, MRS measures the consumer’s willingness to give up units of good y to obtain 1 more unit of x, while keeping her utility level unaffected. Therefore, a constant MRS (i.e., a number) implies that the consumer’s willingness to substitute y for additional units of x is, in plain terms, “always the same,” that is to say, it remains unaffected by the relative scarcity of each good. In contrast, when MRS is decreasing, the consumer is willing to give up more units of good y when x becomes relatively scarce (in other words, good x becomes more attractive, in relative terms, as compared to the abundant good y). Self-assessment 2.10 Chelsea’s utility function is u(x, y) = 3x + 2y. Graph her indifference curve for utility levels u = 10 and u = 20. 2.8.2 Perfect Complements The individual in this case must consume goods in fixed proportions, such as cars and gasoline, left and right shoes, or peanut butter and jelly sandwiches. In particular, her utility function takes the form u(x, y) = A min {ax, by} , where A, a, and b are positive parameters.13 For example, if A = 1 and a = b = 2, the utility function reduces to u(x, y) = min {2x, 2y} = 2 min{x, y}. One interesting property of this utility function is that, if the consumer increases the amount of good x by one unit without increasing the amount of y, her utility does not necessarily increase. Specifically, when the consumer has more units of x than y (x y), an increase in x does not increase her utility at all. However, when she has more units of y (y > x), an increase in x—the least abundant good—does increase her utility level. To illustrate this point, consider that the consumer has 10 units of each good, yielding a utility level of u(10, 10) = min {2 × 10, 2 × 10} = min{20, 20} = 20. If good x is now increased from 10 to 11 units, but good y is unaffected, her utility remains at the same level because u(11, 10) = min {2 × 11, 2 × 10} = min{22, 20} = 20. 13. This utility function is often referred to as “Leontieff,” after the economist who first conceptualized it, Wassily Leontieff. Consumer Preferences and Utility 31 y y= 2 a b D a b u2 = 2 Aa C a a x b E b 1 2 u1 = Aa x Figure 2.11 Perfect complements in consumption. In other words, increasing the amount of one of the goods alone does not yield utility gains, as this consumer needs to enjoy both goods in fixed proportions. (Think about having more fuel-powered cars without having more gasoline!) Formally, this means that preferences for complementary goods violate the strict monotonicity property, because giving the consumer more units of only one good does not necessarily increase her utility. Graphically, the indifference curves associated with this utility function have an L-shape, as figure 2.11 illustrates. Starting from the bundle at the kink, like C, an increase in good x alone (moving rightward) does not increase the consumer’s utility, as depicted by bundle E, which lies on the same indifference curve as bundle C. A similar argument applies if we increase good y alone (moving upward from C) as depicted by bundle D, which also lies on the same indifference curve as bundle C. The kink occurs at points where the two arguments inside min {ax, by} coincide, that is, at ax = by. Solving for y, we find y = ab x, as depicted in the ray in figure 2.11 that crosses all indifference curves at their kinks.14 While the slope of this indifference curve is zero in its flat segment (see the portion on the right side of bundle C), and −∞ in the vertical segment (to the left of C), it is undefined at the kink. Graphically, we could depict infinitely many tangent lines at the kink to define the slope of the indifference curve, so we could not identify a unique and precise slope at this kink. 14. In addition, at bundle C = 1, ab of figure 2.11, the consumer’s utility becomes u 1, ab = A min a × 1, b ba = A min{a, a} = Aa, as illustrated in utility level u1 = Aa in the figure. A similar argument applies to bundle E = 2, ab , where her utility is u 2, ab = A min a × 2, b ab = A min{2a, a}. Because a < 2a, we find that her utility is A min{2a, a} = Aa, which coincides with her utility in bundle C. As a practice exercise, find her utility at bundle D, confirming that it is also Aa. 32 Chapter 2 Self-assessment 2.11 John’s utility function is given by u(x, y) = 3 min{x, 2y}. Graph his indifference curve for utility levels u = 10 and u = 20. 2.8.3 Cobb-Douglas The Cobb-Douglas utility function15 is an intermediate case lying between the utility functions in subsections 2.8.1 and 2.8.2, because the consumer regards goods as neither perfectly substitutable nor complementary. We have encountered this utility function in previous examples throughout this chapter, but we present a more general form here: u(x, y) = Axα yβ , where A, α, and β are positive parameters. As described in previous examples, MUx = Aαxα−1 yβ and MUy = Aβxα yβ−1 , and they yield MRSx,y = MUx Aαxα−1 yβ αyβ−(β−1) αy = = = . MUy Aβxα yβ−1 βxα−(α−1) βx Its MRS is then decreasing in x (i.e., as x increases, the ratio goes down), thus producing indifference curves that are bowed in toward the origin. Graphically, indifference curves become flatter as we move rightward to higher values of x. This type of utility function embodies the following functions as special cases, which are often used in economic analysis: 1. A = α = β = 1, which reduces the utility function to u(x, y) = xy. In this case, the MRS simplifies to MRSx,y = yx because α = β. 2. A = 1 and α = β, which simplifies the function to u(x, y) = xα yα = (xy)α . In this situation, the MRS also reduces to MRSx,y = yx . 3. A = 1 and β = 1 − α, which yields a utility function of u(x, y) = xα y1−α . The MRS now α y simplifies to MRSx,y = 1−α x. Common examples of case (2) are u(x, y) = (xy)1/2 and u(x, y) = (xy)1/3 , and examples of case (3) include u(x, y) = x1/3 y2/3 and u(x, y) = x1/4 y3/4 . Self-assessment 2.12 Maria’s utility function is u(x, y) = 5x1/2 y1/4 . Graph her indifference curve for utility levels u = 10 and u = 20. 15. This function was developed in 1927 by Paul Douglas (an economist and U.S. senator) and Charles Cobb (mathematician and economist). Consumer Preferences and Utility 33 Lastly, note that the exponents in the Cobb-Douglas utility function can be interpreted as elasticities. Before we show this result, we define “utility elasticity.” Utility elasticity of good x, εu,x The percentage increase in utility (if εu,x > 0) or percentage decrease in utility (if εu,x < 0) that the consumer experiences after increasing the amount of good x she consumes by 1 percent. More formally, εu,x = % u(x, y) . % x Rearranging this expression, we obtain εu,x = % u(x, y) = % x u(x,y) u(x,y) x x = u(x, y) x . x u(x, y) When the increase in the amount of good x is marginally small, we can rewrite this expression as follows εu,x = ∂u(x, y) x , ∂x u(x, y) where the first term simply represents the marginal utility of good x and the second term is a ratio with the amount of good x that the individual consumes in the numerator and her utility function in the denominator. Equipped with the definition of utility elasticity εu,x , we can apply it to the Cobb-Douglas utility function to find that εu,x = x ∂u(x, y) x = Aαxα−1 yβ α β , ∂x u(x, y) Ax y ∂u(x,y) ∂x because ∂u(x,y) ∂x u(x,y) = Aαxα−1 yβ . Simplifying this expression yields εu,x = Aαxα−1+1 yβ Aαxα yβ = = α. Axα yβ Axα yβ Hence, when facing a utility function like u(x, y) = Axα yβ , we can claim that the exponent in good x, α, represents the utility elasticity of a marginal increase in x. Intuitively, a 1 percent increase in the amount of good x increases utility by α percent. A similar argument applies to good y, whose utility elasticity is β. Self-assessment 2.13 Consider Maria’s utility function again, u(x, y) = 5x1/2 y1/4 . What is the utility elasticity of good x? And of good y? Interpret. 34 Chapter 2 2.8.4 Quasilinear The quasilinear utility function is often used in economic applications analyzing consumers who use all their additional income on one good alone (e.g., good y, or video games). Alternatively, additional income is never spent on good x. This occurs for goods such as garlic and toothpaste, whose consumption is relatively unaffected by an individual’s income.16 Generally, the quasilinear utility function has the form u(x, y) = v(x) + by, where b is a positive constant, and v(x) is a nonlinear function in x, such as v(x) = x1/2 or v(x) = ln x.17 Other commonly used nonlinear functions in x include v(x) = axy because its derivative is v (x) = ay, which is not a constant, or generally, any function v(x) whose derivative with respect to x, v (x), is not a constant, but instead depends on the units of good x, good y, or both. In this context, marginal utilities are MUx = v (x) and MUy = b, which yield MRSx,y = MUx v (x) . = MUy b Hence, for a given value of x, the MRS is constant because it does not depend on the amount of good y. To illustrate this result, consider a consumer with quasilinear utility function u(x, y) = x1/2 + 3y, so that v(x) = x1/2 and b = 3. Hence, v (x) = 12 x−1/2 , and the MRS becomes MRSx,y = 1 −1/2 2x 3 1 = √ . 6 x 1 Therefore, for a given value of x, such as x = 16, we obtain MRSx,y = √1 = 24 , which is 6 16 constant in y (i.e., it does not depend on y). Graphically, this result says that if we fix the value of good x (e.g., x = 16 units, as depicted in figure 2.12), the slope of the indifference curve (MRSx,y ) is unaffected by the amount of good y. In other words, if we extend a vertical line at a given value of x, such as x = 16, the slope of the indifference curve is the same at all indifference curves being crossed by this vertical line. Graphically, this implies that indifference curves are parallel shifts of each other (in this case, vertical shifts), indicating that the consumer will use additional income to buy good y 16. Other examples may include fads, such as hula hoops or pet rocks, for which all additional income is spent on the fad item. A similar argument applies for recent fads, such as Star Wars figures or Pokemon Go. While the Pokemon Go app is free, players need PokeCoins to buy useful items and for inventory upgrades, generating a revenue of $1.8 billion in the two years after its launch. 17. Recall that when we say that function v(x) is nonlinear, we mean that its derivative with respect to x, v (x), is not a constant (i.e., it depends on the amount of good x or y). In the previous examples, this derivate is v (x) = 12 x−1/2 and v (x) = 1x , respectively; all being a function of x that is required for v(x) to be nonlinear. Consumer Preferences and Utility 35 y u3 u2 u1 x 16 Figure 2.12 Quasilinear utility—Parallel ICs. alone (the good that entered linearly in her utility function), but additional income is not used to purchase more units of good x (the good that entered nonlinearly). Self-assessment 2.14 Eric’s utility function is u(x, y) = x1/3 + 14 y. Find his MRS and depict his indifference curve for utility levels u = 10 and u = 20. 2.8.5 Stone-Geary The Stone-Geary utility function18 takes a Cobb-Douglas shape, but requires that individuals have a minimum amount of each good they require to live, such as half a gallon of water or 2,200 food calories per day. Using x and y to represent the minimal amounts of goods x and y that the individual needs, the utility function is written as u(x, y) = A (x − x)α (y − y)β , where A, α, and β are positive constants. Intuitively, the individual obtains a positive utility from good x only after exceeding her minimal consumption x (i.e., when x > x); and the same applies for the utility from good y, when y > y. Otherwise, her utility from good x, good y, or both is negative. Note that when the minimal amounts of x and y are both zero (x = y = 0), this utility function reduces to u(x, y) = Axα yβ , thus coinciding with the standard CobbDouglas expression discussed previously. In this situation, the marginal utilities become MUx = Aα (x − x)α−1 (y − y)β α MUy = Aβ (x − x) (y − y) β−1 and , 18. This utility function was first derived by Roy C. Geary, and later, it was empirically estimated by Richard Stone. 36 Chapter 2 thus implying that the MRS is MRSx,y = = MUx Aα (x − x)α−1 (y − y)β = MUy Aβ (x − x)α (y − y)β−1 α (y − y)β−(β−1) β (x − x)α−(α−1) = α (y − y) . β (x − x) Interestingly, when the minimal amounts of x and y that the individual must consume are zero (x = y = 0), this MRS collapses to MRSx,y = α y−0 αy = , β (x − 0) βx which coincides with the MRS found in the Cobb-Douglas case. As in that case, indifference curves are here bowed-in toward the origin. Self-assessment 2.15 Ana’s utility function is u(x, y) = 5 (x − 2)1/2 (y − 1)1/3 . Find her marginal utilities and her MRS, and check if it is decreasing in x. 2.9 A Look at Behavioral Economics—Social Preferences The utility functions considered in the previous sections assume that the individual cares about the bundle she receives but ignores the bundle (or money) that other individuals enjoy. However, we can imagine many scenarios where we care about the well-being of family members, friends, or even strangers we just watched on television. In this section, we explore two of the utility functions suggested by the literature to account for this behavioral pattern, where individuals exhibit social, rather than selfish, preferences. Generally, the field of behavioral economics relaxes standard assumptions in economics, such as selfishness, unbounded rationality, or the individual’s unlimited willpower to resist temptations.19 We examine a few topics in behavioral economics in chapters 6, 9, 13, 15, and 17, providing reading recommendations. 2.9.1 Fehr-Schmidt Social Preferences Fehr and Schmidt (1999) suggested a utility function that captures social preferences. For simplicity, let us consider a context with two individuals, 1 and 2, and let x1 and x2 represent their respective incomes. When individual 2 is richer than 1, x2 > x1 , the utility 19. For an introduction to behavioral economics, see Just (2013) and Angner (2016), or the more advanced presentation in Camerer (2003). Consumer Preferences and Utility 37 of individual 1 becomes x1 − α(x − x ) 2 1 Disutility from envy where parameter α 0 denotes the disutility that individual 1 suffers from envy because she is poorer than individual 1. In contrast, when individual 2 is poorer than 1, x2 < x1 (so individual 1 is richer), the utility of individual 1 is x1 − β(x − x ) 1 2 Disutility from guilt where parameter β 0 reflects the disutility that individual 1 suffers from guilt because she is richer than individual 2.20 As a special case, note that when all parameters are zero, α = β = 0, this utility function reduces to x1 , both when x2 > x1 and otherwise, which reflects standard (selfish) preferences as the individual only cares about his income, x1 , and does not suffer from envy or guilt. 2.9.2 Bolton and Ockenfels Social Preferences Bolton and Ockenfels (2000) proposed a relatively more general utility function than did Fehr and Schmidt (1999). For the case of two individuals, 1 and 2, the utility function of individual 1 can be expressed as u1 x1 , x1 x1 + x2 , where the first argument in the parentheses is interpreted as the selfish component because individual 1 considers only her own wealth x1 . The second argument measures the share that 1 , or her situation relaindividual 1’s wealth represents of the total wealth in the group, x1x+x 2 tive to the group, thus capturing social preferences. An example of this utility function can be u1 x1 , x1 x1 + x2 = x1 + α x1 x1 + x2 1/2 . If parameter α 0, individual 1 enjoys a utility from owning a larger share of total wealth. If, instead, α < 0, individual 1 suffers from owning a larger share of wealth. Appendix. Finding the Marginal Rate of Substitution We increase good x by 1 unit and seek to measure how many units of good y the consumer must give up to preserve her current utility level. Because we simultaneously alter 20. Fehr and Schmidt (1999) assumed that individuals suffer more envy than guilt, which means that parameters α and β satisfy α β. 38 Chapter 2 the amount of x and y, we totally differentiate the utility function u(x, y) to obtain du = ∂u(x, y) ∂u(x, y) dx + dy. ∂x ∂y Because the consumer is moving along an indifference curve, her utility level does not vary, implying that du = 0. Plugging this result into the left side, and using MUx = ∂u(x,y) and ∂x ∂u(x,y) MUy = ∂y , we obtain 0 = MUx dx + MUy dy, du=0 or, rearranging, −MUy dy = MUx dx. Lastly, because we are interested in the rate at which y 21 changes for a 1-unit increase in good x, we solve for dy dx , to obtain − dy MUx = , dx MUy as required. Therefore, the slope of the indifference curve, − dy dx , coincides with the ratio MUx of marginal utilities MUy . For compactness, this ratio is referred to as the marginal rate of substitution between goods x and y, or MRSx,y . Exercises 1. Indifference Curves.B Answer the following questions for each of the utility functions in table 2.1. (a) Find the marginal utility for good x and y, MUx and MUy . (b) Are these marginal utilities positive? Are they strictly positive? Connect your results with the properties of monotonicity and strict monotonicity. x (c) Find MRS = MU MUy . Does MRS increase in the amount of good x? (d) Depict an indifference curve reaching a utility level of u = 10, and another indifference curve of u = 20. Do the indifference curves cross either axis? (e) Provide an example of goods that you think can be represented with each utility function in table 2.1. 2. Plotting Curves.A Find the indifference curve for each of the utility functions in table 2.1, evaluating all of them at a utility level of u = 10 units. (Hint: You just need to plug in u = 10 and solve for y.) Are these indifference curves negatively sloped? Plot each indifference curve by considering three values for good x, such as x = 1, 2, and 4, and finding the corresponding value of good y. dy 21. Starting from −MUy dy = MUx dx, we divide both sides by dx, which yields MUy − dx = MUx ; and then, dy x dividing both sides by MUy , we find − dx = MU MU . y Consumer Preferences and Utility 39 For all utility functions in table 2.1, assume that the parameters take the values A = 1, a = 2, b = 3, α = 0.5, and β = 0.5. 3. One Way or Another.B Consider an individual with utility function u(x, y) = min{x + 2y, 2x + y}. Plot her indifference curve at a utility level of u = 10 units. Interpret. 4. Perfect Complements.A Consider a consumer with utility function u(x, y) = min{3x, 4y}. (a) Depict her indifference curve at a utility level of u = 20. (b) Depict bundles A = (10, 10), B = (14, 10), and C = (10, 14). (c) Find the utility levels that the consumer reaches at bundles A, B, and C. 5. Cobb-Douglas.A Consider an individual with the Cobb-Douglas utility function √ √ u(x, y) = x y. Assume that her income is I = $120, the price of good x is px = $4, and the price of good y is py = $10. (a) Find the marginal utility of good x, MUx , and that of good y, MUy . (b) Given the results in part (a), does this utility function satisfy monotonicity? What about strict monotonicity? (c) Using the marginal utilities you found in part (a), find the MRS. 6. Marginal Rate of Substitution–I.A Find the MRS for each of the utility functions in table 2.1. Are the MRS that you found diminishing? Provide an economic interpretation for each MRS. 7. Perfect Substitutes.A Consider a consumer with utility function u(x, y) = ax + by. (a) For a given utility level u(x, y) = 10, find the equation of the indifference curve. (Hint: Set u = 10 and solve for y.) (b) Find the marginal utilities MUx and MUy . (c) Find MRS. Does it increase in the amount of good x? (d) Does this utility function satisfy strict monotonicity? What about monotonicity? And local nonsatiation? 8. Examples of Goods Fitting Each Utility Function.A Consider a scenario with only two goods, x and y. For each of the following utility functions, provide two examples (other than those given in this book), justifying why each utility function represents preferences for that type of good. (a) Perfect substitutes. (b) Perfect complements. (c) Cobb-Douglas. (d) Stone-Geary. 9. Increasing Transformations.B Chelsea has the following Cobb-Douglas utility function: u(x, y) = xy. Assume that we apply any of the following transformations. Show that when we 40 Chapter 2 consider increasing transformations, Chelsea’s ordering of bundles A = (1, 2) and B = (3, 8) is unaffected (i.e., she still prefers B to A). When we consider decreasing transformations, show that Chelsea’s ordering of bundles A and B may be affected. (a) v(x, y) = [u(x, y)]2 (b) v(x, y) = ln[u(x, y)] (c) v(x, y) = 5[u(x, y)] 1 (d) v(x, y) = u(x,y) (e) v(x, y) = 7[u(x, y)] − 2 10. Finding Properties–I.A Eric’s preferences for books, x, and computers, y, can be represented with the following Cobb-Douglas utility function: u(x, y) = x3 y2 . (a) Find Eric’s marginal utility for books, MUx , and for computers, MUy . (b) Are his preferences monotonic (i.e., weakly increasing in both goods)? (c) For a given utility level u, solve the utility function for y to obtain Eric’s indifference curve. (d) Find Eric’s MRS between x and y . Interpret your results. (e) Are his preferences convex (i.e., bowed in toward the origin)? (f) Consider a given utility level of 10 utils. Plot his indifference curve in this case. 11. Finding Properties–II.C Repeat exercise 10, but assume now that Eric’s preferences are represented with the following (Stone-Geary) utility function: u(x, y) = 2 x3 − 1 y2 − 2 . (a) Find Eric’s marginal utility for books, MUx , and for computers, MUy . (b) Are his preferences monotonic (i.e., weakly increasing in both goods)? (c) For a given utility level u, solve the utility function for y to obtain Eric’s indifference curve. (d) Find Eric’s MRS between x and y. Interpret your results. (e) Are his preferences convex (i.e., bowed in toward the origin)? (f) Consider a given utility level of 10 utils. Plot his indifference curve in this case. 12. Finding Properties–III.B Repeat exercise 10, but assume now that Eric’s preferences are represented with the following (linear) utility function: u(x, y) = 3x + 4y. (a) Find Eric’s marginal utility for books, MUx , and for computers, MUy . (b) Are his preferences monotonic (i.e., weakly increasing in both goods)? (c) For a given utility level u, solve the utility function for y to obtain Eric’s indifference curve. (d) Find Eric’s MRS between x and y. Interpret your results. (e) Are his preferences convex (i.e., bowed in toward the origin)? (f) Consider a given utility level of 10 utils. Plot his indifference curve in this case. Consumer Preferences and Utility 41 13. Finding Properties–IV.B Repeat the previous exercise, but assume now that Eric’s preferences are represented with the following utility function for two goods regarded as complements in consumption: u(x, y) = min {3x, 4y}. (a) Find Eric’s marginal utility for books, MUx , and for computers, MUy . (b) Are his preferences monotonic (i.e., weakly increasing in both goods)? (c) For a given utility level u, solve the utility function for y to obtain Eric’s indifference curve. (d) Find Eric’s MRS between x and y. Interpret your results. (e) Are his preferences convex (i.e., bowed in toward the origin)? (f) Consider a given utility level of 10 utils. Plot his indifference curve in this case. 14. Envious Preferences.B Peter’s preferences are represented by the utility function u(x, y) = 4x + 2(x − y), where x denotes the amount of books he has, while y represents the amount of books his friend has. Intuitively, when x > y, he owns more books than his friend, and his utility increases. When x < y, he owns fewer books than his friend and his utility decreases (he suffers from envy). (a) Find Peter’s marginal utility for the books he owns, MUx , and for his friend’s, MUy . (b) Are his preferences monotonic (i.e., weakly increasing in both goods)? (c) For a given utility level u, solve the utility function for y to obtain Peter’s indifference curve. (d) Find Peter’s MRS between x and y. Interpret your results. (e) Are his preferences convex (i.e., bowed in toward the origin)? (f) Consider a given utility level of 10 utils. Plot his indifference curve in this case. 15. Guilty Preferences.B Repeat exercise 14, but assume that Peter’s utility function is now u(x, y) = 4x − (x − y); that is, he suffers from guilt when he owns more books than his friend (x > y). Relative to envy aversion in exercise 14, guilt aversion reduces Peter’s utility less dramatically. Stated in words, Peter cares more about feeling envy than about feeling guilt. (a) Find Peter’s marginal utility for the books he owns, MUx , and for his friend’s, MUy . (b) Are his preferences monotonic (i.e., weakly increasing in both goods)? (c) For a given utility level u, solve the utility function for y to obtain Peter’s indifference curve. (d) Find Peter’s MRS between x and y. Interpret your results. (e) Are his preferences convex (i.e., bowed in toward the origin)? (f) Consider a given utility level of 10 utils. Plot his indifference curve in this case. 16. Toddler Rationality.A Eric’s daughter is 4 years old. He has spent a long time trying to figure out her favorite animal. Today he is asking her to pick which animal she prefers among two animals at a time. He learned the following strict preferences: 42 Chapter 2 Animal 1 Animal 2 Preferred Animal Unicorn Rabbit Cat Rabbit Duck Pony Rabbit Pig Pony Unicorn Dog Dog Pig Pig Unicorn Cat Cat Duck Dog Rabbit Pony Rabbit Unicorn Rabbit Pig Rabbit Duck Cat Duck Pig Pony Pony Dog In this table, the first row indicates that when presented with a unicorn or a dog, his daughter would strictly prefer a unicorn. Assume that Eric’s daughter could compare any two animals (i.e., that she has complete preferences). Is Eric’s daughter rational? Explain why or why not. (Note: Some of these relations are redundant.) 17. Protein Preferences.A John is out to dinner and is presented a menu of items that contains a beef dish and a chicken dish, and John chooses the beef dish. Before the server walks away with John’s order, she remembers that the menu is actually out of date and a fish dish is available, as well. (a) Suppose that John remains with his original order of the beef dish. What does this imply about John’s preferences for all three dishes? (b) Suppose that John switches his order to the fish dish. What does this imply about John’s preferences for all three dishes? (c) Suppose that John switches his order to the chicken dish. What does this imply about John’s preferences for all three dishes? 18. Inverse Utility.C Consider a consumer with the utility function: u(x, y) = 1 . xy (a) Does this utility function satisfy monotonicity? (b) Does this utility function satisfy local nonsatiation? 19. Eating Pizza.A While out for dinner one night, Peter orders a large pepperoni pizza for himself. After eating the first slice, he remarks that the pizza is delicious, and he’ll have another slice. Slices two and three continue to receive accolades, but less so, with Peter expressing that he is starting to feel full after the third slice. Peter decides to have a fourth slice, after which he decides that he is full and prefers to eat no more pizza. (a) What is happening to Peter regarding his utility? (b) Suppose that Peter was dared to eat a fifth slice of pizza and accepted. Afterward, he complains that he feels ill and leaves for the bathroom (rather hurriedly). What has happened to Peter’s utility? Consumer Preferences and Utility 43 20. Discrete Marginal Rate of Substitution.A On the weekend, Eric traveled to a local barter market to exchange some of his apples for cheese. Bringing 20 apples with him, he offers 5 apples to the first cheese merchant he sees, in exchange for a wedge of cheese. Eric offers only 4 of his apples to another cheese merchant for his second wedge of cheese, and only 2 apples to a third merchant for his third wedge of cheese. (a) What is Eric’s MRS as he moves from 0 to 1, then 1 to 2, and finally 2 to 3 wedges of cheese? (b) Why does Eric offer fewer apples for each additional wedge of cheese he obtains? 3 Consumer Choice 3.1 Introduction In chapter 2, we learned about consumer preferences over different bundles, and how they help us represent an individual’s ranking over different alternatives. In addition, we discussed how to represent these preferences with a utility function to measure how much utility a consumer derives from different bundles. However, we were silent about which bundles are affordable for the consumer to buy (as if she had unlimited resources!). To determine how a consumer chooses amongst different bundles, we need to consider not only her preferences, but also her budget. In this chapter, we first describe how to represent such a budget constraint, and then combine the consumer’s utility function and her budget constraint to identify her optimal consumption choices. 3.2 Budget Constraint Budget constraint The set of bundles that a consumer can afford, given the price of each good and her income. For instance, the budget set for two goods (units of food, x, and units of clothing, y) is px x + py y I, where px represents the price of each unit of food, py denotes the price of each unit of clothing, and I represents the consumer’s available income to spend on food and clothing. Intuitively, the budget set says that the total dollar amount that the consumer spends on food, px x, plus the total dollar amount she spends on clothing, py y, cannot exceed her available income, I. For example, if the price of good x is px = $10, that of good y is py = $20, and the consumer has an income of I = $400 to spend on either good, her budget constraint is 10x + 20y 400. 46 Chapter 3 y I py Slope = – px py I px x Figure 3.1 A budget line. Bundles (x, y) that satisfy this budget constraint strictly, px x + py y < I, imply that the consumer does not use all her income, whereas bundles for which the constraint holds with equality, px x + py y = I, mean that the consumer spends all of her income. We often refer to this last equation, px x + py y = I, as the budget line (see figure 3.1). Because good y is on the vertical axis, we can rearrange budget line px x + py y = I to py y = I − px x, and then solve for y, to obtain y= px I − x, py py where pIy represents the vertical intercept, and − ppxy is the slope of the budget line. We can also find the horizontal intercept in figure 3.1 by setting y = 0 in the equation for the budget line px x + py y = I (because at this point, its height is zero). This yields px x + py 0 = I, or px x = I. After solving for x, we find x = pIx , as depicted in the horizontal axis of figure 3.1. At the vertical (horizontal) intercept, the consumer spends all her income on good x (good y), so she can afford pIx units ( pIy units) of this good. At all other points along the budget line, however, she purchases positive units of both goods. The slope of the budget line, − ppxy , tells us how many units of the good on the y-axis the consumer must give up to buy 1 more unit of the good on the x-axis, as we move from left to right on figure 3.1. Continuing with our previous example, if px = $10 and py = $20, the slope of the budget 1 line is − ppxy = − 10 20 , or − 2 . In this case, the slope tells us that the consumer must give up 1/2 units of good y to acquire 1 more unit of good x, because good y is twice as expensive as good x. Alternatively, she must give up 1 unit of good y to purchase 2 more units of good x. Consumer Choice 47 y I' py Slope = – px py I py Slope = – px py I px I' px x Figure 3.2 Budget line (BL) after an increase in income. Self-assessment 3.1 Eric faces prices px = $13 and py = $18, and income I = $250. Plot his budget line, finding the vertical and horizontal intercepts, and its slope. Interpret. Changes in income. An increase in income, I, shifts the budget line outward in a parallel fashion. To see this effect, notice that when income increases from I to I , where I > I, the horizontal intercept increases from pIx to pI x , and so does the vertical intercept from pIy to pI y , as depicted in figure 3.2. Graphically, the increase in income moves the vertical intercept upward and the horizontal intercept rightward. In addition, this shift is parallel to the initial budget line because its slope, − ppxy , is unaffected by a change in income (i.e., the slope is not a function of the individual’s income I). Intuitively, as her income increases (holding prices constant), the consumer can afford a larger set of bundles. That is, she can afford more units of good x (because the horizontal intercept moves rightward), more units of good y (because the vertical intercept moves upward), or more units of both goods.1 A decrease in income would have the opposite effect on the budget line, of course, shifting it inward (closer to the origin) in a parallel fashion. 1. To see this last point graphically, you can depict a 45-degree line (upward diagonal) on figure 3.2. This 45degree line crosses both the original and the new budget lines at a point where the consumer purchases the same amount of both goods (i.e., x = y). However, at the crossing point with the new budget line, the individual can purchase more units than at the crossing point with the original budget line. 48 Chapter 3 (a) y I py (b) y I py Increase in py p Slope = – x py Slope = – I p 'y p p 'x py I I p 'x Increase in px px p Slope = – p x y Slope = – p x' y x I px x Figure 3.3 (a) Increase in price px . (b) Increase in price py . Self-assessment 3.2 Consider self-assessment 3.1 again. If Eric’s income increases to I = $540, find his budget line. What are the new vertical and horizontal intercepts? Does the slope of the budget line change? Changes in prices. An increase in the price of one good, such as px , pivots the budget line inward, as illustrated in figure 3.3a. In particular, the vertical intercept pIy is unaffected by changes in px , whereas the horizontal intercept pIx moves leftward when px increases. Indeed, when the price of good x increases from px to px , the horizontal intercept decreases from pIx to pI . To interpret the economic intuition behind this result, x recall that the horizontal intercept measures the amount of good x that the individual can afford when spending all her income I on good x alone. As x becomes more expensive, she cannot afford as many units of x. Intuitively, the individual now faces a more expensive good x, thus shrinking the set of bundles that she can afford.2 A similar argument applies if the price of good y increases (decreases), as depicted in figure 3.3b. In this case, the horizontal intercept pIx remains unaffected, but the vertical intercept pIy moves down (up). 2. The opposite argument applies if good x becomes cheaper, where the horizontal intercept would move rightward, thus expanding the set of bundles that she can afford. Consumer Choice 49 Self-assessment 3.3 Consider Eric’s situation in self-assessment 3.1 again. If the price of good x doubles from px = $13 to px = $26, while py and I are unaffected, what is the new position of Eric’s budget line? What if, instead, the price of good y doubles from py = $18 to py = $36, while px and I are unchanged? Query. What would happen if both income and the price of all goods were doubled? This is a tricky question. If all prices and income change, the budget line is unaffected. Indeed, you can confirm this result by noticing that (1) the vertical intercept of the budget line, pIy , would now become 2I 2py , which simplifies to (2) the horizontal intercept I px is now − ppxy 2I 2px , I py , thus indicating no change in its position; which reduces to I px , also reflecting no change in x − 2p 2py , its position; and (3) the slope is now which simplifies to − ppxy , implying that the slope of the budget line does not change either. As a consequence, no term in the budget line is affected by a simultaneous increase in all prices and income. Note that such argument not only applies to a doubling of all prices and income, but also extends to any common increase in all prices and income (i.e., multiplying px , py , and I by a common factor α > 1, such as α = 3), and to any common decrease in all prices and income (i.e., multiplying px , py , and I by a common factor α, where 0 < α < 1, such as α = 1/2). 3.3 Utility Maximization Problem After using the budget line to describe which bundles the consumer can afford, we are ready to present the process by which the consumer chooses utility-maximizing bundles. In particular, the consumer chooses bundles that maximize her utility among all those that she can afford. Figure 3.4 illustrates this idea by superimposing indifference curves, which represent the utility that the consumer obtains from different bundles, on top of her budget line, which depicts her affordable bundles. We can test whether points A–D in this graph are utility-maximizing. First, point C cannot be optimal because, although the consumer reaches utility level u1 , and exhausts her income because px x + py y = I, she could find other bundles, such as A, where she still spends her income and obtains a higher utility u2 , where u2 > u1 . Bundles like B cannot be optimal either because, despite spending all her income, the consumer reaches a lower utility level than at bundle A. Finally, bundles such as D, lying strictly above the budget line, cannot be optimal either. Despite yielding a higher utility level than A, they are unaffordable and thus violate the budget constraint. As a consequence, only bundles such as A, where the budget line and the indifference curves are tangent to each other, can be optimal for the consumer. 50 Chapter 3 y B D BL A u3 C u2 u1 x Figure 3.4 Utility maximization problem. This tangency condition requires that the slope of the budget x 3 line at bundle A, ppxy , is equal to the slope of the indifference curve, MRS = MU MUy . That is, utility-maximizing bundles must satisfy Same “bang for the buck.” MUx MUy MUx px = , or after rearranging = . MUy py px py MU y x Intuitively, condition MU px = py states that the marginal utility per dollar spent on the last unit of good x must be equal to that of good y; or more informally, the bang for the buck must coincide across all goods. (Appendix A at the end of this chapter proves this result.) MUy x Otherwise, if in a bundle we have MU px > py , the consumer would obtain a larger bang for the buck from x than from y, ultimately providing her with incentives to spend more dolMUy x lars in x and fewer in y. Hence, the initial bundle for which MU px > py cannot be optimal because the consumer has incentives to readjust her consumption bundle. This occurs, however, at corner solutions where the consumer spends all her income purchasing one good alone (as discussed in example 3.3 later in this chapter). We next present a general procedure on how to solve the utility maximization problem (UMP), and afterward illustrate the use of the procedure with three step-by-step examples. This procedure applies to relatively general situations, but it does not apply to utility functions with one or more goods having a negative marginal utility, such as u(x, y) = ax2 − bx, where a, b > 0 and MUx = 2ax and MUy = −b. In this type of utility function, the consumer reduces his purchases of the good with a negative marginal utility to 3. Recall from chapter 2 (section 2.5) that MUx denotes the marginal utility of good x, MUx = ∂u(x,y) represents the marginal utility of good y, MUy = ∂y . ∂u(x,y) ∂x , while MUy Consumer Choice 51 zero (in the previous example y = 0) and buys as many units as possible of the other good (i.e., x = pIx ). Tool 3.1. Procedure to solve the UMP 1. Set the tangency condition as MUx MUy = ppxy . Cross-multiply and simplify. 2. If the expression found from the tangency condition: a. Contains both unknowns (good x and y), then solve for x, and insert the resulting expression into the budget line px x + py y = I. b. Contains only one unknown (good x or y, but not both), then solve for that unknown. Afterward, insert your result into the budget line px x + py y = I to obtain the remaining unknown. MU MU y y MUx x c. Contains no good x or y, then compare MU px against py . If px > py , then set good y = 0 in the budget line and solve for good x. (You found a corner solution MUy x where the consumer purchases only good x.) If, instead, MU px < py , then set x = 0 in the budget line and solve for good y. (In this case, you found a corner solution where she purchases only good y.) 3. If, in step 2, you find that one of the goods is consumed in negative amounts (e.g., x = −2), then set the amount of this good equal to zero on the budget line (e.g., px 0 + py y = I), and solve for the remaining good. 4. If you haven’t found the values for all the unknowns (goods x and y) yet, use the tangency condition from step 1 to find the remaining unknown. Example 3.1: UMP with interior solutions–I Consider an individual with a CobbDouglas utility function u(x, y) = xy, facing market prices px = $20 and py = $40, and income I = $800. Step 1. Let us use the tangency condition to find the optimal consumption bundle y x of this consumer. In this case, MU MUy = x , which implies that the tangency condition y 1 = ppxy becomes yx = 20 40 . Simplifying, we find x = 2 , or 2y = x. This result contains both x and y, so we can now move on to step 2a, ignoring steps 2b and 2c. MUx MUy Step 2a. From the budget line, we have that 20x + 40y = 800.4 Inserting 2y = x into the budget line, we obtain 20(2y) + 40y = 800, x 4. Mathematically, we have a system of two equations, 2y = x and 20x + 40y = 800, and two unknowns, x and y. Because this example illustrates Tool 3.1, we continue applying step 2a. However, you could directly solve for x and y in these two equations. 52 Chapter 3 or, rearranging, 80y = 800, which yields y = 800 80 = 10 units. Because we found that the consumer purchases 10 units of good y, we can move on to step 4. (Recall that we need to stop at step 3 only if, at the end of step 2, you find that x or y are negative.) Step 4. Lastly, to find the optimal consumption of good x, we use the tangency condition x = 2y = 2 × 10 = 20 units. Summary. The optimal consumption bundle is (20, 10). As confirmation, note that at bundle (20, 10), the slope of the indifference curve, yx = 10 20 , coincides with that of 1 = . the budget line, ppxy = 12 , because 10 20 2 Example 3.2: UMP with interior solutions–II Consider a variation of example 3.1 in which the individual now has a Cobb-Douglas utility function u(x, y) = x1/3 y2/3 , facing market prices px = $10 and py = $20, and income I = $100. We seek to use the px MUx x tangency condition MU MUy = py , but first we must find MUy , as follows: 1 2 2 2 1 3 −1 3 1 −3 3 x y x y y3+3 y MUx = 3 1 2 = 3 1 1 = 1 2 = . 2 3 3 −1 2 3 −3 MUy 2x 2x 3 + 3 3x y 3x y 2 1 y px 10 x Step 1. Plugging this result into the tangency condition MU MUy = py yields 2x = 20 , or rearranging y = x. This result contains both x and y, so we can move on to step 2a. Step 2a. Inserting y = x into the budget line, 10x + 20y = 100, we obtain 10 (y) + 20y = 100, x or, rearranging, 30y = 100, which yields y = 100 30 = 3.33 units. Because we found that the consumer purchases a positive amount of good y, we can move on to step 4. Step 4. The optimal consumption of good x can be found by using the tangency condition y = x = 3.33 units. Summary. Therefore, the optimal consumption bundle is (3.33, 3.33) where the consumer purchases the same amount of each good.5 In example 3.2, we can find the budget shares of each good; that is, the percentage of income that the consumer spends on good x and on good y. In particular, the budget y 3.33 = 1 , coincides 5. As confirmation, note that, at this optimal bundle (3.33, 3.33), the slope of the IC, 2x = 2×3.33 2 px 10 10 1 with that of the budget line, py = 20 , because 20 = 2 . Consumer Choice 53 share of good x is px x 10 × 3.33 1 = = , I 100 3 and it is similar for the budget share of good y, where py y 20 × 3.33 2 = = , I 100 3 which coincides with the exponent of each good in the consumer’s Cobb-Douglas utility function, u(x, y) = x1/3 y2/3 . This is a useful result that generalizes to all types of Cobb-Douglas utility functions, u(x, y) = Axα yβ , where A, α, β > 0, thus allowing us to immediately infer that the budget share of good x is α, while that of good y is β, just by looking at the exponents of this utility function. Self-assessment 3.4 Chelsea has utility function u(x, y) = x1/2 y1/4 , facing prices px = $3 and py = $2, and income I = $16. Using the same steps as in example 3.2, find Chelsea’s optimal consumption of goods x and y. Examples 3.1 and 3.2 examined a scenario in which the consumer purchases positive amounts of all goods (e.g., 10 units of x and 20 of y). However, we can encounter scenarios in which consumers prefer to consume zero units of either good. We now explore such a situation. Example 3.3: UMP with corner solutions Assume that a consumer has the utility function u(x, y) = xy + 7x, faces market prices px = $1 and py = $2, and an income I = $10. y+7 px 1 x Step 1. Using the tangency condition MU MUy = py we find that x = 2 , which collapses to 2y + 14 = x. This result contains both x and y, so we can now move on to step 2a. Step 2a. From the budget line, we have that x + 2y = 10. Plugging 2y + 14 = x into the budget line, we obtain (2y + 14) + 2y = 10, x or, rearranging, 4y = −4, which yields y = −1 units. Using Tool 3.1, we now move on to step 3. Step 3. Because the amounts of goods x and y cannot be negative, this result entails that the individual that we consider would like to reduce her consumption of good y 54 Chapter 3 y u1 u2 BL u3 Bundle (10,0) 10 units x Figure 3.5 Optimal bundle with quasilinear utility. as much as possible (i.e., y = 0). We can insert this result into the budget line to obtain x + (2 × 0) = 10, or x = 10 units. Summary. We have thus found a corner solution, where the consumer in this case uses all her income to purchase good x alone (i.e., x = pIx = 10 1 = 10 units). Graphically, her optimal bundle (x, y) = (10, 0) is located on the horizontal intercept of her budget line. Finally, note that at the corner solution, the tangency condition does not hold because y+7 x MUx MUy = , = , or in this case px py 1 2 10 and, evaluating this equality at the corner solution (x, y) = (10, 0), we obtain 0+7 1 > 2 (we know that the inequality holds with sign > because it simplifies to 7 > 5). As expected from these results, the marginal utility per dollar spent on good x is larger than that on good y, thus inducing the individual to increase her consumption of good x and decrease that of y. Figure 3.5 depicts this result.6 Intuitively, she would like to further decrease her consumption of good y and use the money saved to buy more units of x, but she can no longer decrease her consumption of y once she reaches y = 0. y+7 1 , becomes 0+7 = 7, thus being larger than the slope of the budget line, 10 = 5. In other words, the IC passing through bundle 1 2 6. As confirmation, note that, at this optimal consumption bundle (10, 0), the slope of the IC, (10, 0) is steeper than the budget line, as depictedin figure 3.5. Consumer Choice 55 Self-assessment 3.5 Repeat the analysis of example 3.3, but assume now that prices change to px = $2 and py = $1. How are the results affected? 3.4 Utility Maximization Problem in Extreme Scenarios Goods are regarded as perfect substitutes. A common utility function for which corner solutions arise is when goods are substitutes in consumption, such as two brands of unflavored mineral water (e.g., Dasani and Aquafina). As described in chapter 2 (section 2.8.1), this utility function has the form u(x, y) = ax + by, where a and b are positive parameters. a x In this case, MU MUy = b . Hence, one of the following three cases emerges: • If ab > ppxy , the indifference curve is steeper than the budget line, thus producing a corner solution in which the consumer purchases only units of good x, as in example 3.3. This can be seen more easily if we represent the tangency condition using the “bang for the buck” approach, b a > , px py which indicates that the bang for the buck from good x is larger than that of y, thus implying that the consumer would like to increase her consumption of good x while decreasing that of y. (An example of this would be when a = b = 1 and prices are px = $1 and py = $3.) • The opposite argument applies if ab < ppxy , where a corner solution exists, in which the consumer spends now all her income on good y. In this case, the optimal consumption bundle lies on the vertical intercept of the budget line. (This occurs, for instance, if a = b = 1 and prices are px = $3 and py = $2, where the inequality becomes 11 = 1 < 1.5 = 32 .) • Lastly, if ab = ppxy , the slope of the indifference curves and that of the budget line coincide, yielding a complete overlap between an indifference curve and the budget line. In this case, tangency occurs at all the points of the budget line, implying that all the points are optimal consumption bundles. Formally, in this case we say that a continuum of solutions exists, because any bundle (x, y) satisfying px x + py y = I is utility maximizing. Self-assessment 3.6 Eric’s utility function is u(x, y) = 3x + 4y and faces prices px = $1 and py = $2.5 and income I = $23. Comparing his MRSx,y and the price ratio, find his optimal consumption of goods x and y. 56 Chapter 3 When, in contrast, the consumer regards goods as perfect complements, such as cars and gasoline, her utility function takes the form u(x, y) = A min{ax, by}, where A, a, and b are all positive parameters. Chapter 2 (section 2.8.2) presented this utility function, discussing that its indifference curves are L-shaped and have a kink at a ray from the origin with slope a/b. In addition, we described that the MRS of this function is undefined, because the kink would admit any slope. As a consequence, we cannot use the tangency condition MRS = ppxy , given that we cannot guarantee that the MRS takes a specific number. Optimal bundles in this context, therefore, require us to identify bundles for which we cannot increase the consumer’s utility level, given her budget constraint. This occurs, in particular, when she consumes the bundle at the kink of her indifference curve where it intersects her budget line. Mathematically, that requires ax = by for the bundle to be at the kink or, after rearranging, y = ba x; and px x + py y = I for the bundle to be on the budget line. Hence, we have a system of two equations, y = ba x and px x + py y = I, and two unknowns, goods x and y. Inserting y = ba x into the budget line, we obtain Goods are regarded as perfect complements. px x + py a x = I, b y I bI which, solving for x, yields x = p +p a = bp +ap . Therefore, the optimal amount of y x y x yb becomes y= bI aI a = . b bpx + apy bpx + apy For instance, if a = b = 2 (which occurs when the individual needs to consume the same amount of each good), prices are px = $10 and py = $20, and her income is I = $100, the optimal consumption of good x is x= 10 bI 2 × 100 units, = = bpx + apy (2 × 10) + (2 × 20) 3 2×100 10 and it is similar for good y, where y = bpxaI +apy = (2×10)+(2×20) = 3 units. Self-assessment 3.7 John’s utility function is u(x, y) = 5 min{2x, 3y} and he faces prices px = $1 and py = $2 and income I = $100. Using the previous argument, find his optimal consumption of goods x and y. Consumer Choice 57 3.5 Revealed Preference In previous sections, we analyzed how to find optimal consumption bundles, assuming that we could observe the consumer’s preferences represented with her utility function. But what if we only know which choices she made when facing different combinations of prices and income? Can we still say whether an individual made optimal consumption choices? The answer to this question is yes, thanks to the Weak Axiom of Revealed Preference (WARP). Before we state this axiom, let bundle A = (xA , yA ) be the optimal consumption bundle that the individual selects when facing initial prices and income (px , py , I) and, similarly, let bundle B = (xB , yB ) be her optimal consumption bundles when facing final prices and income (px , py , I ). Weak Axiom of Revealed Preference (WARP) If optimal consumption bundles A and B are both affordable under initial prices and income (px , py , I), then bundle A cannot be affordable under final prices and income (px , py , I ). That is, if px xA + py yA I and px xB + py yB I, then px xA + py yA > I . Intuitively, if both bundles A and B are initially affordable, and the consumer selects A as optimal, she is “revealing” a preference for bundle A over B. WARP then requires that bundle A is not affordable under final prices and income (px , py , I ); otherwise, the consumer should still select the original bundle A rather than B. Hence, WARP can be interpreted as a consistency requirement in an individual’s choices when facing different prices and incomes: if she chooses bundle A when other bundles are affordable, she should keep choosing bundle A if it is still affordable. If, instead, she chooses bundle B when facing new prices and income, it must be that the original bundle A is no longer affordable. We next provide a tool to test for WARP. Tool 3.2. Checking for WARP. Let us follow this two-step procedure: 1. Checking the premise. Check if bundles A and B lie on or below the initial budget line BL, which represents initial prices and income (px , py , I). That is, make sure that both bundles are initially affordable. 1a. If step 1 holds, move to step 2. 1b. If step 1 does not hold, then stop. We can only claim that the individual choices do not violate WARP.7 7. In this case, the premise of WARP does not hold, which means that we cannot claim that WARP is satisfied or violated. We can only claim that WARP is satisfied if steps 1 and 2 hold, and we can only claim that WARP is violated if step 1 holds but 2 does not. 58 Chapter 3 2. Checking the conclusion. Check if bundle A lies strictly above the final budget line BL , which represents final prices and income (px , py , I ). That is, check that bundle A is no longer affordable. 2a. If step 2 holds, then WARP is satisfied. 2b. If step 2 does not hold, then WARP is violated. Hence, if step 1 holds, the premise of WARP is satisfied, and we can move on to check its conclusion, as stated in step 2. In summary, WARP is either: (i) satisfied if steps 1 and 2 hold; (ii) violated if step 1 holds but 2 does not; or (iii) not violated if step 1 does not hold. Example 3.4 illustrates several consumer choices: some satisfying and some violating WARP. Example 3.4: Testing for WARP Figures 3.6a–d represent the same change in the budget line, from BL to BL . This change may be due to a simultaneous decrease in px (so BL becomes flatter than BL) and an income reduction (shifting the budget line closer to the origin). For instance, the initial budget line BL could depict a situation where px = py = $2 and I = $10, whereas the final budget line BL illustrates the case where px decreases to px = $1 and income decreases to I = $7, leaving py unchanged. In this context, the vertical intercept of the budget line decreases from pIy = 10 2 =5 units to I 7 py = 2 = 3.5 units, I 7 px = 1 = 7 units. and the horizontal intercept increases from I px = 10 2 =5 units to Figure 3.6a depicts a scenario where WARP is satisfied. To see this, we start noting that step 1 holds because bundle A lies on the initial budget line BL, while bundle B lies strictly below BL, thus implying that both bundles are affordable under initial prices and income. We can then move on to step 2, and notice that bundle A lies strictly above the final budget line BL , making this bundle unaffordable under the final prices and income. As a consequence, WARP is satisfied. Figure 3.6b, however, depicts choices that violate WARP. To see this, first note that the premise of WARP, as stated in step 1, holds because bundle A lies on the initial budget line BL and bundle B lies strictly below BL. However, step 2 does not hold because bundle A lies below the final budget line BL , making A affordable under final prices and income. Therefore, the consumer is not consistent in her choices given that both bundles A and B are affordable under BL and BL , but she changes her choices in each budget set. In this scenario, WARP is violated. Figure 3.6c illustrates a situation in which WARP is not violated. Indeed, step 1 in the procedure to test WARP does not hold because, while bundle A lies on the initial budget line BL, bundle B lies strictly above BL, making the latter unaffordable under the initial prices and income. Because step 1 does not hold, the premise of WARP Consumer Choice 59 (b) y (a) y BL BL A B B A BL' BL' x x Figure 3.6 (a) WARP holds. (b) WARP is violated. (d) y (c) y BL BL A B A BL’ x Figure 3.6 (c) WARP is not violated. (d) WARP is not violated. B BL’ x 60 Chapter 3 does not hold either, implying that WARP is not violated. A similar argument applies to figure 3.6d. Self-assessment 3.8 Consider figures 3.6a–3.6d again. Assume for each figure that bundle A lies at the crossing point between budget lines BL and BL . How are the results of example 3.4 affected? What if B is the bundle lying at the crossing point between BL and BL ? Exercise 3.14, at the end of the chapter, provides a numerical example where you can apply the above procedure to check for WARP. 3.6 Kinked Budget Lines In previous sections of this chapter, we considered a linear budget line, which assumes that consumers face a constant price for goods x and y, regardless of how many units they purchase. We now examine budget lines that counter this assumption. 3.6.1 Quantity Discounts Sellers often offer quantity discounts that make the first few units (such as the first 2 units) more expensive than each unit afterwards. Formally, this means that the consumer faces a price px for all units of good x between 0 and x (i.e., for all x x), but a lower price px , where px < px , for each subsequent unit (i.e., for all x > x). Figure 3.7 depicts this budget line, which originates at pIy and decreases at a price ratio of ppxy for all x x, as in the standard budget lines in this chapter. However, when the consumer purchases more than x units, she benefits from the quantity discount, lowering the price of x to px , which decreases the price p ratio from ppxy to pxy . Graphically, this lower price ratio entails a flatter budget line for all units to the right side of x. Mathematically, the equation of the budget line in this scenario is y= I py Vertical intercept − px x py for all x x, Slope which coincides with the budget line in section 3.2 for all units of x to the left of the kink in figure 3.7, x x, originating at pIy , having a slope of − ppxy , and crossing the horizontal axis at I px . However, for all units to the right side of the kink, x > x, the equation of the budget line is Consumer Choice 61 y p Slope = – p x y I py BL (solid kinked lines) I px – px ' py – py x Slope = – px ' py x I px – px ' x – px ' px ' I px x Figure 3.7 Budget line with quantity discounts. y= I px − px p − x − x x for all x > x. py py py Vertical intercept Slope Relative to the equation with no price discounts, this expression differs in two ways. First, p p it is flatter, because its slope is pxy rather than ppxy , where pxy < ppxy , as depicted in figure 3.7. p −p Second, it originates at a lower vertical intercept, pIy − xpy x x rather than pIy (see the vertical intercepts in figure 3.7).8 Figure 3.7 also helps us understand the effect of a large or small price discount. A large price discount makes the difference px − px larger, shifting the vertical intercept downward and flattening the right segment of the budget line. In contrast, a small price discount produces a small difference px − px , pushing the vertical intercept upward (closer to that of the original budget line with no discounts, pIy ) and steepening the right segment of the budget line (so that it is almost as steep as the left segment).9 p −p p 8. The horizontal intercept of this line can be found by setting y = 0, which yields 0 = pIy − xpy x x − pxy x p p −p p −p or, after rearranging, pxy x = pIy + xpy x x. Solving for x, we obtain a horizontal intercept of x = I − x x x, as px px depicted in figure 3.7. 9. As a remark, note that the equations of both segments of the budget line (to the left and the right of x) coincide if the seller offers no price discount (that is, px = px ). In this case, equation y = pIy − p −p p p y = pIy − xpy x x − pxy x, which simplifies to y = pIy − pxy x. px −px px py x − py x becomes 62 Chapter 3 Example 3.5: Quantity discounts Eric has an income of $100 to purchase video games (good x) and food (good y). The price of food is py = $5, regardless of how many units he buys, while that of video games is px = $4 for the first two units, but px = $1 for the third unit and beyond. Because the cutoff here is at x = 2 units, Eric’s budget line is then 100 4 4 − x = 20 − x for all x 2, and 5 5 5 100 4 − 1 1 3 94 1 y= − 2 − x = 20 − 2 − x = − x for all x > 2. 5 5 5 5 5 5 y= Graphically, Eric’s budget line originates at I py = 100 5 = 20 units on the vertical axis px = − 45 = −0.8 for the first two units. For all units x > 2, and decreases at a rate of − py p 1 x however, his budget line originates at y = 94 5 18.8 units, has a slope of − py = − 5 = −2, thus becoming flatter, and crosses the horizontal axis at x = pI − x (4−1) × 2 = 100 − 6 = 94 units. 1 px −px 100 px x = 1 − 3.6.2 Introducing Coupons Consider a market where the government offers coupons that let consumers purchase the first x units of good x for free. Figure 3.8 depicts the budget line in this situation, BLC , where subscript C denotes “coupons.” The budget line is flat for all units between 0 and x and then decreases at the usual price ratio ppxy , having a kink at exactly the number of units of good x that the consumer enjoys for free, x. For comparison purposes, the figure also includes a budget line without coupons, BLNC , where the subscript NC denotes “no coupons.” Intuitively, the coupons expand the set of bundles that the consumer can afford. Mathematically, the kinked budget line BLC in figure 3.8 can be expressed as BLC = py y = I for all x < x, and px (x − x) + py y = I for all x x. This condition says that the equation of BLC is py y = I, to the left of the kink, x < x, because, in this case, the consumer effectively faces a price of zero for good x, px = $0, thanks to the coupons. For bundles to the right of the kink, x x, the budget line becomes px (x − x) + py y = I, because the consumer exhausted all coupons at that point and faces market prices px and py . Solving for y, we can also represent budget line BLC as y = pIy for all x < x, and as y = pIy − ppxy (x − x) for all x x, which can be written alternatively as Consumer Choice 63 y I + px x py py BLC (solid kinked lines) I py Slope = – px py BLNC Slope = – x px py I +x px I px x Figure 3.8 Budget lines with and without coupons. y= I px px + x − x. py py py Vertical intercept Slope Graphically, the term in parentheses represents the vertical intercept of BLC , as depicted in figure 3.8 (see dashed line). Wecan then use this expression to find the horizontal intercept of BLC . Setting y = 0, yields 0 = pIy + ppxy x − ppxy x, which simplifies to ppxy x = pIy + ppxy x and, solving for x, we find the horizontal intercept x = pIx + x, as figure 3.8 indicates. Example 3.6: Coupons John’s income is $100, the price of electricity is px = $1, and that of bikes is py = $4. Assume that a government agency distributes coupons for the first 200 kWh per month, making them free. Because x = 200, John’s budget line BLC is y = pIy for all x < 200, which in this scenario becomes y = 100 4 = 25 units. This is graphically represented by a horizontal line at a height of y = 25 from x = 0 to x = 200. For units of x beyond x = 200, however, John’s budget line is 1 1 I px px 100 1 + 200 − x = 75 − x. y= + x − x= py py py 4 4 4 4 Graphically, this means that the dashed segment in figure 3.8 originates at y = 75, decreases at a rate of 14 , and hits the horizontal axis at x = pIx + x = 100 1 + 200 = 300 units. 64 Chapter 3 Appendix A. Applying the Lagrange Method to Solve the Utility Maximization Problem px x In the presentation of the UMP in this chapter, we used the tangency condition MU MUy = py to find optimal consumption bundles. However, we never formally showed that this condition must be satisfied at the optimum of the UMP. We now demonstrate this result. First, note that the UMP can be expressed as max u(x, y) x,y subject to px x + py y = I. Therefore, the consumer chooses the bundle (x, y) that maximizes her utility function u(x, y) subject to the budget line. As described previously, we use the budget line px x + py y = I, rather than the budget constraint px x + py y I, because the consumer will always spend all of her available income.10 Hence, the consumer faces a “constrained maximization problem,” in which her constraint is px x + py y = I (i.e., choosing a point along the budget line), and the objective function that she seeks to maximize is her utility function. Constrained maximization problems are often solved by setting up a Lagrangian function, which in this UMP is L (x, y; λ) = u(x, y) + λ I − px x − py y , where λ represent the Lagrange multiplier, which multiplies the budget line. To solve this problem, we take first-order conditions with respect to x, y, and λ, which yields ∂L = MUx − λpx = 0, ∂x ∂L = MUy − λpx = 0, and ∂y ∂L = I − px x − py y = 0. ∂λ The first condition can be rearranged to be expressed as MUy py MUx px = λ; and, similarly, the second condition can = λ. Because both conditions are equal to λ, we obtain MUy MUx =λ= . px py This is the “bang for the buck” coinciding across goods, as described in section 3.3. px x Alternatively, this condition can be expressed as MU MUy = py , which coincides with the 10. This happens even if some of the goods are regarded as bads for the consumer. In that case, she would spend all her income I on the other good, still leaving no money unspent. In other words, even if a corner solution emerges as the solution of the UMP, the consumer spends all her money. This was the case, for instance, in example 3.3, where the consumer’s utility function was quasilinear and a corner solution emerged. Consumer Choice 65 y B Direction of cheaper budgets D A u1, where u1 > u C BL1 BL2 u BL3 x Figure 3.9 The EMP. tangency condition used in the previous analysis (i.e., at the optimum, the slope of the indifference curve coincides with the slope of the budget line). Appendix B. Expenditure Minimization Problem The UMP considers a fixed budget and finds which bundle provides the consumer with the highest utility level. Alternatively, one could approach the consumer’s problem as if she sought to reach a minimal utility level (i.e., a target utility), but wanted to do that by spending the least amount of money. In other words, the consumer alternatively could minimize her expenditure while reaching a fixed utility level. This is the approach that the expenditure minimization problem (EMP) follows, which we describe next. As figure 3.9 depicts, the EMP can be graphically understood as the consumer seeking to reach an indifference curve with a target utility level u, but shifting her budget line as close to the origin as possible (because lower income levels shift the budget line downward). Bundles like B or C cannot be optimal because the consumer, despite reaching the target utility level u, spends more income than at other bundles, such as A, given that budget line BL2 lies below BL3 . Bundle A, in contrast, must be optimal (i.e., expenditure minimizing) because we cannot find other bundles for which the consumer still reaches the “target” utility level u at a lower expenditure than BL2 . As figure 3.9 illustrates, at the optimal bundle A, the indifference curve and the budget line are tangent to each other (i.e., their slopes coinpx x cide), thus providing us with the same equilibrium condition, MU MUy = py , as when solving the UMP. Finally, note that the consumer’s constraint in this setting becomes u(x, y) = u, rather than u(x, y) u, because she would never choose bundles satisfying u(x, y) > u. Intuitively, bundles providing the consumer with more than the minimal target utility u, such as D in figure 3.9, cannot be optimal, since the consumer can find cheaper bundles that reach the 66 Chapter 3 target utility level u. These bundles still satisfy the constraint and can be purchased at a lower cost. As we did for the UMP, we next offer a step-by-step procedure to find the optimal consumption bundle that solves the EMP, and subsequently illustrate the application of the procedure with two numerical examples. Tool 3.3. Procedure to solve the EMP 1. Set the tangency condition MUx MUy = ppxy . Cross-multiply and simplify. 2. If the expression found from the tangency condition: a. Contains both unknowns (good x and y), then solve for y, and insert the resulting expression into the utility constraint u(x, y) = u. b. Contains only one unknown (good x or y, but not both), then solve for that unknown. Afterward, insert your result into the utility constraint u(x, y) = u to obtain the remaining unknown. c. Contains no good x or y, then compare MUx px against MUy MUx py . If px MUy py , set good y = 0 MUy x instead, MU px < py , > in the utility constraint u(x, y) = u, and then solve for good x. If, set x = 0 in the utility constraint u(x, y) = u and then solve for good y. 3. If in step 2 you find that one of the goods is consumed in a negative amount (e.g., x = −2), then set the amount of this good equal to zero on the utility constraint (e.g., u(0, y) = u), and solve for the remaining good. 4. If you have not found the values for all the unknowns (goods x and y) yet, use the tangency condition from step 1 to find the remaining unknown. The optimal consumption bundle that we find after applying Tool 3.3 (i.e., after solving the EMP) is usually referred to as “compensated demand”11 to differentiate it from the consumption bundles we obtain from solving the UMP, known as “uncompensated demand.”12 Example 3.7 applies Tool 3.3 to a Cobb-Douglas utility function. Example 3.7: EMP with a Cobb-Douglas utility function Consider a Cobb1 2 Douglas utility function u(x, y) = x 3 y 3 , where the consumer faces prices px = $10 and py = $20, and a utility target of u. To find the consumption bundle that solves px x the EMP, we apply the tangency condition MU MUy = py . However, we first need to find 11. This demand is also known as “Hicksian demand,” after British economist Sir John Richard Hicks. 12. This demand is also known as “Marshallian demand,” after economist Alfred Marshall, or “Walrasian demand” after economist and mathematician Leon Walras. Consumer Choice 67 MUx MUy , as follows: 2 2 1 −3 3 x y y MUx = 3 1 1 = . 2 3 −3 MUy 2x 3x y We apply the steps in Tool 3.3 next. px x Step 1. Tangency condition MU MUy = py reduces to contains both x and y, so we can move on to step 2a. y 2x = 10 20 , or y = x. This result 1 2 Step 2a. The constraint u(x, y) = u in this context becomes x 3 y 3 = u. Inserting our result from step 1, y = 2x, in the constraint yields 1 2 x 3 (x) 3 = u, y and, rearranging, x = u. For instance, if the target utility level is u = 5, the optimal amount of good x becomes x = 5 units. Because we found a positive amount of good x, we can move on to step 4. (We need to stop at step 3 only if we obtain negative amounts of either good.) Step 4. Finally, using the tangency condition, y = x, we find y = u. Summary. The optimal consumption bundle is x = y = u, consuming the same amount of each good. For instance, if the consumer seeks to reach a utility level u = 5, the optimal bundle is (5, 5). 1 1 Self-assessment 3.9 Eric’s utility function is u(x, y) = x 2 y 2 , and he faces prices px = $1 and py = $3 and has a utility target of u = 20. Using the same steps as in example 3.7, solve Eric’s EMP, finding his optimal purchases of goods x and y. Example 3.8: EMP with a quasilinear utility Consider the same quasilinear demand from example 3.3, u(x, y) = xy + 7x, facing market prices px = $1 and py = $2. In addition, assume that the consumer targets a utility level of u = 70 (which is the utility that the individual achieves when consuming her optimal consumption bundle at the UMP). 68 Chapter 3 We now apply the steps in Tool 3.3. y+7 px 1 x Step 1. In this scenario, the tangency condition MU MUy = py becomes x = 2 , which collapses to 2y + 14 = x. Because this result contains x and y, we move on to step 2a. Step 2a. Inserting the result from the tangency condition, 2y + 14 = x, into the utility target xy + 7x = 70, we obtain (2y + 14)y + 7(2y + 14) = 70, x x 2 2 which simplifies to 2(7 √ + y) = 70, or (7 + y) = 35. Taking the square roots of both sides yields 7 + y = 35, and solving for good y, we obtain y = −1.08 units. Because we found negative units of at least one good, we need to apply step 3 next. Step 3. As in the UMP, these results indicate that the individual consumes a zero amount of good y and dedicates all her income to buy only units of good x. As described in example 3.3, the marginal utility per dollar (its bang for the buck) is larger from good x than from y, regardless of the amount consumed, which drives her to purchase only good x. Because y = 0, her utility constraint becomes u(x, 0) = 70, or x0 + 7x = 70, which, solving for x, yields x = 10 units. Summary. The optimal consumption bundle is x = 10 and y = 0, regardless of the utility target that the individual seeks to reach. This result, of course, is consistent with that of example 3.3, where we solved the UMP of this consumer finding the same optimal bundle. Self-assessment 3.10 John’s utility function is u(x, y) = 4x1/2 + 2y, he faces prices px = $2 and py = $3, and has a utility target of u = 10. Using the same steps as in example 3.8, solve John’s EMP, finding his optimal purchases of goods x and y. Relationship between the Utility Maximization Problem and the Expenditure Minimization Problem After reading example 3.8, you probably found some similarities and differences between the UMP and the EMP. In both approaches, we start by writing down the tangency condition px MUx MUy = py , because in both the UMP and the EMP we require that the consumer chooses a bundle where her budget line is tangent to her indifference curves. However, the UMP takes the result from the tangency condition and inserts it in the constraint of the UMP, the budget Consumer Choice 69 UMP EMP Tangency condition, MRS = px/py inserted into... Budget constraint pxx+pyy = I Utility constraint u(x,y) = u Figure 3.10 Comparing UMP and EMP. line px x + py y = I, whereas the EMP takes that result and plugs it into its corresponding constraint, the utility target u(x, y) = u.13 Figure 3.10 depicts the similarities and differences of these two approaches. This description of the EMP, as well as examples 3.7–3.8, probably made you interpret the EMP as the mirror image of the UMP, as both approaches lead us to the same optimal consumption bundle. (Formally, we say that the EMP is the dual representation of the UMP.) Indeed, consider a consumer that solves her UMP and finds bundle (xU , yU ) to be optimal. In this situation, the utility that she can reach when purchasing bundle (xU , yU ) is u(xU , yU ). In this context, if we ask the consumer to solve her EMP and we require her to reach a target utility level of exactly u = u(xU , yU ), the bundle that solves her EMP will coincide with that of solving her UMP. We can draw the opposite relationship, but starting now from the EMP. Specifically, let (xE , yE ) be the bundle that solves the EMP, and let I E be the income that the consumer needs to purchase such optimal bundle (i.e., px xE + py yE = I E ). Then, if we ask the consumer to solve her UMP, giving him an income of I = I E , the optimal bundle solving her UMP, (xU , yU ), coincide with that solving her EMP, (xE , yE ). Example 3.9: Dual problems We now return to examples 3.2 and 3.7 to illustrate these equivalences between the UMP and EMP. From UMP to EMP. First, recall that solving the UMP in example 3.2, we found (xU , yU ) = (3.33, 3.33), which yields a utility level of u = 3.33. If we go to the EMP in example 3.7 (where the consumer still faces the same utility function and prices), and ask her to target a utility level of u = 3.33, then her optimal bundle 13. For convenience, we solve for y in the tangency condition and insert our result into the y term of the budget line when we solve the UMP, or in the y term of the utility function when solving the EMP. Alternatively, you can solve for good x and obtain similar results. 70 Chapter 3 becomes (xE , yE ) = (3.33, 3.33) because in example 3.7 we found that x = y = u. Hence, optimal bundles in the UMP and EMP coincide. From EMP to UMP. Similarly, consider that we approach the consumer again, giving her the income that she would need to purchase the optimal bundle found in the EMP of example 3.7, px xE + py yE = $100, as described previously. Solving her UMP, she obtains (xU , yU ) = (3.33, 3.33), which coincides with the optimal bundle solving the EMP. Exercises 1. Budget Line.A Peter has an income of I = $100, which he dedicates to purchasing soda and pizza. The price of soda is $1 per can, while that of pizza is $2 per slice. (a) Find the equation of his budget line, and represent it graphically. (b) How does Peter’s budget set change when his income increases to I = $150? (c) Consider that the university that Peter attends subsidizes pizza, decreasing its price by $1. What is Peter budget set now? (d) What if, instead, the university gives Peter 25 vouchers that he can use to get 25 free slices of pizza? 2. Choosing the Best Deal.B Peter has a monthly income of I = $1, 000 to spend on pizza (good x, with a price of $20 per unit), or other goods (composite good y, whose price is $1). The pizza place where he always goes announced an attractive offer: “Pay $800 and you will get 30 pizzas for an entire month (1 pizza per day).” (a) Assuming that Peter’s preferences are represented with a Cobb-Douglas utility function u(x, y) = x1/3 y2/3 , will Peter accept the offer? (b) What if, instead, Peter’s preferences are represented with a linear utility function u(x, y) = 2x + y? Will he accept the offer then? 3. Composite Good.A Consider an individual with utility function u(x, y) = (x + 3)y, and income I = $30. The price of good x is px = $2, while that of good y is normalized to py = $1 (that is, good y represents the money left for purchasing all other goods but x, which we refer as the “numeraire”). (a) Find the optimal consumption bundle of this individual. Evaluate his utility function at this optimal bundle. (b) Assume now that his income was increased by $10 (for a total of I = $40). What is his new optimal consumption bundle? What is the new utility level that he can reach? (c) Assume now that the price of good x decreases in $1, so its new price is px = $1. What is his new optimal consumption bundle? What is the new utility level that he can reach? Consumer Choice 71 (d) Assume that he receives a coupon allowing him to consume 4 units of good x for free. What is his new optimal consumption bundle? What is the new utility level that he can reach? (e) In which version of parts b–d is the consumer better off? That is, describe whether the consumer prefers the change in income from part (b), the change in prices from part (c), or the coupon from part (d). 4. Checking Statements.A One of your classmates approaches you before an exam, saying that he figured out how to tackle several questions in microeconomics. For each of the following sentences, argue if your classmate is wrong and give the reason behind your answer. (a) Perfect substitutes. “If a consumer regards two goods as substitutes in consumption, he will always choose the cheaper good.” (b) Perfect complements. “If a consumer regards two goods as perfect complements in consumption, his compensated demand must be flat.” (c) Demand and compensated demand. “Demand and compensated demand curves are different for all types of goods.” 5. Expenditure Function.A Consider that a consumer’s expenditure function is given by 2 1/3 upx py e(px , py , u) = 3 . 4 Find the demand for good y, y(px , py , u). [Hint: Because px hx (px , py , u) + py hy (px , py , u) = e(px , py , u), you may differentiate e(px , py , u) with respect to py to find hy (px , py , u).] The demand function you find is the one that solves the EMP. In this scenario, however, we find it without needing to solve the EMP. As we have information about expenditure function e(px , py , u), we can find this demand function more directly by just differentiating e(px , py , u) with respect to py . 6. Quasilinear Utility–I.B John has a monthly income of I = $700 and a quasilinear utility function of the type u(x, y) = x1/2 + 7y. The price of good x is $2, while the price of good y is $3. (a) Find John’s tangency condition, following step 1 of the utility maximization procedure. (b) Find John’s equilibrium quantities for goods x and y. 7. Cobb-Douglas–I.A Eric has a monthly income of I = $500 and a Cobb-Douglas utility function of the type u(x, y) = x1/2 y1/2 . The price for good x is $1, and the price for good y is $3. (a) Find Eric’s tangency condition, following step 1 of the utility maximization procedure. (b) Find Eric’s equilibrium quantities for goods x and y. 8. Cobb-Douglas–II.A Chelsea has a monthly income of I = $1, 000 and a Cobb-Douglas utility function of the type u(x, y) = x0.6 y0.4 . The price for good x is $3, and the price for good y is $1. (a) Which good does Chelsea prefer, x or y? How do you know this? (b) Find Chelsea’s tangency condition, following step 1 of the utility maximization procedure. (c) Find Chelsea’s equilibrium quantities for goods x and y. (d) Compare your results from parts (a) and (c). Are they in line with Chelsea’s preferences? Why or why not? 72 Chapter 3 9. Cobb-Douglas–III.C Eric has a Cobb-Douglas utility function of the type u(x, y) = x1/2 y1/2 . Suppose that Eric has a general value for his income, I, and general values for the prices of goods x and y (px and py , respectively). (a) Find Eric’s tangency condition, following step 1 of the utility maximization procedure. (b) Find Eric’s equilibrium quantities for goods x and y as a function of I, px , and py . (c) Compare your results with those in exercise 7 by setting I = $500, px = $1, and py = $3. Are they identical? 10. Kinks in the Curve.B John has a weekly income of I = $50, which he dedicates to purchasing soda and other goods (a composite good, whose price is $1). The price of soda is $2 per can. (a) Find the equation of his budget line, and represent it graphically. (b) Suppose that the price of soda changes to $2/can for the first 10 cans, and then $1 for each additional can. Depict this new budget line graphically. (c) Before the price change in part (b), John purchased 20 cans of soda and 10 units of the composite good. Describe generally how John’s consumption of soda and the composite good change as the price changes. 11. Different Relationships.C Eric has a weekly income of I = $50 and a utility function of the type x1/2 = x1/2 (y + 1)−1 . The price for good x is $1, and the price for good y is also $1. u(x, y) = y+1 (a) Are x and y both goods? If not, which is a “bad?” (b) Find Eric’s tangency condition, following step 1 of the utility maximization procedure. (c) Find Eric’s equilibrium quantities for goods x and y. 12. Perfect Substitutes–I.A Chelsea has a monthly income of I = $800 and a utility function of the type u(x, y) = 3x + 4y. The price for good x is $1, and the price for good y is $2. (a) Find Chelsea’s tangency condition, following step 1 of the utility maximization procedure. (b) Find Chelsea’s equilibrium quantities for goods x and y. (c) Suppose that the price of good y falls to $1. What happens to Chelsea’s equilibrium quantities? 13. Perfect Complements–I.A John has a monthly income of I = $400 and a utility function of the type u(x, y) = min{2x, y}. The price of good x is $3, while the price of good y is $4. (a) What is John’s most preferred ratio of consuming good x to good y? (b) Find John’s equilibrium quantities for goods x and y. 14. WARP.B Eric has a weekly income of I = $40, which he allocates between purchasing goods x and y. When the price of good x is $4 and the price of good y is also $4, Eric purchases 3 units of good x and 7 units of good y in equilibrium. Now suppose that the price of good x falls to $2. (a) Find the equations of his original and new budget lines and represent them graphically. (b) Suppose that Eric’s new equilibrium bundle is 5 units of good x and 5 units of good y. Does this new bundle violate WARP? Explain why or why not. (c) Suppose that Eric’s new equilibrium bundle contains 4 units of good y. How many units of good x must be consumed such that our equilibrium allocation does not violate WARP? Consumer Choice 73 15. Expenditure Minimization Problem.A Peter wishes to reach a utility level of U = 50 and has a Cobb-Douglas utility function of the type u(x, y) = x0.4 y0.6 . The price for good x is $1, and the price for good y is $4. (a) Find Peter’s tangency condition, following step 1 of the expenditure minimization procedure. (b) Find Peter’s equilibrium quantities for goods x and y. (c) How much income does Peter require to reach his target utility level? 16. Perfect Substitutes–II.B Chelsea wishes to reach a utility level of U = 100 and has a utility function of the type u(x, y) = 5x + 2y. The price for good x is $3,and the price for good y is $1. (a) Find Chelsea’s tangency condition, following step 1 of the expenditure minimization procedure. (b) Find Chelsea’s equilibrium quantities for goods x and y. (c) How much income does Chelsea require to reach her target utility level? 17. Perfect Complements–II.B John wishes to reach a utility level of U = 75 and has a utility function of the type u(x, y) = min{3x, 2y}. The price of good x is $2, while the price of good y is $3. (a) What is John’s most preferred ratio of consuming good x to good y? (b) Find John’s equilibrium quantities for goods x and y. (c) How much income does John require to reach his target utility level? 18. Quasilinear Utility–II.B Eric wishes to reach a utility level of U = 50 and has a quasilinear utility function of the type u(x, y) = 4x + y1/2 . The price of good x is $2, while the price of good y is also $2. (a) Find Eric’s tangency condition, following step 1 of the expenditure minimization procedure. (b) Find Eric’s equilibrium quantities for goods x and y. (c) How much income does Eric require to reach his target utility level? 19. Utility Maximization.B Suppose that you are in a situation where you can afford purchasing 3 units of good x and 4 units of good y. You discover that another affordable bundle of 5 units of good x and 2 units of good y causes you to reach the same level of utility. Assume that your utility function and budget line are both well behaved. (a) Are you maximizing your utility by consuming your original bundle? Why or why not? (b) Propose an alternative, affordable bundle that would yield a higher utility level that either of the original bundles. 20. Goods and Bads.A Consider a situation where there are two goods in the world: food and garbage. Naturally, garbage in this context would be considered a “bad,” while food is considered a good. Suppose that the price for both types of goods is positive. (a) Would a rational person consume any garbage in equilibrium? Why or why not? (b) Under what condition or conditions would a consumer be willing to consume a positive amount of garbage? 4 Substitution and Income Effects 4.1 Introduction In this chapter, we use the solution to the consumer’s utility maximization problem from chapter 3 —the optimal consumption bundle— to analyze how it changes in the individual’s income, as well as in the price of each good. We start examining how consumption bundles are affected by income. When the consumer becomes richer, her purchases of most goods will likely increase, but consumption of some goods, such as fast food, may fall. In addition, the consumer could use part of her larger purchasing power to buy other goods. For instance, individuals may need less of basic staples, such as bread or rice, as their income increases, but update their smartphones more frequently. This pattern has been well documented in countries moving from underdeveloped to developed status. Furthermore, we examine how consumers respond to less expensive goods by increasing their consumption of that good, and how they may respond by decreasing their purchases of other goods, which now become relatively more expensive. We then analyze how this increased consumption can be disaggregated (separated) into a substitution and an income effect. The substitution effect measures how the consumer shifts her purchases toward the good that became relatively less expensive. The income effect reflects the consumer’s increased purchases due to her larger purchasing power. 4.2 Income Changes In this section, we take the demand for a good (the optimal consumption bundle described in chapter 3), and analyze how it changes as the consumer’s income increases. For most goods, we expect such demand to increase as the individual becomes richer (i.e., the number of units she demands at a given price increases as her income grows). We next present four ways to measure such a change in demand. 76 Chapter 4 4.2.1 Using the Derivative of Demand Recall that, formally, we use x(px , py , I) to represent a consumer’s demand for good x, which is a function of the price of goods x and y, and the consumer’s income. Next, we describe normal and inferior goods. Normal goods A consumer’s demand for good x, x(px , py , I), is normal if its derivative with respect to income is positive; that is, ∂x(px , py , I) > 0. ∂I Intuitively, she demands more units of good x as her income, I, increases. Most goods satisfy this property, and thus they are normal goods. You can think about how your purchases of holiday packages, restaurant meals, and cars would increase if you won the lottery. However, not all goods are normal, because some products see their demand fall as individuals become richer, as we define next. Inferior goods A consumer’s demand for good x, x(px , py , I), is inferior if its derivative with respect to income is negative; that is, ∂x(px , py , I) < 0. ∂I Intuitively, consumers cut their consumption of inferior goods as soon as they can afford to do so. Examples include basic food staples, such as canned meats, rice, or intercity bus service. If your income were to double (or you won the lottery), wouldn’t you reduce your purchases of Spam (canned, precooked, pork and ham meat), and bus tickets to travel to another city (rather than taking a plane)? Recall that when we say that a derivative is negative, the terms in the numerator and denominator move in opposite directions. In this context, the negative derivative implies that when income increases, quantity demanded decreases; and vice versa.1 This concept indicates that inferior goods, such as basic staples, experience demand increases if individuals (or entire countries) become poorer, such as after an economic crisis. This could have been the case, for instance, in Mexico after the 1990 currency crisis (when individuals started consuming more tortillas than in previous years, cutting their purchases of high-quality meats), 1. This argument also applies to the situation where such a derivative is positive. In that context, an increase in income raises the quantity demanded, and a decrease in income entails a reduction in the quantity demanded. That is, income and demand move in the same direction, either both going up, or both going down. Substitution and Income Effects 77 or the US after the 2008 crisis (when Walmart saw its sales increase, whereas high-end supermarkets saw their sales decrease).2 Example 4.1: Increasing income in a Cobb-Douglas utility function Consider an individual with a Cobb-Douglas utility function u(x, y) = xy, who faces prices px , py , and income I. As shown in previous chapters, her optimal consumption for good x (i.e., her demand) is3 x= I . 2px This demand is increasing in income because I shows up in the numerator (more 1 formally, you can check that, indeed, ∂x ∂I = 2px > 0). Hence, good x is normal in consumption because its demand increases in income. A similar argument applies to the demand of good y = 2pI y , which is also increasing in income. Self-assessment 4.1 Assume that Eric’s demand function for good x is x = ∂x √ 5I px −3py . Find the derivative ∂I and its sign, and interpret your results. 4.2.2 Using Income Elasticity An alternative way to represent the relationship between income and demand is found by ∂x(px ,py ,I) inserting the derivative examined in the previous section, , in the formula of income ∂I elasticity, as follows: εx,I = ∂x(px , py , I) I , ∂I x(px , py , I) which measures the percentage change in quantity demanded per 1 percent change in income. In other words, if we increase income by 1 percent, quantity demanded changes by εx,I percent. Because the elements in the second ratio, I and x(px , py , I), are both positive, the sign of the income elasticity ultimately depends on the sign of the derivative in the first term. 2. See McKenzie (2002). 3. If you do not remember this result, now is a good time to practice. First, set the tangency condition in this x = y = px , or y = px x. Second, insert this result into the budget line, p x + p y = I, to obtain p x + setting, MU x y x py MUy x py px I py py x = I, which simplifies to 2px x = I. Finally, solving for x yields the demand function for x, x = 2p , as x required. 78 Chapter 4 Table 4.1 Types of goods according to their income elasticity. Income Elasticity, εx,I Type of Good Example εx,I < 0 0 < εx,I < 1 εx,I > 1 Inferior Necessity Luxury Canned food Water Yachts That is, income elasticity is positive, εx,I > 0, when the good is normal, ∂x(px , py , I) ∂I ∂x(px , py , I) ∂I > 0; but < 0. it becomes negative, εx,I < 0, when the good is inferior, In addition, when income elasticity is positive and larger than 1 (i.e., εx,I > 1), we say that the good is regarded as a luxury by consumers. Indeed, an income elasticity of εx,I > 1 (e.g., εx,I = 2) indicates that the consumer responds to a 1 percent increase in her income by increasing her consumption of the good by more than 1 percent (e.g., 2 percent). In other words, an increase in income produces a more-than-proportional increase in the quantity demanded of the good, which occurs with goods such as housing (particularly second residences), electronic gadgets, or yachts. Instead, a good whose income elasticity is less than 1 (0 < εx,I < 1), is regarded as a necessity, as a 1 percent increase in income yields a lessthan-proportional increase in demand (e.g., by 0.5 percent). As an extreme example, when εx,I = 0, the consumer is insensitive to changes in her income, purchasing the same amount of the good (water or natural gas) regardless of her income. Table 4.1 summarizes the three types of goods. Example 4.2: Finding income elasticity in the Cobb-Douglas scenario From example 4.1, the demand for good x is x = 2pI x , and its derivative with respect to 1 income is ∂x ∂I = 2px . We can use these results to evaluate the income elasticity of good x, as follows: εx,I = ∂x(px , py , I) I ∂I x(px , py , I) = 1 I 1 = 2px = 1, I 2px 2p 2px x thus indicating that the good is normal, because εx,I > 0, but it is neither a luxury (which requires εx,I > 1) nor a necessity (which needs that εx,I < 1). Substitution and Income Effects 79 y BL3 BL2 BL1 C A Income-consumption curve IC3 B IC2 IC1 x Figure 4.1 Income-consumption curve. Self-assessment 4.2 Consider again Eric’s demand function x = √px5I−3py . Find his income elasticity εx,I , its sign, and interpret your results. 4.2.3 Using the Income-Consumption Curve To find the income-consumption curve, we first depict the optimal consumption bundle at an initial income level I1 . As illustrated in figure 4.1, bundle A is where the indifference curve IC1 becomes tangent to the budget line BL1 . The individual’s income is then increased, which shifts her budget line upward, to BL2 . Then she selects a new consumption bundle that maximizes her utility, as depicted by point B, where indifference curve IC2 becomes tangent to her new budget line BL2 . We repeat the process again, increasing her income, which produces a new budget line BL3 , leading the consumer to choose optimal consumption bundle C. Lastly, we connect all her optimal consumption bundles with a curve, which we refer to as the “income-consumption curve.” When the income-consumption curve has a positive slope, as in the segment between bundles A and B, it means that the individual increases her purchases of both goods x and y as she becomes richer. As a consequence, we interpret positively sloped income-consumption curves as being characteristic of normal goods. However, when these curves have a negative slope, as in the segment between bundles B and C in figure 4.1, the consumer decreases her purchases of good x (graphically, we move leftward as we jump from B to C), but she increases her purchases of good y (graphically, we move upward when jumping from B to C). Therefore, negatively sloped income-consumption curves indicate that one of the goods must be inferior. 80 Chapter 4 Example 4.3: Finding income-consumption curves From example 4.1, the demand for good x is x = 2pI x , and the demand for good y is y = 2pI y . Hence, the ratio of these demands is y = x I 2py I 2px = px py which is the slope of the ray connecting the origin (0, 0) with any optimal consumption bundle. For instance, when px = $4 and py = $2, this ratio becomes yx = 42 = 2, thus indicating that the optimal consumption of goods y and x maintain a two-to-one relationship. Graphically, this result implies that the income-consumption curve is a straight line from the origin, with a constant slope of 2. After a given increase in income, the consumer responds by increasing her demand for good y more significantly than for good x, which comes as no surprise because x is twice as expensive as y.4 Self-assessment 4.3 Consider Maria’s demand function x = √5Ipx and assume that her demand for good y is symmetric, so that y = √5Ipy . Find Maria’s incomeconsumption curve, and evaluate it at prices px = $4 and py = $9. 4.2.4 Using the Engel Curve An alternative approach to represent how income affects the demand for a good plots the demand for good x, x(px , py , I), on the vertical axis and income I on the horizontal axis; obtaining the so-called Engel curve of the good. The left panel of figure 4.2 depicts a positively sloped Engel curve, which implies that the good is normal, as the number of units purchased increases with income (i.e., purchases of x increase as we move rightward on the graph). The right panel depicts an Engel curve that has a positive slope for low-income levels (implying that the good is normal when the individual is not very rich), but eventually becomes negatively sloped (indicating that the individual starts regarding the good as inferior once she is sufficiently rich). 4. Alternatively, you can see this result by going back to the demand function of good x, x = 2pI , and evaluate it at x the price considered in the current example, px = $4, to obtain x = 8I . Similarly, for the demand of good y, y = 2pI , y 1 we find y = 4I . We can now see that a given increase in income induces an increase of ∂x ∂I = 8 in the demand of ∂y 1 good x and a larger increase of ∂I = 4 in the demand of good y. Substitution and Income Effects 81 (a) (b) x x Income Income Figure 4.2 Two Engel curves. The left panel depicts the Engel curve for products such as real estate, whose demand keeps increasing regardless of the individual’s income (e.g., second and third residences). In contrast, the Engel curve on the right panel may illustrate canned food or public transportation (which increases for a while as the individual’s income grows, but eventually decreases when her income is sufficiently high). Example 4.4: Finding Engel curves Recall from example 4.1 that the demand for good x is x = 2pI x . Because Engel curves plot units of x on the horizontal axis and income I on the vertical axis, we need to solve for I in x = 2pI x , obtaining an Engel curve of I = (2px ) x. That is, the Engel curve for this good originates at zero, and has a slope of 2px (e.g., a slope of 6 if px = $3). In addition, such a slope is positive and constant in x, thus indicating that the consumer regards good x as normal (demand increases in income) for all income levels. Self-assessment 4.4 Consider again Eric’s demand function, x = √px5I−3py . Solve for income I to find his Engel curve. Is its slope positive or negative? What does your result mean in terms of good x being normal or inferior? 82 Chapter 4 y Region b BL2 B Region a B BL1 A B Region c x Figure 4.3 Not all goods can be inferior. Remark—Not All Goods Can Be Inferior—While our previous discussion allows some goods to be inferior, it is important to note that not every good can be inferior. Figure 4.3 depicts an individual facing income I1 at budget line BL1 , who chooses an optimal consumption bundle A. When her income increases to I2 , her budget line shifts upward to BL2 . Which bundle B does the consumer select at this new income level? Because the individual must exhaust all her income, her new bundle B must lie along BL2 , thus allowing three possibilities: • If bundle B lies in region a, to the northeast of bundle A, the individual increases her consumption of both goods x and y. • If bundle B lies in region b, northwest of A, the consumer purchases more units of y but fewer of x, thus regarding good y as normal and x as inferior, respectively. • If bundle B lies in region c, southeast of A, the individual buys fewer units of y but more of x, indicating that good y is inferior, whereas x is normal. Hence, either both goods are normal, or only one of them is inferior. For both to be inferior, the new bundle B should lie in the shaded region to the southwest of the initial bundle A. In this region, however, the consumer does not spend all her income. As described in chapter 3, bundles in the shaded region are not optimal because the consumer can still afford bundles that increase her utility. (Appendix A at the end of this chapter, provides a more formal proof of this result using income elasticities.) 4.3 Price Changes In this section, we analyze how demand changes as the price of one good changes. Similar to the previous section, we can use different measures, as described next. Substitution and Income Effects 83 4.3.1 Using the Derivative of Demand Because x(px , py , I) represents a consumer’s demand for good x, we say that her demand curve for the good is negatively sloped if its derivative with respect to its own price px is negative: ∂x(px , py , I) < 0. ∂px In this case, the consumer purchases fewer units as the good becomes more expensive, where we keep her income and the price of all other goods constant. If, instead, the opposite ∂x(px ,py ,I) > 0, the demand function has a positive slope. This indicates relationship holds, ∂px that the quantity demanded and price go in the same direction. In this case, an increase (decrease) in price leads the consumer to increase (decrease) her purchases of this good. (What we just said sounds crazy, but we will return to this type of goods later in this chapter.) We refer to this type of goods as “Giffen goods.” Example 4.5: Demand and price changes Consider the demand function from example 4.1, x = 2pI x . If the price of good x increases by a small amount, the consumer’s purchases respond as follows: ∂x(px , py , I) I =− 2, ∂px 2px which is negative, given that I and px are both positive by assumption. Hence, demand function x = 2pI x decreases in price px . Graphically, this demand function has a negative slope. Self-assessment 4.5 the derivative ∂x(px ,py ,I) ∂px Consider again Eric’s demand function, x = √px5I−3py . Find and its sign, and interpret. 4.3.2 Using the Price-Elasticity of Demand An alternative way to represent the relationship between the price of good x and its demand ∂x(px ,py ,I) is by inserting the derivative described in section 4.3.1, , in the formula of price∂px elasticity, as follows: εx,px = ∂x(px , py , I) px . ∂px x(px , py , I) 84 Chapter 4 Intuitively, if we increase price px by 1 percent, quantity demanded changes by εx,px percent. As for income elasticity, the elements in the second ratio are all positive, implying ∂x(px ,py ,I) that the sign of εx,px depends only on the sign of the derivative (i.e., whether the ∂px demand function has a positive or negative slope). For most goods, the demand function has a negative slope (i.e., the quantity demanded and price move in opposite directions), entailing that price-elasticity must also be negative. Example 4.6 evaluates the price-elasticity of the demand function we found in example 4.1. Example 4.6: Price elasticity and demand Consider again the demand function from example 4.1, x = 2pI x . Using the formula for price elasticity, and recalling from example 4.5 that ∂x(px ,py ,I) ∂px = − 2pI 2 , we obtain x εx,px = ∂x(px , py , I) px ∂px x(px , py , I) =− I px = −1. 2p2x 2pI x Intuitively, a 1 percent increase in price px produces a proportional reduction in its own demand (i.e., purchases of good x decrease by exactly 1 percent). Self-assessment 4.6 Consider again Eric’s demand function, x = √px5I−3py . Find its price elasticity εx,px , and interpret your result. Cross-price elasticity The “cross-price elasticity” of the demand for good y to px is εy,px = ∂y(px , py , I) px . ∂px y(px , py , I) This expression says that, if we increase the price of good x by 1 percent, the quantity demanded of good y changes by εy,px percent. In example 4.1, the demand for good y is y = 2pI y , implying that quantity demanded of y is independent of px . Therefore, the first ∂y(p ,p ,I) x y = 0, entailing that εy,px = 0. term in the cross-price elasticity formula becomes ∂px Intuitively, a 1 percent increase in the price of good x does not affect the demand for good y at all. Substitution and Income Effects 85 y Price-consumption curve B C A px 2 BL2 BL1 12 4 BL3 x A 1 B C 0.5 12 4 x Figure 4.4a Price-consumption curve–I. 4.3.3 Using Price-Consumption Curves Figure 4.4a illustrates a decrease in the price of good x, px . Starting from px = 2 in budget line BL1 , a decrease to px = $1 makes the budget line BL2 flatter, implying that the individual can now afford larger amounts of good x. A similar argument applies when this price further decreases to px = $0.50 in budget line BL3 .5 Figure 4.4a also depicts the optimal consumption bundles that the consumer selects at each price px : bundle A at px = $2 (where she consumes only 1 unit of good x), bundle B at px = $1 (where she now consumes 2 units of x), and bundle C at px = $0.50 (where she consumes 4 units of x). The graph connects optimal bundles A–C with a curve, often referred to as the “price-consumption curve.” In this example, as good x becomes cheaper, the individual increases her purchases of both goods x and y. 5. Recall from chapter 3 that the horizontal intercept of the budget line is pIx . A decrease in px then produces an increase in ratio pIx , ultimately shifting this horizontal intercept rightward. In contrast, the vertical intercept, pIy , is unaffected by a cheaper good x, because it is independent of px . 86 Chapter 4 y A C Price-consumption curve B BL 1 BL 2 BL 3 x Figure 4.4b Price-consumption curve–II. Figure 4.4a (top panel) depicts that the price-consumption curve has a positive slope at all levels of price px . This could occur, for instance, if good x represents housing: as houses and apartments become cheaper (per square foot), you can probably afford a larger house. The lower panel of figure 4.4a summarizes these results. For clarity, it depicts good x on the horizontal axis (as in the top panel) but has the price of this good, px , on the vertical axis (rather than the units of y). This helps us more easily focus on good x, by directly seeing how the purchases of good x are affected by changes in its own price px . The lower panel represents, of course, the demand curve for x. (For instance, the demand function found in example 4.1, x = 2pI x , would have a graphical representation similar to that in the lower panel of figure 4.4a, because x = 2pI x decreases in px and at a decreasing rate.)6 Figure 4.4b illustrates a situation in which the individual also increases her consumption of good x, the good that became relatively cheaper. As for good y, however, the consumer in this case decreases her purchases when moving from bundle A to B (when px decreases 10 = 5 . You can then plot this expression 6. Assume that income is I = $10, and depict the demand function x = 2p px x ∂x = − 5 is negative, given in a graphing calculator. Alternatively, note that demand decreases in px because ∂p 2 x (px ) that px > 0 by assumption, thus indicating that demand x = p5x has a negative slope. In addition, such a negative 2 slope increases (becoming closer to zero) as px increases (i.e., the second derivative is ∂ 2x = 10 3 , which is ∂px (px ) positive). To graphically interpret these two results, first note that we are differentiating x (on the horizontal axis) with respect to px (on the vertical axis). We then can check the slope of the figure once you rotate it 90 degrees counterclockwise, so x is now on the vertical axis and px is on the horizontal axis. At this point, it is clear that the demand has negative slope (smaller purchases of x as px increases), and becomes flatter as we further increase prices. Substitution and Income Effects 87 from $2 to $1), but increases her purchases of y afterwards (when px further decreases to $0.50).7 Example 4.7: Finding price-consumption curves Recall from example 4.3 that the demand for good x is x = 2pI x , and that of good y is y = 2pI y . Hence, the ratio of these demands is y = x I 2py I 2px = px , py which gives the slope of the ray connecting the origin (0, 0) with any optimal consumption bundle. Therefore, an increase in the price of good x, px , increases the value of the above ratio yx = ppxy , showing that the consumer moves to optimal bundles that contain more units of good y (the good that, relative to x, became cheaper) and less units of good x (the good that became more expensive). Graphically, this entails that the price-consumption curve pivots northwest as px increases. The opposite argument applies if py increases, where now ratio yx = ppxy goes down, illustrating that the consumer purchases more units of good x but less of good y. Self-assessment 4.7 Maria’s demand function for good x is x = √5Ipx , and her demand for good y is y = √5Ipy . Find her price-consumption curve and interpret your results. 4.4 Income and Substitution Effects The previous discussion uses the demand curve (and the price-consumption curve) to measure how much the individual increases her quantity demand of a good after a price decrease. Such an increase in the quantity demanded, however, includes two simultaneous effects, which we examine separately next. Income effect The change in the quantity demanded due to an increase in purchasing power, with the price of the item held constant. 7. Graphically, the price-consumption curve unambiguously moves rightward, indicating an increase in the purchases of good x, but it has a negative slope followed by a positive slope, reflecting that the individual initially decreases her consumption of y, but subsequently increases it. 88 Chapter 4 A cheaper good x allows the individual to afford more units of all goods (i.e., larger purchasing power).8 The increase in the quantity demanded of good x due to greater purchasing power is referred to as the “income effect.” As described at the beginning of this chapter, an increase in income can induce the consumer to increase (decrease) her purchases of a good if she regards it as normal (inferior), thus allowing for positive (negative) income effects. The substitution effect seeks to measure how much quantity demanded changes due to the change in the price ratio (making one good relatively cheaper than the other), while fixing the utility of the individual at the level reached before the price change. Substitution effect The change in quantity demanded due to a change in its price, holding the utility level constant. In contrast to the income effect, we now keep the utility constant at the level reached before the price change. That is, we adjust the individual’s income so she can reach the same utility level as before the price change, and then she can choose an optimal bundle at the new price ratio. After a price decrease (increase), the substitution effect always leads to an increase (decrease) in the quantity demanded of the good that became relatively less (more) expensive. This quantity change is, importantly, not due to an increase in the consumer’s purchasing power. Instead, it is due to a change in the relative price of the two goods, inducing the consumer to increase (decrease) her purchases of the good that became relatively cheaper (more expensive). 4.5 Putting Income and Substitution Effects Together Figure 4.5 depicts the income and substitution effects of a price decrease for normal goods. First, when facing budget line BL1 , the consumer selects an optimal consumption bundle A, where she reaches an indifference curve IC1 , and she purchases xA units of good x. When the price of x decreases, the budget line pivots upward to BL2 , and in this situation the consumer chooses an optimal bundle C, where she purchases xC units of good x. The difference xC − xA represents the total effect (or total change in consumption due to the cheaper price of x). To separate the total effect into the substitution and income effects, we first need to shift BL2 downward (so we reduce the individual’s income) to make her reach the same utility level as 8. For example, imagine that housing prices were to fall by 50 percent tomorrow. You could probably afford a larger apartment (or even a house!), more units of other goods, or both. Substitution and Income Effects 89 y A C B BL1 xA xB TE SE xC BLd BL 2 x IE Figure 4.5 Income and substitution effects for normal goods. before the price change. The resulting budget line, BLd , is parallel to BL2 , thus having the final price ratio, and is tangent to indifference curve IC1 at bundle B, where the consumer purchases xB units of good x. We are done! At this point, we only need to summarize our results: 1. The increase in consumption from xA to xB reflects the substitution effect (buying more units of x solely due to the fact that good x became relatively cheaper). 2. The increase in consumption from xB to xC represents the income effect (buying more units of x because the consumer’s income has increased, but fixing prices at the final price ratio). 3. The total effect is, of course, the sum of the substitution and income effects. Figure 4.6a illustrates the income and substitution effect when the consumer regards good x as inferior. In this scenario, the income effect is negative, but it only partially offsets the substitution effect, thus producing a total effect that is positive overall. Figure 4.6b depicts a Giffen good, which also exhibits a negative income effect, being large enough to offset the substitution effect and generate an overall negative total effect. Recall that the total effect measures how the quantity demanded responds to a change in price (in these figures, a decrease in price of x). Hence, a normal and inferior good would entail that the consumer responds to a cheaper good x by increasing her purchases (i.e., positive total effect), whereas a Giffen good would mean that the consumer responds by reducing her purchases of good x (i.e., negative total effect). That is, demand curves are 90 Chapter 4 y C A B BL1 xA xC xB BLd BL2 x BLd BL2 x TE IE SE Figure 4.6a Income and substitution effects for inferior goods. y C A B xC xA TE SE IE Figure 4.6b Income and substitution effects for Giffen goods. BL1 xB Substitution and Income Effects 91 Table 4.2 Substitution and income effects. Price Decrease Price Increase Type of good SE IE TE Type of good SE IE TE Normal Inferior Giffen + + + + − − + + − Normal Inferior Giffen − − − − + + − − + negatively sloped for normal and inferior goods, but become positively sloped for Giffen goods.9 In the case of a price decrease, the left panel of table 4.2 summarizes the signs of the substitution effect (SE), income effect (IE), and total effect (TE) for normal, inferior, and Giffen goods. A positive sign in the table indicates that, after a price decrease (such as those in figures 4.5–4.6), the individual responds by increasing her purchases of the good, while a negative sign represents a decrease in her purchases of that good. The right side of table 4.2 analyzes a price increase, showing that all signs are reversed. For instance, inferior goods exhibit a negative substitution effect (fewer purchases), a positive income effect (more purchases), and, overall, a negative total effect. Example 4.8: Finding IE and SE with a Cobb-Douglas utility function Consider a utility function u(x, y) = xy, income I = $100, and a price for good y of py = $1. The price of good x decreases from px = $3 to px = $2. Let us find the total effect of the price change, and then decompose it into the income and substitution effects. Finding initial bundle A. Under the initial price, px = $3, the tangency condition y px MUx 3 MUy = py is x = 1 or, after rearranging, y = 3x. Inserting this expression in the budget yields y = line, 3x + y = 100, we obtain 3x + 3x = 100, or x = 100 6 units, which then 100 3 6 = 50 units. Hence, with the initial price of x, the optimal bundle is A = 100 , 50 . 6 Finding final bundle C. Similarly, under the final price, px = $2, the tangency cony px 2 x dition MU MUy = py is x = 1 or, after rearranging, y = 2x. Inserting this result into the new budget line, 2x + y = 100, we obtain 2x + 2x = 100, or x = 100 4 = 25 units, which 9. Alternatively, a Giffen good requires its sales to increase as the good becomes more expensive. Empirical evidence for Giffen goods has proven elusive, but some references exist for basic staples, such as tortillas in Mexico, (McKenzie, 2002), and rice in China (Jensen and Miller, 2007). Intuitively, a Giffen good must be clearly inferior, in the sense of reducing its quantity demanded significantly after small increases in income. In this case, the good would have a sufficiently negative income effect in order to offset the positive substitution effect, leading to a negative total effect. 92 Chapter 4 yields y = 2 × 25 = 50 units. Hence, under the final price of good x, the optimal bundle is C = (25, 50). The total effect of the decrease in px is, therefore, an increase of TE = xC − xA = 25 − 100 8.3 units. 6 Finding decomposition bundle B. We can now decompose this total effect into the substitution and income effects. First, we need to find the decomposition bundle B. From the previous discussion, this bundle satisfies two conditions: 1. First, at bundle B, the consumer reaches the same utility level as at the initial 100 bundle A. Because bundle A produces a utility of u 100 6 , 50 = 6 × 50 833.3, bundle B must satisfy xy = 833.3. 2. Second, at bundle B, the decomposition budget line BLd (which has the same slope p as BL2 , pxy = 21 ) is tangent to the consumer’s indifference curve. That is, we need y MUx MUy = x , to coincide y 2 x = 1 , or y = 2x. the slope of the indifference curve, line p BLd , pxy = 2 1, which implies with the slope of budget In summary, these two conditions state that xy = 833.3 and y = 2x. Inserting one equation into the other, we obtain x(2x) = 833.3. After rearranging, we find x2 = 416.6; applying square roots to both sides yields x = 20.4 units. Hence, y = 2 × 20.4 = 40.8 units, which entails that bundle B is B = (20.4, 40.8). In summary, the substitution effect of the decrease in px is given by the rightward move from bundle A to B; that is, 100 3.74 units, 6 whereas the income effect of this price decrease is captured by the move from bundle B to C; that is, SE = xB − xA = 20.4 − IE = xC − xB = 25 − 20.4 = 4.6 units. In particular, the substitution effect indicates that the individual increases her purchases of good x by 3.74 units only due to the lower price of this good (but still reaching the same utility level as before the price change). The income effect reflects that, for a given price ratio, the individual increases her consumption of good x by 4.6 units because a cheaper good x increases her purchasing power. Substitution and Income Effects 93 Self-assessment 4.8 John’s utility function is u(x, y) = x1/3 y2/3 , his income is I = $150, and the price of good y is py = $1. The price of good x decreases from px = $3 to px = $1. Using the steps in example 4.8, find the substitution and income effects. Next, we alter example 4.8 by considering a quasilinear utility function. As we show, income effects of a price decrease for the good that enters linearly are zero, implying that the substitution and total effects coincide. Example 4.9: Finding IE and SE with a quasilinear utility Consider now a utility as in example 4.8. function u(x, y) = 2x1/2 + y, and the same income and price changes 1 −1/2 1 = x1/2 and MUy = 1. In this example, the marginal utilities are MUx = 2 2 x Finding initial bundle A. At initial price px = $3, the tangency condition becomes MUx MUy = ppxy 1 x1/2 3 = , 1 1 1 or x1/2 = 3. After rearranging, we find 13 = x1/2 . Squaring both sides yields x = 19 0.11 units. Inserting this expression in the budget line, 3x + y = 100, we obtain (3 × 0.11) + y = 100, or y = 100 − 0.33 99.67 units. Hence, under the initial price of x, the optimal bundle is A = (0.11, 99.67). Finding final bundle C. Similarly, at the final price, px = $2, the tangency condition MUx MUy = ppxy yields 1 x1/2 1 = 21 or, after rearranging, 1 x1/2 = 2. Squaring both sides yields x = = 0.25 units. Inserting this result into the new budget line, 2x + y = 100, yields (2 × 0.25) + y = 100. After solving for y, we find y = 100 − 0.5 = 99.5 units. Hence, under the final price of good x, the optimal bundle is C = (0.25, 99.5). The total effect of the decrease in px is, therefore, an increase of 1 4 TE = xC − xA = 0.25 − 0.11 = 0.14 units. Finding decomposition bundle B. We can now break down this total effect into the substitution and income effects. To do that, we first need to find the decomposition bundle B. From our discussion, this bundle satisfies two conditions: 1. First, at bundle B, the consumer reaches the same utility level as at the initial bundle A. Because bundle A produces a utility of 94 Chapter 4 u(0.11, 99.67) = 2 × 0.111/2 + 99.67 100.33, bundle B must satisfy 2x1/2 + y = 100.33. 2. Second, at bundle B, the decomposition budget line BLd (which has the same slope p as BL2 , pxy = 21 ) is tangent to the consumer’s indifference curve. That is, the slope 1 MUx x1/2 MUy = 1 , must coincide with that of budget 1 to x1/2 = 2 or, after rearranging, x = 0.25 units. of the indifference curve, px py = 21 , which simplifies line BLd , We now insert x = 0.25 theutility target that bundle B must reach, 2x1/2 + into 1/2 + y = 100.33. After solving for y, we find y = y = 100.33, to obtain 2 × 0.25 100.33 − 1 = 99.33 units. In summary, the substitution effect of the decrease in px is given by the rightward move from bundle A to B; that is, SE = xB − xA = 0.25 − 0.11 = 0.14 units, whereas the income effect of this price decrease is captured by the move from bundle B to C; that is, IE = xC − xB = 0.25 − 0.25 = 0 units. Therefore, the income effect is zero in this example (IE = 0), implying that the substitution effect is equal to the total effect, SE = TE. This means that the consumer, after experiencing a cheaper good x, uses her increased purchasing power to buy more units of good y alone, rather than increasing her purchases of good x. Nonetheless, she understands that good x became relatively cheaper than y, inducing her to increase her purchases of x by 0.14 units, as reflected by the substitution effect. Self-assessment 4.9 Chelsea’s utility function is u(x, y) = 3x + 4y1/2 , her income is I = $220, and py = $1. The price of good x decreases from px = $3 to px = $2. Using the steps in example 4.9, find the substitution and income effects. 4.5.1 Income and Substitution Effects on the Labor Market The previous analysis of income and substitution effects can be readily applied to any good or service, such as the number of hours of leisure that an individual enjoys, L. Because the day has only 24 hours, the analysis of leisure choices allows us to examine its counterpart, Substitution and Income Effects 95 (a) u1 24w BL1 yA 2w w 0 LA L 24 (b) y 24w ’ u2 BL2 u1 C BL1 A 2w ’ w’ 0 LC LA L 24 Figure 4.7 (a) Labor decisions when facing wage, w. (b) Labor decisions when facing a higher wage, w . working hours, H. That is, given that L + H = 24, then H = 24 − L. Figure 4.7a represents an individual facing a salary of w per working hour. Her budget line BL1 originates at the horizontal intercept, representing 24 hours of leisure per day (or, equivalently, zero hours of work). At this point, her total income is zero, so she cannot purchase units of good y on the vertical axis (which can be understood as a composite good, aggregating all goods and services, as opposed to leisure). If she chooses to work 1 hour, moving leftward, her income increases to w; and if she works 2 hours (moving farther to the left) her income increases to 2w. If she works all day (24 hours), her income becomes 24w, as depicted on the vertical intercept of her budget line, whereby she does not enjoy any leisure but can purchase the largest amount of goods (represented on the vertical axis). Finally, note that indifference curves move northeast, which indicates that her utility increases as she enjoys more hours of leisure (moving to the right) and more units of goods (moving upward). At an hourly wage of w, she chooses an 96 Chapter 4 y 24 w' BL2 u2 u1 BL1 C B A BLd 2 w' w' 0 LB LC LA IE TE SE L 24 Figure 4.8 Income and substitution effects in the labor market. optimal consumption bundle, in which her budget line BL1 is tangent to her indifference curve u1 , which occurs at bundle A, where she enjoys LA hours of leisure and yA goods. Figure 4.7b depicts an increase in the worker’s hourly salary to w , where w > w. As a consequence, the new budget line BL2 becomes steeper than BL1 because, starting at 24 hours of leisure, every hour of work (moving leftward) entails a larger salary, which allows the worker to afford more units of good y. If she were to work all day (24 hours), her income would be 24w , as depicted on the vertical intercept of BL2 , which lies above that of BL1 because 24w > 24w. Facing this more generous salary, the worker chooses an optimal bundle C, where she enjoys LC hours of leisure. The decrease in leisure LC − LA is, therefore, the total effect that arises from the salary increase. This total effect can be broken into a substitution and an income effect, as we did in similar applications in this chapter. To examine the substitution effect, we shift the final budget line BL2 downward so it becomes tangent to the initial indifference curve, u1 . This downward parallel shift gives us the so-called decomposition budget line BLd , which is tangent to u1 at bundle B, where the worker enjoys LB hours of leisure. Figure 4.8 superimposes this effect on figure 4.7b. The substitution effect in this case is given by the change in leisure from the initial bundle A to the decomposition bundle B, LA − LB , whereas the income effect is represented by the change in leisure from the decomposition bundle B to the final bundle C, LC − LB . The sum of the substitution and income effects coincides with the total effect; that is, (LB − LA ) + (LC − LB ) = LC − LA . In figure 4.8, the income effect of the salary increase moves in the opposite direction of the substitution effect. Intuitively, a more generous salary per hour induces the worker to work more hours (the substitution effect moves her choice leftward, toward more work and less leisure). The opportunity cost of leisure increases along with the Substitution and Income Effects 97 wage, because this opportunity cost is captured by the higher hourly wage that the worker forgoes if she does not work 1 more hour. In other words, every hour of leisure becomes more expensive. However, the income effect reflects the fact that, as the worker becomes richer, she can afford to work fewer hours and enjoy more leisure. If the income effect is sufficiently large, it would completely offset the substitution effect, resulting in an overall positive total effect on leisure. In this case, a more generous salary increases the hours of leisure that the worker enjoys, thus reducing the number of hours she chooses to work.10 Appendix A. Not All Goods Can Be Inferior In this appendix, we use income elasticities to prove that not all goods can be inferior. Let us begin by writing a property that holds in all the previous analysis: when the consumer chooses her optimal bundle, this bundle must lie on the budget line. Formally, we say that, at optimal consumption bundles x(px , py , I) and y(px , py , I), the individual must exhaust her income, or px x(px , py , I) + py y(px , py , I) = I. Because we seek to analyze the effect of an income change, let us differentiate this expression with respect to I: px ∂x(px , py , I) ∂y(px , py , I) + py = 1. ∂I ∂I To obtain the expression of income elasticity, εx,I = multiply the first term by multiplication yields px x(px ,py ,I)×I x(px ,py ,I)×I ; ∂x(px ,py ,I) I ∂I x(px ,py ,I) , and multiply the second term by in each term, we y(px ,py ,I)×I y(px ,py ,I)×I . This ∂x(px , py , I) x(px , py , I) I ∂y(px , py , I) y(px , py , I) I + py = 1. ∂I x(px , py , I) I ∂I y(px , py , I) I Rearranging, we obtain px x(px , py , I) ∂x(px , py , I) py y(px , py , I) ∂y(px , py , I) I I + = 1, I ∂I x(p , p , I) I ∂I y(p , x y x py , I) θx εx,I θy εy,I 10. If we plot the number of working hours H on the horizontal axis, and the salary per hour w on the vertical axis, we can visually understand the relationship between H and w with the labor supply curve of the worker. When the total effect of an increase in w is positive, working hours increase in w, thus producing a positively sloped labor supply curve. In contrast, when the total effect is negative, working hours decrease in w, entailing a negatively sloped labor supply. 98 Chapter 4 or, more compactly, θx εx,I + θy εy,I = 1, p x(p ,p ,I) where θx = x Ix y represents the budget share that the individual spends on good x (which entails that θx is a percentage, such that 0 θx 1). In addition, εx,I = ∂x(px ,py ,I) I ∂I x(px ,py ,I) is the definition of income elasticity discussed in previous sections of this chapter. An analogous interpretation applies to the budget share of good y, θy , as well as to its income elasticity. εy,I . At this point, we are ready to tackle our initial question: Can all goods be inferior? For that to occur, we would need their income elasticities to be negative (i.e., εx,I < 0 and εy,I < 0). However, that would require the left side of the previous expression, θx εx,I + θy εy,I , to be negative, given that budget shares are both positive or zero.11 Hence, this equality could not hold if both goods were inferior. As a consequence, one or both goods must be normal, but both cannot be inferior. Appendix B. An Alternative Representation of Income and Substitution Effects In previous sections of this chapter, we analyzed the increase in demand coming from a price decrease, and how to break down this increase (total effect of the price decrease) into the income and substitution effects. We next present a more compact approach to express these two effects. First, from the discussion of the utility maximization problem (UMP) and the expenditure minimization problem (EMP) in chapter 3 (see Appendix B), we take the U demand function that results from the UMP, x px , py , I , and evaluate it at an income level, we obtain I = e px , py , u . Recall that this is the necessary income to purchase the optimal bundle that solves the EMP. Therefore, we obtain (4.1) xU px , py , e px , py , u = xE px , py , u , where e px , py , u = px xE px , py , u + py yE px , py , u . That is, the optimal bundle that solves the UMP (left side of 4.1) coincides with the bundle solving (right side), the EMP where the solution to the UMP is evaluated at income level I = e px , py , u . Because the income and substitution effect measure how purchases of good x are affected by a change in its price, px , we next differentiate both sides of equation (4.1) with respect to price px , to obtain ∂xU ∂xU ∂e ∂xE + = . ∂px ∂e ∂px ∂px (4.2) 11. Recall that budget shares are percentages, θx ∈ [0, 1] and θy ∈ [0, 1], implying that θx and θy are either positive numbers, or zero. Substitution and Income Effects 99 To understand the left side of equation(4.2), recall that price px shows up in the first and that third arguments of xU px , py , e px , py , u , implying we need to differentiate separately in each of them. In addition, px is inside e px , py , u , meaning that we need to apply the chain rule.12 E E that is, we use e px , py , u = px x px , py , u + py y px , py , u Eby definition; Note that E p ,p ,u . p we need to buy x , p , u and y e px , py , u to refer to the income that x y x y Differentiating with respect to px in e px , py , u , yields ∂e = xE px , py , u ∂px U E and x px , py , e px , py , u = x px , py , u . We can insert this result at the end of the left side of equation (4.2), to obtain ∂xU ∂xU U ∂xE x = + . ∂px ∂e ∂px (4.3) Finally, because e px , py , u = I, we can ultimately express equation (4.3) as ∂xU ∂xU U ∂xE + . x = ∂px ∂I ∂px Rearranging yields the so-called Slutsky equation: ∂xU ∂xE ∂xU U x , = − ∂px ∂px ∂I TE SE IE indicating that the total effect of a decrease in px (as measured by the effect on the demand function found after solving the UMP) is given by the substitution effect (as captured by the change in the demand found after solving the EMP) and the income effect. E Let us briefly explain why the substitution effect is measured by ∂x ∂px . Upon a decrease in px , the budget line pivots upward, but the EMP requires that the individual still reaches the same utility target u. Graphically, the consumer must then return to the same indifference curve she reached before the price change, but with a flatter budget line (as the price ratio px py decreased). As a consequence, she moves to a bundle located on the same indifference curve, but to the southeast of the initial consumption bundle, thus leading her to purchase more units of good x. Hence, a decrease in px increases her purchases of x, implying that ∂xE ∂px < 0 (i.e., px and purchases of x move in opposite directions). Importantly, this result 12. Recall the chain rule when dealing with composite functions. If we have a differentiable function y = f (x) where x = g(z) is another differentiable function, then the derivative of y with respect to z is equal to the derivative dy df (y) of y with respect to x, times the derivative of x with respect to z. More compactly, dz = dx x(z) dz = f (x)g (z). Intuitively, a marginal increase in z produces an increase in x (as captured by g (z)) and, because x increases y, we also experience an increase in y (as measured by f (x) in the last expression). 100 Chapter 4 does not rely on goods being normal or inferior but rather applies to all types of goods. For the income effect, however, its sign depends on whether goods are normal (a larger income U results in increased purchases of x, implying that ∂x∂I > 0) or inferior (a larger income results in decreased purchases of x, obtain ∂xU ∂I < 0). Specifically, when goods are normal, ∂xU ∂I > 0, we ∂xE ∂xU U ∂xU x = − ∂p ∂p ∂I x x TE is − SE is − + IE is − entailing that income and substitution effects are both negative, thus reinforcing each other. U In contrast, when goods are inferior, ∂x∂I < 0, we have ∂xU ∂xE ∂xU U x = − ∂px ∂px ∂I TE ? SE is − − IE is+ which implies that, while the substitution effect is negative (recall that it is always negative), the income effect is now positive; and thus the sign of the total effect is ambiguous. If the substitution effect dominates the income effect,13 the total effect is negative (as with inferior goods); whereas when the income effect dominates, the total effect becomes positive (as with Giffen goods). Example 4.10: Applying the Slutsky equation to the Cobb-Douglas case Consider the Cobb-Douglas utility function from example 4.1. After solving the UMP, U we found that the demand for good x was xU (px , py , I) = 2pI x . In that situation, ∂x ∂px = − I , 2(px )2 whereas ∂xU ∂I = 2p1 x . Applying the Slutsky equation, we obtain ∂xE 1 I I = − , − 2 ∂p 2p 2p 2 (p ) x x x x TE SE IE E 13. This occurs if the absolute value of the substitution effect, ∂x ∂px , is greater than the absolute value of the U income effect, ∂x∂I xU . Substitution and Income Effects 101 ∂xE I ∂px = − 4(px )2 . For instance, if px = $3 E 25 becomes ∂x ∂px = − 9 , the income effect is also thus implying that the substitution effect is and I = $100, the substitution effect − 2p1 x 2pI x = − 25 9 , and thus the two effects reinforce each other, ultimately producing a total effect of − I 2 = − 50 9 5.55 units. Intuitively, a marginal increase in the price 2(px ) of good x decreases the quantity demanded by 5.55 units, where half of this decrease can be attributed to the substitution effect alone (change in price ratio). The remaining half is explained by the smaller purchasing power that the individual experiences when facing a more expensive good (income effect). Using Elasticities to Represent the Slutsky Equation We can also represent the Slutsky equation in a more compact way by using elasticities. First, let us multiply the left and right sides by xpUx to obtain ∂xU px ∂xE px ∂xU U px x U. = − ∂px xU ∂px xU ∂I x (4.4) We now multiply the second term inthe right side of equation (4.4) by II = 1, and note that xU px , py , e px , py , u = xE px , py , u in the first term on the right side. That is, equation (4.4) becomes ∂xE px ∂xU U px I ∂xU px x U , = − U ∂px x ∂px xE ∂I x I (4.5) where the left side coincides with our definition of price elasticity, εx,px = ∂x ∂px U ∂xE px ; xU and the px E ∂px xE , where x E = first term on the right side is εx,p denotes the demand function we found x after solving the EMP. Furthermore, the second term on the right side can be rearranged as follows: ∂xU U px I ∂xU I px xU x U = , ∂I x I ∂IxU I εx,I ∂x(p ,p ,I) θx U px x x y I where εx,I = ∂I x(px ,py ,I) represents the income-elasticity of demand, and θx = I denotes the budget share that the individual spends on good x. As a consequence, equation (4.5) can be rewritten as E − θx εx,I . εx,px = εx,p x (4.6) From an applied perspective, this expression can be more attractive than the Slutsky equation, because we can often find estimates for elasticities εx,px , εx,I , and budget share θx , thus 102 Chapter 4 E . To illustrate this point, consider two extreme examples. allowing us to infer elasticity εx,p x First, if you analyze the demand for garlic, you will likely find that the budget share of most E . consumers is negligible (i.e., θx 0), implying that equation (4.6) reduces to εx,px εx,p x In other words, the income effect is close to zero, and hence, the substitution effect coincides with the total effect. Second, consider a good such as housing, with a much larger budget share (e.g., θx = 0.3). If we have estimates of its price elasticity being εx,px = −0.6, E , by using equation (4.6) as and its income elasticity being εI,x = 1.3, then we can find εx,p x follows: E − (0.3 × 1.3) −0.6 = εx,p x E , yields ε E = −0.21. Intuitively, a 1 percent increase in the price of which, solving for εx,p x,px x housing reduces demand by 0.6 percent if wealth is left unaffected. However, if the consumer receives additional wealth to guarantee that she can still reach the same utility level as before the price change, her demand for housing would be reduced by only 0.21 percent. Exercises 1. Deriving Functions–I.A Consider an individual with a Cobb-Douglas utility function u(x1 , x2 ) = x1 x2 , facing an income I = 100 and prices p1 and p2 for goods 1 and 2, respectively. (a) Find the demand function for each good. (b) Assume that the price of both goods increases by 10 percent. Find the new demand functions for each good. (c) Find the price-consumption curve of each good. Interpret. (d) Find the Engel curve of each good. Interpret. 2. Calculating Effects–I.A Consider the scenario in exercise 1, but now assume specific prices p1 = $10 and p2 = $5. In this context, only the price of good 1 decreases to p1 = $4. Answer the following questions: (a) Find the demand for each good, both before and after the price change. This represents the “total effect” of the price change. (b) Identify which part of the total effect originates from the substitution and the income effects. 3. Quasilinear Utility–I.B Repeat your analysis in exercise 1, but assuming a quasilinear utility function u(x1 , x2 ) = ln x1 + x2 . (a) Find the demand function for each good. (b) Assume that the price of both goods increases by 10 percent. Find the new demand functions for each good. (c) Find the price-consumption curve of each good. Interpret. (d) Find the Engel curve of each good. Interpret. Substitution and Income Effects 103 4. Calculating Effects–II.A Consider the situation in exercise 3, but now assume specific prices p1 = $10 and p2 = $5. In this context, only the price of good 1 decreases to p1 = $4. Answer the following questions: (a) Find the demand for each good, both before and after the price change. This represents the total effect of the price change. (b) Identify which part of the total effect originates from the substitution and the income effects. 5. Decomposition Bundles.B Consider an individual with an utility function u(x, y) = x2 y, and facing prices px = $2 and py = $4. (a) Assuming that his income is I = $800, find the optimal consumption of goods x and y that maximizes his utility. That is, solve his UMP. (b) Consider now that the price of good y decreases from py = $4 to py = $3. Find this consumer’s new optimal consumption bundle. Then, identify the total effect of the price change, and decompose it into the substitution and income effects. (c) Considering that the price of good y remains at py = $4, assume that the consumer seeks to reach the same utility level as in part (a). Find the optimal consumption of goods x and y that minimizes his expenditure. That is, solve his EMP. (d) As in part (b), assume that the price of good y decreases from py = $4 to py = $3. Find this consumer’s new optimal consumption bundle. Comparing your results from parts (c) and (d), argue that the total effect that we find when using the compensated demand (the result of the EMP) measures the substitution effect alone. Interpret. 6. Perfect Substitutes.B Peter’s preferences for tea and coffee are given by u(x, y) = 2x + y, where x denotes the units of tea and y the units of coffee. His income is $500, and the initial prices are px = $36 and py = $22. (a) Find the utility-maximizing pair of tea and coffee. [Hint: Peter regards tea and coffee as perfectly substitutable, so you should anticipate that he consumes only one of the two goods.] (b) Assume now that the price of tea increases, to px = $40. Is his consumption of tea and coffee affected by the price change? Your answer defines the total effect of the price change. Decompose it into substitution and income effects. (c) What if the price of tea further increases to px = $83? What is the total effect of the price change? What are the substitution and income effects? 7. Income Effects.A Peter informs us that his demand for housing decreases when his income decreases. Can we infer from that information that, after an increase in the price of housing, Peter’s demand will decrease? 8. Quasilinear Utility–II.B Chelsea’s utility function is u(x, y) = 3x + 4y1/2 , her income is I = $220, and py = $1. The price of good x decreases from px = $3 to px = $2. Using the steps in example 4.9, find the substitution and income effect from this price change. 9. Linear Demand.A Suppose that the demand for cookies (good x) was expressed as x = 250 − 3px , where px is the price of cookies. (a) Calculate the price elasticity of demand. 104 Chapter 4 (b) For what prices is the demand for cookies elastic? (c) For what prices is the demand for cookies inelastic? 10. Point Elasticity.A Consider the market for football tickets. It faces the following supply and demand functions: qS = −2 + 2p qD = 8 − 3p + 2I + pB where p is the price for football tickets, I is average income in units of $10, 000, and pB is the price of basketball tickets. (a) Let I = 4 and pB = 2. Calculate the equilibrium price and quantity. (b) Calculate the price elasticity of demand, income elasticity, and cross-price elasticity at the equilibrium price and quantity. 2 11. Income Elasticity–I.B Suppose that the demand for beef (good x) can be expressed as x = 2I−I px , where I is the consumer’s income, measured in units of $100, 000. (a) Calculate the income elasticity for beef. (b) Provide an interpretation for the income elasticity for beef. For what values of I is beef a normal good? 2 12. Engel Curve.C Suppose that the demand for good x was x = 9 − (I−3) px . (a) Calculate the income elasticity for good x. (b) Derive and plot the Engel curve for good x. (c) For what income levels is good x normal? Label this range on your plot in part (b). 13. Perfect Complements.B Consider an individual with utility function u(x, y) = min{2x, 3y}, facing an income I = 250 and prices px and py for goods 1 and 2, respectively. (a) Find the demand function for each good. (b) Calculate the price elasticity of demand and income elasticities for both goods. Interpret. (c) Find the price-consumption curve of each good. Interpret. 14. Calculating Effects–III.A Consider the scenario in exercise 13, but now assume that the initial prices are px = $10 and py = $8. In this context, only the price of good y increases, to py = $12. Answer the following questions: (a) Find the demand for each good, both before and after the price change. This represents the “total effect” of the price change. (b) Identify which part of the total effect originates from the substitution and the income effects. 15. Deriving Functions–II.A Consider an individual with the Cobb-Douglas utility function u(x, y) = x0.4 y0.6 , facing an income I and prices px and py for goods x and y, respectively. (a) Find the demand function for each good. (b) Find the price-consumption curve of each good. Interpret. (c) Find the Engel curve of each good. Interpret. Substitution and Income Effects 105 16. Calculating Effects–IV.A Consider the situation in exercise 15, but now assume specific prices px = $3 and py = $2 and income of I = $100. In this context, only the price of good x decreases, to px = $2. Answer the following questions: (a) Find the demand for each good, both before and after the price change. This represents the total effect of the price change. (b) Find the decomposition bundle. (c) Identify which part of the total effect originates from the substitution and the income effects. 1 , where 17. Income Elasticity-II.A Suppose that the demand for good x can be expressed as x = 2Ip x px is the price of good x and I is the consumer’s income. (a) Calculate and interpret the income elasticity for good x. (b) Derive and plot the Engel curve for good x. 18. Magnitudes of SE and IE.A After calculating the effects of a price increase for good x, you find that the substitution effect in this situation is equal to SE = −8. (a) Suppose that the income effect is equal to IE = −4. Calculate the total effect. How does the consumer regard good x with respect to their income? (b) Suppose instead that the income effect is equal to IE = 3. Calculate the total effect. How does the consumer regard good x with respect to their income? (c) Suppose now that the income effect is equal to IE = 12. Calculate the total effect. How does the consumer regard good x with respect to their income? 19. Giffen Good?B Suppose that you work in a hardware store in a community that is expecting a major hurricane in the next few days. To ration your plywood, you start to increase its price, but you find that with each price increase, more people seem to purchase your plywood. You begin to expect that your plywood may be a Giffen good. (a) Provide an argument why plywood is a Giffen good under these circumstances. (b) Provide an alternative explanation for the increase in plywood sales. √ I py 20. Cross-Price Elasticity.B Suppose that the demand for good x is x = 2p , where px is the price x of good x, py is the price of good y, and I is the consumer’s income. (a) Calculate and interpret the cross-price elasticity of good x with respect to good y. (b) Suppose instead that the demand for good x is x = 2p I√p . Calculate and interpret the crossx y price elasticity of good x with respect to good y. 5 Measuring Welfare Changes 5.1 Introduction In this chapter, we evaluate the welfare gain that individuals enjoy when facing cheaper prices and, similarly, the welfare loss they suffer when facing more expensive goods. Interestingly, this can occur when demand changes, but also when a new sales tax is enacted that affects the selling price. Our analysis helps explain consumer welfare losses that are a consequence of more expensive goods or more stringent taxes. Here, we discuss three measures of welfare change: (1) consumer surplus, (2) compensating variation, and (2) equivalent variation. We evaluate them in applied settings, and discuss contexts under which all three measures produce the same welfare change (i.e., the same number). This coincidence, however, does not necessarily happen in all situations, so we also examine applications where each welfare measure yields a different welfare change. 5.2 Consumer Surplus As discussed in previous chapters, the demand curve identifies how many units of a good x an individual is willing to purchase at price px and income $I. As such, the demand function can be interpreted as representing the maximum number of units that the individual consumes at each price px .1 Alternatively, it can be understood as measuring, for a given number of units of good x, how many dollars the individual is willing to pay for these units. In short, the demand function represents the maximum willingness-to-pay for the good.2 Hence, if we compare this maximum willingness-to-pay against the price that the consumer 1. Graphically, for a given horizontal line corresponding to a price px , the crossing point with the demand curve measures the maximum number of units she is willing to buy at that price px . 2. Graphically, for a given vertical line corresponding to x units, the height of the demand function measures her willingness-to-pay. 108 Chapter 5 actually pays for the good, we find a measure of the utility gain that she makes when buying the good. Consumer surplus (CS) The area below the demand curve and above the price that consumers pay for the good. We now present examples on how to find such CS in two situations: one with a linear demand, and another with a nonlinear demand. Example 5.1: Finding CS with linear demand Figure 5.1 depicts a demand curve p(q) = 10 − 2q, and a market price of p = $4. The area below the demand curve and above the current price of p = $4 measures the CS, which is given by triangle A, with area 1 CS = (10 − 4) 3 = 9, 2 where the height of the triangle is 10 − 4 = 6, as depicted on the vertical axis, whereas its base is given by the output level that solves 4 = 10 − 2q, yielding q = 3 units. (Graphically, this is the output level for which the demand function reaches a height of exactly p = $4.) In addition, if the price were to fall to p = $3, output would increase to 3 = 10 − 2q, that is to say, q = 3.5 units. As a consequence, CS increases by the size of areas B and p $10 p (q)=10 – 2q A $4 $3 B C 3 3.5 Figure 5.1 CS with linear demand. q Measuring Welfare Changes 109 C in the graph. We then represent the increase in CS as CS = B + C 1 = (4 − 3)3 + (4 − 3)(3.5 − 3) 2 = 3 + 0.25 = 3.25. Therefore, the increase in CS is 3.25, which produces a new CS of 9 + 3.25 = 12.25. Self-assessment 5.1 Repeat the analysis in example 5.1, but considering now a demand function p(q) = 11 − 13 q. What is the change in CS when the price of good x decreases from $4 to $3? Example 5.2: Finding CS with nonlinear demand Consider the nonlinear demand in example 4.1, x = 2pI x , arising from a Cobb-Douglas utility function. If the consumer 100 = 12.5 faces a price of px = $4 and an income level of I = $100, she purchases x = 2×4 100 units. If the price decreases to px = $3, she increases her purchases to x = 2×3 = 16.6, as depicted in figure 5.2. In this case, however, to find the gain in CS, we must use the integral of demand function x = 100 2px between prices px = $4 and px = $3 because the demand function is not linear (i.e., it is not a straight line). In particular, the increase in consumer surplus is 4 4 100 1 dpx = 50 dpx = CS = 2p p x x 3 3 = 50[ln px ]43 = 50[ln 4 − ln 3] = 14.38. What would happen if we tried to approximate the change in CS using the rectangle and triangle below the demand curve (as if it was linear)? In that case, we would find that the approximated increase in CS is 1 CS(approx.) = [12.5 × (4 − 3)] + (16.67 − 12.5) × (4 − 3) = 14.59, 2 Area of rectangle B Area of triangle C thus implying an overestimation of the true change in CS, because 14.59 > 14.38. 110 Chapter 5 p p (q) 4 3 ΔCS q Figure 5.2 Change in CS with nonlinear demand. Self-assessment 5.2 Repeat the analysis in example 5.2, but assume a demand 3I function x = 2√ px . Still assuming that the price of good x decreases from $4 to $3, what is the change in CS, CS? 5.3 Compensating Variation In this section, we present the compensating variation (CV) as an alternative measure of welfare change. For simplicity, we consider that the price of good y is normalized to $1, which means that we divide all prices by py . For instance, if prices are px = $4 and py = $3, we divide all prices by py = $3 to obtain the normalized prices px = $ 43 and py = $ 33 = $1. An advantage of this normalization is that the vertical intercept of the budget line, y = pIy , is now y = pI = 1I = I. We can hence interpret the vertical intercept of the budget line as y the consumer’s income, and make income comparisons by just looking at the height of this intercept. Without further ado, we present the CV, which measures the welfare change that an individual experiences from a price change. Compensating variation (CV) How much money an individual needs to take away from (give to) a consumer after a price decrease (increase) such that she is as well off as before the price change. Measuring Welfare Changes 111 y I CV IB A C B u2 BL1 BLB u1 BL2 x Figure 5.3 Finding the CV. Intuitively, the consumer is better off after a price decrease, as her budget line pivots upward, allowing her to achieve a higher utility level. The CV then asks: How much do we need to reduce the consumer’s income to make her as well off as she was before the price change? Graphically, we shift her new budget line downwards in a parallel fashion, thus exhibiting the final price ratio, until the shifted budget line becomes tangent to the initial indifference curve that the consumer reached before the price change. This implies that she obtains the same utility level as before the decrease in prices. The opposite argument applies if prices increase, where the consumer now reaches a lower utility level than before the price change. Hence, the CV measures how much money we need to provide to the consumer to compensate her for the price increase. Graphically, that means shifting her budget line upward so she can reach the same indifference curve (i.e., utility level) as before the price change. Regardless of whether we analyze a price decrease or increase, however, the CV focuses on final prices, and evaluates how much money we need to take away from the consumer in the case of a price decrease (give to the consumer in the case of a price increase) to guarantee that she can reach the same utility level as before the price change. Figure 5.3 depicts the CV. At the initial prices, the consumer faces budget line BL1 , purchases an optimal consumption bundle A, and reaches a utility level u1 . As mentioned in the previous discussion about normalizing prices, the vertical intercept of BL1 measures the consumer’s income I. When the price of good x decreases, her new budget line is BL2 , but her income remains I (i.e., BL1 and BL2 have the same vertical intercept). At new prices, the consumer chooses bundle C. However, the CV asks: “How much money would we need to take from the consumer’s income I to make her new budget line BL2 tangent to her initial 112 Chapter 5 indifference curve u1 ?” To answer this question, we make a parallel shift of BL2 downwards until we find another budget line, BLB , tangent to u1 , where the consumer purchases bundle B.3 The vertical intercept of budget line BLB , IB in the graph, is the income that the consumer would need to purchase bundle B. Hence, the difference CV = I − IB measures the CV, namely, the amount of money we need to subtract from the consumer’s initial income I to make her as well off as before the price change. Example 5.3: Finding the CV of a price decrease Consider a consumer with the Cobb-Douglas utility function u(x, y) = xy, an income of I = $100, and a normalized price of good y at py = $1. We first seek to find demand functions for goods x and y, which will help us in the analysis of the bundles at the initial and final prices. From the y px px x tangency condition, MU MUy = py , we obtain x = py , or after rearranging, y = px x because py = $1. Inserting this result into the budget line px x + py y = I, which in this context becomes px x + y = 100 because py = $1, we find px x + px x = 100 ⇒ 2px x = 100. Because y=px x 50 Solving for good x, we obtain the demand for this good, x = 100 2px = px . We can then insert this result into the expression obtained from the tangency condition, y = px x, to find the demand for good y (i.e., y = px 50 px = 50 units). Let us consider now that the price of good x decreases from px = $3 to px = $2, while the price of good y remains fixed at py = $1. 1. Finding the initial bundle A. At the initial price px = $3, this demand for good x simplifies to xA = 50 3 16.67 units. 2. Finding the final bundle C. At the final price px = $2, the demand for goods x increases to xC = 50 2 = 25 units. 3. Finding the decomposition bundle B. At the decomposition bundle, we must ensure the following occurs: a. The consumer must reach the same utility level as with the initial bundle A. Because we have found that bundle A = (16.67, 50), this bundle yields a utility level of uA = 2500 50 (50) = 833.33. 3 3 3. This is, of course, the decomposition bundle B that we found in our analysis of the substitution and income effects in chapter 4. In that analysis, we also shifted the final budget line downward until the consumer reached the same utility level as before the price change. Measuring Welfare Changes 113 Therefore, the amount of goods x and y consumed at the decomposition bundle B, (xB , yB ), must also yield a utility level of 833.33, which we can mathematically express as follows: (xB ) (yB ) = 833.33. b. The consumer’s indifference curve must be tangent to the budget line (i.e., px MUx MUy = py ), which in this context entails y = px x. Because px = 2, the tangency condition can be written as y = 2x. Substituting this condition in the above equation, (xB ) (yB ) = 833.33, we find uB = (xB )(yB ) = (xB ) (2xB ) = 833.33 yB =2xB or, after rearranging, 2 (xB )2 = 833.33, which further simplifies to (xB )2 = 416.67. Applying square roots on both sides, we find that xB 20.41. We can now insert this result into the tangency condition, y = 2x, to find the amount of good y that the consumer has at bundle B, obtaining yB = 2 × 20.41 = 40.82 units. 4. Evaluating the CV. The CV is given by CV = I − IB , where I = $100 is the consumer’s income and IB represents the income that the individual needs to purchase the decomposition bundle B = (20.41, 40.82) found in point 3 at the final prices (i.e., px = $3 and py = $1). Specifically, IB = ($2 × 20.41) + ($1 × 40.82) = 81.64. Thus, the CV is CV = I − IB = $100 − $81.64 = $18.36. Expressed in words, if, after experiencing the price decrease, we reduce the consumer’s income by $18.36, her utility level coincides with that before the price decrease. Self-assessment 5.3 Repeat the analysis in example 5.3, assuming the same utility function and py = $1, but consider that income is I = $125 and the price of good x decreases from px = $2 to px = $1. 114 Chapter 5 5.4 Equivalent Variation The equivalent variation focuses on the day before the price change, as opposed to the CV, which focuses on the day after the price change. Similar to the CV, the EV is defined as follows. Equivalent variation (EV) How much money one needs to give to (take away from) a consumer before a price decrease (increase) such that she is as well off as after the price change. Using a similar interpretation as for the CV, note that a price decrease will make the consumer better off. Hence, the EV asks: How much money do we need to offer the consumer today (before she enjoys the price decrease) to make her as well off as after the price decrease? Graphically, a price decrease pivots the consumer’s budget line outward, leading her to reach a higher utility level. If we provide the consumer with more income today, her initial budget line shifts outwards until it becomes tangent to her final utility level. The opposite argument applies if she experiences a price increase, which pivots her initial budget line inward, driving her to achieve a lower utility level. In this case, the EV would measure how much money we need to take away from a consumer today (before she suffers the price increase) to make her as worse off as she will be once she suffers such an increase in prices. Figure 5.4 depicts the EV when the consumer suffers a price decrease. First, the individual faces a budget line BL1 and purchases bundle A, reaching utility level u1 . Second, the price of good x decreases, pivoting the individual’s budget line from BL1 to BL2 , which leads her to purchase bundle C and reach a higher utility level u2 . The EV focuses on the “beforethe-price-change” scenario, and asks how much money we need to give to the consumer (on top of her initial income I) to make her as well off as she will be once she enjoys such a price decrease. Graphically, we then need to make a parallel shift of her initial budget line BL1 outward until it becomes tangent to her final indifference curve, and thus reaches utility level u2 . As depicted in the figure, this occurs at bundle E, which entails a budget line BLE . Because the vertical intercept of the budget lines indicates the individual’s income along all points of that line, IE reflects the total income that the consumer needs to reach utility level u2 at the initial prices.4 Hence, the additional income that we need to give the consumer to 4. We say “at the initial prices” because budget lines BL1 and BLE are parallel, thus indicating that they both face the initial price ratio. Measuring Welfare Changes 115 y IE EV I A E C u2 BL1 BLE u1 BL2 x Figure 5.4 Finding the EV. reach utility level u2 is EV = IE − I, as measured by the height between points IE and I on the vertical axis of the figure. Example 5.4: Finding the EV of a price decrease Following the scenario in example 4.8, consider a consumer with the Cobb-Douglas utility function u(x, y) = xy, income I = $100, and a price for good y of py = $1. The price of good x decreases from px = $3 to px = $2. From example 4.8, we know that the initial bundle A is A = 50 3 , 50 , the final bundle is C = (25, 50), and the decomposition bundle is B = (20.4, 40.8). The EV of this price decrease is given by EV = IE − I, where I = $100 denotes the individual’s income and IE represents the income she needs to purchase bundle E. But, where is bundle E? To find this bundle, recall the conditions we discussed in figure 5.4: 1. Bundle E must reach the same utility level as the final bundle C. Because C = (25, 50), its utility level is uC = 25 × 50 = 1, 250, implying that bundle E = (xE , yE ) must also yield this utility level; that is, xE yE = 1, 250. 2. Bundle E must be a tangency point; that is, the tangency condition MUx MUy = ppxy must hold, which in this case entails yx = 31 (recall that the slope of BL1 and BLE coincide, 116 Chapter 5 as depicted in figure 5.4). This tangency condition simplifies to y = 3x. Plugging this result into xE yE = 1, 250, we obtain xE (3xE ) = 1, 250, or to (xE )2 = 1,250 which collapses to 3(xE )2 = 1, 250, √ 3 416.67. Applying square roots on both sides, we find xE = 416.67 = 20.41 units. We can then use the tangency condition, y = 3x, to find the amount of good y that the individual consumes in bundle E (i.e., yE = 3 × 20.41 = 61.2 units). Therefore, the income that the individual spends to purchase bundle E = (20.41, 61.2) at the initial prices (px = $3 and py = $1) is IE = ($3 × 20.41) + ($1 × 61.2) = $122.43, implying that the EV is EV = IE − I = $122.43 − $100 = $22.43. Expressed in words, if, before enjoying the price decrease, we increase the consumer’s income by $22.47, we help her reach the same utility level that she will enjoy after the price decrease. Self-assessment 5.4 Repeat the analysis in example 5.4, assuming the same utility function and py = $1, but consider that income is I = $125, and the price of good x decreases from px = $2 to px = $1. 5.5 Measuring Welfare Changes with No Income Effects The previous discussion considered three approaches to measure the welfare change that consumers experience after a price change: (i) the change in CS, (ii) the CV, and (iii) the EV. While these welfare measures generally differ (as illustrated in Examples 5.2–5.4 for the Cobb-Douglas utility function), they produce the same exact number if income effects are absent. As we know from chapter 4, income effects are zero when, for instance, the consumer has quasilinear preferences. This type of preferences, as example 5.5 shows, produce the triple coincidence CS = CV = EV . Appendix A at the end of chapter 4 demonstrated that income effects are absent when the budget share that the consumer spends on the good we analyze (relative to her entire income) is negligible and/or when the income-elasticity of the Measuring Welfare Changes 117 good is small. In these cases, we can conclude that the three measures of welfare change, CS, CV , and EV , will approximately coincide. Example 5.5: CS, CV, and EV with a quasilinear utility function Consider a √ consumer with a quasilinear utility function u(x, y) = 2 x + y, an income level of I = $100, and a price of good y at py = $1. We start by finding the demand function for goods x and y. In this context, the tangency condition MUx MUy = ppxy becomes √1 x 1 = p1x , which simplifies to x = p12 , providing us with the demand function for good x.5 We x can now find the demand function for good y. The budget line px x + py y = I becomes px x + y = 100 in this example. Inserting the previous result x = p12 into the budget line, x we obtain 1 px + y = 100, p2x Demand for x which simplifies to 1 px + y = 100, ultimately yielding the demand function for good y, Consider now that the price of good x decreases from $4 to $3, and let y = 100 − us find the increase in consumer welfare measured through the three tools learned in this chapter: the CS, CV, and EV discussed next. 1 px . Finding the CS. To obtain the welfare change by using the CS, we simply need to integrate the demand curve of good x between $4 to $3 as follows: 4 1 1 1 1 4 1 − − = = 0.08. dp = − = − CS = x 2 p 4 3 12 p x 3 3 x Finding the CV. We now find the change in consumer welfare measured through the CV = I − IB . Let’s start by finding the income that the consumer needs to purchase bundle B, IB . To do that, we first obtain bundles A, C, and B, as follows: 1. Finding the initial bundle A. At the initial price px = $4, the demands for goods 1 and yA = 100 − 14 = 399 x and y simplify to xA = 412 = 16 4 , and thus bundle A is A= 1 399 16 , 4 . 5. As discussed in chapter 3, when dealing with quasilinear utility functions, the tangency condition provides the expression of the demand function for good x without the need to insert the results into the budget line. The budget line only plays a role in determining the demand for good y, which is given by the income left after purchasing good x. 118 Chapter 5 2. Finding the final bundle C. At the final price px = $3, the demand for goods x and y change to xC = 312 = 19 and yC = 100 − 13 = 299 3 , implying that C = 1 299 9, 3 . 3. Finding the decomposition bundle B. At the decomposition bundle, the following must occur: a. The consumer must reach the same utility level as with the initial bundle A. 1 399 , 4 , this bundle yields a utility Because we found that bundle A is A = 16 level of 399 1 + = 100.25. uA = 2 16 4 Therefore, the decomposition bundle B, (xB , yB ), must also yield a utility level of 100.25, which mathematically can be expressed as follows: √ (5.1) uB = 2 xB + yB = 100.25. b. The consumer’s indifference curve must be tangent to the budget line at the final px x prices, MU MUy = py , which in this example means √1 xB 1 3 = , 1 or x1B = 3. Squaring both sides, we obtain x1B = 9, or xB = 19 0.11. Substi√ tuting this result in equation (5.1), 2 xB + yB = 100.25, gives us 1 2 + yB = 100.25, 9 which simplifies to 23 + yB = 100.25, ultimately yielding yB 99.58 units. Therefore, the income that the consumer needs to purchase the decomposition bundle B = (0.11, 99.58) is IB = 3 (0.11) + 1(99.58) = $99.91. 4. Evaluating the CV. The CV is then given by CV = I − IB = 100 − 99.9133 ≈ 0.08, which coincides with the CS we found previously because the consumer exhibits a quasilinear utility function. Measuring Welfare Changes 119 Finding the EV. We now find the change in consumer welfare measured through the EV = IE − I. We then start by finding the income that the consumer needs to purchase bundle E, IE . 1. Bundle E must reach the same utility level as the final bundle C. In our search for the CV, we already found that bundle C = 19 , 299 3 , which yields a utility level of uC = 2 1 299 + = 100.33. 9 3 Therefore, bundle E = (xE , yE ) must also yield a utility level of 100.33, which mathematically can be written as √ (5.2) uE = 2 xE + yE = 100.33. 2. The consumer’s indifference curve must be tangent to the budget line at the initial px x prices, MU MUy = py , which in this example means √1 xB 1 4 = , 1 1 or x1E = 4. Squaring both sides, and rearranging, we obtain xE = 16 0.0625. √ Thus, substituting this result in equation (5.2), 2 xE + yE = 100.33, gives us, 1 2 + yE = 100.33, 16 which simplifies to 24 + yE = 100.33, ultimately yielding yE = 99.83 units. Thus, the income that the consumer needs to purchase bundle E = (0.0625, 99.83) is IE = 4 (0.0625) + 1(99.83) = 100.08. 3. Evaluating the EV. The EV is then given by EV = IE − I = 100.08 − 100 = 0.08. This result coincides with those we found for the CS and CV previously because the consumer exhibits a quasilinear utility function. Self-assessment 5.5 Repeat the analysis in example 5.5, assuming the same utility function and py = $1, but consider that income is I = $125, and that the price of good x decreases from px = $2 to px = $1. 120 Chapter 5 Appendix. An Alternative Representation of the Compensating and Equivalent Variations We next present an alternative approach to measuring the CV and EV, which uses the expenditure function, found in chapter 3, where we study the consumer’s expenditure minimization problem (EMP). A.1 Compensating Variation Consider an individual facing prices px and py , and seeking to reach a utility target of u in her EMP. From the discussion in appendix B in chapter 3, we know that she would set the tangency condition MRSx,y = ppxy , and then insert the result into her constraint of reaching utility target u (i.e., u(x, y) = u). Following this procedure, the individual obtains a demand for good x of xE (px , py , u), and a demand for good y of yE (px , py , u), where superscript E denotes that we obtained this expression after solving the EMP. Intuitively, these demands help the individual minimize her expenditure while reaching utility target u. We can then find the cost of buying these demands, as follows: e(px , py , u) = px xE (px , py , u) + py yE (px , py , u), (5.3) which we refer to as the “expenditure function,” because it represents the minimal expenditure that the individual needs to incur to reach utility level u at current prices. We can then repeat the process when the price of good x decreases from px to px , and the individual can thus reach a higher utility u , where u > u. In this setting, we would obtain demands xE (px , py , u ) and yE (px , py , u ), and an expenditure function of e(px , py , u ) = px xE (px , py , u ) + py yE (px , py , u ). We now repeat our analysis, decreasing prices from px to px , but we require that the individual reaches the same utility level as before the price change, u. In this context, we would find demands xE (px , py , u) and yE (px , py , u), and an expenditure function of e(px , py , u) = px xE (px , py , u) + py yE (px , py , u). We can now use the monetary amounts found in e(px , py , u) and e(px , py , u) to express the CV. In particular, recall that the CV takes an “after-the-price-change” perspective, thus implying that we must focus on the final price, px . In addition, recall that the CV measures the amount of money the consumer is willing to give up after the price decrease (after her utility level improves from u to u ) to be just as well off as before the price decrease (where she only reached utility level u). Formally, the CV can then be written as CV = e(px , py , u ) − e(px , py , u). Measuring Welfare Changes 121 While the equation given here is a convenient expression of the CV, we can alternatively write it using xE (px , py , u) alone. In particular, note that the individual’s expenditure must satisfy e(px , py , u) = I when the utility target u coincides with the maximal utility that the individual reaches when solving her UMP. Similarly, e(px , py , u ) = I, which allows us to rewrite the CV as Yet another representation of CV. CV = I − e(px , py , u), or, using e(px , py , u) = I, we can express CV as CV = e(px , py , u) − e(px , py , u). CV then represents the decrease in the consumer’s minimal expenditure of reaching utility target u when prices decrease from px to px . Hence, the CV can be rewritten as the change in e(px , py , u), its derivative with respect to px , when the price of good x decreases from px to px , as follows: px ∂e(px , py , u) CV = dpx . ∂px px Lastly, to simplify this expression, recall that from the previous description of e(px , py , u) ∂e(px ,py ,u) in equation (5.3), we know that its derivative with respect to px is = xE (px , py , u), ∂px which reduces the CV to px xE (px , py , u)dpx . CV = px Graphically, the CV then becomes the area below the demand curve for good x that we found from the EMP, xE (px , py , u), between prices px and px . Example 5.6: An alternative representation of CV Consider an individual with a Cobb-Douglas utility function u(x, y) = xy. We can solve the EMP to show that the p demand for good x is xE (px , py , u) = u pyx . (You can take this opportunity to practice with the EMP, showing that you can find the same demand function.)6 Consider that px x 6. Recall that to solve the EMP, we need to satisfy two conditions. The first is the tangency condition MU MU = py , y y p p which reduces to x = pxy in this example, where u(x, y) = xy. Solving for good y yields y = x pxy . Second, from the utility target condition, we know that the consumer must reach autility level u so that xy = u. Inserting the p p p expression obtained from the tangency condition, y = x pxy , yields x x pxy = u, which simplifies to x2 = u pxy . p Taking the square root of both sides, we obtain the demand for good x, xE (px , py , u) = u pyx . Intuitively, the consumer’s demand for good x increases in the utility that she seeks to reach, u (as she needs more units of x to reach this utility), in the price of good y (i.e., as this good becomes more expensive, the consumer demands more units of good x because it became cheaper in relative terms), but decreases in the price of good x. 122 Chapter 5 the price of good x decreases from px = $3 to px = $2, the price for good y is held constant at py = $1, and the consumer seeks to reach a utility target of u = xy = 50 3 × 50 = 833.33. This is the utility level that the consumer reaches at bundle A = 50 3 , 50 ; see example 5.4 for more details. In this case, the demand function we found from p the EMP, xE (px , py , u) = u pyx , simplifies to 833.33 p1x = 28.87 p1x . Therefore, the CV becomes the integral of demand function 28.87 p1x between prices px = $3 and px = $2; that is, 3 1 1 x (px , py , u)dpx = 28.87 dpx = 28.87 dpx CV = px px px 2 2 √ √ √ 3 = 28.87 2 px 2 = 28.87 2 3 − 2 $18.35. px E 3 Due to approximations while solving, there is a small difference ($0.01) in the CV in this example and that in example 5.3. Self-assessment 5.6 Repeat the analysis in example 5.6, assuming the same utility function and py = $1, but consider that the price of good x decreases from px = $2 to px = $0.5. A.2 Equivalent Variation We can follow a similar approach to write the EV using the expenditure function. In particular, recall that the EV takes a “before-the-price-change” perspective, thus implying that we must focus on the initial price, px . Hence, the EV can be expressed as EV = e(px , py , u ) − e(px , py , u), which measures the amount of money that the consumer needs to receive before the price decrease (when her utility level is still u) to be just as well off as after the price decrease (when she reaches a higher utility level, u ). We can also follow a similar approach as in the CV to obtain yet one more expression for the EV. First, note that the consumer’s minimal expenditure e(px , py , u) satisfies e(px , py , u) = I, which helps us to rewrite the above EV as EV = e(px , py , u ) − I; Measuring Welfare Changes 123 and because e(px , py , u ) = I holds as well, we can express the EV as EV = e(px , py , u ) − e(px , py , u ). Intuitively, the EV measures the change in the consumer’s minimal expenditure when, reaching a utility level u , the price of good x decreases from px to px . Therefore, the EV can be rewritten as the change in e(px , py , u ), its derivative with respect to px , when the price of good x decreases from px to px , as follows: px ∂e(px , py , u ) EV = dpx . ∂px px Finally, to simplify this expression, recall that the derivative of e(px , py , u ) with respect to px is ∂e(px ,py ,u ) ∂px = xE (px , py , u ), which helps us reduce the EV to px xE (px , py , u )dpx . EV = px Graphically, the EV is the area below the demand curve for good x that we found from the EMP, xE (px , py , u ), between prices px and px . Comparing this expression with that of the CV, they are symmetric except for the fact that the EV is evaluated at utility level u whereas the CV is evaluated at utility level u. Example 5.7: An alternative representation of EV Following example 5.6, conp E sider an individual with demand for good x as x (px , py , u) = u pyx . As in that example, consider that the price of good x decreases from px = $3 to px = $2. The utility level that the consumer reaches at final bundle C (after the price change) is utility level u = 1, 250 u = 1, 250 (revisit example 5.4 for more details). Inserting and price py = $1 into the demand function yields the EV becomes EV = px px 3 x (px , py , u )dpx = E 1, 250 p1x = 25 Therefore, 25 2 2 px . 2 dpx px √ √ 3 √ √ √ = 25 2 2 px 2 = 50 2 3 − 2 $22.47 (5.4) which coincides (up to 0.03) with the answer obtained in example 5.4 using an alternative approach. 124 Chapter 5 Self-assessment 5.7 Repeat the analysis in example 5.7, assuming the same utility function and py = $1, but consider that the price of good x decreases from px = $2 to px = $0.5. Exercises 1. Changes in CS.A Patricia wants to measure the change in CS when the market price for doughnuts increases to $14.5 (box of a dozen doughnuts). She needs your help! Consider that the demand for doughnuts is p(q) = 15 − 12 q, and the supply is p = 7q. 2. CS and a tax.A We can also use CS to measure the impact of taxes and subsidies. Assume that the inverse demand for a pack of cigarettes is p(q) = 25 − 5q. (a) Find the CS when the market price is p = $4.00. (b) Find the change in CS if a $0.50 tax is added to the price of a pack of cigarettes. 3. CS and changing prices.A Every spring, John goes to a local co-op to buy seeds to plant in his field. He has been keeping track of prices of seeds and the number of tons of seeds that he buys each year and estimates his demand curve to be p(q) = 300 − 10q. Last year, John paid $150 per ton of seeds. This year, he noticed that the price went down to $100. Unfortunately, John didn’t take any economics courses in college, so he doesn’t know how to quantify his welfare improvement. Help John find his CS from this price decrease. 50 . What 4. CS and nonlinear demand.B Assume that a consumer has a demand for good x of x = 2√ p x is the change in CS if the price of x increases from px to px ? For simplicity, you can assume that px = $1. 5. CS and nonlinear demand-I.A Jean has a Cobb-Douglas utility function that yields a demand for jeans of j = 2pI . Jean has an income of $100, and jeans have a price of $25. j (a) What is the change in CS if the price of jeans increases from $25 to $30? (b) What is the change in CS if the price of jeans decreases from $25 to $20? 6. CS and nonlinear demand-II.A A different variation of the Cobb-Douglas utility function (u = x0.2 y0.8 ) will yield a demand for x of x = 0.2I px . (a) If I = 100 and px changes from $1 to $2, what is the change in CS? (b) If I = 100 and px changes from $5 to $6, what is the change in CS? How does this differ from (a)? (c) If I = 200 and px changes from $1 to $2, what is the change in CS? How does this differ from (a)? 7. Calculating CV.B Redo the analysis from example 5.3, but now assume that py changes from py = $1 to py = $2, while px = $1. Measuring Welfare Changes 125 8. CV with different income.B Redo the analysis from example 5.3, but now assume that I = $200. How does the increase in income affect CV? 9. CV with Cobb-Douglas utility function.B Chris has a demand √ for books (b) and other goods (y) that follows the Cobb-Douglas utility function u(b, y) = y b, and an income of I = $50. Find Chris’s CV if the price of books decreases from pb = $2 to pb = $1. 10. CV and EV.A In words, describe the difference between the CV and EV. 11. CV with general price change.C Let’s investigate the impact of a generic change in price. Consider a consumer with the Cobb-Douglas utility u(x, y) = xy, an income of I = 100, and a normalized price of good y at py = $1. What is the CV of a change in the price of x from px to px ? For simplicity, you can assume that px = 1. Dividing both prices by px , we can more compactly p express the initial price as ppxx = $1, and the final price as pxx = p. Intuitively, when p > 1, we have that px > px , so good x becomes more expensive, and when 0 < p < 1, we have that px < px and good x is cheaper. 12. EV with Cobb-Douglas demand–IB Again, consider Chris’s demand for books (b) and other √ goods (y) that follows the Cobb-Douglas utility function u(b, y) = y b, and an income of I = $50. Find Chris’s EV if the price of books decreases from pb = $2 to pb = $1. 13. EV with Cobb-Douglas Demand–II.B Consider a consumer with the Cobb-Douglas demand √ u(x, y) = x y, with income I = $100, and the price of good y is normalized at py = $1. Calculate the EV of the change in price of good x from px = $5 to px = $10. 14. EV with general price change.C Repeat the analysis from example 5.4, but assume a generic increase in the price of good x from px to px . For simplicity, you can assume that px = 1. Dividing both prices by px , we can more compactly express the initial price as ppxx = $1, and the final price p as pxx = p. Intuitively, when p > 1, we have that px > px , so good x becomes more expensive, and when 0 < p < 1, we have that px < px and good x is cheaper. 15. EV with different income.A Redo the analysis from example 5.4, but now assume that I = $200. How does the increase in income affect EV? 16. CV and EV with quasilinear utility.B Samantha often consumes two goods during exam times at school to relax, chocolate (c) and music (m). Her utility from consuming these two goods is 1 represented by the following quasilinear utility function, u(c, m) = c + 2m 3 . Her income level during exam week is I = $120, and the price of a bar of chocolate is pc = $4. Identify the CV and EV when the price for downloading music increases from pm = $2 to pm = $3. 17. CS, CV, and EV with no income effects–I.B Repeat the analysis from example 5.5, but now assume a generic increase in the price of good x, from px to px . For simplicity, you can assume that px = 1. Dividing both prices by px , we can more compactly express the initial price as ppxx = $1, p and the final price as pxx = p. Intuitively, when p > 1, we have that px > px , so good x becomes more expensive, and when 0 < p < 1, we have that px < px and good x is cheaper. 18. CS, CV, and EV with no income effects–II.C Explain the intuition behind section 5.5 of this chapter. That is, why do CS, CV, and EV coincide when there is no income effect? 126 Chapter 5 19. Alternative representation of CV.B Consider a consumer with utility u(x, y) = x0.75 y0.25 . (a) Find the demand for goods x and y by solving the consumer’s EMP. (b) Calculate the CV for a price increase from px = $1 to px = $2, where u = 10 and py = $1. (c) Calculate the CV of the price change for good x, but use the demand function of good y to see how the consumer’s welfare in her purchases of good y is affected by a more expensive good x. 20. Alternative representation of CV – quasilinear utility.B Consider a consumer with the quasilinear utility function u(x, y) = 2x1/3 + y. The demand for good x from the EMP yields 2p xE (px , py , u) = 3py x x = m and y = c). 3/2 , as solved for in exercise 16 (where goods m and c are relabeled as (a) Find demand for good y from the consumer’s EMP. (b) Calculate the CV for a price increase from px = $5 to px = $10, where u = 30 and py = $1. (c) Calculate the CV of the price change for good x, but use the demand function of good y to see how the consumer’s welfare in her purchases of good y is affected by a more expensive good x. 21. Alternative representation of EV.B Consider a consumer with a utility function u(x, y) = x0.75 y0.25 . (a) Find the demand for goods x and y by solving the consumer’s EMP. (b) Calculate the EV for a price increase from px = $1 to px = $2, where the new utility is u = 5 and py = $1. (c) Calculate the EV of the price change in good x, but use the demand function of good y to see how the consumer’s welfare in her purchases of good y is affected by a more expensive good x. 22. Alternative representation of EV – quasilinear utility.B Consider a consumer with a quasilinear utility u(x, y) = 2x1/3 + y. The demand function from the EMP yields xE (px , py , u) = p6x solved for in exercise 16 (where we relabeled goods m and c as x = m and y = c). 3/2 , as (a) Find demand for good y from the consumer’s EMP. (b) Calculate the EV for a price increase from px = $5 to px = $10, where u = 20 and py = $1. (c) Calculate the EV of the price change in good x, but use the demand function of good y to see how the consumer’s welfare in her purchases of good y is affected by a more expensive good x at the new utility, u = 20. 6 Choice under Uncertainty 6.1 Introduction In this chapter, we analyze situations where individuals or firms make choices under uncertainty, such as playing roulette in a casino or buying company stocks, as each outcome is not certain but has a probability associated with it. Another example is weather predictions, which nowadays are reported with a probability associated with rain, cloud cover, or sunny days. We start the chapter by describing what we mean by a lottery: an uncertain event with associated probabilities. We then explain how to find a lottery’s expected value, its variance, its standard deviation, and the expected utility that an individual obtains from participating in a lottery. While the expected value and variance of a lottery are both objective measures that we all agree on, its expected utility can be different depending on each individual’s degree of risk aversion. We define risk aversion in section 6.6, along with risk loving and risk neutrality. To start thinking about risk aversion, consider two job offers, both of them entailing the same expected dollar amount, $50,000. However, offer A gives you $50,000 with certainty (no risk), while offer B involves risk (e.g., it promises $60,000 with probability 0.8 and $10,000 with probability 0.2). Which one would you choose? As we discuss in section 6.6, a risk-averse individual may prefer offer A, a risk lover may prefer offer B, and a risk-neutral person would be indifferent between the two offers. We then discuss different measures of risk: (1) the risk premium that a risk-averse individual is willing to pay to avoid risk (i.e., to obtain a certain amount rather than participating in a lottery); (2) the certainty equivalent of a lottery; and (3) the Arrow-Pratt coefficient of absolute risk aversion. We finish the chapter by presenting other approaches to decisionmaking under uncertainty from the behavioral economics literature, such as the certainty effect, prospect theory, and weighted utility. 128 Chapter 6 Probability p 60% B 30% 10% C A Figure 6.1 Probability lottery portrayed as a histogram. 6.2 Lotteries Lottery An uncertain event with N potential outcomes, where each outcome i occurs with an associated probability pi ∈ [0, 1], and the sum of these probabilities satisfies p1 + p2 + … + pN = 1. The act of flipping a coin, for instance, can be regarded as a lottery, with two potential outcomes (Heads or Tails), each being equally likely (with probability 1/2). Similarly, weather conditions tomorrow can be understood as a lottery, where each outcome would be a different weather condition associated with a specified probability. Other common examples are stock returns, the outcome of a race, the score of a soccer match, and the probability of lightning striking you while you read this. Therefore, lotteries can be understood as probability distributions over outcomes, such as the one depicted in figure 6.1, where outcome A occurs with probability 10 percent, B with probability 60 percent, and C with the remaining probability 30 percent. These probabilities can be understood as the frequency with which we observe a certain outcome, such as A, occurring. For instance, if A refers to good-quality cars, probability 10 percent indicates the proportion of good-quality cars in a certain region. 6.3 Expected Value In this section, we describe how to obtain the expected value of a lottery (e.g., the return of a stock at the New York Stock Exchange) by measuring its expected value. Choice under Uncertainty 129 Expected value (EV) The average payoff of a lottery, where each payoff is weighted by its associated probability. The EV, therefore, computes the average payoff of a lottery by multiplying each possible payoff with its associated probability of occurring. As a consequence, the EV assigns a larger weight to those payoffs that are relatively more likely to occur, and a smaller weight to those that are less likely. Example 6.1 puts this definition to work. Example 6.1: Finding the EV of a lottery Consider the following probability distribution: outcome A ($90) occurs with probability 10 percent, outcome B ($20) with probability 60 percent, and outcome C ($60) with probability 30 percent. The EV of the lottery is given by the weighted average EV = (0.1 × $90) + (0.6 × $20) + (0.3 × $60) = 9 + 12 + 18 = $39. As discussed previously, the EV assigns the largest weight to the most likely outcome B (its associated probability is 0.6), a smaller weight to outcome C (its probability is 0.3), and the smallest weight to the most unlikely outcome A (because its probability is only 0.1). Self-assessment 6.1 Consider the lottery in example 6.1, but assume now that outcome A provides you with a payoff of $800, while outcome C only gives you $12. How is the EV of the lottery affected? Interpret. 6.4 Variance While the EV informs about the expected payoff of a lottery, it does not provide us with a measure of how risky the lottery is. We can find lotteries yielding the same EV as that in example 6.1 (EV = $39), yet far less risky than that lottery. For instance, a lottery with two equally likely outcomes a ($30) and b ($48) also generates an EV of EV = (0.5 × $30) + (0.5 × $48) = $39. Intuitively, while the lottery in example 6.1 has a large payoff variability (with payoffs ranging from $20 to $90), the lottery we just presented fluctuates close to its EV of $39 130 Chapter 6 ($9 down in outcome a, or $18 up in outcome b). One measure of the riskiness of a lottery is its variance, which we define next. Variance (Var) The average squared deviation of a lottery from its EV, weighting each squared deviation by the associated probability of that outcome. You can think about variance sequentially, as follows: 1. For each possible outcome in the lottery, x, we compute how far away this outcome is relative to the EV (i.e., x − EV ). This difference can be positive, if payoff x satisfies x > EV ; negative, if x < EV ; or zero, if the outcome’s payoff coincides with the EV. 2. Square this payoff difference, (x − EV )2 , so all differences are positive (both if payoffs are above or below EV ). 3. Lastly, multiply this squared deviation (x − EV )2 by the probability of the outcome, as in the EV calculation. This helps us weight each outcome with its associated likelihood of occurring. 4. If we repeat these three steps for all possible outcomes, and sum them up, we obtain the variance. Therefore, the variance measures the dispersion of a data set relative to its mean (i.e., it increases as some payoffs become further away from the EV of the lottery). For instance, a volatile stock has a high variance. The variance also increases as outcomes with a large squared deviation become more likely (i.e., their probability weight increases). We next provide a numerical example to illustrate how to find the variance of a lottery. Example 6.2: Finding the variance of a lottery Let us first calculate the variance of the (risky) lottery in example 6.1: VarRisky = 0.1 × ($90 − $39)2 + 0.6 × ($20 − $39)2 + 0.3 × ($60 − $39)2 = $609. Intuitively, while the squared deviation of outcome A, ($90 − $39)2 , is large, its probability weight is the lowest (0.1), helping reduce the variance due to outcome A. In contrast, the squared deviation of outcome B is the smallest, ($20 − $39)2 , as $20 is close to the EV of the lottery. Next, we can practice finding the variance of the relatively safe lottery presented at the beginning of this section: VarSafe = 0.5 × ($30 − $39)2 + 0.5 × ($48 − $39)2 = $81, Choice under Uncertainty 131 which is, of course, much smaller than that of the previous lottery because the squared deviations are low. Self-assessment 6.2 Consider the risky lottery in example 6.2. If outcome A yields $800 rather than $90, how is the variance of the lottery affected? Interpret. While variance helps us measure the volatility of a data set, it cannot be interpreted as a dollar amount, as payoff deviations from the mean have been squared. The standard deviation, defined next, helps us understand the dispersion of a data set in dollars or, more generally, in the original units of our payoffs. Standard deviation (SD) The square root of the variance, or SD = √ Var. √ For the variances found in√example 6.2, we have that SD = 609 = $24.67 for the most risky lottery, and only SD = 81 = $9 for the less risky lottery. Needless to say, if lottery 1 has a larger variance than lottery 2, then it must also have a larger standard deviation because SD is increasing in Var. 6.5 Expected Utility While previous sections analyze how to evaluate the expected monetary value of a lottery, and its riskiness, we do not yet have a tool to determine which specific lottery a decision maker selects when facing several available lotteries. To understand the value that she assigns to each lottery, we must first measure the expected utility she obtains from each lottery. Expected utility (EU) The average utility of a lottery, weighting each utility with the associated probability of that outcome. We find the utility that the individual obtains from the payoff in one outcome, multiply this utility by the probability of that outcome occurring, and then repeat the process for all other outcomes. As a consequence, the definition of EU is similar to that of EV, as both approaches weight payoffs according to their probability, assigning a larger weight to more likely outcomes. However, EU plugs each payoff into the individual’s utility 132 Chapter 6 function to better assess how important that payoff is for her, while EV considers only payoffs, without evaluating their utility for the individual.1 Example 6.3 illustrates this definition. Example 6.3: √ Finding the EU of a lottery Consider an individual with utility function u(I) = I, where I 0 denotes the income that the individual receives in each outcome. Let us first calculate the EU of the lottery in example 6.1: EURisky = 0.1 × $90 + 0.6 × $20 + 0.3 × $60 = 5.96, while that of the second (less risky) lottery is EUSafe = 0.5 × $30 + 0.5 × $48 = 6.20, This result indicates that the individual obtains a higher EU from the second lottery. While both lotteries generate the same EV, the safer lottery yields a higher EU for this individual. Self-assessment 6.3 Consider the scenario in example 6.3, but assume that the individual’s utility function changes to u(I) = I 1/3 . What is his EU from the risky lottery? What about from the safe lottery? 6.6 Risk Attitudes 6.6.1 Risk Aversion Figure 6.2 depicts the EU from the less risky lottery. We can understand the construction of this figure sequentially as follows: √ 1. We plot the utility function u(I) = I, which is increasing and concave in income.2 Intuitively, more income increases the individual’s utility, but at a decreasing rate: 1. A utility function with the EU form is also referred as a “von Neumann-Morgenstern EU function.” 2. For an increasing utility function u(I), we say that it is concave if it increases at a decreasing rate. Mathematically, this means that its second-order derivative with respect to income I is negative or zero, u (I) ≤ 0, but never positive. Choice under Uncertainty 133 u(I ) 6.93 u(EV ) = 6.24 EU = 6.20 5.47 B D C A $30 EV = $39 $48 I Figure 6.2 EV and EU from a lottery—risk averse. additional amounts of income are more beneficial when she has only $1 than when she has $1 million! 2. We place payoff $30 on the horizontal axis (recall that the individual obtains this payoff when outcome A occurs). 3. √ We extend a vertical line from this point until we hit the utility function, at a height of $30 ∼ = 5.47 at point A. 4. We then repeat steps 2–3 for the other outcome in this lottery, $48, first placing √ it on the horizontal axis and then extending a vertical line that hits the utility function at $48 ∼ = 6.93 at point B. 5. Finally, we connect points A and B with a line and, because the lottery assigns the same probability to both outcomes A and B (they are equally likely), we find the midpoint of the line (see point C). The height of this point represents the EU of the lottery which, as described in example 6.3, is EU = 6.20. Intuitively, this height represents the utility that the individual obtains from playing the lottery, and thus faces some uncertainty about which outcome will arise. Note that if we face a more volatile lottery (with higher variance), the line connecting points A and B becomes longer. For instance, if payoff $30 decreases to $16 in this lottery, while keeping all other elements of the lottery unchanged, point A’s height would be only √ u(16) = 16 = 4. In contrast, if we seek to depict the utility of the EV of the lottery, we only need to place the payoff corresponding to the EV, $39, on the horizontal √ axis, and then extend a vertical line until it hits the utility function, at a height of $39 ∼ = 6.24, as √ illustrated at point D. Intuitively, the utility of the EV—or, more compactly, u(EV ) = EV in this 134 Chapter 6 example—represents the utility that the individual obtains if she received the EV with certainty, without having to face the risk of playing the lottery. As figure 6.2 indicates, point D lies above point C, thus indicating that u(EV ) > EU. In short, this says that the individual is “risk averse” because she prefers to receive the EV of the lottery with certainty, where she obtains u(EV ), rather than having to face the risk of playing the lottery, which yields EU.3 Intuitively, the reduction in utility that she suffers from the downside of the lottery (6.24 − 5.47 = 0.77) is larger than the increase in utility from the upside of the lottery (6.93 − 6.24 = 0.69). For this type of individual, we can anticipate that, if facing two lotteries with the same EV (such as the risky and safe lotteries in example 6.3), she will always prefer the utility from the safest lottery. Alternatively, if two lotteries have the same EV, a risk-averse individual prefers the lottery with the lowest variance because it yields a higher EU. √ Risk aversion arises every time an individual’s utility function is concave, such as u(I) = I, depicted in figure 6.2. Utility functions with the form u(I) = a + bI γ are concave if constants a and b are positive and exponent γ in the individual’s income satisfies γ ∈ (0, 1). In the previous example, a = 0, b = 1 and γ = 1/2, but other utility functions like u(I) = 5 + 4I 1/3 or u(I) = 2 + 8I 2/5 would also yield increasing and concave utility functions.4 Concave utility. 6.6.2 Risk Loving Not all individuals are risk averse. Instead, some individuals are “risk lovers” because they enjoy facing situations where risk is involved, as example 6.4 illustrates. Example 6.4: Finding the EU of a lottery under risk-loving preferences Consider an individual with utility function u(I) = I 2 . Let us now find the EU of the two lotteries considered in example 6.3, but now evaluated at this utility function: EURisky = 0.1 × $902 + 0.6 × $202 + 0.3 × $602 = 2, 130 3. This result is a direct application of Jensen’s inequality. This inequality states that if f (x) is a strictly concave function where x denotes a real number, such as money in utility function u(I) in this discussion, then f (E[x]) > E [f (x)]. 1 4. As an exercise, note that by differentiating the utility function u(I) = 5 + 4I 3 with respect to income I, we 1 u (I) = 4 1 I 3 −1 4 I − 23 , =3 which is positive for all income levels I ≥ 0, indicating that the utility increases 2 5 in income. Differentiating u (I), we find u (I) = 43 − 32 I − 3 −1 = − 98 I − 3 , which is negative for all income levels, thus reflecting that the utility increases, but at a decreasing rate (i.e., it is concave in income). obtain 3 Choice under Uncertainty 135 u(I ) 2,304 B C EU = 1,602 u(EV ) = 1,521 900 D A $30 EV = $39 $48 I Figure 6.3 EV and EU from a lottery—risk lover. and EUSafe = 0.5 × $302 + 0.5 × $482 = 1, 602. This indicates that the individual obtains a higher EU from the first (risky) lottery than from the second (safe) lottery. Self-assessment 6.4 Consider the scenario in example 6.4, but assume that the individual’s utility function is u(I) = 5I 3 . What is his EU from the risky lottery? What about from the safe lottery? Which lottery yields the highest EU. Interpret. √ Figure 6.3 plots utility function u(I) = I 2 . An immediate difference with u(I) = I depicted in figure 6.2 is that u(I) = I 2 is convex (i.e., it increases in income at an increasing rate).5 Intuitively, this says that the individual enjoys additional income more when she owns $1 million than when she owns only $1. We can then follow a similar approach as in the previous section to depict the EU of the safe lottery: first, we place the payoffs that can arise from the lottery on the horizontal axis ($30 and $48); second, we extend a vertical line upward until we hit the utility function (at a height of 900 for point A and 2, 304 5. For an increasing utility function u(I), we say that it is convex if it increases at an increasing rate. Mathematically, this means that its second-order derivative with respect to income I is positive or zero, u (I) ≥ 0, but never negative. 136 Chapter 6 for point B); third, we connect points A and B with a straight line; and, finally, we find the midpoint of this straight line at point C, which represents the EU of the lottery, where EU = 1, 602. In contrast, the utility of the EV is found by simply extending a vertical line upward from the EV = $39 until we hit the utility function at point D; that is, u($39) = 392 = 1, 521. As expected, in this case, point C lies below point D, indicating that u(EV ) < EU. Intuitively, this says that the individual is a “risk lover” because she prefers to play the lottery and face risk (obtaining EU) to receiving the EV of the lottery with certainty, where she obtains u(EV ). Risk-loving attitudes emerge when an individual’s utility function is convex, as u(I) = I 2 , depicted in figure 6.3. (Recall that we assume positive or zero income levels throughout the chapter, I 0.) Generally, utility functions with the form u(I) = a + bI γ are convex if constants a and b are positive, while exponent γ now satisfies γ > 1. In example 6.4, parameters a and b took values a = 0 and b = 1, while exponent γ was γ = 2; but other utility functions like u(I) = 5 + 7I 3 or u(I) = 8 + 2I 5 are also convex.6 Convex utility. 6.6.3 Risk Neutrality Finally, some individuals may not be risk averse or risk loving, but instead are “risk neutral.” Example 6.5 calculates the EU again, but now under risk-neutral preferences. Example 6.5: Finding the EU of a lottery under risk-neutral preferences Consider an individual with utility function u(I) = I. The EU from the risky and safe lotteries for this individual are EURisky = (0.1 × $90) + (0.6 × $20) + (0.3 × $60) = 39 and EUSafe = (0.5 × $30) + (0.5 × $48) = 39, meaning that the individual experiences the same EU from the risky and safe lotteries. 6. As an exercise, note that differentiating utility function u(I) = 5 + 7I 3 with respect to income I, we obtain u (I) = 7 × 3I 3−1 = 21I 2 , which is positive, implying that utility increases in income. Differentiating u (I), we find u (I) = 2 × 21I 2−1 = 42I, which is also positive, thus indicating that utility increases at an increasing rate (i.e., it is convex in income). Choice under Uncertainty 137 u(I ) 48 B C EU = u(EV ) = 39 30 D A $30 EV = $39 $48 I Figure 6.4 EV and EU from a lottery—risk neutral. Self-assessment 6.5 Consider the scenario in example 6.5, but assuming that the individual’s utility function is u(I) = 2 + 5I. What is her EU from the risky lottery? What about from the safe lottery? Which lottery yields the highest EU? Interpret. Figure 6.4 plots utility function u(I) = I, which is linear in income (i.e., a straight line). That means that, it increases in income, but at a constant rate (in this case, 1) rather than at a decreasing rate (as concave utility functions) or at an increasing rate (as with convex utility functions). This figure follows the same approach to depict the EU of the safe lottery as in previous sections. We can immediately see that the height of point C, which represents the EU of the lottery, coincides with that of point D, which identifies the utility of the EV of the lottery. As a consequence, u(EV ) = EU. This result indicates that the individual is “risk neutral” because she obtains the same utility from receiving the EV of the lottery with certainty, which yields u(EV ), and from playing the lottery, where she obtains the EU. Linear utility. Risk neutrality arises when an individual’s utility function is linear, thus exhibiting the form u(I) = a + bI, where a and b are positive constants. In example 6.5, a = 0 and b = 1; but other utility functions like u(I) = 3 + 8I are also linear.7 7. Generally, the linear utility function u(I) = a + bI is increasing in income, because u (I) = b is positive; at a constant rate, given that u (I) = 0. We can confirm this property in this example, u(I) = 3 + 8I, because u (I) = 8 is positive and u (I) = 0, as required. 138 Chapter 6 6.7 Measuring Risk The discussion has established that an individual with a concave utility function is risk averse. A natural question is: how averse? Or, more generally, how can we measure risk? In this section, we seek to measure the amount of money that a risk-averse individual is willing to pay to avoid the risk of playing the lottery. 6.7.1 Risk Premium Risk premium (RP) The amount of money that we need to subtract from the EV in order to make the decision maker indifferent between playing the lottery and accepting the EV from the lottery. That is, the RP solves u(EV − RP) = EU. To understand the RP, think about the following scenario. Assume that you are the riskaverse individual of example 6.3, and we approach you with the relatively safe lottery. As we know, the EU from playing that lottery is EU = 6.2. If, instead, we√offer you the EV of the lottery with certainty, $39, your utility is larger because u(EV ) = 39 = 6.24, and you would prefer the EV paid with certainty. Knowing that, we cut the EV that we offer you by $1. Would you still prefer the EV − $1 than the EU? You may say yes if u(EV − $1) > EU. What if we cut the EV by $2? You may still accept it if u(EV − $2) > EU. The RP then measures how much we need to cut the EV offered to you with certainty to make you indifferent between accepting the EV and playing the lottery, that is, u(EV − RP) = EU. Example 6.6 finds the exact RP for our safe lottery. Example 6.6: Finding the RP of a lottery Considering the safe lottery of example 6.3, and recalling that EV = $39 and EU = 6.2, the RP solves u(39 − RP) = 6.2 √ or 39 − RP = 6.2. Squaring both sides yields 39 − RP = 6.22 , and solving for RP, we obtain RP = $0.56. Intuitively, we would need to cut the EV of the lottery by $0.56 for the individual to be indifferent between playing the lottery and receiving that (diminished) EV with certainty. If we cut the EV = $39 by more than $0.56, the individual would prefer playing the lottery rather than the (highly discounted) EV. Choice under Uncertainty 139 u(I ) 6.93 u(EV ) = 6.24 EU = 6.20 5.47 B D C A RP $30 CE EV = $39 $48 I Figure 6.5 Finding the RP and CE of a lottery. Self-assessment 6.6 Consider the scenario in example 6.6, but assume now that EV = $42 and EU = 6. Find the RP and interpret your result. Figure 6.5 illustrates the RP. Starting from the EV, the RP decreases the certain amount that we offer the individual. Graphically, we shift the EV leftward until its utility (height of point D) decreases enough to coincide with the EU of the lottery (height of point C). The individual is now indifferent between playing the lottery and receiving that (diminished) EV with certainty. The diminished EV, after subtracting RP, EV − RP, thus makes the individual indifferent between receiving that amount with certainty and playing the lottery. This diminished EV is also known as the “certainty equivalent,” as we define next. 6.7.2 Certainty Equivalent Certainty equivalent (CE) The amount of money that, if given to the individual with certainty, makes her indifferent between receiving such a certain amount and playing the lottery. That is, CE = EV − RP. In example 6.6, CE = EV − RP = 39 − 0.56 = $38.44. Therefore, if we offer $38.44 to the risk-averse individual, she would be indifferent between receiving this amount and playing the lottery. 140 Chapter 6 Example 6.7: Measuring RP and CE with other risk attitudes Consider the riskloving individual from example 6.4. Because EV = $39 and EU = 1, 602, the RP solves u(39 − RP) = 1, 602, which in this case entails (39 − RP)2 = 1, 602. Applying the square root to both sides of the equality, yields 39 − RP = 1, 602, or 39 − RP = 40.02. Solving for RP, we obtain RP = −1.02, which is negative! This indicates that, for the individual to be indifferent between playing the lottery and receiving a monetary amount with certainty, we would need to offer her more than the EV (rather than less, as in example 6.6). She loves risk, so she would actually need to be compensated to stop playing the lottery. Therefore, the CE becomes CE = EV − RP = 40.02. As suggested previously, RP < 0, thus augmenting the EV to induce the individual to stop playing the lottery. As a consequence, CE > EV when the individual is a risk lover. Following a similar approach with the risk-neutral individual from example 6.5, we obtain that the RP solves u(39 − RP) = 39, which in this case entails 39 − RP = 39, ultimately yielding RP = $0. Intuitively, the individual is indifferent between receiving EV with certainty and playing the lottery, so we don’t have to decrease EV (as with risk-averse individuals), nor increase the EV (as with risk lovers). As a result, the CE becomes CE = EV − RP = EV . Self-assessment 6.7 Consider an individual with the utility function u(I) = I 2 . We find that EV = $42 and EU = 1, 822. What is her RP from the lottery? What is her CE? Compare the CE and RP and interpret your result. 6.7.3 Arrow-Pratt Coefficient of Absolute Risk Aversion From figures 6.2–6.4, you probably noticed that risk aversion requires utility functions to be concave, meaning that it is increasing in the individual’s income but at a decreasing rate. The Arrow-Pratt coefficient of absolute risk aversion (AP for short) uses the concavity of the utility function to measure risk aversion, as described next. Arrow-Pratt coefficient of absolute risk aversion (AP) This coefficient is given by AP ≡ − u , u Choice under Uncertainty 141 where the denominator, u , represents the first-order derivative of the individual’s utility function u(I) with respect to income I, while the numerator, u , denotes the second-order derivative. Recall that the denominator u is always positive because the individual enjoys a positive utility when her income increases. The numerator, however, can be (1) negative when the individual’s utility function is concave, u < 0 (entailing a positive AP coefficient given the negative sign in the AP definition); (2) positive when her utility function is convex, u > 0 (which yields a negative AP); or (3) zero when her utility function is linear u = 0 (providing a zero AP coefficient). Example 6.8 illustrates how to find the AP for the risk-averse individual we considered in previous examples. Example 6.8: Finding the AP coefficient Consider the risk-averse individual from √ 1 example 6.3 with utility function u(I) = I. The first-order derivative is u = 12 I − 2 , and the second-order derivative is then 1 − 1 −1 1 1 3 − I 2 = − I− 2 , u = 2 2 4 yielding an AP coefficient of 3 3 1 −2 − 1 I− 2 I u AP = − = − 1 4 = 2−1/2 −1/2 u I 2I − 12 1 −3− = I 2 2 1 1 = I −1 = , 2 2I which is positive, thus indicating a positive risk aversion. In contrast, the risk-loving individual from example 6.4 with utility function u(I) = I 2 , has a first-order derivative of u = 2I, and second-order derivative of u = 2. Therefore, its AP coefficient is 1 u 2 =− =− , u 2I I which reflects a negative risk aversion (because she is a risk lover). As an exercise, you can check that the risk-neutral individual of example 6.5, with utility function u(I) = a + bI, has an AP coefficient of zero.8 AP = − 8. Indeed, the first-order derivative is u = b, whereas the second-order derivative is u = 0, which yields AP = − b0 = 0. 142 Chapter 6 Table 6.1 Summary of risk aversion measures. Utility function u(EV ) vs. EU Risk Premium, RP Certainty Equivalent, CE Arrow-Pratt coefficient, AP Exponent γ in u(I) = a + bI γ Risk Averse Risk Lover Risk Neutral Concave u(EV ) > EU + CE < EV AP > 0 Between 0 and 1 Convex u(EV ) < EU − CE > EV AP < 0 Larger than 1 Linear u(EV ) = EU 0 CE = EV AP = 0 1 Self-assessment 6.8 Consider an individual with utility function u(I) = 2I 1/3 . Using the same steps as in example 6.8, find her AP coefficient. Interpret. Table 6.1 summarizes some of the results for a risk-averse, risk-loving, and risk-neutral individual. When the individual is risk averse, in the first column: (1) her utility function u(I) is concave in income; (2) her utility from the EV of the lottery is larger than the EU from playing the lottery, u(EV ) > EU; (3) she would pay a positive amount to receive the EV of the lottery with certainty rather than playing the lottery; (4) the CE she needs to receive to avoid playing the lottery is lower than the EV of the lottery; and (5) the ArrowPratt coefficient of risk aversion is positive, AP > 0. Finally, if her utility function has the form u(I) = a + bI γ , exponent γ must be a number between 0 and 1. 6.8 A Look at Behavioral Economics—Nonexpected Utility The EU measure is tractable and intuitive, inducing many researchers to test it experimentally in the last decades. When we say that a theory has been “experimentally tested,” we mean that researchers set up an experiment where several individuals (often college students) are asked to sit at computer terminals and then presented with relatively simple lotteries to choose among. To help every participant think hard about what is her best choice, experiments provide monetary incentives, such as informing participants that they can take home $1 of every $5 dollars they earn in the experiment. What were the main findings of these experiments? Participants sometimes behaved differently from what EU would have predicted, leading researchers in the field of behavioral economics to propose alternative theories of decision-making under uncertainty that seek to account for these experimental anomalies. We present some of them next.9 9. For more references, see the book by Daniel Kahneman and Amos Tversky (2000) or the more accessible one published by Kahneman (2013). Choice under Uncertainty 143 Example 6.9: The certainty effect Kahneman and Tversky (1979) asked experimental participants to consider what decisions they would make in the following two choices: 1. Choice 1: (a) Lottery A: Receive $3,000 with certainty. (b) Lottery B: Receive $4,000 with probability 0.8 and receive $0 with probability 0.2. 2. Choice 2: (c) Lottery C: Receive $3,000 with probability 0.25 and receive $0 with probability 0.75. (d) Lottery D: Receive $4,000 with probability 0.20 and receive $0 with probability 0.80. Kahneman and Tversky (1979) found that most participants prefer lottery A over B in Choice 1 and lottery D over C in Choice 2. However, these preferences are inconsistent with EU theory, as we show next. In particular, an individual that evaluates lotteries according to her EUs prefers lottery A over B in Choice 1 if and only if u($3, 000) > 0.8u($4, 000) + 0.2u($0) because in lottery A, she is receiving $3, 000 with certainty (see the left side of the inequality), while in lottery B, she receives $4,000 with probability 0.8 and $0 otherwise (see right side). Individuals also expressed a preference of lottery D over C in Choice 2, which means 0.2u($4, 000) + 0.8u($0) > 0.25u($3, 000) + 0.75u($0), where the left side of the inequality represents the EU of lottery D, while the right side measures the EU of lottery C. Dividing both sides of the inequality by 0.25 and rearranging yields 0.8u($4, 000) + 0.2u($0) > u($3, 000), but this inequality is exactly the opposite of the inequality for Choice 1. This result is problematic because we did not assume any risk attitude for the individual (she could be risk averse, risk loving, or neutral). In other words, we cannot rationalize these choices using EU, regardless of the utility function of this individual, u(·). The alternative theories of decision-making under uncertainty that we present next, however, can help explain these preferences for lotteries. 144 Chapter 6 6.8.1 Weighted Utility An individual with weighted utility (WU) assigns to each payoff x in the lottery, a weight g(x), which may differ from the weight that she assigns to payoff y, that is to say, g(x) = g(y). To illustrate this assumption, consider a lottery between two payoffs x and y, with probabilities p and 1 − p, respectively. According to EU theory (section 6.5), the EU of this lottery is EU = pu(x) + (1 − p) u(y), where u(x) denotes the utility that the individual enjoys from payoff x, while u(y) represents her utility from payoff y. Examples of these utilities include u(x) = x1/2 and u(y) = y1/2 , as explored in previous examples in this chapter. Intuitively, p operates as the probability weight on payoff x, whereas (1 − p) is the probability weight on payoff y. WU only changes these probability weights, p and 1 − p, as follows: WU = g(y)(1 − p) g(x)p u(x) + u(y), g(x)p + g(y)(1 − p) g(x)p + g(y)(1 − p) Prob. weight on payoff x Prob. weight on payoff y In the special case in which the individual assigns the same weight on both payoffs, g(x) = g(y), this WU simplifies to10 WU = pu(x) + (1 − p) u(y) = EU. Therefore, WU theory in this case is equivalent to the EU theory presented in this chapter. In contrast, when payoff weights do not coincide, g(x) = g(y), the two approaches yield different results. Intuitively, when the individual assigns a greater probability weight on the upward outcome of the lottery, g(y) > g(x) so payoffs satisfy y > x, the WU assigns a larger importance to the upward outcome, ultimately yielding a WU that exceeds EU. In this context, the individual is more willing to participate in the lottery when she evaluates it according to the lottery’s WU than its EU. Example 6.10: Weighted utility Consider the safe lottery in example 6.3, which yields payoffs x = $30 and y = $48, both occurring with probability 1/2. The √ individual’s utility function was u(x) = x, implying that the safe lottery generated 10. Indeed, when g(x) = g(y), we first obtain WU = which simplifies to g(x)(1 − p) g(x)p u(x) + u(y), g(x)p + g(x)(1 − p) g(x)p + g(x)(1 − p) g(x)pu(x) g(x)(1−p) g(x) + g(x) u(y), and then ultimately reduces to pu(x) + (1 − p) u(y). Choice under Uncertainty 145 EU = 6.20. If, instead, the individual evaluates this lottery according to WU and g(x) = 2, while g(y) = 3, her WU becomes WU = 2 12 2 12 = + 3 12 $30 + 3 12 2 12 + 3 12 $48 3 2 × 5.47 + × 6.92 = 6.34, 5 5 which is larger than the EU. Intuitively, because the individual assigns a larger weight to the upward outcome of the lottery (payoff y), she finds the lottery more attractive when evaluating it according to WU than according to EU. WU can help explain the preferences for lotteries described in example 6.9. Example 6.11: Using WU to explain the certainty effect Consider again the individual in example 6.10 and check whether her preferences can explain the certainty effect presented in example 6.9. Lottery A is preferred to B in Choice 1 if and only if 2 × 0.2 3 × 0.8 $3, 000 > $0 + $4, 000, (3 × 0.8) + (2 × 0.2) (3 × 0.8) + (2 × 0.2) which simplifies to 54.77 > 54.21. In addition, lottery D is preferred to C in Choice 2 if and only if 3 × 0.8 2 × 0.2 $0 + $4, 000 (3 × 0.8) + (2 × 0.2) (3 × 0.8) + (2 × 0.2) 3 × 0.75 2 × 0.75 > $3, 000 + $0, (3 × 0.75) + (2 × 0.25) (3 × 0.75) + (2 × 0.25) which collapses to 54.21 > 44.81. Therefore, the experimental observations in Kahneman and Tversky (1979) can be explained by WU theory. 6.8.2 Prospect Theory Tversky and Kahneman (1986) proposed that the value that an individual obtains from a lottery can be different from the EU. In particular, considering the same lottery as in the previous section (two payoffs x and y, with probabilities p and 1 − p, respectively), the value of the lottery is V = w(p)v(x, x0 ) + w(1 − p)v(y, x0 ). 146 Chapter 6 v(x,x 0) Concave in gains (payoffs above x0) x0 Convex in losses (payoffs below x0) 0 x Kink at reference point x = x0 Figure 6.6 Utility function in prospect theory. This value of the lottery differs from the EU in three dimensions: • Probability weights. Like WU in the previous section, probability is also weighted with the probability weighting function w(p), rather than considering p directly. When w(p) > p, we say that the individual overestimates the likelihood of outcome x, and when w(p) < p, she underestimates it. If w(p) = p, the individual is not overestimating or underestimating the probability of outcome x, thus assigning the same probability weights as when she uses EU to evaluate the lottery.11 • The use of reference points. Every payoff x is evaluated against a reference point x0 , such as the status quo, so the individual’s utility from payoff x is v(x, x0 ), and that of payoff y is v(y, x0 ). Utility v(x, x0 ) is increasing in x and, importantly, it is concave for all payoffs that lie above the reference point, x > x0 , indicating that the individual is risk averse toward gains (relative to the reference point). Figure 6.6 illustrates this utility function with a solid line. However, v(x, x0 ) is convex for all payoffs that lie below the reference point (x < x0 ), suggesting that the individual is risk loving toward losses, as depicted by the dashed line of the utility function. • Loss aversion. Utility v(x, x0 ) has a kink at reference point x0 , rather than a smooth transition. Intuitively, this means that the individual suffers a larger disutility when he 11. A similar argument applies to outcome y, with weighted probability w(1 − p), which can overestimate the likelihood of outcome y if w(1 − p) > 1 − p; or underestimate it if w(1 − p) < 1 − p. In addition, the probability weighting function assumes that overestimation or underestimation does not happen with outcomes occurring with certainty; that is, when p = 1, we find that w(1) = 1 and, similarly, when p = 0, we find that w(0) = 0. Choice under Uncertainty 147 loses $1 relative to the reference point x0 than when he gains the same dollar from x0 , which is often referred to as the individual exhibiting loss aversion. Example 6.12: Prospect theory As an example of probability weighting function w(p), consider w(p) = p1/2 . p1/2 + (1 − p)1/2 Figure 6.7 depicts w(p) on the vertical axis and p on the horizontal axis, and includes the 45-degree line where w(p) = p. For relatively low probabilities (on the left side of the figure), w(p) lies above the 45-degree line, so w(p) > p, indicating that the individual overestimates outcomes that occur with low probability. In contrast, for relatively high probabilities (on the right side of the figure), w(p) lies below the 45-degree line, implying that w(p) < p. In this case, the individual underestimates outcomes that happen with high probability. Regarding utility function v(x, x0 ), we could consider a reference point x0 = $0, for simplicity, so that v(x, 0) = x1/2 for all x 0, and −3(−x)1/4 for all x < 0. Intuitively, the individual has a concave utility function x1/2 for all positive payoffs, and a convex utility function −3(−x)1/4 for all negative payoffs, producing a graph similar to that shown in figure 6.6, but with the kink happening at the origin, x = 0. Exponent 1/2 in x1/2 captures her concavity in gains, exponent 1/2 in −3(−x)1/2 measures her convexity in losses, and −3 in −3(−x)1/2 represents her loss aversion. w(p) 1.0 45-degree line where w( p) =p w( p) 0.8 0.6 Overestimation, because w( p) >p Underestimation, because w( p) < p 0.4 0.2 0.2 0.4 0.6 p =1/2 Figure 6.7 Probability weighting function. 0.8 1.0 p 148 Chapter 6 To understand this point, note that if the utility for losses was −x1/2 , gains and losses would produce the same effect on the consumer’s utility, leading to no kink at the utility function in figure 6.6. We finish this chapter by showing that prospect theory can help explain the certainty effect discussed in example 6.9. Example 6.13: Using prospect theory to explain the certainty effect Let us consider again the preference for lotteries described in example 6.9. In particular, we assume the probability weighting function w(p) in example 6.12 and the utility function v(x, x0 ) from that example, where reference point x0 = $0. First, Kahneman and Tversky (1979) found that individuals typically prefer lottery A over B, entailing $3, 0001/2 > 0.81/2 0.81/2 + (1 − 0.8)1/2 $4, 0001/2 + 1 − 0.81/2 0.81/2 + (1 − 0.8)1/2 $01/2 which simplifies to 54.77 > 42.16. Similarly, individuals prefer lottery D over C; that is 0.21/2 0.21/2 1/2 $4, 000 + 1 − 0.21/2 + (1 − 0.2)1/2 0.21/2 + (1 − 0.2)1/2 > 0.251/2 0.251/2 + (1 − 0.25)1/2 $4, 0001/2 + 1 − $01/2 0.251/2 0.251/2 + (1 − 0.25)1/2 $01/2 which collapses to $21.08 > $20.05. Therefore, individuals preferring lottery A over B in Choice 1, but lottery D over C in Choice 2, is consistent with prospect theory. For other examples of experiments testing behavior in contexts with uncertainty, see Tversky and Kahneman (1992), and for a readable introduction to the general topic, see Kahneman and Tversky (2000). Exercises 1. Expected utility.A Scientists are evaluating the impact of climate change on the production of apples in the Yakima region in Washington State. After analyzing the data on temperature during the last fifty years, they have identified three cases: (a) low impact, which can occur with probability 5 percent. Choice under Uncertainty 149 (b) medium impact, with probability 45 percent. (c) high impact, with probability 50 percent. A low-impact scenario implies profits for the agriculture industry of π = $85 million, the medium-impact yields profits of π = $5 million, and the high-impact scenario implies negative profits of π = $900 million. Consider a farmer who is risk averse and her utility function is concave and equal to 1 u (π ) = 10 + 3 × (π ) 3 . Calculate the EU for this farmer and discuss whether she should support measures to deal with climate change. 2. EV and variance–I.A You are looking at two firms as an investment opportunity: • For the first firm, you know that with probability 0.7, your investment will mature to a profit of $60 million, and with probability 0.3, your investment will mature to a loss of $40 million. • For the second firm, you know that with probability 0.9, your investment will mature to a profit of $40 million, and with probability 0.1, your investment will mature to a loss of $10 million. (a) Calculate the EV of each investment. (b) Calculate the variance of each investment. (c) If you had the opportunity to invest in only one of these firms, which would you pick, and why? 3. EV and variance–II.B You are looking at two firms as an investment opportunity. • For the first firm, you know that with probability 0.7, your investment will mature to a profit of $45 million, and with probability 0.3, your investment will mature to a loss of $30 million dollars. • For the second firm, you know that with probability 0.8, your investment will mature to a profit of $30 million, and with probability 0.2, your investment will mature to a loss of $7.5 million. (a) Calculate the EV of each investment. (b) Calculate the variance of each investment. (c) If you had the opportunity to invest in only one of these firms, which would you pick and why? 4. Expected utility–I.A Consider the situation in exercise 2, but suppose now that your utility function is √ u(π ) = 50 + π , where π is the profit from your investments. (a) Calculate the EU of each investment. (b) Based on your utility level, if you had the opportunity to invest in only one of these firms, which would you pick, and why? 5. Expected utility–II.A Consider the situation in exercise 2, but suppose now that your utility function is u(π ) = (50 + π )2 , 150 Chapter 6 where π is the profit from your investments. (a) Calculate the EU of each investment. (b) Based on your utility level, if you had the opportunity to invest in only one of these firms, which would you pick, and why? 6. Risk attitudes.A After looking at the details of a lottery, you calculate that your EU from that lottery is EU = 150. (a) After performing some additional calculations, you find that the utility that you would obtain if you instead received your EV is u(EV ) = 175. What is your attitude toward risk? (b) What if, instead, you calculated the utility of your EV as u(EV ) = 140. What is your attitude toward risk? (c) What if, instead, you calculated the utility of your EV as u(EV ) = 150. What is your attitude toward risk? 7. Risk aversion.B Suppose that you took part in a lottery that had a chance to increase, decrease, or have no effect on your level of income. With probability 0.5, your income remains at its original level, $500. With probability 0.2, your income increases to $700, and with probability 0.3, your income decreases to $400. Your utility function is u(I) = I 0.7 , where I denotes your income level. (a) Using only the utility function, show that your risk preferences are risk averse. (b) Calculate both your EU and the utility equivalent of the EV of your income. (c) Using the results from part (b), show that your risk preferences are risk averse. (d) Suppose now that you had the option to either accept this lottery, or walk away with your initial $500. Should you accept the lottery? Why or why not? 8. Risk premium–I.A Consider the situation in exercise 7. (a) Calculate your CE. (b) Calculate and interpret your risk premium. Is it consistent with risk aversion? 9. Risk loving.B Suppose that you took part in a lottery that had a chance to increase, decrease, or have no effect on your level of income. With probability 0.3, your income remains at its original level, $200. With probability 0.2, your income increases to $300, and with probability 0.5, your income decreases to $0. Your utility function is u(I) = I 2.5 , where I denotes your income level. (a) Using only the utility function, show that your risk preferences are risk loving. (b) Calculate both your EU and the utility equivalent of the EV of your income. (c) Using the results from part (b), show that your risk preferences are risk loving. (d) Suppose now that you had the option to either accept this lottery, or walk away with your initial $200. Should you accept the lottery? Why or why not? Choice under Uncertainty 151 10. Risk Premium–II.A Consider the situation in exercise 9. (a) Calculate your CE. (b) Calculate and interpret your risk premium. Is it consistent with risk loving? 11. Risk neutrality.B Suppose that you took part in a lottery that had a chance to increase, decrease, or have no effect on your level of income. With probability 0.4, your income remains at its original level, $400. With probability 0.4, your income increases to $800, and with probability 0.2, your income decreases to $200. Your utility function is u(I) = 125 + 3I, where I denotes your income level. (a) Using only the utility function, show that your risk preferences are risk neutral. (b) Calculate both your EU and the utility equivalent of the EV of your income. (c) Using the results from part (b), show that your risk preferences are risk neutral. (d) Suppose now that you had the option to either accept this lottery, or walk away with your initial $400. Should you accept the lottery? Why or why not? 12. Risk Premium–III.A Consider the situation in exercise 11. (a) Calculate your CE. (b) Calculate and interpret your risk premium. Is it consistent with risk neutrality? 13. Contracting an illness.B Consider a situation where you are faced with a risky situation. You currently have $100,000 available for consumption, and with a 90 percent probability, you would suffer no illness. You have a 9 percent chance, however, of contracting a case of influenza, leading to the loss of $10,000 in consumption. In addition, there is a 1 percent chance that this is a severe illness, leading to the loss of $50,000 in consumption. Your utility from consumption is U(C) = C 0.4 , where C is your consumption level. (a) What is your attitude toward risk? How do you know this? (b) Suppose that you could purchase insurance against influenza. What is your CE? (c) What is the maximum premium that you are willing to pay for insurance against influenza? (d) What is your risk premium? How does this compare with your risk premium if you were risk neutral? 14. Purchasing full insurance.B Suppose that Adam has an initial wealth of $100 and has the utility √ function u(I) = I, where I > 0 denotes his income. Assume that he faces a 10 percent chance of suffering a car accident where he would lose $25. He considers purchasing insurance to protect against his potential loss. He can buy a units of insurance for $0.10 per unit, which pays $1 per unit a that is purchased. (a) What is Adam’s EU from buying a units of insurance? (b) How many units of insurance, a, does Adam purchase? 152 Chapter 6 15. Not purchasing full insurance.B Consider Adam’s situation in exercise 14, except now each unit of insurance costs $0.11. (a) What is Adam’s expected utility from buying a units of insurance? (b) How many units of insurance, a, does Adam purchase in this scenario? 16. Arrow-Pratt coefficient–I.B Consider an individual with the utility function u(I) = log(I), where the log function refers to the natural logarithm and I is an individual’s income level. (a) Calculate the Arrow-Pratt coefficient of risk aversion. (b) Based on your results from part (a), what is this individual’s attitude towards risk? 17. Arrow-Pratt coefficient–II.B Consider an individual with the utility function u(I) = expI , where exp denotes the exponential function and I is an individual’s income level. (a) Calculate the Arrow-Pratt coefficient of risk aversion. (b) Based on your results from part (a), what is this individual’s attitude towards risk? 18. Weighted utility.B Repeat exercise 13, but suppose that now you place additional weight on the chance of contracting a severe illness. Let x denote the outcome where you do not contract influenza, y denote the outcome where you contract a standard case of influenza, and z denote the outcome where you contract a severe case of influenza. Suppose that g(x) = 3, g(y) = 3, and g(z) = 4. How does your risk premium change as you weigh the worst outcome more heavily? 19. Prospect theory–I.B Suppose that your initial wealth is W = $50 and your utility function is √ u(W ) = W . While out on a walk one day, you notice a $5 on the ground and pick it up. (a) By how much does your utility increase? (b) Suppose now that while walking home with your $55, you are stopped by a police officer for jaywalking and fined $5. By how much does your utility decrease? How does your utility now compare with your original utility? √ (c) Repeat part (a), but suppose now that your utility function is u(W , W0 ) = W − W0 when you increase your wealth, where W0 represents your wealth before the event occurs (your reference point), and u(W , W0 ) = −(W − W0 )2 when you decrease your wealth. (d) Repeat part (b) with the utility function in part (c). (e) Compare the results of parts (b) and (d). Under which situation are you worse off? Why? 20. Prospect theory–II.C You are considering whether to invest in your brother’s business. He informs you that if you invest $10, 000, you have a 75 percent chance of doubling your money after a year. When you ask him about the other 25 percent chance, he mumbles something about losing all your money. (a) What is your EV for this investment? (b) Suppose that you are risk averse, with a utility function of U(I) = value of your investment. What is your EU of this investment? √ I, where I represents the (c) You remain skeptical of your brother’s abilities, so instead you decide to utilize a value function to calculate your most likely outcome. From prospect theory, you decide to weight the Choice under Uncertainty 153 probability p = 0.75 that your brother is successful by using the weighting function p1/3 , w(p) = 1/3 p + (1 − p)2/3 In addition, you decide to use your initial investment as a reference point, and calculate √ your utility as U(I, 10, 000) = I + 10, 000 for a successful investment, and U(I, 10, 000) = −(I − 10, 000)2 for an unsuccessful investment. What is your value of this investment? (d) Compare the results from parts (b) and (c). Are you more likely to invest in your brother’s business while using prospect theory? Explain. 21. Gambling.A Suppose that you have a situation where your grandfather enjoys spending all his free time (and money) playing the slot machines at his local casino. When you confront him, he explains to you that he is risk loving, and he can’t give up the thrill of taking a gamble. With what you know about risk premiums, how could you persuade your grandfather to curtail his gambling habits? 22. Exam pressure.A Suppose that you arrive at your final exam in this class to find that the professor has reduced it to a single question. The question states that each student, in turn, will approach him and roll a fair 20-sided die. With a roll of 1, the student receives 0 points on the final, but on a roll of any other number, the student receives 100 points on the final. (a) What is your expected score on the final? (b) Suppose now that your professor offered to sell you some grade insurance. You could offer him some of your final exam points in exchange for avoiding the risk. What percentage of your points would you offer him? - Note 1: There is no wrong answer to this question. - Note 2: Your professor would never accept an offer less than 5 of your points, but he might not be fair, so don’t offer him too little! (c) What does your offer in part (b) reveal about your attitude toward risk? 7 Production Functions 7.1 Introduction After describing consumer decisions in previous chapters, this chapter and chapter 8 focus on firm decisions, such as how many units of output to produce and how many inputs to invest in. “Inputs” are factors of production that the firm can transform into units of output, such as labor (secretaries, chief executive officers, gardeners, or software engineers); capital (buildings, desks, computers, and software packages); and land. We start by measuring the average product that the firm obtains per unit of input, the additional product that the firm gains when adding 1 more unit of input (the marginal product), and the relationship between the two. As in consumer theory, where we used indifference curves to illustrate combinations of two goods (x and y) that yield the same utility level for the consumer, we now seek to depict combinations of labor and capital that produce the same amount of output. We refer to these labor-capital combinations as the firm’s “isoquant.” Following our approach in consumer theory, we measure the slope of the firm’s isoquant, because it helps us understand the firm’s ability to substitute one unit of labor for one unit of capital while maintaining its output level unchanged. We then examine various production functions, which exhibit similar mathematical properties as the utility functions previously explored in consumer theory (chapter 2), such as the Cobb-Douglas production function, the linear production function, and the fixed-proportions production function. We conclude the chapter with two applications. First, we measure returns to scale in the production process. If all inputs increase by the same proportion, returns to scale evaluate how much the firm’s output increases. Second, we test for technological progress. As the term indicates, this progress allows the firm to increase its output while using the same amount of inputs. 156 Chapter 7 7.2 Production Function In this section, we discuss how to represent the production of a firm as a function of its inputs, such as labor, capital, and land, and generally, any other element that the firm can transform into units of output. Production function A function representing how a certain amount of inputs is transformed into an amount of output q. For example, q = f (K, L) is a production function as it describes how specific amounts of labor L and capital K are transformed into an amount of output q. We next elaborate on examples of common production functions. Example 7.1: Examples of production functions The Cobb-Douglas function q = AK α Lβ is relatively common, where parameter A is positive and parameters α and β satisfy α, β ∈ (0, 1). Let these parameters take the values A = 3, and α = β = 1/2, and consider that the firm uses K = 4 machines and L = 9 workers. In this case, the maximum output that the firm can generate, given its Cobb-Douglas technology, is q = 3 × 41/2 × 91/2 = 18 units. If, instead, we observe that the firm produces only 14 units of output using the previous combination of inputs (K = 4 and L = 9), this indicates that the firm is not efficiently managing its available inputs, because it is not reaching the maximum possible output (q = 18 units), given its current technology and input usage. We can then measure a firm’s efficiency as the ratio of observed output that the firm produces to the potential output identified by the production function. In this example, efficiency would be 14/18 = 0.77. Alternatively, the firm would have an inefficiency level of 1 − 0.77 = 0.33. Section 7.7, later in this chapter, elaborates on other types of production functions, but we briefly list them here. They include: (1) production function q = aK + bL, where a and b are all nonnegative and capital and labor enter linearly; (2) production function q = A min {aK, bL}, where parameters A, a, and b are all positive and capital and labor must be used in a certain proportion; and (3) production function q = AK α + bL, where parameters A, α, and b are all positive, and where one Production Functions 157 input (in this case, labor) enters linearly and the other input enters nonlinearly.1 The mathematical representation of these production functions is analogous to that of utility functions in consumer theory (chapter 2). Indeed, we have only changed labels: good x is now units of capital K, whereas good y is now units of labor L. Self-assessment 7.1 Consider the firm discussed in example 7.1, but assume that its production function is q = 5K 1/3 L2/3 . Which is the largest amount of output q that the firm can produce using L = 9 and K = 4 inputs? What if the production function changes to q = 7K + 4L? What if it changes to q = 5 min {2K, 3L}? What if it changes to q = 4K 1/2 + 3L? 7.3 Marginal and Average Product Here, we define two common measures to evaluate the productivity of a firm’s inputs: average product and marginal product. Average product The total units of output per unit of input. Hence, the average product of labor is APL = Lq , whereas that of capital is APK = Kq . As an example, if a firm produces 100 units of output, and hires L = 4 workers, its average product per worker is APL = 100 4 = 25 units. In other words, every worker produces on average 25 units of output. This is frequently referred to in the media as “labor productivity,” in articles that report the growth of productivity over time. For instance, you might read that “U.S. labor productivity grew by only 1 percent last year,” indicating that the average production per worker increased by 1 percent. When labor productivity in a country grows, the ratio APL = Lq must go up, and this can occur for a number of reasons: (1) total output increases, while the number of employed workers remains constant, as a result of better technology and education; (2) total output remains unaffected, but workers are fired after firms automate the production process (replacing workers with machines, such as robots); 1. Recall from chapter 2 that, when we say “a function is linear in a good” (an input, in this chapter), we just mean that the derivative of the function with respect to that input yields a constant (a number). That is, the derivative is no longer a function of the units of labor L or capital K that the firm uses. If, instead, we say “a production function is non-linear in an input” (or, alternatively, “the input enters nonlinearly”), the derivative of the function with respect to that input yields an expression that still contains L, K, or both. 158 Chapter 7 q B 400 200 A 4 16 L Figure 7.1 √ Production function q = 100 L. and (3) total output increases and more workers are hired, but the former grows faster than the latter, thus increasing the ratio of Lq . √ Figure 7.1 depicts the production function q = 100 L, with units of labor L on the horizontal axis and total output on the vertical axis. (For simplicity, the figure includes a production function √with only one input.) At point A, with LA = 4 workers, the total product becomes qA = 100 4 = 200 units, entailing that the average product of this firm is given by qA 200 LA = 4 = 50 units. Graphically, the average product at A coincides with the slope of the ray connecting the origin to√point A. Similarly, at point B, where LB = 16 workers, the total product becomes qB = 100 16 = 400 units, implying that the average product is LqBB = 400 16 = 25 units. Graphically, the ray connecting the origin to point B now becomes flatter than that connecting the origin to A, which should come at no surprise because the average product was cut in half. Example 7.2: Finding average product Consider a production function q = 5L1/2 + 3L − 6. The average product of labor is found by dividing total output by units of labor, L, as follows: APL = 5 q 5L1/2 + 3L − 6 6 6 = = 5L1/2−1 + 3 − = 1/2 + 3 − . L L L L L We can now analyze the APL expression. First, as L increases, APL increases if the derivative 5 6 ∂APL = − 3/2 + 2 ∂L 2L L Production Functions 159 is positive, which occurs when L62 2L53/2 . After rearranging, this condition simplifies 5 1/2 . Squaring both sides, we find L 144 5.76 workers. to L3/2−2 12 , and to 12 5 L 25 In other words, APL increases in L for all L 5.76 workers, but it decreases in L L beyond that point. Therefore, APL reaches its maximum when ∂AP ∂L = 0, which occurs at L = 5.76 workers. Self-assessment 7.2 Consider the firm in example 7.2, but assume now that its production function changes to q = 7L1/3 + 4L − 2. Find the average product, APL , and the labor at which APL reaches its maximum. We next analyze how the total output of the firm increases as it utilizes an additional unit of input. Marginal product The rate at which total output increases as the firm uses an addiq when labor tional unit of either input. The marginal product of labor is MPL = L ∂q is discrete, or ∂L when it is continuous; whereas the marginal product of capital is q ∂q MPK = K when capital is discrete, or ∂K when it is continuous. Hence, we can find the marginal product of an input by differentiating the production function q = f (L, K) with respect to that input. Because the derivative of a function at a point coincides with its slope of a tangent line at that point, we can graphically interpret the marginal product of an input (e.g., labor) as the slope of the function when we marginally √ increase the amount of that input. Figure 7.2a depicts the production function q = 100 L again. The marginal product of labor (e.g., hiring more workers) is measured by the derivative of this function with respect to L; that is, MPL = 1 ∂q 50 = 100 L1/2−1 = 1/2 ∂L 2 L 1 because L1/2−1 = L−1/2 , and L−1/2 can be also expressed as L1/2 . Figure 7.2b depicts 50 marginal product MPL = L1/2 as a function of the number of workers that the firm hires, L. For instance, at point A, where the firm hires LA = 4 workers, the marginal product of hir50 ing more workers becomes MPL = 450 1/2 = 2 = 25 units. That is, when the firm hires only 4 workers, it can increase its total output by 25 units if it hired an additional worker. In 160 Chapter 7 (a) q B 400 200 A L 16 4 (b) MPL 40 30 A 25 20 B 12.5 10 4 5 10 1516 20 L Figure 7.2 (a) Depicting MPL as the slope of the production function. (b) Representing MPL directly. contrast, at point B, where the firm already employs LB = 16, MPL becomes MPL = 16501/2 = 50 4 = 12.5 units. Intuitively, the additional output that each extra worker brings is positive, but it decreases as the firm hires more and more workers. In short, MPL is diminishing. This could happen, for instance, if there are “too many cooks in the kitchen.” Example 7.3: Finding marginal product Consider the same production function as in example 7.2, q = 5L1/2 + 3L − 6. Next, we find its marginal product of labor, showing that it is decreasing in L, as in our above discussion. First, let us find MPL by differentiating the production function with respect to L: MPL = ∂q 5 1 = 5 L1/2−1 + 3 = 1/2 + 3. ∂L 2 2L Production Functions 161 To check if MPL decreases in L, we can calculate its derivative as follows: 5 ∂MPL = − 3/2 ∂L 4L which is negative because L > 0. Hence, as L increases, the marginal product of labor, MPL , decreases. Essentially, additional workers bring more production to the firm, but at a decreasing rate. Self-assessment 7.3 Consider the firm in example 7.3, but assume that the firm’s production function changes to q = 7L1/3 + 4L − 2. Find the marginal product, MPL , and check if it increases or decreases in labor. 7.4 Relationship between APL and MPL The average and marginal products exhibit some interesting relationships: 1. When the APL curve is increasing, MPL lies above APL ; 2. When the APL curve is decreasing, MPL lies below APL ; and 3. When the APL curve is flat (at its highest point), MPL curve crosses APL . To understand these relationships, let us consider grades in a class rather than output. Assume that you take a midterm exam in one of your classes, and, a few days later, the instructor shows up with a stack of graded exams, letting you know that your average grade in the class will go up as a consequence of the midterm. Great news! What does that say about your performance on the midterm exam? Of course, it is saying that your grade on the midterm exam must be better than your previous average (a result you would know even without taking intermediate micro!). In other words, for your average grade in the class to increase, it must be the case that the grade in your midterm exam is higher than your previous average, or alternatively, that the marginal effect of the last grade is higher than your average. This is analogous to what happens with output: for the average product of labor to increase in L (as the firm hires one more worker), it must be that the newly hired worker produces more total output than previous workers did on average. In short, MPL > APL , as in the midterm grade example. The opposite applies when the midterm exam decreases your average grade in the class. In that case, the midterm grade must be lower than your previous average. In the context of output, the product of the newly hired worker is lower than that of previous workers on average, or MPL < APL . 162 Chapter 7 Lastly, if the instructor of the class informs you that your grade is unaffected by your midterm score, it means that your score exactly coincides with your previous average. In the context of the firm, the newly hired worker is as productive as previous workers are on average. An immediate consequence of this argument is that the MPL curve crosses the APL at the maximum point (the peak) of the APL curve. We can easily show this result for any production function q = f (L). First, note that the average product per worker is APL = Lq = f (L) L . To find the number of workers, L, at which APL reaches its maximum, we differentiate APL with respect to L and set our result equal to zero, as follows: ∂APL f (L)L − 1f (L) = 0, = ∂L L2 2 where, because APL = f (L) L , we find the derivative using the quotient rule. Note that, for compactness, we use f (L) to denote the derivative of total output f (L) with respect to L. As ∂q , we can replace f (L) for MPL , as follows: this derivative is the marginal product MPL = ∂L MPL L − f (L) MPL f (L) = − 2 = 0. L L2 L Multiplying both sides by L, we obtain MPL − f (L) = 0. L Finally, note that the second term is the average product per worker, APL = f (L) L , which allows us to write this expression more compactly as MPL = APL . This equation tells us that, at the maximum of the APL curve, the MPL curve crosses the APL curve. Figure 7.3 illustrates this discussion: when the APL curve is increasing, MPL lies above APL ; when the APL curve is decreasing, MPL lies below APL ; and when the APL curve is flat (at its peak) the height of MPL and APL coincide. Example 7.4: Relationship between APL and MPL Consider the production function analyzed in examples 7.2 and 7.3. As shown in example 7.2, APL = 5 + 3 − L6 reaches its maximum at L = 144 25 5.7, where its height becomes L1/2 g(x) 2. Recall the quotient rule. For a function f (x) = h(x) , where both the numerator, g(x) and denominator, h(x), are g (x)h(x)−g(x)h (x) functions of x, the quotient rule says that the derivative of f (x) is f (x) = . In our scenario, the 2 f (L) (h(x)) quotient is given by APL = L , so the numerator f (L) plays the role of g(x) in the quotient rule, while L plays the role of h(x). Production Functions 163 MPL APL MPL A APL L Figure 7.3 The APL and MPL curves. 6 APL = 5 1/2 + 3 − 5.7 4.04. If we evaluate the MPL = 2L51/2 + 3 curve found in (5.7) example 7.3 at exactly the same L = 5.7, we obtain that the height of the MPL curve is, 5 + 3 4.04, MPL = 2 (5.7)1/2 thus confirming that the MPL crosses the APL at its maximum point. Self-assessment 7.4 Consider the firm in self-assessments 7.2 and 7.3. Using the same steps as in example 7.4, find the point at which MPL crosses the APL curve. 7.5 Isoquants In this section, we examine a firm’s ability to substitute one input for another while maintaining the same level of output. For instance, a firm may consider acquiring a packaging machine that does the job of three packaging workers. To evaluate that ability to substitute between inputs, we first present a definition of how to measure input combinations that yield the same output level. Isoquant curve It represents combinations of labor and capital that yield the same amount of output. 164 Chapter 7 K q = 100 q = 200 A3 A C A1 B A2 L Figure 7.4 Isoquants—nonprofitable input combinations. The isoquant is, then, analogous to the indifference curve in consumer theory. As discussed in chapter 2, the indifference curve of a consumer represents combinations of goods x and y for which she obtains the same utility level. In the context of production, an isoquant similarly reflects combinations of labor and capital that generate the same total output.3 Figure 7.4 depicts an example of an isoquant, where at point A, the firm uses an input combination that is relatively intense in capital (i.e., many units of capital and few of labor), whereas at point B, the firm uses a labor-intense input combination. Yet, at both points, the firm produces the same total output (100 units). At point C, however, the firm reaches a higher total output (200 units). Figure 7.4 continues the isoquant curve upward and rightward (in the dark shaded areas), to illustrate that these shaded areas are regarded as unprofitable for the firm, and thus are never chosen by a rational manager. To see this, note that at points A1 and A2 , the firm produces the same units of output (100 units) because they both lie on the same isoquant. However, point A2 uses more units of labor than point A1 (note that A2 is to the right side of A1 ), and the same amount of capital (both points have the same height). As a consequence, the input combination at A2 must be more expensive to purchase than that at A1 , and yet it produces the same total output! No rational firm would then choose an input combination such as A2 , an argument that applies to the upward-bending portion of the isoquant in the shaded area on the right. Similarly, points A1 and A3 generate the same units of output because they lie on the same isoquant, but A3 requires more capital than A1 , thus implying that it is more expensive to purchase. As a result, the firm would never choose an input 3. Graphically, if a production function q = f (L, K) is represented by a “production mountain” in three dimensions (3D), a horizontal slice of the production mountain at a specific height (at a particular output level) entails combinations of labor and capital that yield the same output (i.e., reach the same height on the mountain). The 3D figure of the production function could look like the figure discussed for utility functions in chapter 2 (figure 2.3a) while its slice at a given height would resemble that of the indifference curve in figure 2.3b. Production Functions 165 combination such as A3 , an argument that extends to all points on the backward-bending portion of the isoquant in the shaded area at the top of the graph. The equation of the isoquant is found using the same approach followed for indifference curves in consumer theory (section 2.6 in chapter 2). As we illustrate in example 7.5, to find the isoquant curve for a specific output level, such as q = 100 units, we only need to solve for the variable on the vertical axis (often, capital K). Example 7.5: Finding isoquant curves for a Cobb-Douglas production function Consider a Cobb-Douglas production function q = 5L1/2 K 1/2 , and let us find the isoquant corresponding to output level q = 100 units. Inserting this output level into the production function (left side), we obtain 100 = 5L1/2 K 1/2 , or 20 = L1/2 K 1/2 . To solve for capital K, we first square both sides, which yields 202 = LK, or 400 = LK. Lastly, we can solve for K to obtain the isoquant K = 400 L . Graphically, this isoquant is a curve that approaches the vertical axis when L is close to zero, but it never crosses that axis; and that approaches the horizontal axis when L is large, without ever crossing that axis. (As an exercise, consider a linear production function q = 5L + 7K, and find the isoquant corresponding to the output level q = 100. You should obtain 100 a straight line originating at K = 100 7 , crossing the horizontal axis at L = 5 , with a 5 slope of − 7 .) Self-assessment 7.5 Consider the firm in example 7.5, but assume that it seeks to produce q = 200 units. What is the firm’s isoquant now? What if its production function changes to q = L1/3 K 2/3 ? Interpret your results. 7.6 Marginal Rate of Technical Substitution We next find the slope of the isoquant. If the firm were to add one extra worker, it would increase its total output. Then, we ask: How many units of capital must the firm give up to maintain its output level unaffected after hiring an extra worker? The slope of the isoquant answers this question. Marginal rate of technical substitution (MRTS) After increasing the quantity of labor by 1 unit, the MRTS measures the amount by which capital must be reduced so that output remains constant. 166 Chapter 7 K Large decrease in K K = 80 Small decrease in K K = 60 A B C K = 50 q = 100 units 4 5 6 L 1 more 1 more worker worker Figure 7.5 Diminishing MRTS. Figure 7.5 illustrates the MRTS for the same isoquant depicted in figure 7.4. At point A, the firm uses a large amount of capital and few workers to produce q = 100 units of output. If it were to hire 1 more worker (moving rightward from L = 4 to L = 5), the firm would need to reduce its capital usage significantly (moving downward from K = 80 to K = 60) to maintain its current output level, thus moving along the isoquant from the initial point A to B. Importantly, we must move along the isoquant because we seek to keep output unchanged.4 Let us repeat this process at point B, where the firm employs more workers but fewer units of capital than at A (L = 5 and K = 60). If the firm hired 1 more worker, it would be willing to give up only a few units of capital to keep its output level unaffected. Intuitively, when capital is abundant and labor scarce (as in point A), the firm is willing to give up many units of capital to hire one more worker. However, as capital becomes more scarce (at point B), the firm is less willing to replace it with workers. (For completeness, appendix A, at the end of the chapter, shows that the slope of the isoquant is measured by the ratio of marginal products.) Example 7.6: Finding the MRTS of a Cobb-Douglas production function Consider a firm with production function q = 8L1/2 K 1/2 . Its marginal product of labor is MPL = 8 12 L1/2−1 K 1/2 = 4L−1/2 K 1/2 , and that for capital is MPK = 8 12 L1/2 K 1/2−1 = 4L1/2 K −1/2 . Hence, the MRTS in this scenario is 4. Intuitively, a high MRTS indicates that, at point A, the firm hired so few workers that hiring one worker would increase output significantly, thus requiring a large capital reduction to keep output unchanged. Production Functions 167 MRTS = MPL 4L−1/2 K 1/2 K = = , MPK 4L1/2 K −1/2 L which is decreasing in the units of labor (as L shows up only in the denominator).5 Graphically, this result entails that the slope of the isoquant (the MRTS) falls as we move rightward toward more units of L. That is, the isoquant becomes flatter as we move rightward or, alternatively, the firm’s isoquant is bowed in from the origin, as in figure 7.5. Self-assessment 7.6 Consider a firm with production function q = 4L1/3 K 2/3 . Using the same steps as in example 7.6, find this firm’s MRTS. Interpret your results. Example 7.7: Finding the MRTS of a linear production function Consider a firm with a linear production function q = aL + bK, where a, b > 0. Its marginal product of labor is MPL = a, while that for capital is MPK = b. Hence, the MRTS in this scenario is MPL a = . MRTS = MPK b In this context, the MRTS is not a function of the units of labor or capital that the firm uses. Indeed, the MRTS is just a constant (a number ab ). For instance, if parameters a and b took the values a = 6 and b = 3, then the MRTS would become 63 = 2, implying that the slope of the isoquant would be −2 in all its points. Graphically, the isoquant would then be a straight line (because its slope is constant).6 Self-assessment 7.7 Consider a firm with production function q = 5L + 2K. Using the same steps as in example 7.7, find this firm’s MRTS. Assume now that its production function changes to q = 5L + 4K. What is the firm’s MRTS now? How does it compare to the initial MRTS? Interpret your results in terms of capital productivity. K 5. Formally, the derivative of the MRTS with respect to the units of labor, L, is ∂MRTS ∂L = − L2 , which is negative for any number of units of labor, L, and capital, K, that the firm uses. 6. As described in the previous discussion of isoquants, to find the isoquant of production function q = aL + bK, q q we can find K = b − ab L, where b represents the vertical intercept of the isoquant in the K-axis, while − ba is the slope of the isoquant, thus confirming this result. 168 Chapter 7 K q b a b q a L Figure 7.6 Linear production function. 7.7 Special Types of Production Functions 7.7.1 Linear Production Function As described in previous sections, the linear production function takes the form q = aL + bK, where both a and b are positive parameters (numbers), such as in q = 7L + 5K. The isoquant of this production function is a straight line. To see this, solve for K in q = aL + bK, to obtain q a K = − L, b b where qb is the vertical intercept of the isoquant, while ab denotes its negative slope, as depicted in figure 7.6. For instance, consider a firm with production function q = 7L + 5K, and let us depict the isoquant of q = 100 units of output, the vertical intercept is K = qb = 100 a 7 7 5 = 20, whereas the negative slope is b = 5 . The slope is constant along all points of the a isoquant because b is not a function of L or K. As a consequence, the MRTS is also constant given that the latter represents the slope of the isoquant. Intuitively, the firm can substitute 7. Recall that, to find the horizontal intercept of this isoquant, we just need to set capital equal to zero (K = 0), q and solve for L. That is, q = aL + b0, which yields the horizontal intercept L = a . For instance, in the production q function q = 7L + 5K, if we depict the isoquant for q = 100 units of output, the horizontal intercept is L = a = 100 14.28 workers. 7 Production Functions 169 units of capital and labor at the same rate, regardless of the number of each input that it employs. Hence, linear production functions can help us represent firms capable of easily substituting between inputs, such as two types of fuel (oil and natural gas), or two types of computers; that is, relative input usage does not change the firm’s ability to substitute one input for another. Self-assessment 7.8 Consider a firm with production function q = 5L + 2K. If the firm seeks to produce q = 230 units, find the isoquant, its vertical and horizontal intercept, and its slope. 7.7.2 Fixed-Proportions Production Function This type of production function is the polar opposite of the linear production function described previously. In this case, the firm cannot substitute between inputs and still maintain the same output level. Instead, the firm must use inputs in a fixed proportion to increase output. As described earlier in this chapter, this production function takes the form q = A min{aL, bK}, where parameters A, a, and b are all positive. An example of this production function would be q = min{2L, 3K}. Importantly, an increase in one input without a proportional increase in the other input will not result in an increase in production. Figure 7.7 illustrates that, depending on the amounts of labor and capital used, the firm faces either of the following three cases: K K= a L b q Ab aL = bK q Aa Figure 7.7 Fixed-proportions production function. L 170 Chapter 7 • If min{aL, bK} = aL, which occurs because aL < bK, the output level becomes q = AaL. q Solving for L in q = AaL, we obtain that L = Aa . Graphically, this is a vertical line at a q L is not a function labor level of L = Aa (note that it is a straight vertical line because ratio Aa of K). In this example, q = min{2L, 3K}, if the firm produces q = 100 units, the vertical segment of the isoquant happens when 2L < 3K or, solving for K, 23 L < K, where the q = 100 vertical line lies at L = Aa 2 workers. • If min{aL, bK} = bK, which occurs because aL > bK, the output level becomes q = AbK. q . Graphically, this is a horizontal line at Solving for K in q = AbK, we obtain that K = Ab q q (note that it is a straight horizontal line since ratio Ab is not a a capital level of K = Ab function of L). For instance, in the example where q = min{2L, 3K}, if the firm seeks to produce q = 100 units, the horizontal segment of the isoquant occurs when 2L > 3K or, q = 100 solving for L, L > 32 K, where the horizontal line lies at K = Ab 3 units of capital. • If aL = bK, then, min{aL, bK} is either aL or bK, since both are the same number. In this case, the output level becomes q = AaL = AbK because aL = bK. This occurs at the kink of the isoquant, where aL = bK, which, solving for K, yields a kink at K = ab L. In this example, q = min{2L, 3K}, the kink happens at K = 23 L. Graphically, this result means that the kinks of all isoquants are crossed by a ray from the origin with slope 23 . Note that the MRTS of the fixed proportion production function is not well defined, because we can find infinitely many slopes for the isoquant at its kink. We can nonetheless say that the slope of the isoquant is infinite in its vertical segment, and zero in its horizontal segment. This type of production function is common in firms that cannot easily substitute across inputs without altering their total output, such as firms in the chemical or pharmaceutical industry or firms with a highly automated production process. In the first type of firm, a switch from one chemical to another might alter the final product; and in the second type of firm, a reduction in the number of workers cannot be easily compensated for by an increase in machines without serious adjustments to the production process. Self-assessment 7.9 Consider a firm with production function q = 5 min{3L, K}. If the firm seeks to produce q = 200 units of output, find and depict the isoquant. 7.7.3 Cobb-Douglas Production Function This production function takes the mathematical form q = ALα K β , where the parameters A, α, and β are all positive. For instance, if A = 1 and α = β = 1/2, the production function becomes q = L1/2 K 1/2 , which we frequently encountered throughout Production Functions 171 this chapter. In this scenario, the isoquant is found by squaring both sides of q = L1/2 K 1/2 , 2 which yields q2 = LK, and solving for K (the input on the vertical axis), we obtain K = qL , as described in example 7.5. In addition, the slope if the isoquant (MRTS) becomes MRTS = MPL 1/2L−1/2 K 1/2 K 1/2+1/2 K = = = MPK 1/2L1/2 K −1/2 L1/2+1/2 L as discussed in example 7.6. Self-assessment 7.10 Consider a firm with production function q = 5L1/3 K 2/3 . Find the firm’s MRTS. Then, assuming that the firm seeks to produce q = 220 units of output, find and depict the isoquant. 7.7.4 Constant Elasticity of Substitution Production Function Lastly, we present a production function that exhibits a constant elasticity of substitution (CES), σ . (For more details about this elasticity, see appendix B at the end of this chapter.) This function has the following form: σ σ −1 σ −1 σ −1 , q = aL σ + bK σ where the term σ in the exponents represents the elasticity of substitution. An interesting property of this production function is that it embodies all previously discussed production functions as special cases. In particular, when the elasticity of substitution, σ , is σ = +∞, we have that the production function coincides with the linear production function where the firm can easily substitute between inputs; when σ = 0, the production function converges to the fixed-proportions production function; and when σ = 1, the production function coincides with the Cobb-Douglas production function.8 7.8 Returns to Scale In this section, we analyze how a firm’s output responds to a common increase in all of its inputs, which, for compactness, we refer to as “returns to scale.” Intuitively, the firm increases its scale because it expands the number of all inputs being hired, but the question is whether its output level increases more or less than proportionally to the increase in input usage. 8. Proving this result is not straightforward. For a step-by-step proof, see Muñoz-Garcia (2017), chapter 2, pages 49–51. 172 Chapter 7 Returns to scale Consider that all inputs are increased by a common factor, λ > 1. Hence, L is increased to λL, and K is increased to λK. If the firm’s output increases as follows: • λa > λ (which occurs when a > 1), we say that the firm exhibits increasing returns to scale. • λa < λ (which happens when a < 1), we say that the firm exhibits decreasing returns to scale. • λa = λ (which occurs when a = 1), we say that the firm exhibits constant returns to scale. As an illustration, consider a firm doubling the units of all inputs (λ = 2). If output increases more than proportionally (i.e., output more than doubles), the firm’s production function exhibits increasing returns to scale. If output increases less than proportionally (it falls short from doubling), we say that the firm’s production function exhibits decreasing returns to scale. Finally, if output increases proportionally (exactly doubling), the firm’s production function satisfies constant returns to scale. Example 7.8 illustrates how we can test for the existence of either of these three types of returns. First, we increase all inputs by the same factor λ (multiplying all inputs by λ), and then simplify our results to obtain how much total output increases. Example 7.8: Testing for returns to scale Consider a Cobb-Douglas production function q = ALα K β . If we increase all inputs by λ, labor becomes λL, while capital is λK, implying that total output is now A (λL)α (λK)β = Aλα Lα λβ K β , which simplifies to λα+β ALα K β = λα+β q, q where we factored λ out on the left side, and use the definition q = ALα K β . (As described previously, our goal when rearranging the previous expression was to factor the production function q = ALα K β out, so we could have q and an additional term—in this case λα+β —which will help us identify whether increasing, constant, or decreasing returns to scale are present.) Production Functions 173 Hence, output increased by λα+β , giving rise to three possible cases: • If its exponent satisfies α + β > 1, increasing returns to scale exist. This is the case, for instance, if α = 23 and β = 12 . • If the exponent satisfies α + β < 1, decreasing returns to scale are present; for example, if α = 13 and β = 12 . • Finally, if the exponent is exactly equal to 1 (α + β = 1), then constant returns to scale exist. This happens if, for example, α = 12 and β = 12 . Consider now a linear production function q = aL + bK. If we increase all inputs by λ, we find a (λL) + b (λK) = λ(aL + bK) = λq, q implying that output increased proportionally to inputs, and thus the firm’s production process exhibits constant returns to scale. Lastly, consider a fixed-proportions production function, q = A min{aL, bK}. If we increase all inputs by a common factor λ, we obtain A min{aλL, bλK} = λA min{aL, bK} = λq, q which also means that output responds proportionally to a given increase in inputs. That is, this production function exhibits constant returns to scale. Self-assessment 7.11 Consider a firm with the production function q = 5L1/3 K 2/3 . Use the same steps as in example 7.8 to find if this production function exhibits increasing, decreasing, or constant returns to scale. What if the firm’s production function changes to q = 7L + 8K? What if it changes to q = 3 min{L, 2K}? 7.9 Technological Progress In previous sections, we assumed that the firm’s production function was unaffected by technological progress; however, firms often benefit from technological advances that allow them to produce a larger output using the same amount of inputs.9 Mathematically, for a 9. Alternatively, technological progress lets firms produce the same number of units of output while using fewer units of inputs. 174 Chapter 7 given pair of inputs L and K, the firm can produce q1 units of output before the technological progress occurred, but it can produce q2 units of output afterward (where q2 > q1 ). For instance, if the firm’s production function is q = A1 Lα K β before the technological change, it could be q = A2 Lα K β afterward; q = A1 Lα+γ K β , where the exponent on labor increased by γ > 0; or q = ALα K β+γ , where now the exponent on capital increased by γ > 0. Example 7.9: Testing for technological progress Consider a firm with a production function that changes from q = Lα K β to q = L2α K β . The firm benefits from technological progress because Lα K β < L2α K β which simplifies to L < L2 , a condition that holds true for any number of workers L > 1. Therefore, for a given pair of inputs L and K, the firm can produce more units of output, confirming that technological progress exists. Self-assessment 7.12 Consider a firm with the production function q = 7L + 2K, which changes to q = 7L + 5K. Is the firm experiencing technological progress? 7.9.1 Types of Technological Progress Technological progress can be one of three types: labor enhancing, capital enhancing, or neutral, as we discuss next. We say that technological progress is labor enhancing if it increases the marginal product of labor more than that of capital (i.e., MPL > MPK ). This can occur if higher education allows workers to be more productive, MPL while capital does not increase its marginal product as quickly. As a result, MRTS = MP K increases because its numerator grows faster than its denominator. Graphically, this implies that the firm’s isoquant becomes steeper after the technological progress, thus indicating that the firm is willing to give up more units of capital to hire one more worker. This type of technological progress is also referred to as “capital saving,” because the firm can get rid of machines and hire more labor, which became more productive in relative terms. Labor-enhancing technological progress. In contrast, capital-enhancing technological progress means that capital increases its marginal product faster than labor does, MPL < MPK . This can happen if new robots or software programs make each unit of capital more productive, while labor does not increase its marginal product as much. In this Capital-enhancing technological progress. Production Functions 175 MPL case, MRTS = MP decreases, as its numerator grows more slowly than its denominator. K Graphically, this implies that the firm’s isoquant becomes flatter after the technological progress occurred, thus suggesting that the firm is willing to give up fewer units of the more productive capital to hire one more worker. This type of progress is also labeled as “labor saving,” because the firm can fire some workers and purchase more capital, which became relatively more productive. Finally, technological progress is regarded as neutral if the marginal product of labor and capital increase by the same amount (i.e., MPL = MPK ). MPL is unaffected, because its numerator grows by the same In this scenario, MRTS = MP K amount as its denominator. Graphically, this implies that the firm’s isoquant does not change its slope after the technological progress occurred, implying that the firm is willing to give up as many units of capital to hire one more worker. Neutral technological progress. Example 7.10: Identifying the type of technological progress Consider the firm in example 7.9. Before the technological change, MRTSPre = βα KL , while afterward it K becomes MRTSPost = 2α β L . Comparing them, we find that MRTS increased after the technological progress: MRTSPre < MRTSPost , indicating that the technological progress is labor enhancing. This is also known as capital-saving progress because it provides incentives to the firm to replace capital by hiring more workers. Self-assessment 7.13 Consider the firm in self-assessment 7.12. Find the MRTS before and after the technological change. Is this change labor saving, capital saving, or neutral? Appendix A. MRTS as the Ratio of Marginal Products In previous sections of this chapter, we discussed that the slope of the isoquant is the ratio MPL , which we labeled as the firm’s MRTS. This appendix formally of marginal products, MP K MPL . We follow an shows that, indeed, the slope of the firm’s isoquant is measured by MP K x analogous proof to the one in chapter 2, where we showed that MRS = MU MUy represents the slope of a consumer’s indifference curve (see appendix A in that chapter). 176 Chapter 7 Consider a firm with production function q = f (L, K), using L units of labor and K units of capital to produce q units of output. To evaluate the slope of the firm’s isoquant, we simultaneously increase labor (for instance, by 1 unit) and decrease capital.10 Hence, we totally differentiate the production function q = f (L, K) with respect to L and K to obtain dq = ∂f (L, K) ∂f (L, K) dL + dK. ∂L ∂K Note that ∂f (L,K) ∂L = MPL represents the marginal product of additional units of labor and, ∂f (L,K) similarly, ∂K = MPK reflects the marginal product of additional units of capital. Hence, this expression becomes dq = MPL dL + MPK dK. Now, recall that, because we are moving along different points of the firm’s isoquants, the output level is the same across all these points. Therefore, output does not change, entailing that dq = 0. Plugging this result into the left side of this expression, we obtain 0 = MPL dL + MPK dK and, rearranging, MPK dK = −MPL dL. Lastly, because we are interested in finding the slope of the isoquant, we solve for − dK dL . This reflects the rate at which the firm needs to decrease 11 K if L increases by 1 unit. Solving for − dK dL yields − dK MPL = . dL MPK Hence, the slope of the firm’s isoquant, − dK dL , coincides with the ratio of marginal products, MPL , which we refer to as the MRTS. MPK Appendix B. Elasticity of Substitution A common measure of how easy it is for a firm to substitute labor for capital is the “elasticity of substitution,” presented as follows: σ= % KL = %MRTS K K L B− L A K L A MRTSB −MRTSA MRTSA . 10. We could also consider the opposite change in inputs, where labor decreases and capital increases. Generally, we only need to change the amount of labor and capital simultaneously. 11. Starting from MPK dK = −MPL dL, we divide both sides by −dL, which produces MPK − dK dL = MPL , and MPL then divide both sides by MPK to obtain − dK dL = MPK . Production Functions 177 K A KA = 20 B KB = 8 LA = 4 LB = 10 L Figure 7.8 Finding MRTS at two points to obtain σ . The elasticity of substitution tells us that, if the MRTS increases by 1 percent, the capitallabor ratio that the firm uses, KL , increases by σ percent. Figure 7.8 depicts a firm’s isoquant to illustrate this elasticity. Starting at point A, the firm uses a capital-labor ratio KLAA = 20 4 = 5, and the isoquant has a slope of MRTSA = 6. At point 8 B, however, the capital-labor ratio decreases to KLBB = 10 = 0.8 (because the firm uses less capital and more labor), and the isoquant has a flatter slope of MRTSB = 2. Hence, the elasticity of substitution is σ= K K L B− L A K L A MRTSB −MRTSA MRTSA = 0.8−5 5 2−6 6 = −0.84 − 23 = 1.26, which means that if the MRTS decreases by 2/3 (about 66 percent), then the capital-labor ratio K/L decreases more than proportionally, by 84 percent. We can better understand the elasticity of substitution by examining the extreme cases in which the firm can very easily substitute inputs, and the case in which the firm cannot, as we discuss next. If the firm has a linear production function q = aL + bK, its isoquants are straight lines, as described in section 7.7.1. In this scenario, the MRTS is constant along all the points of the isoquant (recall that the MRTS is equal to ab ), implying that the denominator of the elasticity of substitution formula, σ , is zero (i.e., there is no change in the MRTS). Therefore, regardless of the percentage change in the capital-labor ratio in the numerator of σ , the elasticity of substitution is infinite because Linear production function. σ= K K L B− L A K L A 0 = +∞. 178 Chapter 7 Intuitively, the firm can substitute labor for capital very easily without altering its output level. Fixed-proportions production function. Let us now consider the opposite case, where the firm faces a production function q = A min{aL, bK}. In this setting, the MRTS changes drastically as we move rightward, from the vertical to the horizontal segment of the L-shaped isoquant as shown previously in figure 7.7. Therefore, there is a large change in the denominator of σ , producing a ratio in the formula of σ that converges to zero. That is, σ= K K L B− L A K L A +∞ = 0. In this case, the firm cannot easily substitute units of labor for capital without affecting its output level. Cobb-Douglas production function. Next, we show that the Cobb-Douglas production function q = ALα K β has an elasticity of substitution exactly equal to 1. First, we rewrite the definition of the elasticity of substitution, as follows: % KL = σ= %MRTS KL K L MRTS MRTS = KL MRTS . MRTS KL Second, we find each of the four terms (two for the first ratio, and two for the second). Let us start by finding the MRTS in the Cobb-Douglas production function: MRTS = MPL αALα−1 K β α K . = = MPK βALα K β−1 β L Rearranging the expression of the MRTS we just found, MRTS = βα KL , yields the capitallabor ratio K L, as follows: MRTS β K = . α L Applying increments on both sides, we obtain MRTS β K = , α L or, rearranging, KL β = . MRTS α (7.1) Production Functions 179 From the expression of the MRTS, MRTS = βα KL , we also know that MRTS K L = α . β (7.2) Hence, inserting equations (7.1) and (7.2) into the definition of the elasticity of substitution, we obtain σ= KL MRTS β = MRTS KL α α β = 1. From (7.1) From (7.2) Therefore, the Cobb-Douglas production function q = ALα K β has an elasticity of substitution σ = 1, regardless of the specific value of parameters A, α, and β. CES production function. Lastly, we find the elasticity of substitution for the production function presented in section 7.7.4, reproduced here: σ σ −1 σ −1 σ −1 . q = aL σ + bK σ To find its elasticity of substitution, we first obtain its MRTS, as follows: 1 aL− σ a MPL = = MRTS = 1 − MPK bK σ b 1 σ K L . Applying logs on both sides, we find ln(MRTS) = ln K a 1 + ln b σ L , which, after rearranging, simplifies to 1 K ln σ L a = ln(MRTS) − ln . b Multiplying both sides by σ , yields ln K L a = σ ln(MRTS) − σ ln . b Therefore, the elasticity of substitution between labor and capital is the derivative of this expression with respect to ln(MRTS); that is, ∂ ln KL =σ, ∂ ln (MRTS) coinciding with term σ in the exponent of the CES production function. 180 Chapter 7 Exercises 1. Properties of production functions.B A firm uses only one input, labor (L), to produce output with production function q(L) = 3L2 + 0.5L − 0.6L3 . (a) Total product. For which values of L does the total product curve q(L) increase or decrease? For which values is it concave or convex in labor? (b) Marginal product. For which values of L does the marginal product curve ∂q(L) ∂L increase or decrease? For which values is it concave or convex in labor? (c) Average product. For which values of L does the average product curve q(L) L increase or decrease? For which values is it concave or convex in labor? (d) Find the value of L where the marginal product curve crosses the total product curve, and where it crosses the average product curve. 2. MP and AP curves.A Sarah is looking into producing her homemade dog treats on a larger scale and is contemplating two different kitchen sizes (K). Her production of dog treats follows q = 200KL + K 2 L3 . What are the marginal and average product curves for labor when K = 5? What happens to the marginal and average product curves when her kitchen doubles to K = 10? 3. Where the MP and AP curves cross.A A firm has the production function q = 50L − 2L2 − 10. At what level of labor do the marginal product and average product curves cross? 4. Isoquants.B Graph the isoquants for the following production functions for an output q = 100 units: (a) q = 10K + 5L. (b) q = K 0.75 L0.25 . (c) q = 5 min{10K, 5L}. 5. MP and changing capital.B A firm produces stickers, s, using capital and labor in its production, with the production function s = 10K + 20L2 − K 3 L3 . Does the marginal product of labor increase or decrease as capital increases? 6. Cobb-Douglas and MRTS.A Consider the Cobb-Douglas production function q(K, L) = 5L0.75 K 0.25 . Find the marginal products of labor and capital, and the MRTS. 7. Linear production and MRTS.A Jack produces water bottles and wants to know more about his production function. He finds that it follows the function q(K, L) = 20K + 30L + 10KL. Find the marginal products of labor and capital, as well as the MRTS. 8. Nonlinear production and MRTS.B Jack (from exercise 7) is at the end of the lease in his current factory and is considering moving his production to a new space. With this new space comes a new production function: q(K, L) = 0.5L + L0.6 K 0.4 . Find the marginal products of labor and capital, as well as the MRTS. 9. Returns to scale–I.A Do the following production functions exhibit increasing, decreasing, or constant returns to scale? (a) q = 5K 0.7 L0.3 . Production Functions 181 (b) q = K 0.5 L0.6 . (c) q = 2K + 4L. 10. Returns to scale–II.C Do the following production functions exhibit increasing, decreasing, or constant returns to scale? (a) q = 2K + 4L + 5. (b) q = 3K 0.5 L0.5 − 2. (c) q = 3K 0.75 L0.25 + √ L. 11. Decreasing marginal returns.A A bakery that specializes in cupcakes has the production function for cupcakes of c = 10L − 0.5L2 in their current space. What is the firm’s marginal product? At what amount of labor does the firm’s output begin to decrease (i.e., when does there start to be “too many cooks in the kitchen”)? 12. Technological change.A A firm has a production function that changes from q = 7L + 2K to q = 10L + 5K. Is the firm experiencing technological progress? Find the MRTS before and after the technological change. Is this change labor saving, capital saving, or neutral? 13. Choosing production.A Eric is a manager of a firm that produces playing cards. He is investing in a new technology and has two options with the resulting production functions: (a) q = 10L0.5 K 0.5 , and (b) q = 10(L0.5 + K 0.5 ). If the firm has 100 units of capital, when would the firm prefer technology (a) over technology (b)? 14. Technology change with a Cobb-Douglas production function.A Julie’s Candy Factory produces candy bars with a production process that follows a Cobb-Douglas production function. She invests in a new technology that changes the production function from q = L0.25 K 0.5 to q = L0.25 K 0.75 . Is Julie experiencing technological progress, or was her investment a waste of time and money? Find the MRTS before and after the technological change. Is this change labor saving, capital saving, or neutral? 15. CES production function and marginal products.B Find the marginal products of labor and σ σ −1 σ −1 σ −1 capital for the CES production function q = aL σ + bK σ . 16. CES production and returns to scale.B Find the returns to scale for the CES production function σ σ −1 σ −1 σ −1 q = aL σ + bK σ . 17. Choosing factors of production.B Ashley is a producer of water purifiers for use in remote African villages and relies heavily on donations and volunteers in her production of water purifiers. In 1 hour, a worker can make 5 purifiers with 10 units of capital, or 2 workers can make 10 purifiers with 15 units of capital. As of Thursday evening, Ashley has 20 committed volunteers (for 5 hours on Saturday) and 100 units of capital. Ashley has 24 hours to round up more volunteers, more capital, or both. What should she do? 18. Three inputs to production.C Many firms have more than two inputs to their production functions. An example of this might be a microchip manufacturer with the production function q = A0.58 L0.19 K 0.23 , where A is the materials and energy, and L and K are labor and capital, respectively. 182 Chapter 7 (a) What type of returns to scale does this production function exhibit? (b) Find the marginal products of materials MPA , labor MPL , and capital MPK . (c) Find the MRTS between capital and material MRTSA,K , and between capital and labor MRTSL,K . Explain what the difference means. 19. Comparing MRTS.C Tony and Chris are studying for exams to be certified public accountants (CPAs). Tony prefers reading laws and regulations (R), while Chris prefers practicing audits (A). Tony’s score follows the function ST = 10A0.65 R0.35 , while Chris’s score follows SC = 10A0.4 R0.6 . (a) At what point will their MRTS between practicing audits and reading regulations be equal? (b) Tony and Chris went to college together and know that each of them is studying for the CPA exam, and they plan to study together. Their score functions change by adding a variable representing the time they spend studying together, T, such that ST = 10A0.65 R0.35 + T 0.5 and SC = 10A0.4 R0.6 + T 0.5 . Find the MRTS between the time spent studying together and their preferred method of studying. Can we tell who is willing to give up more of his preferred method of studying to study together? If so, who? 20. Comparing Cobb-Douglas production functions.C Is it possible for two firms with different Cobb-Douglas production functions, q1 = ALα K β and q2 = Lα K β , to have the same marginal products at particular levels of capital and labor? What about each firm’s MRTS? 8 Cost Minimization 8.1 Introduction Isoquant curves analyzed in chapter 7 help us describe input combinations for which the firm reaches a specific output level. However, isoquants did not allow us to find which exact input combination the firm chooses to minimize its costs. In this chapter, we combine the isoquant, input combinations that help the firm reach an output level, and isocost lines, which depict input combinations entailing the same cost, in order to identify which optimal labor and capital the firm hires. The chapter begins by examining the isocost line and then combining it with isoquants to find the cost-minimizing amount of labor and capital that the firm acquires. For compactness, we refer to this input pair as “input demand.” We then investigate whether a firm’s input demand (e.g., the number of software developers that the firm hires) is decreasing in that input’s price (e.g., the salary of software developers) but increasing in the price of other inputs (e.g., salaries of other type of employees, and cost of capital). We also evaluate the firm’s cost at the optimal units of labor and capital to obtain its “cost function.” Finally, we decompose the firm’s cost function into various types of costs (such as sunk, nonsunk, fixed, and variable), and analyze how to measure average and marginal costs. We conclude the chapter with a discussion of how costs vary when the firm increases in scale and when it adds more product lines. 8.2 Isocost Lines In this section, we describe how to represent input combinations that result in the same total cost for the firm. Isocost line The set of input combinations (i.e., pairs of labor and capital amounts) that yield the same total cost for the firm. That is, the combinations of L and K for which 184 Chapter 8 K TC r – w r Isocost line K = TC w TC w – L r r L Figure 8.1a Isocost line. TC = wL + rK, where w > 0 denotes the price of every unit of labor (wage per hour), r > 0 represents the cost of each unit of capital (interest rate), and TC is a given total cost that the firm incurs. Figure 8.1a depicts the isocost line TC = wL + rK with units of L on the horizontal axis and units of K on the vertical axis. As usual, we solve for the variable on the vertical axis (K), to obtain K= TC w − L. r r w Hence, TC r represents the vertical intercept of the isocost line, whereas r denotes its negative slope.1 The firm faces a linear isocost line like that in figure 8.1a, regardless of its production function q = f (L, K), because the isocost line is just an accounting sum of costs (i.e., the costs that the firm incurs when hiring L workers plus those from acquiring K units of capital). An increase in the total cost that the firm incurs, TC, increases both the vertical intercept TC w of the isocost line, TC r , and its horizontal intercept, w , without altering its slope, − r ; thus producing a parallel upward shift in the isocost line (moving northeast). Intuitively, as the firm can incur a larger cost, it can choose among higher input combinations. If wages increase, the vertical intercept of the isocost TC r is unaffected, but its horizontal 2 In other words, the firm can , decreases, thus making the isocost steeper. intercept, TC w 1. To find the horizontal intercept, recall that we only need to set the variable on the vertical axis (K) equal to zero in the equation of the isocost to obtain TC = wL + r0. Solving for L, we find a horizontal intercept of L = TC w . 2. Alternatively, you can see this result by noticing that an increase in w decreases ratio TC , thus shifting the w horizontal intercept of the isocost line leftward, without changing the vertical intercept TC r . Cost Minimization 185 afford to hire fewer workers as their wages increase. If the interest rate r increases, the vertical intercept of the isocost TC r decreases, flattening the isocost as a result. Therefore, the firm can afford fewer units of capital as its price increases. Example 8.1: A particular isocost Consider a firm facing a wage of w = $10, a price for capital of r = $15, and incurring a total cost of TC = $200. Its isocost 10 line would be 200 = 10L + 15K. Solving for K yields K = 200 15 − 15 L, as depicted in figure 8.1b. The first term in the equation of the isocost line, 200 15 13.3, is the vertical intercept; 10 2 the second term, 15 = 3 , represents its negative slope; and 200 10 = 20 is its horizontal intercept. K 200 = 13.3 15 – 2 3 Isocost line K = 200 10 – L 15 15 200 = 20 10 L Figure 8.1b An example of an isocost line. Self-assessment 8.1 Consider the firm in example 8.1, but assume that wages double to w = $20. Find the firm’s isocost in this scenario, and interpret how it changed relative to that in figure 8.1b. 8.3 Cost-Minimization Problem In this section, we combine the isoquant learned in chapter 7 with the isocost discussed in section 8.2 to determine how many units of labor and capital the firm optimally hires. Figure 8.2 depicts an isoquant where the firm produces 100 units of output, with a set of isocost lines superimposed, each one associated with a different total cost, TC. 186 Chapter 8 K TC ' r B TC r A C D TC w q = 100 units TC ' w L Figure 8.2 Cost-minimization problem (CMP). The cost-minimization problem (CMP) can be represented as follows: min TC = wL + rK L, K subject to 100 = f (L, K). Intuitively, this problem asks the firm: Choose the input combination that minimizes your total cost TC, reaching an output level of q = 100 units. As illustrated in Figure 8.2, the CMP entails pushing the isocost line inward, because isocost lines closer to the origin are associated with lower total costs and reach the isoquant where the firm produces q = 100 units.3 Points like B or C in Figure 8.2 cannot be cost minimizing because, while the firm reaches the isoquant of q = 100 units, it does so at a cost that could still be reduced by moving to points closer to A. At point A, the firm minimizes its total cost, TC , and reaches the isoquant q = 100. Cheaper input combinations, such as that at point D, do not reach the target isoquant, q = 100, which fails to solve this CMP. As a result, combinations of labor and capital minimizing the firm’s cost require that the firm’s isoquant is tangent to its isocost, which implies that the slope of the isoquant MPL ) and isocost ( wr ) coincide; that is, (MRTS = MP K MPL w = . MPK r 3. Recall that a decrease in TC shifts the isocost inward and in a parallel fashion, because TC is in the numerator of both the vertical and horizontal intercepts of the isocost line. Cost Minimization 187 We can cross-multiply this expression to rewrite it as MPL MPK = . w r Therefore, when minimizing its total cost, the firm rearranges its inputs until the point where the marginal product per dollar spent on additional units of labor (e.g., hiring one L more worker), MP w , coincides with the marginal product per dollar spent on additional units MPK of capital, r . Informally, the bang for the buck must be the same across all inputs.4 If this condition does not hold, the firm still has incentives to readjust its input combinaMPK L tion. For instance, if MP w > r , the firm could decrease its total costs (and still reach the target production level q = 100) by acquiring fewer units of capital, and using the savings to hire more workers, given that they provide a larger marginal product per dollar to the firm. This is what happens at point B, where the isoquant is steeper than the isocost, entailing MPL MPK L > wr , or rearranging, MP that MP w > r , ultimately providing the firm with incentives K to move southeast, towards point A, where it uses fewer units of capital and more units of labor.5 (For completeness, the appendix at the end of the chapter shows that solving the MPL = wr or, in its bang for the buck version, firm’s CMP yields the tangency condition MP K MPL MPK w = r .) Similarly to consumer theory, we present a three-step procedure to solve the firm’s CMP in tool 8.1, and after that, illustrate it with two numerical examples.6 Tool 8.1. Procedure to solve the CMP: 1. Set the tangency condition MPL MPK = wr . Cross-multiply and simplify. 2. If the expression found from the tangency condition: a. Contains both unknowns L and K, then solve for K, and insert the resulting expression into the firm’s output target q = f (L, K). b. Contains only one unknown (input L, input K, but not both), then solve for that unknown. Afterwards, insert your result into the firm’s output target q = f (L, K). c. Contains no input L or K, then compare MPL w against MPK r . If K = 0 in the firm’s output target, and solve for L. If, instead, L = 0 in the firm’s output target, and solve for K. MPL w MPL w > MPr K , then set < MPr K , then set 4. This result is analogous to what we found in consumer theory, where the individual rearranged her purchases of goods x and y until the point where the marginal utility per dollar spent on an additional unit of good x coincided with that of good y. MPL 5. The opposite argument applies to point C, where the isoquant is flatter than the isocost, implying that MP < wr K MPK L or, after rearranging, MP w < r . In this case, the firm has incentives to hire fewer workers and acquire more units of capital (moving northwest in the figure towards point A). 6. As with tool 3.1 in chapter 3, which solves the utility maximization problem (UMP) for the consumer, tool 8.1 applies when no input has a negative marginal product. If it does, the firm would hire zero units of that input, and use all its resources to hire units of the other input. For instance, if MPK < 0, the firm hires K = 0 units of capital and L = TC w workers. 188 Chapter 8 3. If in step 2 you find that one of the inputs is negative (e.g., L = −2), then set the amount of that input equal to zero on the firm’s output target (e.g., q = a0 + bK), and solve for the remaining input. 4. If you haven’t found the values for all the unknowns L and K yet, then use the tangency condition from step 1 to find the remaining unknown. Example 8.2: Cost minimization with Cobb-Douglas production functions Consider a firm with the Cobb-Douglas production function q = L1/2 K 1/2 , seeking to reach an output level of q = 100 units, and facing input prices w = $40 and r = $10. Step 1. We set the tangency condition MPL MPK = wr , which yields 1 −1/2 1/2 K 2L 1 1/2 −1/2 K 2L = 40 , 10 or rearranging, KL = 4, which, solving for K, simplifies to K = 4L. Because this result contains both inputs, K and L, we now move on to step 2a. Step 2a. Inserting the result from step 1, K = 4L, into the output target of the firm, q = 100, 100 = L1/2 K 1/2 , we obtain 100 = L1/2 (4L)1/2 , K where K = 4L from the tangency condition. Rearranging this, we obtain 100 = (4)1/2 L or, solving for L, L= 100 (4) 1/2 = 100 = 50 workers. 2 Because the firm hires a positive number of workers, we can now move on to step 4. (Recall that we need to go on to step 3 only if we find a negative amount of either input.) Step 4. Lastly, we can plug this result into the tangency condition K = 4L, to find that K = 4 × 50 = 200 units of capital. Summary. The cost-minimizing input combination is then (L, K) = (50, 200). Intuitively, the firm uses more capital than labor because labor is four times as expensive as capital, while their marginal productivities are symmetric. Cost Minimization 189 Self-assessment 8.2 Repeat the analysis in example 8.2, but assume that wages decrease to w = $20. How are the results in example 8.2 affected? Example 8.3: Cost minimization with linear production functions Consider the scenario of example 8.2, but assume that the firm’s production function is linear, q = 2L + 8K. MPL = wr , yielding 28 = 40 Step 1. We set the tangency condition MP 10 , which cannot K hold because each side represents a different number! Our result from the tangency condition contains neither input, K or L, so we move on to step 2c. MPL w Step 2c. We obtained that 28 < 40 10 , which entails that MPK < r or, after crossMPK L multiplying, MP w < r . As a consequence, the firm increases its purchases of capital as much as possible, leading to a corner solution where the firm only purchases capital but no labor (L = 0). Step 4. Inserting L = 0 into the output target of the firm, 100 = 2L + 8K (recall that the firm seeks to produce q = 100 units of output), yields 100 = (2 × 0) + 8K. Solving for K, we obtain K = 100 8 = 12.5 units of capital. Summary. The cost-minimizing input combination is (L, K) = (0, 12.5). Self-assessment 8.3 Repeat the analysis in example 8.3, but assuming that wages decrease to w = $20. How are the results in example 8.3 affected? 8.4 Input Demands Examples 8.2–8.3 identified a specific number of units of labor and capital being employed by the firm to minimize its costs, namely (L, K) = (50, 200) in example 8.2 and (L, K) = (0, 12.5) in example 8.3. We can now use that analysis in a more general setting, where input prices (w and r) are not specific numbers, and similarly, where the output q that the firm seeks to reach is not a concrete number of units. For illustration purposes, we reproduce example 8.2 in example 8.4, where the firm faced a Cobb-Douglas production function q = L1/2 K 1/2 , without assuming specific values to input prices w and r, or to output level q. 190 Chapter 8 Example 8.4: Finding input demands with a Cobb-Douglas production function Consider the firm in example 8.2 again. We follow a similar procedure as in that example. MPL MPK Step 1. We set the tangency condition = wr , which yields 1 −1/2 1/2 K 2L 1 1/2 −1/2 K 2L = wr or, rearranging, KL = wr , which solving for K simplifies to K = wr L. Because this result contains both inputs, K and L, we now move on to step 2a. Step 2a. We now insert the result from step 1, K = wr L, into the output target of the firm, q = L1/2 K 1/2 , to obtain w 1/2 . q = L1/2 L r K Rearranging, we obtain q = demand: w 1/2 r L= L and, solving for L, we find the firm’s labor q w 1/2 r √ q r = √ . w √ Step 4. Finally, we plug labor demand L = q√wr into the tangency condition, K = wr L, to find that capital demand is √ √ wq r q w K= √ = √ . r w r L If we evaluate this input demands at the parameter values we considered in example 8.2, q = 100 units, w = $40, and r = $10, we obtain√the same results as in that √ √ q w √ 10 = 50 workers, and K = √ = 100 √ 40 = 200 units of example; that is, L = 100 r 40 10 capital. We can now do comparative statics of the labor and capital demands in example 8.4. √ q r √ Starting with labor demand L = w , we find that it is increasing in the number of units that the firm seeks to produce, q; decreasing in the salary that the firm pays to workers, w; and increasing in the price of capital, r. In other words, as the firm seeks to produce more units, it needs to hire more workers; as it faces higher salaries, it responds by hiring fewer workers; and as capital becomes more expensive, labor becomes relatively more attractive, and thus the firm responds by hiring more workers. Similar results apply to the demand for capital, √ q w √ K = r , which is also increasing in the number of units the firm produces, q; decreasing in the price of capital, r; but increasing in the price of labor, w. Cost Minimization 191 Self-assessment 8.4 Repeat the analysis in example 8.4, but assume that the firm’s production function changes to q = 4L1/3 K 1/2 . How are the results in example 8.4 affected? Interpret the results. Example 8.5: Finding input demands with a linear production function Consider the scenario of example 8.3, where production function was q = 2L + 8K. MPL = wr , which yields 28 = wr . This result does Step 1. We set the tangency condition MP K not contains either input, K or L, we now move on to step 2c. Step 2c. Comparing the marginal product per dollar across inputs, we obtain that 2 8 MPL MPK < if < , w r w r simplifying to 14 < wr , which induces the firm to hire no workers (L = 0). Otherwise, the marginal product per dollar spent on labor is now higher than that on capital, entailing that the firm hires no capital (K = 0). Step 4. If 14 < wr , the firm hires no workers (L = 0), which we can insert into the output target q = 2L + 8K (which is now evaluated at a generic output level q), yielding q = (2 × 0) + 8K. Solving for K, we obtain a demand for capital of K = q8 , which is increasing in output. If, instead, 14 > wr , the firm hires no capital, K = 0, and its demand for labor is found by inserting K = 0 into the output target, which yields q = 2L + (8 × 0), or L = q2 , which is also increasing in output.7 In terms of comparative statics, note that labor and capital are increasing in the output q that the firm seeks to produce, as in example 8.4. However, an increase in salary w does not generally affect the firm’s demand for labor and capital. There is, however, one scenario in which a higher salary w producesa change in input demands. When 14 > wr , the firm produces using labor alone, (L, K) = q2 , 0 ; but if w increases enough to yield 14 < wr , the firm changes its input usage completely to (L, K) = 0, q2 , thus using capital alone. Self-assessment 8.5 Repeat the analysis in example 8.5, but assume that the firm’s production function changes to q = 3L + 8K. How are the results in example 8.5 affected? Interpret. 7. When the marginal product per dollar coincides across inputs, 14 = wr , the isocost and isoquant overlap, which gives rise to a continuum of solutions (i.e., all points along the isoquant are optimal). That is, any point satisfying the equation of the isoquant q = 2L + 8K, is optimal. 192 Chapter 8 (a) (b) K L 250 120 200 100 L= 80 60 316.2 w K= 150 632.4 r 100 40 50 20 20 40 60 100 w 80 20 40 60 80 100 r Figure 8.3 (a) Labor demand. (b) Capital demand. 8.4.1 Input Demand—Responses Response to changes in its own price. The demand for an input decreases as its price increases, implying that the input demand has a negative slope. In example 8.4, for instance, the demand for labor decreases in w and the demand for capital decreases in r. Assum√ ing q = 100 units and r = $10, the demand for labor on that example, L = q√wr , becomes √ √ , which is clearly decreasing in w, as depicted in figure 8.3a. SimiL = 100√w10 316.2 w √ larly, assuming q = 100 units and w = $40, the demand for capital K = q√rw simplifies to √ √ , which is also decreasing in its own price r, as plotted in figure 8.3b. K = 100√r 40 632.4 r The sensitivity of input demand to variations in its price is often measured using price elasticity. For the case of labor demand, its price elasticity is εL,w = %L = %w L L w w = L w , w L ∂L w ∂L or, if the change in salary w is infinitely small, εL,w = ∂w L , where ∂w represents the slope of the labor demand curve. Intuitively, if salary w increases by 1 percent, the firm would reduce the number of workers it hires by εL,w percent. A similar expression applies to the r elasticity of capital with respect to its price r, εK,r = ∂K ∂r K . To better understand this elasticity, let us consider some of the common production functions examined in previous sections of the chapter. In the case of the fixed-proportion production function, input demand becomes vertical, as the firm does not change its input ∂L = −∞, combination when input prices change. Hence, the slope of labor demand is ∂w yielding the same value for the labor elasticity, εL,w = −∞. In contrast, when the firm faces Cost Minimization 193 a linear production function, its input demand is flat.8 In this case, the slope of labor demand is zero, thus yielding zero elasticity, εL,w = 0. (Similar arguments apply to the slope of the demand for capital and its elasticity.) Response to changes in the price of the other input. The demand for an input may increase as we increase the price of the other input, thus shifting upward. For instance, as labor becomes more expensive (higher wages), capital becomes more attractive. In example 8.4, √ for instance, the demand for capital K = q√rw increases in salaries, w; and, similarly, the √ demand for labor L = q√wr increases in the price of capital, r. Graphically, the demand function for labor (capital) in figure 8.3a (figure 8.3b) would shift outwards as the price of the other input, capital (labor), becomes more expensive. Response to changes in output. When the firm increases its demand for input as it seeks to produce more units of output q, we say that such input is regarded as normal, whereas when its demand decreases in q, we say that the input is inferior. For aggregate categories of inputs, such as labor or capital, we rarely see firms using fewer of them as they seek to produce more output. However, if we disaggregate an input into more specific categories, we can quickly recognize some inputs as decreasing in output (inferior). For instance, consider a firm with different types of labor, such as chief executive officers, midlevel managers, sellers, accountants, secretaries, information technology personnel, and janitors. While the firm may initially hire more workers in all categories as it increases its output, it may be possible for the firm to sign software contracts when its output is large enough, and as a result fire some accountants or secretaries, who would become inferior inputs. 8.5 Cost Functions Total cost The expenditures that a firm incurs when hiring the optimal amounts of labor and capital identified by its labor and capital demand; that is, TC = wL∗ + rK ∗ . For illustration purposes, we identify the total cost that emerges from the labor and capital demands found in example 8.4 for a Cobb-Douglas production function. 8. Recall that a flat demand curve for an input entails that a minor change in its price leads the firm to stop using this input, and switch to use only the other input. Intuitively, because both inputs can be easily substituted without altering the number of units produced, q, the firm uses the input with the highest marginal product per dollar: only MPK L labor if MP w > r , or only capital otherwise. 194 Chapter 8 Example 8.6: Finding total cost in the Cobb-Douglas case Recall that the labor √ q r demand found in example 8.4 was L = √w , while the demand for capital was K = √ q w √ . r Hence, the total cost is L K √ √ q r q w TC = w √ + r √ w r = qw1/2 r1/2 + qr1/2 w1/2 √ = 2q rw, which increases as the firm produces more units of output, q, and as inputs become more expensive (higher r and/or √w). If input prices take values w = $40 and r = $10, total cost simplifies to TC = 2q 10 × 40 = 40q, which is a straight line with positive slope of 40. Considering the same output target as in example 8.4, q = 100 units, this total cost becomes TC = $4, 000. Self-assessment 8.6 Consider the labor and capital demands found in selfassessment 8.4. Find the total cost function TC, and then evaluate it at input prices w = $40 and r = $10. Compare it against TC = 40q we found in example 8.6. Interpret. Example 8.7: Finding total costs in the linear production case The input demands found in example 8.5 were L = 0 and K = q8 when input prices satisfy 14 < wr (i.e., when 4r < w), but become L = q2 and K = 0 when input prices satisfy 4r > w. Hence, when 4r < w, the total cost is q q TC = w0 + r = r , 8 8 which increases in the output that the firm seeks to produce, q, and in the price of capital, r, but it is independent of the price of labor w because the firm does not use labor in this case. If, instead, input prices satisfy 4r > w, the firm only uses labor, and its total cost becomes q q TC = w + r0 = w , 2 2 Cost Minimization 195 thus increasing in output q, and in wages w, but unaffected by the price of capital, r. Lastly, note that if w increases enough, the condition on input prices 4r > w can revert to 4r < w, leading the firm to one of the cases studied in example 8.5 where it uses only capital, but no labor (i.e., L = 0 and K = q8 ). Self-assessment 8.7 Consider the labor and capital demands found in selfassessment 8.5. Find the total cost function TC in this scenario, and compare it against TC = r q8 , found in example 8.7. Interpret. 8.6 Types of Costs “Explicit costs” are those involving a direct outlay, and are thus included in balance sheets. “Implicit costs,” however, do not necessarily involve direct outlays, but they reflect the opportunity cost of an input. Therefore, implicit costs consider the best alternative use of the input that the firm forgoes when dedicating that input to its production process. A common example of opportunity cost is that of studying an undergraduate degree: your monetary outlays (either in cash or in debt) are the explicit cost of your education, whereas the foregone salary that you could earn in the years you dedicate to your degree is the opportunity cost (or implicit cost) of your degree. Another highly cited example is that of Kaiser Aluminum. This firm initially signed a long-term electricity contract at a price of $23/mWh, which guaranteed Kaiser this purchasing price for several years. In 2001, a few months after the contract was signed, however, the price of electricity skyrocketed to $1,000/mWh. While the explicit cost of using a megawatt of electricity was still $23, Kaiser’s implicit cost (the opportunity cost of using electricity in aluminum production rather than selling it) was $1,000. Kaiser understood this difference and shut down their smelters for a few days to sell the electricity on the open market, which the contract allowed Kaiser to do. Explicit versus implicit costs. Sunk costs cannot be recovered, even if the firm chooses to shut down its operations. For instance, if the firm rents the building where it operates, and the lease contract prohibits the firm from subletting the building to another party, rental payments can be considered as unrecoverable, and thus sunk.9 If, instead, the lease does not prohibit the firm from subletting the building, then the manager could sublet it, and recover Sunk versus nonsunk costs. 9. Another example of sunk cost is that of specific investments, such as developing new tools and machines to be used in the production process, which only benefits the firm that develops them. If the firm were to shut down its operations, it would face many difficulties at finding other companies interested in these (completely new) tools, as they could be useless for firms in other industries and only of little use for other firms in the same industry. 196 Chapter 8 a significant portion of the rental cost, thus making it a nonsunk cost. The cost of most raw materials and inputs is also nonsunk, because the firm can sell them back to its providers if it were to shut down its operations (recovering a portion of the cost). In the long run, the firm is assumed to have enough time to vary the amount of all inputs as much as necessary. In the short run, however, the amount of at least one input is considered to be fixed, as its amount cannot be easily changed in a matter of only days or weeks. Most applications assume that in the short run capital is fixed, as varying the building size (or some machines) usually requires several weeks or months. Nonetheless, some industries might find that, in contrast, capital is relatively easy to vary, but labor is fixed in the short run. A typical example is faculty positions at universities (or top programmers in the technology industry). Acquiring a new computer or software (both being forms of capital) can be done in a matter of hours, whereas a professor with a long experience in a specific field would require the posting of ads, interviewing candidates at professional meetings and conferences, inviting a small group of potential candidates to visit the hiring institution, an offer to the selected candidate, and perhaps a subsequent negotiation about the details of the contract—a process that can take four to five months, if not longer. Because in the short run the firm can vary the amount of fewer inputs, it has less freedom to minimize its costs, which ultimately entails that short-run costs are higher (or equal, but never lower) than long-run costs. For illustration purposes, example 8.8 revisits the firm in examples 8.4 and 8.5, holding capital fixed in the short run. Long-run versus short-run costs. Example 8.8: Comparing long- and short-run costs Consider a firm with CobbDouglas production function q = L1/2 K 1/2 , but assume that capital cannot be varied in the short run, being fixed at K = 150 units. To find the cost-minimizing units of labor (i.e., the firm’s demand for labor in the short run), we only need to insert K = 150 into the firm’s production function, q = L1/2 1501/2 , and solve for L. Squaring both sides yields q2 = 150L, and solving for L, we obtain the short-run demand for labor: L= q2 , 150 which increases in the output that the firm seeks to produce, q. In this context, the short-run total cost becomes STC = wL∗ + rK = w q2 + r150. 150 Considering the same input prices as in example 8.4, w = $40 and r = $10, this 4 2 q , which lies above the long-run short-run total cost simplifies to STC = $1, 500 + 15 Cost Minimization 197 STC(q) 12 , 000 10 , 000 8,000 TC(q) 6,000 4,000 $1,500 2,000 50 100 150 200 q q = 75 Figure 8.4 Long-run versus short-run total costs. total cost found in example 8.6, TC = 40q, as depicted in figure 8.4.10 If we assume the same output target q = 150 units, this short-run total cost simplifies to STC = $7, 500, which is higher than the long-run total cost found in example 8.6, TC = $6, 000, as illustrated in figure 8.4. When the firm seeks to produce q = 75 units of output, the long- and short-run costs coincide; see the point of tangency on the graph where TC(75) = STC(75). Intuitively, this result indicates that√to produce this volume of output, the firm has √ a capital demand of K = q√rw = 75√ 40 = 150 units, which coincides with the fixed 10 amount of capital that is given in the short run, K = 150. In other words, if in the short run, the firm could choose the amount of capital it uses, it would exactly choose K = 150. For all other output levels, q = 75 units, the fixed amount of capital K = 150 yields a larger cost than in the long run, STC(q) > TC(q), as depicted in the figure. Self-assessment 8.8 Repeat the analysis in example 8.8, but assume that capital is fixed at K = 50 units. Find the firm’s demand for labor, and its short-run cost function STC. Assume the same input prices as in example 8.8, w = $40 and r = $10, and compare STC against the firm’s long-run total cost TC = 40q. 2 10. Graphically, the short-run cost originates at 4×75 15 = 1, 500, which coincides with the fixed cost that the firm needs to incur even if it produces zero units of output. (Indeed, when q = 0, the firm is “stuck” with K = 150 units of capital that it cannot vary. Because the price of each unit is r = $10, its cost from capital alone is 150 × 10 = 8q $1, 500.) The short-run cost then increases at a rate of ∂STC ∂q = 15 , which is itself increasing in q, thus exhibiting a convex shape. 198 Chapter 8 As a trick to identify whether a particular cost is variable or fixed, and whether it is sunk or nonsunk, you can ask the following questions: Cheat sheet of short-run costs. 1. Does the cost increase when the firm increases its production? (a) If the answer is yes, then the cost is variable. (b) If the answer is no, the cost is fixed. 2. Does the firm incur a positive cost if it were to shut down its operations (setting output equal to zero)? (a) If the answer is yes, the cost is sunk because the firm cannot recover the cost. (b) If the answer is no, the cost is nonsunk because the firm can recover such a cost. For examples of a variable and nonsunk cost, think about labor and raw materials; for examples of fixed but nonsunk costs, think about the heating that can be avoided if the firm shuts down its operations and turns off the thermostat; and for examples of fixed and sunk costs, think about the rental cost of a property that cannot be sublet. 8.7 Average and Marginal Cost Average cost (AC) AC = TC q . The total cost that the firm incurs per unit of output; that is, For instance, if the firm incurs $1,000 in total costs to produce 20 computer monitors, its average cost is 1,000 20 = $50 per monitor. Marginal cost (MC) The rate at which total costs increase as the firm produces 1 more unit; that is, MC = ∂TC ∂q . Graphically, MC measures the slope of the total cost curve: when TC increases, its slope must be positive, implying that MC is also positive (i.e., additional workers increase the firm’s output), while when TC decreases, the opposite argument applies. The AC and MC curves exhibit a similar relationship to the one described for average and marginal product, AP and MP, where MP curve crossed the AP curve at its maximum (see section 7.4 in chapter 7). In the current scenario, the MC curve crosses the AC curve at its minimum. The intuition for average and marginal scores in a test is the same as that for average and marginal products, so we leave that as an exercise for you. Cost Minimization 199 Example 8.9: Finding average and marginal cost Consider the firm with the Cobb-Douglas production function analyzed in example 8.4, where total cost is TC = 40q. This firm’s average cost is then AC = 40q q = 40, and its marginal cost is MC = ∂(40q) ∂q = 40. Hence, the average and marginal cost curves are both constant and coincide, being graphically depicted in figure 8.5 by a horizontal line at a height of $40. (This is a common feature for all firms with a Cobb-Douglas production functions like q = ALα K β , which exhibit constant average and marginal cost curves.) In the case of the linear production function from example 8.7, we found that the total cost is TC = r q8 when input prices satisfy 4r < w (i.e., labor is expensive relative to capital), but changes to TC = w q2 when input prices satisfy 4r > w (labor is relatively cheap). In this context, average cost becomes AC = w 2q q r q8 q = 8r when 4r < w, but AC = = w2 when 4r > w. Similarly, marginal cost is MC = ∂(w q2 ) ∂(r q8 ) ∂q = 8r when 4r < w, but MC = ∂q = w2 when 4r > w. Hence, both AC and MC are constant in output, q, and coincide for a given pair of input prices, w and r. Graphically, AC and MC would overlap, both being depicted by a horizontal, flat line. AC(q) = MC(q) = $40 $40 q Figure 8.5 AC and MC with a Cobb-Douglas production function. Self-assessment 8.9 Assume a firm with total cost TC = 30q2 . Find its average and marginal cost functions, and depict them against q. 8.7.1 Output Elasticity to Total Cost As discussed previously, the marginal cost MC = ∂TC ∂q measures how much total cost increases if the firm increases its output by 1 unit. While this measure is useful, it is not unit-free. To illustrate this point, consider a firm producing computer monitors in the US, 200 Chapter 8 and another firm producing cars in Germany. The MC from the first firm would be in dollars/monitor, whereas that of the second firm would be in euros/car which are not easily comparable. As in previous chapters, we can apply the definition of elasticity in this context to obtain a unit-free measure of how total cost changes in output (output elasticity to total cost), as follows: εTC,q = %TC = %q TC TC q q = TC q , q TC TC ∂TC q ∂q . In addition, q εTC,q = MC TC . Lastly, as the average q or εTC,q = ∂TC ∂q TC when the change in output q is small, so that because MC = ∂TC ∂q , we can rewrite this elasticity as cost is AC = TC q , its inverse is 1 AC q = TC , which allows us to express this elasticity as εTC,q = 1 , or more compactly, as MC AC εTC,q = MC , AC which is a function of MC and AC alone. When MC > AC, elasticity εTC,q = MC AC satisfies εTC,q > 1, indicating that total costs increase more than proportionally to a 1 percent increase in output. In contrast, when MC < AC, elasticity εTC,q = MC AC must be εTC,q < 1, thus implying that total costs increase less than proportionally to a 1 percent increase in output. Lastly, if MC = AC, elasticity equals 1, εTC,q = 1, which reflects that total costs respond proportionally to a 1 percent increase in output. This is the case with the total cost described in example 8.9, where MC = AC = 40, thus yielding an elasticity of εTC,q = 1.11 Example 8.10: Output elasticity in the Cobb-Douglas case Consider again the firm with the Cobb-Douglas production function from example 8.6, where total cost was found to be TC = 40q. The output elasticity in this case is εTC,q = q ∂TC q = 40 = 1. ∂q TC 40q That is, if the firm increases its output by 1 percent, it will see its total costs also increase by 1 percent. If the firm has a linear production, such as in example 8.7, total cost was TC = r q8 when input prices satisfy 4r < w, but changes to TC = w q2 when 11. Empirically, the output elasticity to total cost is often lower than 1 (εTC,q < 1), with industries such as producers of computer units and accessories exhibiting elasticities around 0.6 − 0.7, while utilities (e.g., water and gas distribution) exhibit elasticities around 0.99. Expressed in words, a 1 percent increase in output generates almost a 1 percent increase in total costs for utilities but a less-than-proportional increase in total costs for computer producers. Cost Minimization 201 input prices satisfy 4r > w. Therefore, when 4r < w, output elasticity becomes εTC,q = ∂TC q q r q = q = , ∂q TC 8 r 8 r which is increasing in output. Essentially, if the firm seeks to produce 1 percent more units of output, its total costs increase by qr percent, thus entailing a larger percentage increase in costs as its scale grows. (A similar result occurs if input prices satisfy 4r > w, where TC = w q2 , and output elasticity becomes εTC,q = wq , which we leave for the reader as an exercise.) Self-assessment 8.10 Consider a firm with total cost TC = 5 + 30q2 . Find its output elasticity, and interpret your results. 8.8 Economies of Scale, Scope, and Experience 8.8.1 Economies of Scale Economies of scale A firm experiences economies of scale when its average cost, AC, decreases in output q. This property often arises from specialization. When a new firm starts its operations, a worker might be doing a variety of tasks, such as producing goods, selling them to customers, preparing balance sheets, and even cleaning up the store. As the firm expands its output, more workers are hired and more specific tasks can be assigned to each worker, allowing each one to learn how to do her task in more effective ways. An additional reason for economies of scale to exist is the presence of large capital investments, which are spread over large output levels. For instance, if an automaker requires an expensive robot to produce cars, but costs from labor and materials are relatively low, the average cost of the first car will be extremely high, but it will decrease for subsequent cars. Firms may also experience an increase in their average cost when they increase output, as we describe next. Diseconomies of scale A firm suffers from diseconomies of scale when its average cost, AC, increases in output q. 202 Chapter 8 For example, this can arise from managerial diseconomies. To understand this case, consider a firm that produces a new technology, which increased its output substantially in the last few years, and is still managed by its initial founder. Seeking to serve markets in other countries, the founder realizes that the firm will need to hire more managers. These managers, however, might not know the details of the firm’s product as closely as the current manager, potentially making mistakes (at least while they learn the product details) and, ultimately, increasing the firm’s operating cost. Example 8.11: Testing for economies of scale Consider a firm with total cost TC = a + bq + cq2 , where a, b, c 0.12 Let us first find the average cost, as follows: AC = TC a + bq + cq2 a = = + b + cq. q q q Figure 8.6 depicts this average cost. This expression reaches its minimum at the point where its derivative with respect to q is zero (i.e, ∂AC ∂q = 0), or ∂AC a = − 2 + c = 0, ∂q q AC(q) = min AC (q ) = a + b + cq q a2 + c2 +b ac a q=⎛ ⎛ ⎝c⎝ 1/2 q Figure 8.6 Testing for economies of scale. 12. Note that this allows for different types of costs: (1) if b = c = 0, total cost simplifies to TC = a, which are constant in the units of output that the firm produces, q; (2) if c = 0, then total cost reduces to TC = a + bq, thus being linear in output; and (3) if b = 0, total cost becomes TC = a + cq2 , thus being a standard convex expression that originates at a height of a, and increases in q, at an increasing rate. Cost Minimization 203 1/2 1/2 which, after solving for q, yields q = ac . Hence, for all output levels q < ac , the AC curve is decreasing, thus exhibiting economies of scale (see the left side of 1/2 , the AC curve increases in output, figure 8.6), whereas for all output levels q > ac and the firm suffers from diseconomies of scale (right side). As a confirmation, we next show that the minimum of the AC curve we just found, 1/2 , could alternatively be found by using the property that the MC and AC q = ac curves cross each other at the minimum of the AC curve. First, we obtain the marginal cost, MC = ∂TC ∂q = b + 2cq. Second, the MC and AC curves cross each other where MC = AC, which implies a + b + cq = b + 2cq. q Rearranging, we obtain a q = cq and, solving for output q, we have that q = a c 1/2 . For instance, if the firm’s total cost function is TC = 10 + 2q + q2 , implying that a = 10, b = 2, and c = 1, the AC curve becomes 10 q + 2 + q, which reaches its minimum at 1/2 q = 10 3.16 units of output. For all q < 3.16, the firm’s AC curve decreases in 1 output, while for all q > 3.16, it increases in output. Self-assessment 8.11 Consider a firm with total cost TC = 5 + 2q + q3 . Find the average cost curve AC(q) and its minimum point. Interpret your results in terms of economies of scale. 8.8.2 Economies of Scope Economies of scope The situation where a firm incurs a lower total cost producing two different products than the total cost that two firms would incur producing each good separately; that is TC(q1 , q2 ) < TC(q1 , 0) + TC(0, q2 ). (8.1) This inequality says that the total cost from producing both q1 units of good 1 and q2 units of good 2, TC(q1 , q2 ), is lower than the cost of producing q1 alone by one firm and q2 alone by another firm, TC(q1 , 0) + TC(0, q2 ). Because the total cost of producing none of the goods 204 Chapter 8 is often zero,13 we can write TC(0, 0) = 0, which allows us to rewrite equation (8.1) as TC(q1 , q2 ) < TC(q1 , 0) + TC(0, q2 ) − TC(0, 0), zero or, after rearranging, as follows: TC(q1 , q2 ) − TC(q1 , 0) < TC(0, q2 ) − TC(0, 0). (8.2) Intuitively, the right side of equation (8.2) represents the additional cost (increase in total costs) that the firm experiences when it starts producing good 2. The left side reflects the increase in total cost that the firm experiences if, when producing good 1, it starts to produce good 2 as well. In summary, equation (8.2) describes that the increase in cost from starting to produce one good alone is larger than the additional costs of adding one more good to the firm’s product line. Common examples of economies of scope are television channels in a satellite network. Once the network launches a service offering 80 channels, the additional cost of offering 1 more channel is relatively low.14 Example 8.12 studies economies of scope in a cola company. Example 8.12: Economies of scope Consider a soda company producing two types of cola, with and without sugar (e.g., regular and Diet Coke). When the firm only produces regular cola (good 1), its total cost function is TC(q1 , 0) = 3q1 + 10, where 10 indicates a fixed cost. When the firm only produces diet cola (good 2), its total cost function is TC(0, q2 ) = 4q2 + 10. However, when the firm simultaneously produces both types of colas, its total cost function becomes TC(q1 , q2 ) = (3 − α)q1 + (4 − α)q2 + (10 + β), where parameter α > 0 indicates the cost savings effect that producing related products has on the unit cost of both regular and diet cola. Parameter β > 0, instead, represents the increase in fixed costs that the firm experiences when producing two types of colas rather than one. Therefore, the firm exhibits economies of scope if (3 − α)q1 + (4 − α)q2 + (10 + β) < [3q1 + 10] + [4q2 + 10] , which simplifies to β < 10 + α(q1 + q2 ). This condition says that the firm benefits from economies of scope if the increase in fixed costs that it experiences when producing both goods (as captured by parameter β) is relatively lower than the cost-saving effect from producing both goods (as measured by parameter α). 13. Recall from the discussion about types of costs in section 8.6 that the total cost of producing zero units of output is zero when fixed costs are non-sunk. If, instead, fixed costs are sunk, the firm cannot recover them, even if it shut down its operations. 14. Another recurrent example of economies of scope is that of soda varieties. If a soda company offering four or five types of soda chooses to offer one more variety (e.g., cherry cola), its additional costs from doing so are relatively low, and definitely lower than those of a new soda firm planning to offer its first soda product. Cost Minimization 205 Self-assessment 8.12 Consider the scenario in example 8.12. Does the soda company benefit from economies of scope when α = 2, β = 3, and q1 = q2 = 4 units? 8.8.3 Economies of Experience Economies of experience production history. The average variable cost decreases during the firm’s Intuitively, economies of experience often emerge because workers in all tasks learn from previous periods to avoid product defect, because the managers arrange workstations to improve worker productivity or achieve higher material yield. In particular, economies of experiences are commonly expressed as follows: AVC(E) = A Eε where A > 0 denotes the AVC from the first unit,15 E = qt−1 + qt−2 + …. measures experience from production in previous periods, and ε represents experience elasticity where ε ∈ (0, 1). To see why ε measures the experience elasticity, note that such elasticity is εAVC,E = %AVC = %E AVC AVC E E = AVC E , E AVC E A or εAVC,E = ∂AVC ∂E AVC when the change in E is small. Because AVC(E) = Eε , its derivative with respect to E is ∂AVC ∂E = −AεE−(1+ε) , which entails that experience elasticity becomes εAVC,E = E ∂AVC E = −AεE−(1+ε) A = ε. ∂E AVC Eε Essentially, a 1 percent increase in the firm’s production experience, E, decreases its average variable costs by ε percent. Example 8.13: Slope of the experience curve An arguably more compact method of analyzing the responsiveness of a firm’s average variable costs (AVC) to its 15. Indeed, if the firm produced only one unit during its history, E = 1, the average variable cost simplifies to AVC = 1Aε = A. 206 Chapter 8 production experience (E) focuses on the slope of the experience curve, as follows: AVC(2E) = Slope of experience curve = AVC(E) A (2E)ε A Eε = Eε 1 = . 2ε Eε 2ε This slope measures how much the average variable cost decreases when cumulative output (E) doubles. Because the experience elasticity parameter ε satisfies ε ∈ (0, 1), an increase in the experience elasticity entails a larger slope of the experience curve.16 Self-assessment 8.13 Consider a firm with AVC(E) = E10 1/2 . Find the experience elasticity of the firm, and the slope of its experience curve. Interpret. Remark—Economies of scale and economies of experience are often confused, but as the previous discussion highlights, each applies to a different phenomenon. Mature industries, such as cement or aluminum, often benefit from economies of scale, as increasing their output allows them to further reduce their average costs. However, they rarely benefit from economies of experience because their products and technology are relatively well known. Economies of experience are more common in new products and start-ups, which see their average variable costs decrease after learning from their own mistakes and experience.17 Appendix. Cost-Minimization Problem—A Lagrangian Analysis In previous sections, we graphically showed that, at the input pair that minimizes the firm’s MPL = wr . This costs, the isoquant and isocost must be tangent to each other, entailing MP K appendix formally shows the origin of this result by solving the cost-minimization problem. Let us start by writing down the firm’s problem: 16. For a numerical example, consider an elasticity of ε = 0.15, which yields a slope of 1 = 0.7. ε = 0.5, the slope decreases to 0.5 2 1 = 0.9. If, instead, 20.15 17. An example of an industry where economies of experience are present but economies of scale are rare is that of handmade pianos or handmade watches. In these activities, experience helps the firm avoid defects, and so it arranges its workstations to improve worker productivity. However, output scale would not significantly decrease average costs, given that approximately the same number of working hours and materials are needed in each handmade piece. Cost Minimization 207 min L0, K0 TC = wL + rK subject to q = f (L, K). Intuitively, this problem says that the firm seeks to minimize its total cost TC = wL + rK, while reaching an output level q (e.g., q = 100 units). This is a constrained minimization problem, in which the constraint is given by the output target, q = f (L, K), that the firm seeks to reach. This problem has the following Lagrangian function: L = wL + rK + λ [q − f (L, K)] , where λ denotes the Lagrange multiplier associated with the constraint. We next need to differentiate with respect to the variables that the firm can alter in order to solve this problem (units of L and K) and with respect to the Lagrange multiplier λ. First, differentiating with respect to L, we obtain w+λ − w ∂f (L, K) = 0, or =λ ∂L MPL because MPL = ∂f (L,K) represents the marginal product of labor. Differentiating with respect ∂L to K, we similarly find r+λ − r ∂f (L, K) = 0, or = λ, ∂K MPK denotes the marginal product of capital. Lastly, we differentiate given that MPK = ∂f (L,K) ∂K with respect to the Lagrange multiplier λ, obtaining q − f (L, K) = 0, or q = f (L, K), which coincides with the constraint (the firm must reach a production level of q units of output). Because the first two results are both equal to λ, we can set them equal to each w = MPr K or, after cross-multiplying, other to obtain MP L MPL MPK = . w r When minimizing cost, the firm adjusts its inputs until it gets the same bang for the buck across all inputs, as described in previous sections of this chapter. We can also rewrite this MPL = wr , which says that the firm hires inputs until the point at which the isoresult as MP K MPL quant is tangent to the isocost (i.e., the slope of the isoquant MP coincides with that of the K w isocost r ). 208 Chapter 8 Exercises 1. Cost minimization for Cobb-Douglas.B Consider a firm with the Cobb-Douglas production function f (K, L) = 4K 1/2 L1/3 , where K denotes units of capital and L represents units of labor. Assume that the firm faces input prices of r = $10 per unit of capital, and w = $7 per unit of labor.18 (a) Solve the firm’s cost-minimization problem, to obtain the combination of inputs (labor and capital) that minimizes the firm’s cost of producing a given amount of output, q. (b) Use your results from part (a) to find the firm’s cost function. This is its long-run total cost, as all inputs can be altered. (c) Find the firm’s marginal cost function, and its average cost function. Interpret. (d) Assume now that the amount of capital is held fixed at K = 3 units. Solve the firm’s costminimization problem again to find the amount of labor that minimizes the firm’s cost. (e) Use your results from part (c) to find the firm’s short-run cost function (because in the short run, the firm can alter the amount of labor, but without changing the units of capital). 2. CMP for linear production.B Repeat the analysis in the previous exercise, but assume now that the firm faces a production function f (K, L) = 4K + L, thus treating capital and labor as substitutes in the production process. 3. Properties of a cost function.B A firm has the following cost function: 9 1 1 TC(q) = 2q3 − q2 + q + , 3 2 10 where q denotes units of output. Intuitively, the first three terms on the right side capture the firm’s variable cost, because they depend on the output the firm produces, whereas the last term represents its fixed cost, as it is not a function of output q. (a) Total cost. For which output q does the total cost curve TC(q) increase or decrease? For which values is it concave or convex in output? (b) Marginal cost. For which output q does the marginal cost curve ∂TC(q) ∂q increase or decrease? For which values is it concave or convex in output? (c) Average cost. For which output q does the average cost curve AC(q) = TC(q) increase or q decrease? For which values is it concave or convex in output? (d) Average variable cost. For which output q does the average variable cost curve AC(q) increase or decrease? For which values is it concave or convex in output? (e) Find the value of q where the marginal cost curve crosses the total cost curve, where it crosses the average cost curve, and where it crosses the average variable cost curve. 4. Properties of factor demand and cost functions.B Most managers must submit detailed reports about the firm’s performance. However, the manager we consider in this exercise is rather sloppy, 18. The firm sells every unit of output at a price p > 0. Cost Minimization 209 because he often has typos in his reports! For each of the following functions, argue what (if anything) looks wrong and why. 3 (a) Capital demand of K(q, w, r) = 18 wq r . (b) Labor demand of L(q, w, r) = 37 rp2 . qw (c) Cost function C(q, w, r) = q3/7 w1/3 r2 . [Hint: Check for homogeneity in input prices.] 5. CMP for Cobb-Douglas production–I.C Consider a Cobb-Douglas production function q = ALα K β , where A,α,β > 0. Assume that input prices are w for labor and r for capital. (a) Find labor demand L(q, w, r). (b) Find capital demand K(q, w, r). (c) Find total cost TC(q). (d) Find average cost and marginal cost, AC(q) and MC(q). Show under which conditions on α and β these costs are constant in q and coincide between them. 6. Elasticity with labor demand.B Using the labor demand you found for the Cobb-Douglas production function in Exercise 8.5, do the following: (a) Find the elasticity of labor demand with respect to wage, εL,w . (b) Find the elasticity of labor demand with respect to the rental rate of capital, εL,r . (c) Interpret the elasticities found in parts (a) and (b). 7. Explicit and implicit costs.A Calculate the explicit and implicit costs of finishing your education. (Hint: Your university’s financial aid page should have estimates on tuition and cost of living, but the opportunity cost of your degree may be harder to estimate.) 8. Finding an isocost line.A Katie, a recent college graduate, is looking to start a new entreprise producing and selling T-shirts for local businesses. An hour of labor costs $15, and the rental rate on capital is $10. If she limits herself to only spending $200 on her first order (remember, she is a recent college grad), how much labor and capital can she hire? In other words, find and graph the isocost line. 9. CMP with Cobb-Douglas production–II.B Jared is a manager for a local coffee roaster. He hires labor at a rate of $20 per hour, and capital costs $15 per hour. His production function of pounds of coffee beans follows q = 2K 0.25 L0.75 . (a) If Jared wants to produce 10 pounds of coffee beans per hour, how much labor and capital should he employ? (b) What if the price of capital increases to $20? 10. Choosing between substitutes in production.B Lydia can use low-skilled or high-skilled workers to help run her accounting department. The low-skilled workers, which we denote as Ll , can review 10 accounts per hour, and the high-skilled workers, denoted as Lh , can review 15 accounts per hour. (a) What is Lydia’s production function for number of accounts reviewed per hour? (b) If low-skilled workers cost $15 per hour and high-skilled workers cost $25 per hour, what is her optimal use of the two types of labor? (c) At what prices of labor is Lydia indifferent between hiring each type of labor? 210 Chapter 8 11. Cobb-Douglas production and input demand.B Let’s revisit our local coffee roaster Jared, with production function q = 2K 0.25 L0.75 . Find Jared’s input demand for labor and capital without assuming specific values for the price of labor and capital, w and r, respectively. 12. Fixed-proportion input and total cost.B Suppose that a sandwich-maker has a production function for sandwiches that is q = min{2B, M}, where B is slices of bread (which cost b per slice) and M is slices of meat (which cost m per slice). What is the firm’s total, average, and marginal cost? 13. Short-run Cobb-Douglas costs.B Jenny produces first-aid kits using labor and capital with the production function q = 6L0.8 K 0.2 , where the wage is $5 and rental rate is $3. (a) Find her total cost function TC(q) if her capital is fixed at 50 units (this is her short run cost curve). (b) If her capital is fixed at 50 units, what is the total cost of 10, 25, 50, and 100 first-aid kits? Graph this short-run total cost curve. (c) Graph the short-run average and marginal cost curves (at K = 50). At what point do these curves cross? 14. The cost function.A Explain why we need more than just the input prices to derive a cost function. 15. Substitutes in production.B Hard red winter wheat is planted in the fall in order to be harvested in the spring. Suppose that wheat production uses acres of land A and labor L in its production as follows: q = αA + Lβ , where q is in thousands of bushels. Calculate the total cost function for wheat. 16. Finding MC, AC, and output elasticity.B A publisher for textbooks has a total cost of TC(q) = 25, 000 − 50q + 15q2 . (a) Find the publisher’s marginal cost, average cost, average variable cost, and average fixed cost. (b) Find the value of q for where the marginal cost curve crosses the average cost curve and average variable cost curve. (c) Find the output elasticity εTC,q . 17. Economies of scale–I.A Suppose that a firm has the cost function TC(q) = 5q3 (wr)0.5 . What is the marginal and average cost? Does this firm exhibit economies of scale? 18. Economies of scale–II.B A manufacturer of computer monitors has the following total cost function, TC(q) = 10, 000 − 25q + 2q2 . Characterize this firm’s economies of scale. 19. Economies of scope.A A frozen pizza manufacturer separately produces pepperoni pizzas (pp ) at a total cost of TC(pp ) = 100 + pp , and sausage pizzas (ps ) at a total cost of TC(ps ) = 100 + 1.5ps . Workers have told management there might be some cost savings if the pizzas were produced simultaneously. One worker estimates the total cost of joint production (if the firm produces both types of pizzas in the same plant) as TC(pp , ps ) = pp + (1.5 − α)ps + (100 + β). (a) When would the firm prefer to produce both types of pizzas in the same plant? (b) If α = 0.5 and the firm wants to produce 150 pepperoni pizzas and 70 sausage pizzas, at what level of β does the manufacturer benefit from economies of scale? Cost Minimization 211 20. Experience elasticity–I.A Suppose that a firm has an economies of experience of AVC(E) = EAε , and an experience elasticity ε = 1/3. (a) Find the slope of the firm’s experience curve. (b) Suppose that the average cost of the first unit is A = 100. What is the firm’s average variable cost if it produced 10 units in its history? What about 20 units? 100 units? 21. Experience elasticity–II.A Professor Smith has been teaching for a very long time, and therefore, he has graded many term papers. He has estimated the average time in minutes that he takes to grade a paper is equal to AT(Y ) = 2P ln Y , where Y is the number of years he has been teaching and P is how long the paper is in pages. Find Professor Smith’s experience elasticity and the slope of his experience curve. 22. PMP and CMP for general Cobb-Douglas.C Consider a firm with the Cobb-Douglas production function q(K, L) = K α Lβ , where the exponents satisfy α, β > 0. (a) Profit maximization problem (PMP). Let us first focus on the firm’s PMP, for a given output price p. Write the firm’s PMP, differentiate with respect to capital and labor, and show that the firm’s input demands are K(p, w, r) = pα L(p, w, r) = 1−β 1−α−β r pβ w 1−α 1−α−β β 1−α−β pβ w pα α 1−α−β r and . (b) Is labor demand L(p, w, r) increasing in the output price p? Is it increasing in input prices w and r? [Hint: Your answer depends on whether the firm exhibits increasing, decreasing, or constant returns to scale.] (c) Use your results from part (a) to obtain the firm’s supply function, q(p, w, r), showing that it takes the following form: q(p, w, r) = pα r α 1−α−β β 1−α−β pβ w . (d) Is supply function q(p, w, r) increasing in the output price p? Is it increasing in input prices w and r? [Hint: Your answer depends on whether the firm exhibits increasing, decreasing, or constant returns to scale.] (e) Cost-minimization problem (CMP). Let us now focus on the firm’s CMP. Use the tangency MPL = wr to find the firm’s compensated input demands, K(q, w, r) and L(q, w, r), condition MP K showing that they take the form 1 K = q α+β αw βr β α+β 1 and L = q α+β βr αw α α+β . 212 Chapter 8 (f) Is the compensated labor demand L(q, w, r) increasing in the output level that the firm seeks to reach q? Is it increasing in input prices w and r? (g) Use your results from part (e) to obtain the firm’s cost function C(q, w, r) = wL∗ + rK ∗ , showing that it is 1 α β C(q, w, r) = q α+β r α+β w α+β θ, where θ = β α α α+β + α β β α+β . (h) Is the cost function C(q, w, r) increasing in the output level that the firm seeks to reach q? Is it increasing in input prices w and r? (i) Find the firm’s average cost function. Is it increasing in the output level that the firm seeks α β to reach q? [Hint: Use T = r α+β w α+β θ to gather all the elements of cost function C(q, w, r) that are not a function of output q.] 9 Partial and General Equilibrium 9.1 Introduction In this chapter, we start to combine our findings from previous chapters about consumer theory, where we learned how to find an individual’s demand function, and production theory, where we identified the firm’s cost function. In this and subsequent chapters, we place consumers and producers in different markets to better understand their behavior. From our previous discussion of the firm’s cost function, you may remember that it reflects the minimal costs that the firm incurs to produce a given output level q. The cost function, however, did not tell us how many units of output the firm produces to maximize its profits. In this chapter, we explore the firm’s problem in order to understand its incentives and the firm’s optimal production (supply). For simplicity, this chapter considers a perfectly competitive market, where the firm takes output prices as given. In future chapters, we relax this assumption by analyzing industries with fewer firms, such as a monopoly (only one firm) or an oligopoly (a few firms), where prices are affected by the units that each firm brings to the market. We start the chapter by describing the main features that differentiate a perfectly competitive market from other types of markets, and we then describe its two main ingredients: consumers’ aggregate demand for a good and firms’ aggregate supply of this good. We then analyze how to find the equilibrium output and price in this market. This type of equilibrium is often referred to as “partial equilibrium” because it focuses on a specific good, as opposed to “general equilibrium,” which considers equilibrium outputs and prices for several goods simultaneously. We also discuss general equilibrium in this chapter, comparing equilibrium outcomes with those maximizing social welfare (i.e., socially optimal outcomes). Finally, we analyze in which cases the equilibrium that naturally emerges in a perfectly competitive market is socially optimal (the First Welfare Theorem). In this scenario, a social planner cannot increase overall welfare by rearranging the way in which consumers purchase goods or firms use inputs to produce output. We also explore the opposite relationship, in which case a socially optimal outcome can be reached when consumers and firms freely interact in a perfectly competitive market (the Second Welfare Theorem). With this issue, 214 Chapter 9 we seek to identify redistribution schemes (such as an income tax) that a government agency can offer before consumers make their purchasing decisions and firms make their production decisions. Allowing agents to solve their individual decision problems afterward can yield an equilibrium outcome that is socially optimal. 9.2 Features of Perfectly Competitive Markets Perfectly competitive markets satisfy the following properties: • Fragmented: There are many small firms, each with a negligible market share. As a consequence, an increase in the production of either firm does not alter market prices. • Undifferentiated products: Consumers regard the products of all firms in the industry as identical (i.e., undifferentiated). • Perfect pricing information: Consumers can easily compare different sellers’ prices, at no cost to themselves. • Free entry and exit: In the long run, firms have the ability to enter the market if positive economic profits can be earned, or to exit the industry if they incur losses. Examples of these types of markets include agricultural products, such as common varieties of wheat and rice. They indeed have several producers, each of them accounting for a relatively small market share; the product is homogeneous when we focus on the market of a specific variety; consumers can easily compare prices; and producers have access to relatively similar technologies if the seeds of the variety have been available and well known for decades, along with fertilizers and harvesting machinery. 9.3 Profit Maximization Problem These features ultimately entail that all firms in the industry are price-takers—they take the market price p as given—because individual production decisions do not alter market prices. There is free entry and exit because technology and inputs are available to all firms. Therefore, every firm’s profit-maximization problem (PMP) is max π = TR(q) − TC(q) = pq − TC(q). q (9.1) TR(q) The firm chooses its output level q to maximize its profit π , which is equal to the difference between the firm’s total revenue, TR(q) = pq, and its total cost, TC(q). The total cost is, of course, the expression found in chapter 8 after solving the firm’s cost-minimization problem (CMP). Hence, the PMP can be understood as a three-step procedure that unfolds as follows: Partial and General Equilibrium 215 1. We find input demands for L and K that minimize the firm’s cost subject to reaching a generic production level q (i.e., the input combination that solves the CMP described in chapter 8). 2. We insert input demands into the firm’s costs to obtain its total cost TC(q) = wL + rK, which remains a function of the output q. 3. We insert the total cost TC(q) found in step 2 into the firm’s profit in equation (9.1). The profit function π = pq − TC(q) from equation (9.1) implies that we have completed steps 1 to 3. Hence, we need to differentiate the profit function only with respect to output q, as follows: p− ∂TC = 0, or ∂q p = MC(q), where MC(q) = ∂TC ∂q denotes the marginal cost of output. Intuitively, this result says that, to maximize its profits, the firm increases its output q until the point where the price from selling an additional unit coincides with the additional cost that the firm incurs to produce such extra unit. This result should come as no surprise: if, instead, the firm chooses a volume of output q for which p > MC(q), it could still increase its profits by producing more units, given that the price that the firm receives per unit exceeds the marginal cost of producing such unit; and if the firm chooses q where p < MC(q), it could increase its profits by producing fewer units because the price it receives per unit is less than the marginal cost of producing it. We can also check that p = MC(q) is a condition for the firm to maximize (rather than minimize) its profits. In particular, this is done by finding the second-order conditions, which are obtained by differentiating this result p − MC(q) = 0 with respect to output q again, and checking that the result is negative. Indeed, differentiating p − MC(q) with respect to q yields 0− ∂MC ∂q which is negative (or zero) so long as ∂MC ∂q ≥ 0. Essentially, if the firm’s marginal costs are increasing (or constant) in output, condition p = MC(q) guarantees that the firm is maximizing its profits.1 1. Formally, we say that condition p = MC(q) is not only a necessary condition (obtained from the first differentiation of the firm’s profits with respect to q), but also a sufficient condition (because the second differentiation with respect to q produced an expression that is negative). In other words, second-order conditions check that the second derivative of the firm’s profit function is negative in its output q, which graphically indicates that profits are concave in output. 216 Chapter 9 Example 9.1: PMP in the Cobb-Douglas case Consider the firm of example 8.6 in chapter 8, with Cobb-Douglas production function q = L1/2 K 1/2 . As found in example 8.6, its total cost function is TC(q) = 40q. Inserting this total cost into the firm’s PMP, we obtain max π = pq − 40q. q Differentiating with respect to output q yields p − 40 = 0, or p = $40. This result indicates that, at a price of p $40, the firm produces as much as possible. If, instead, the price is below that threshold (p < $40), the firm finds it optimal not to supply any units whatsoever, producing q = 0 units. Figure 9.1a depicts this supply curve. (a) p $40 Supply curve q (b) p p = 80q, or q = (1/80)p Supply curve 80 q Figure 9.1 (a) Supply curve with a linear cost function. (b) Supply curve with a convex cost function. Partial and General Equilibrium 217 Assume now that the firm’s cost function was TC(q) = 40q2 , which is convex in output q.2 In this scenario, condition p = MC(q) becomes p = 80q. Solving for output p p q, we can find the firm’s supply, q = 80 . Because 80 is increasing in price p, the firm supplies more units as the price increases. Figure 9.1b plots this supply curve, which originates at zero, and grows in p. Self-assessment 9.1 Repeat the analysis in example 9.1, but assuming that the firm’s total cost function is now TC(q) = 5 + 40q2 . Find and depict its supply curve. 9.4 Supply Curves 9.4.1 Individual Firm Supply In this section, we use the result from the firm’s PMP, p = MC(q), to obtain the firm’s supply curve. One can immediately guess that we only need to plot the firm’s marginal cost (MC) curve MC(q), as in figure 9.2. In particular, for each price p on the vertical axis, we can move rightward along the dotted lines toward the MC(q) curve, mapping the curve on the horizontal axis, as illustrated by the arrows on the graph. Intuitively, this mapping from the vertical to the horizontal axis says, for each price p, how many units the firm produces to maximize its profits.3 In the long run, the amounts of all inputs can be varied or, in other words, there are no fixed costs. Hence, the average cost curve AC(q) only includes variable costs (because both labor and capital can be altered). Figure 9.3 superimposes the firm’s AC(q) curve on top of the MC(q) curve depicted in figure 9.2. This analysis of the firm’s supply curve assumes that the firm would supply units of output even when the market price p falls below the firm’s average cost, AC(q). Doing so, however, would result in losses, as depicted in figure 9.3. Hence, this production strategy would never be chosen by the firm. Informally, it prefers to shut down its operations rather than making a loss on every unit. As a consequence, the firm’s supply curve can be given by the relationship found above, p = MC(q), but only on the segment of the MC curve, MC(q), that lies above ∂TC(q) ∂MC(q) 2. Indeed, the marginal cost is MC(q) = ∂q = 80q, and its derivative is ∂q = 80, which is positive for all output levels q. Graphically, the total cost increases in q, and at an increasing rate. 3. Mathematically, this mapping from the vertical to the horizontal axis is just the inverse of MC(q). That is, if the firm’s profit-maximizing condition is p = MC(q), this mapping would be equivalent to solving for q in p = MC(q), as in example 9.1. 218 Chapter 9 p MC(q) $7 $5 10 units 12 units q Figure 9.2 Supply curve and MC(q). p MC(q) Supply curve AC(q) p = minAC(q) q Figure 9.3 Average and marginal costs. the AC(q). Prices above the AC(q) curve help the firm make a positive profit margin per unit, implying that the firm prefers staying active (producing a positive output level) to shutting down. For prices below the AC(q) curve, the firm prefers to shut down, indicated by the vertical spike along the vertical axis where q = 0. Recall that in this section, we are considering a long-run approach where the firm can alter the units of all inputs. In contrast, section 9.5 will examine a short-run scenario where the firm cannot alter all inputs (some inputs are fixed), where we show that the firm can stay active so long as price p covers its average variable cost AVC(q). Partial and General Equilibrium 219 Example 9.2: Finding the long-run supply curve Consider a firm with total cost curve given by TC(q) = −5q + 2q2 . We can find its MC curve by differentiating TC(q) with respect to output q, obtaining MC(q) = ∂TC(q) ∂q = −5 + 4q. Figure 9.4 depicts this marginal curve, which originates at −5 (in the negative quadrant) and increases in q at a rate of 4. Setting p = MC(q), we obtain p = −5 + 4q which, solving for output q, yields q= p+5 p 5 = + . 4 4 4 This curve, however, is not necessarily the firm’s supply curve. To find that curve, we first need to find the firm’s average cost AC(q) = TC(q) q = −5 + 2q, and compare it with the MC(q) found previously.4 To obtain the point where the MC(q) and AC(q) curves cross, which constitutes the firm’s “shutdown price,” we set the two curves equal to each other: −5 + 4q = −5 + 2q, which simplifies to q = 0. At this output level, the firm’s marginal cost is MC(0) = −5 + (4 × 0) = −5. Hence, for any positive market price p, the firm produces a positive output level, according to q(p) = p4 + 54 ; but shuts down, producing zero output p Supply curve 15 MC(q) = –5 + 4q 10 5 AC(q) = –5 + 2q 1 –5 2 3 q = 2.5 units 4 5 q q = 3.75 units Figure 9.4 Finding a supply curve. 4. The AC(q) curve just obtained originates at a height of −5 when q = 0, and increases at a rate of 2, crossing the horizontal axis when −5 + 2q = 0, or q = 52 = 2.5 units of output. 220 Chapter 9 q(p) = 0, when p = $0. We can summarize our results with the following supply function: p 5 + if p > 0 q(p) = 4 4 0 otherwise, as depicted in the thick segment of figure 9.4. 5 15 For instance, if the market price is p = $10, the firm supplies q = 10 4 +4= 4 = 3.75 units, as depicted in figure 9.4, whereas if the price the firm faces is p = $16, the 5 21 firm increases its production to q = 16 4 + 4 = 4 = 5.25 units. Self-assessment 9.2 Repeat the analysis in example 9.2, but assume that the firm’s TC curve is TC(q) = −5q + 8q2 . Find MC(q), AC(q), and the firm’s supply curve. 9.4.2 Market Supply After finding the individual supply curve of each firm, we can easily aggregate them in order to obtain the market (or aggregate) supply. This can be found by horizontally summing across all individual demands in the industry. Example 9.3 examines market supply when the number of firms in the industry, N, is given (i.e., no entry or exit occurs). Example 9.3: Finding market supply Consider N firms, each with the individual supply curve we found in example 9.2: p 5 + if p > 0 q(p) = 4 4 0 otherwise. The market supply is then N × q(p); that is, p 5 N 4 + 4 if p > 0 q(p) = 0 otherwise. 5 For instance, at a price p = $10, every firm supplies q = 10 4 + 4 = 3.75 units, which entails an aggregate supply of N × 3.75 units of output. (If, for example, there are N = 200 firms in the industry, aggregate supply becomes 200 × 3.75 = 750 units.) Partial and General Equilibrium 221 Self-assessment 9.3 Consider the supply curve found in self-assessment 9.2. Find the market supply curve, and evaluate it at N = 100 firms to obtain the aggregate supply. 9.5 Short-Run Supply Curve The analysis thus far assumes that the amount of all inputs could be altered (the long-run approach). In the short run, however, the amount of at least one input is considered fixed, such as capital. In this section, we analyze how the firm’s supply curve is affected if the manager operates in a short-run scenario. For tractability, consider a TC function TC(q) = a + bq + cq2 , FC VC(q) where parameter a 0 captures the part of total costs that is unaffected by changes in output (fixed cost, FC). In contrast, the last two terms, bq + cq2 , depend on q and thus measure the firm’s variable costs, VC. In this scenario, the average cost becomes AC(q) = a TC(q) = + b + cq , q q AVC(q) AFC where a q is the average fixed cost AFC(q), because a represents the fixed cost, whereas b + cq denotes the average variable cost AVC(q), because bq + cq2 reflects the variable cost. Because AC(q) = AFC + AVC(q), the difference between the AC(q) and AVC(q) is, of course, the average fixed cost (AFC). In other words, AC(q) lies above AVC(q), as depicted in figure 9.5. For generality, we can allow a share of fixed cost, a, to be sunk (unrecoverable) or nonsunk (recoverable). In particular, let a = aS + aNS , where aS denotes the sunk fixed cost, while aNS represents the non-sunk fixed cost. In this context, the firm’s average fixed cost is AFC(q) = a aS + aNS = . q q Figure 9.6 depicts the average non-sunk cost (i.e., ANSC = aNS q + b + 2q), thus being lower than the AC(q) curve, because AC(q) = aq + b + cq, but higher than the AFC curve, NS . AFC(q) = aS +a q 222 Chapter 9 p MC(q) AC(q) p = min AC(q) AVC(q) p = min AVC(q) q Figure 9.5 AC and MC curves. p MC(q) AC(q) p = min AC(q) ANSC(q) AVC(q) p = min AVC(q) q Figure 9.6 AC functions when a share of fixed costs are sunk. The question remains: What is the firm’s supply curve in a short-run context? To answer this question, we need to recognize that, by altering its output decision (including, if necessary, shutting down its operations), the firm can avoid its AVC(q) and its non-sunk costs, as the latter are recoverable; but it cannot recover its sunk costs. The firm will then produce positive amounts so long as the market price exceeds its ANSC because its non-sunk costs aggregate all those cost categories that can be avoided or recovered. Hence, the shutdown price in a short-run scenario lies at exactly the minimum of the ANSC or, alternatively, at the point where ANSC crosses MC(q). Partial and General Equilibrium 223 Example 9.4: Finding the short-run supply curve Consider a firm with the same TC function as in example 9.2, but with $10 in fixed costs (i.e., TC(q) = 10 − 5q + 2q2 ), and assume that this fixed cost is evenly distributed into sunk costs, $5, and non-sunk costs, $5. In that example, we showed that MC(q) = −5 + 4q. We can now find the expression of the non-sunk costs, NSC(q) = 5 − 5q + 2q2 , which implies that ANSC is ANSC(q) = NSC(q) 5 = − 5 + 2q. q q We can then set the MC(q) and ANSC(q) equal to each other, in order to find their crossing point; that is, 5 + 4q = 5 − 5 + 2q, q √ which simplifies to 2q = 5q . Solving for output q, we obtain q = 2.5 1.58 units. Inserting this output level into the MC(q) curve, we find the shutdown price in this short-run scenario, p = −5 + (4 × 1.58) = $1.32. In summary, the firm’s short-run supply curve is p 5 + if p > $1.32 q(p) = 4 4 0 otherwise. Comparing the short-run supply we just found against the long-run supply identified in example 9.2, we can see that they are very similar, as the firm still produces along its MC curve; but now the firm needs a higher price to start producing positive amounts, $1.32, than in the long-run scenario, $0. Intuitively, the firm did not face any fixed costs in the long run, as all input usage could be recovered. In the short run, some costs are fixed and, more importantly, sunk (unrecoverable), inducing the firm to start producing only when it faces a sufficiently high price. Self-assessment 9.4 Repeat the analysis in Example 9.4, but assuming that the firm’s fixed costs are distributed differently: only $2 are sunk while $8 are non-sunk. Find the firm’s short-run supply curve and compare your results against those in Example 9.4. 224 Chapter 9 9.6 Market Equilibrium 9.6.1 Short-Run Equilibrium In the short run, one can assume that the number of firms in the industry, N, is given. That is, the time span is sufficiently short to prevent firms from entering or exiting the industry (e.g., a day or a week). In that scenario, we can use the market demand (aggregate demand after summing all individual demands, as described in chapter 3) and the market supply (after summing all individual supplies, as found in section 9.4) to analyze the equilibrium output and price in that market. Example 9.5: Finding short-run equilibrium output and price Consider a market demand qD (p) = 100 − 2p, and the aggregate supply curve of example 9.3. These cross each other when qD (p) = qS (p), or 100 − 2p = N p 5 + , 4 4 which simplifies to 8p + N(5 + p) = 400. Solving for price p, we find an equilibrium price of p= 5(80 − N) , 8+N which is decreasing in the number of firms N.5 In addition, this price crosses the shutdown price of p = $1.32 at N = 461.21 firms.6 Hence, when there are only N 62 firms in the industry, the equilibrium price is p = 5(80−N) 8+N , which entails an aggregate 5(80−N) 110N output of q = 100 − 2 8+N = 8+N . (For instance, when N = 10 firms compete in 350 ∼ the industry, equilibrium price is p = 5(80−10) 8+10 = 18 = $19.44, while aggregate out1,100 ∼ 110×10 put becomes q = 8+10 = 18 = 61.1 units.) In contrast, when the number of firms exceeds 62, such as when N = 90, every firm sets its individual production at zero, q = 0, implying that aggregate output is also zero in equilibrium. ∂p 440 , 5. As an exercise, check that if we differentiate price p = 5(80−N) 2 8+N with respect to N, we obtain ∂N = − (8+N) which is clearly negative. Expressed in words, this says that the equilibrium price decreases as more firms enter the industry. 6. To see this point, set the price found previously, p = 5(80−N) 8+N , equal to the shutdown price of $1.32, so that 5(80−N) 8+N = 1.32. Rearranging, we obtain 5(80 − N) = 1.32(8 + N). Solving for N, we find N = 61.62 firms. Partial and General Equilibrium 225 Self-assessment 9.5 Repeat the analysis in example 9.5, but assume that the demand function is now qD (p) = 350 − 2p. How do equilibrium price and quantity change relative to those we found in example 9.5? 9.6.2 Long-Run Equilibrium In a perfectly competitive market, firm entry can occur if potential entrants can make more profits in this market than in other industries; and firm exit can happen if incumbent firms make losses (or less profits) than in other easily accessible industries. The market reaches an equilibrium (i.e., a stable situation) when no more firms have an incentive to enter or exit the industry. For that to occur, it must be that firms make no economic profits (i.e., they make the same profits as in other competitive markets). As a consequence, two conditions must hold: (1) profits for every firm are zero, which entails that p = min AVC(q); and (2) aggregate demand and supply cross each other, qD (p) = qS (p). We apply these two conditions to example 9.6. Example 9.6: Finding long-run equilibrium output and price Consider the same market demand as in example 9.5, qD (p) = 100 − 2p, and an AC curve AC(q) = 10 q − 5 + 2q. Because in the long-run equilibrium, the production of every firm, q, must satisfy p = MC(q) = AC(q), we must have that MC(q) = AC(q). Setting MC(q) equal to AC(q), we find that −5 + 4q = 10 − 5 + 2q. q Rearranging, we obtain 2q = 10 q2 = 10 q , or√ 2 = 5. Taking the square root of both sides yields an individual output of q = 5 = 2.24 units. This is the output level where curve MC(q) crosses AC(q) at its minimum. In short, all firms produce an output of q = 2.24 units, at an equilibrium price of p = MC(2.24) = −5 + (4 × 2.24) = $3.94, which is the shutdown price in this scenario. We now need to use only the second condition (no entry or exit incentives) to obtain the last unknown: the number of firms operating in the industry in equilibrium, N ∗ . To find N ∗ , we set aggregate demand equal to aggregate supply, as follows: 100 − 2p = N p 5 + . 4 4 226 Chapter 9 Because, we already found the equilibrium price, p = $3.94, we can insert it into this expression to obtain 100 − (2 × 3.94) = N 3.94 5 , + 4 4 which, solving for N, yields an equilibrium number of firms of N ∗ = 92.12 2.23 = 41.21 (i.e., 41 firms are active in the industry, as no fractional firms can enter). Interestingly, as demand increases, the equilibrium number of firms N ∗ grows as well. For instance, if demand increases from that in the current example, qD (p) = 100 − 2p, to qD (p) = 4, 000 − 2p, the equilibrium number of firms grows to N ∗ = 1, 786 firms. Essentially, because all firms produce the same output, q = 2.24 units, an increase in demand attracts more firms to the industry.7 Self-assessment 9.6 Repeat the analysis in example 9.6, but assume that the demand function is now qD (p) = 350 − 2p. How do equilibrium price, quantity, and number of firms in the industry, change relative to those we found in example 9.6? 9.7 Producer Surplus As described in chapter 5, the consumer surplus represents the difference between the consumer’s maximum willingness-to-pay for an object and the price that she actually pays, p. Graphically, the consumer surplus was given by the area below the demand curve (as that captures the consumer’s maximum willingness-to-pay for each unit) and above market price p. A similar argument applies to the analysis of producer surplus, as described next. Producer surplus The difference between the price that the producer receives for its product, p, and its marginal cost from producing that unit. 7. In the short run, where the number of firms is fixed, an increase in demand would lead to a short-lived increase in prices above the shutdown price of $3.94. Graphically, the supply curve would be unaffected, but the demand curve would shift rightward, thus producing a crossing point of demand and supply to the northeast of the initial crossing point. Because the equilibrium price coincides with the shutdown price, the new (higher) price entails p > min AC(q), thus allowing every firm to earn economic profits. As information about prices is common knowledge, however, firms in other markets would be attracted to this industry in the long run, ultimately pushing equilibrium prices down to p = min AC(q), with zero economic profit. Partial and General Equilibrium 227 Graphically, the producer surplus (PS) is given by the region below the prevailing market price p and the firm’s supply curve because the latter is found by setting p = MC(q) and solving for q. Importantly, recall that the firm’s marginal cost comes from the derivative of the total cost, MC(q) = ∂TC(q) ∂q ; and, in turn, the total cost TC(q) was the result of minimizing the firm’s costs (i.e., choosing input combinations that minimize costs). As a consequence, PS measures the profit margin that the firm makes by comparing the price that it receives from each unit against the minimal cost that the firm incurs when producing 1 extra unit, as captured by MC(q). Example 9.7: Finding producer surplus Consider the supply function found in example 9.6, p = −5 + 4q, or q = p4 + 54 , as depicted in figure 9.7. In that scenario, we found the shutdown price to be pShutDown = $3.94. Let us evaluate PS when market price is p = $15. Because this supply curve is linear, we can find PS by calculating the area of rectangle A plus triangle B in figure 9.7, as follows: 1 p − pShutDown q − qShutDown , PS = (p − pShutDown )qShutDown + 2 Area A Area B where p = $15 denotes the price we consider, pShutDown = $3.94 expresses the shutdown price found in example 9.6, qShutDown = 2.24 units is the output every firm produces at the shutdown price, and q denotes the units sold at price p = $15, which 5 20 are q = 15 4 + 4 = 4 = 5 units. Using this information, PS in this example becomes p MC(q) = –5 + 4q p = $15 Area A Area B min AC = $3.94 Supply curve 5 units q = 2.24 units Figure 9.7 Producer surplus, PS = A + B. q 228 Chapter 9 1 PS = (15 − 3.94)2.24 + (15 − 3.94) (5 − 2.24) = 24.7 + 15.26 2 Area A Area B which simplifies to PS = $40.03. Self-assessment 9.7 Repeat the analysis in example 9.7, but assume that equilibrium price is p = $13. 9.8 General Equilibrium In this section, we extend this analysis to markets with more than one good. In particular, we seek to find equilibrium prices for which the demand and supply for every good are compatible with one another. For presentation purposes, we consider markets with two goods, 1 and 2, but the analysis extends to larger markets. In this scenario, an endowment e ≡ eA1 , eA2 ; eB1 , eB2 denotes the amount of goods 1 and 2 that consumers A and B enjoy when they do not trade. Consumer A’s endowment of good 1 is then eA1 , while that of good 2 is eA2 . Similarly, consumer B’s endowment of good 1 is eB1 , and that of good 2 is eB2 . For instance, an endowment could be e = (4, 1; 2, 3), indicating that consumer A initially has 4 units of good 1 and only 1 unit of good 2. Consumer B, however, has 2 units of good 1 and 3 of good 2. Figure 9.8 depicts this endowment e using the so-called Edgeworth box. Graphically, the origin of consumer A is in the southwest corner, so units of good 1 are illustrated on the horizontal axis and units of good 2 on the vertical axis. The origin of consumer B is in the northeast corner of the graph. To understand that consumer B is endowed with 2 units of good 1 and 3 units of good 2, you may want to rotate the book 180 degrees. The length of the horizontal axis represents the total endowment of good 1, eA1 + eB1 (which is equal to 6 units in this example), while the length of the vertical axis measures the total endowment of good 2, eA2 + eB2 (4 units in the ongoing example). Using a similar notation, we denote an allocation as x ≡ xA1 , xA2 ; xB1 , xB2 , which lists the amount of goods 1 and 2 that consumers A and B enjoy. Allocation x can differ from the initial endowment e if individuals trade among themselves. For instance, an Partial and General Equilibrium 229 A x2 B x1 2 units Origin for Consumer B Endowment e 1 unit 3 units A Origin for Consumer A 4 units x1 B x2 Figure 9.8 Example of an endowment. allocation x = (2, 3; 4, 1) indicates that consumer A enjoys 2 units of good 1 and 3 of good 2, while consumer B has 4 units of good 1 and only 1 unit of good 2. As an exercise, you can depict this allocation in figure 9.8. In the subsequent discussion, we focus on feasible allocations, as defined next. Feasible allocation An allocation x is feasible if xe. Condition xe says that the aggregate amount that all individuals consume, x, does not exceed the aggregate amount they initially owned, e. In the context of two consumers and two goods, this condition can be expressed as xA1 + xB1 eA1 + eB1 for good 1, and xA2 + xB2 eA2 + eB2 for good 2. In figure 9.8, feasibility says that the allocation that consumers A and B enjoy cannot lie outside the box because, as discussed previously, the total endowment of good 1, eA1 + eB1 , is measured by the length of the horizontal axis in the Edgeworth box, while the total endowment of good 2, eA2 + eB2 , is given by the length of the vertical axis. We next examine the equilibrium allocations that emerge when individuals trade between themselves, continue defining what we mean by efficient allocations in this context, and finally compare efficient and equilibrium allocations. 230 Chapter 9 9.8.1 Equilibrium Prices Equilibrium price A price vector (p1 , p2 ) is in equilibrium if it clears the markets for both good 1 and good 2. In other words, the demand for every good k = {1, 2} from all individuals in the economy (which we refer to as “aggregate demand”) coincides with the supply of that good in the economy (“aggregate supply”). We can understand this definition by considering what would happen otherwise: If aggregate demand exceeds aggregate supply, agents could charge more for the items they sell, implying that the initial price was not in equilibrium. Likewise, if aggregate supply exceeds aggregate demand, buyers won’t be willing to pay so much for the product, forcing suppliers to reduce prices. When aggregate demand and supply coincide, suppliers and buyers have no incentive to increase or decrease prices. How can we put this definition to use? We can do this by finding the demand of every consumer and every good k because the sum of these expressions constitutes the aggregate demand for good k. Aggregate supply similarly can be found by summing across the individual supply of every firm. For simplicity, we first consider an economy without production, where individuals exchange the endowments they do not plan to consume (i.e., a “barter economy”). In that scenario, supply of good k is just given by the total amount of good k in the endowments of consumers A and B. The demand from consumer A is obtained from solving her utility maximization problem A = p1 ; see chapter 3 for details. Similarly, (UMP), which yields tangency condition MRS1,2 p2 B = p1 . the demand from consumer B is found by solving her tangency condition MRS1,2 p2 Hence, the demands from these two individuals are compatible if the price ratio pp12 satisfies A B MRS1,2 = MRS1,2 = p1 . p2 Intuitively, this condition says that the market is in equilibrium when the indifference curves of consumers A and B are tangent to one another. That is, their slopes, captured by the marginal rate of substitution (MRS), coincide. Example 9.8 illustrates how to use this condition to find equilibrium prices that clear all markets. Example 9.8: Finding an equilibrium allocation and price Consider two consumers with the Cobb-Douglas utility function ui (xi1 , xi2 ) = xi1xi2 for every consumer A A i, and endowments e1 , e2 = (100, 350) for consumer A and eB1 , eB2 = (100, 50) for consumer B. Figure 9.9 illustrates this initial endowment, with a total of 200 units of good 1 and 400 units of good 2, which explains why the vertical axis in this example is Partial and General Equilibrium 231 A x2 Origin for Consumer B 100 units B x1 350 units 50 units Endowment e A Origin for Consumer A x1 100 units x 2B Figure 9.9 The endowment in example 9.8. longer than the horizontal axis. Intuitively, consumers exhibit symmetric preferences for goods 1 and 2 but start with asymmetric endowments of good 2 because consumer A owns 350 units, while B only owns 50 units. In this scenario, it is straightforward to find consumer A’s demand. A = p1 yields Consumer A. Using her tangency condition, MRS1,2 p2 xA2 xA1 = pp12 or, after rearranging, p2 xA2 = p1 xA1 . Inserting this result into consumer A’s budget constraint, p1 xA1 + p2 xA2 = p1 100 + p2 350, we obtain p1 xA1 + p1 xA1 = p1 100 + p2 350, which simplifies into consumer A’s demand for good 1: xA1 = 50 + 175 p2 . p1 Plugging this expression back into the tangency condition p2 xA2 = p1 xA1 , we obtain p2 = p2 xA2 , p1 50 + 175 p1 xA1 and, solving for xA2 , yields consumer A’s demand for good 2, xA2 = 175 + 50 pp12 . 232 Chapter 9 Consumer B. We can follow a similar approach to find consumer B’s demand for B = p1 , which yields goods 1 and 2, by using her tangency condition MRS1,2 p2 xB2 xB1 = pp12 or, after rearranging, p2 xB2 = p1 xB1 . We leave these calculations for the reader as an exercise. Following the same steps as for consumer A, you should obtain that consumer B’s demands are p2 p1 and xB2 = 25 + 50 . xB1 = 50 + 25 p1 p2 Lastly, we only need to find equilibrium prices. Inserting the demands for good 1 from consumers A and B into the feasibility condition, xA1 + xB1 = 100 + 100, we obtain p2 p2 50 + 175 + 50 + 25 = 200, p p 1 1 xB1 xA1 which simplifies to 100 + 200 pp21 = 200. Solving for ratio of p2 1 = . p1 2 A x2 p2 p1 , we find an equilibrium price 100 units Origin for Consumer B 62.5 units B x1 350 units 50 units Endowment e 275 units 125 units ICA Equilibrium allocation x* IC B Origin for Consumer A Figure 9.10 The equilibrium allocation in example 9.8. A x1 137.5 units 100 units x 2B Partial and General Equilibrium 233 Plugging this price ratio into these demands yields an equilibrium allocation of and xA2 = 275 units for consumer A, and xB1 = 62.5 and xB2 = 125 units for consumer B. Figure 9.10 superimposes this equilibrium allocation onto figure 9.9. Relative to the initial endowment, consumer A gives up 350 − 275 = 75 units of good 1 to gain 137.5 − 100 = 37.5 units of good 2; and consumer B gains 75 units of good 1 and gives up 37.5 units of good 2. Finally, we can show that both consumers are made better off by trading. Consumer A’s utility with her initial endowment is 350 × 100 = 35, 000 but at the equilibrium allocation her utility increases to 275 × 137.5 = 37, 812.5. A similar argument applies to consumer B, who sees her utility increase from 50 × 100 = 5, 000 with her endowment to 62.5 × 125 = 7, 812.5. xA1 = 137.5 Self-assessment 9.8 Repeat the analysis example 9.8, but assume that the B in B endowment of consumer B changes to e1 , e2 = (50, 100). How are equilibrium allocations and price affected by this endowment change? 9.8.2 Efficient Allocations In this section, we examine whether equilibrium allocations are efficient. This is an interesting question because, if the allocation that emerges in equilibrium when individuals exchange goods is efficient (in the sense we define here), no government intervention is needed. We consider two assumptions in the rest of this chapter: (1) every consumer’s utility function is strictly increasing in the goods she enjoys, being unaffected by the amount of goods the other individual consumes; and (2) markets for goods 1 and 2 exist with prices p1 and p2 , which all consumers take as given. Efficient allocation A feasible allocation x is efficient if we cannot find another feasible allocation y that strictly increases the utility of at least one individual without reducing the utility of any other individual. In other words, we cannot rearrange the bundles that each individual consumes to make at least one of them strictly better off than she is with x, without making another individual worse off. The appendix in this chapter shows that efficiency entails that the indifference curves of consumers A and B must be tangent to one another, thus having the same slope. Because the slope of an indifference curve is measured with the MRS, we can say that an 234 Chapter 9 efficient allocation x requires A B = MRS1,2 . MRS1,2 9.8.3 Equilibrium versus Efficiency From the previous section, we have learned that an equilibrium allocation occurs when the indifference curves of consumers A and B are tangent to one another and have slopes equal A = MRS B = p1 ). As a consequence, an equilibrium allocation to the price ratio (i.e., MRS1,2 1,2 p2 A = MRS B . In other words, equilibrium allocaalso satisfies the efficiency condition MRS1,2 1,2 tions are efficient. This result is often known as the “First Welfare Theorem” and, given its importance, we include it next. First Welfare Theorem Every equilibrium allocation is efficient. Therefore, the equilibrium allocation that emerges when individuals are allowed to trade among themselves cannot be improved upon by a benevolent social planner (e.g., a government official) who reassigns goods between consumers. That is, the planner will not be able to increase the utility of at least one individual without decreasing the utility of another individual. Importantly, this result cannot be interpreted as saying that markets are always efficient. Instead, it means that markets are efficient so long as assumptions (1) and (2) hold, but may be inefficient when these assumptions are violated. Examples where assumption (1) does not hold include consumers caring about the amount of goods that other individuals enjoy (exhibiting envy or guilt). Similarly, instances where assumption (2) is not satisfied include those in which the market for one of the goods does not exist (such as for bads like pollution), or if it does exist, consumers have market power, and thus fail to take prices as given. We explore scenarios where agents sustain market power in chapters 10, 11, and 14, and contexts where markets may fail to exist in chapter 16. Example 9.9: Finding efficient allocations Consider the consumers of example A = MRS B yields 9.8. The tangency condition MRS1,2 1,2 xA2 xA1 xB = x2B or, after rearranging, 1 xA2 xB1 = xB2 xA1 . The feasibility requirement for good 1 says that xA1 + xB1 = 100 + 100, or xB1 = 200 − xA1 . Similarly, the feasibility requirement for good 2 says that xA2 + xB2 = 350 + 50, or xB2 = 400 − xA2 . Inserting these feasibility equations into the tangency condition, xA2 xB1 = xB2 xA1 , yields xA2 200 − xA1 = 400 − xA2 xA1 , xB1 xB2 Partial and General Equilibrium 235 A x2 Origin for Consumer B B x1 Efficient allocations Origin for Consumer A A x1 x 2B Figure 9.11 Efficient allocations. which simplifies to xA2 = 2xA1 . Essentially, for an allocation to be efficient, consumer A must enjoy twice as many units of good 2 than of good 1. Figure 9.11 illustrates this line, which starts at the origin of consumer A and grows with a slope of 2. Consumer B must then enjoy the remaining xB1 = 200 − xA1 units of good 1 and xB2 = 400 − xA2 of good 2. Self-assessment 9.9 Repeat the analysis in example 9.9, but assume that the endowment of individual B changes to eA1 , eB2 = (50, 100). Find the set of efficient allocations. Is it affected by the change in individual B’s endowment? Example 9.10: Testing the First Welfare Theorem Is the equilibrium allocation found in example 9.8 efficient? For it to be efficient, the condition that we found in example 9.9, xA2 = 2xA1 , must hold. It is indeed satisfied because in example 9.8, 236 Chapter 9 A x2 275 units Origin for Consumer B 62.5 units B x1 Equilibrium allocation x* 125 units ICA ICB Efficient allocations Origin for Consumer A A x1 137.5 units x 2B Figure 9.12 Efficient and equilibrium allocations. we found that xA1 = 137.5 and xA2 = 275 for consumer A, where xA2 = 2 × 137.5 = 275 units. Figure 9.12 superimposes the efficiency condition (from figure 9.11) and the equilibrium allocation (from figure 9.10), showing that, indeed, the equilibrium allocation lies on the line of efficient allocations. Self-assessment 9.10 Consider the equilibrium allocation you found in selfassessment 9.8. Is it efficient? Hint: It must satisfy the efficiency condition you found in self-assessment 9.9 The First Welfare Theorem, informally, says that if we let market forces work, a social planner (even if she is perfectly informed about all individuals’ preferences and their endowments) won’t be able to improve welfare. In other words, the theorem provides an argument against market intervention. (In future chapters, we explore under which conditions market failures exist and this theorem does not hold, and thus government intervention might be necessary.) Partial and General Equilibrium 237 A natural question at this point is whether the converse relationship of that in the First Welfare Theorem also holds. That is, can every efficient allocation emerge as an equilibrium outcome? Second Welfare Theorem Consider an efficient allocation x, and a redistribution of the initial endowment, from e to e, which satisfy pei = pxi for every individual i = {A, B}. Then, every efficient allocation can be supported as an equilibrium allocation given the new endowment e. To better understand this theorem, consider a scenario where society prefers, among all efficient allocations, a specific allocation x. The theorem says that this allocation can emerge in equilibrium if we redistribute the initial endowments and then let the market work by allowing individuals to trade among themselves. In other words, for every efficient allocation we seek to implement, we can find the appropriate redistribution of endowments that will ultimately move the equilibrium toward that efficient allocation.8 One way to redistribute endowments across consumers is by taxing some consumers and distributing tax collection among other consumers as a subsidy. We explore this type of redistribution scheme in example 9.11. Example 9.11: Testing the Second Welfare Theorem The efficient allocations found in example 9.9 satisfy xA2 = 2xA1 , where xA1 ∈ [0, 200]. One specific allocation satisfying this condition is xA1 = 100 and xA2 = 200 units for consumer A, which leaves xB1 = 200 − xA1 = 200 − 100 = 100 units of good 1 and xB2 = 400 − xA2 = 400 − 200 = 200 units of good 2 for consumer B. Which redistribution of the initial endowment can lead to such an allocation emerging in equilibrium? From the equilibrium allocation in example 9.8, we know that A = MRS1,2 xA2 xA1 = p1 =⇒ p1 xA1 = p2 xA2 p2 (9.2) = p1 =⇒ p1 xB1 = p2 xB2 p2 (9.3) from consumer A, and B MRS1,2 = xB2 xB1 8. Like the First Welfare Theorem, this theorem holds if assumptions (1) and (2) are satisfied, but may not hold otherwise. In other words, we may not be able to implement an efficient allocation via redistribution if consumers care about the amount other individuals enjoy (violating assumption 1), if the market for some good does not exist (violating assumption 2) or if consumers do not take prices as given (violating assumption 2). 238 Chapter 9 from consumer B. We next show that this efficient allocation can be implemented if the social planner (e.g., government) sets a tax tB > 0 to individual B, with the amount collected going to individual A as a subsidy (technically, we set tA < 0 to individual A as a “negative tax”).9 Consumer A. We can express consumer A’s budget constraint in terms of the subsidy tA she receives, as follows:10 p1 xA1 + p2 xA2 = p1 eA + p2 eA 1 2 + Value of initial endowment tA . Tax/Subsidy After substituting equation (9.2) on the left side and her initial endowment eA1 , eA2 = (100, 350) on the right side, we find 2p1 xA1 = 100p1 + 350p2 + tA . Solving for xA1 , we obtain xA1 = 50 + 175 p2 tA + . p1 2p1 (1) insert the specific efficient We now take xA1 for consumer A and do the following: allocation that we seek to implement, xA1 , xA2 , xB1 , xB2 = (100, 200, 100, 200); (2) insert the equilibrium price ratio pp21 = 12 (which is not affected relative to example 9.8)11 ; and (3) normalize the price of good 2, so that p2 = $1 and p1 = $2 in equilibrium. Doing these three steps, we obtain tA 1 , 100 = 50 + 175 + 2 2×2 (9.4) which, solving for tA , yields tA = −$150 (a subsidy of $150 to consumer A). We would obtain the same result if, in this analysis, we start by solving for xA2 , yielding xA2 = tA . After applying steps (1) to (3), this expression collapses to 200 = 50 pp12 + 175 + 2p 2 tA , which also yields a subsidy of tA = −$150 for consumer A. (50 × 2) + 175 + 2×1 9. This means that tax collection is subject to tA + tB = 0, which is often referred to as being “revenue neutral.” Intuitively, tax collected cannot be used in a part of the economy not explicitly considered in our model. 10. The expression below does not assume that consumer A is subject to a tax or subsidy. We just write tA , and then we will find whether our result yields a tax (if tA > 0) or a subsidy (if tA < 0, i.e., negative tax). 11. Intuitively, the utility functions of each individual did not change, nor did the total endowment of each good. Relative to example 9.8, we are only taxing one individual, and then giving the tax collected to the other individual. Partial and General Equilibrium Consumer B. 239 We can apply a similar argument to consumer B, so we can express her budget constraint as a function of the tax tB she faces, as follows:12 p1 xB1 + p2 xB2 = p1 eB1 + p2 eB2 + tB . Value of endowment Tax/Subsidy After substituting equation (9.3) on the left side and her initial endowment eB1 , eB2 = (100, 50) on the right side, we find 2p1 xB1 = 100p1 + 50p2 + tB . Solving for xB1 , we obtain xB1 = 50 + 25 p2 tB + . p1 2p1 After applying steps (1) to (3), xB1 becomes 1 tB 100 = 50 + 25 + , 2 2p1 which, after solving for tB , yields tB = $150 (a tax of $150 to consumer B).13 Finally, we confirm that the tax imposed on consumer B, tB = $150, coincides with the subsidy provided to consumer A, tA = −$150, so the redistribution scheme is revenue neutral. 9.8.4 Adding Production to the Economy Previous sections of this chapter considered exchange economies, in the sense that supply of goods comes from the initial endowments, but no production occurs (as in a barter economy). When we allow for firms, this analysis still holds, but a few new elements emerge, as we discuss next. Equilibrium allocations. In equilibrium allocations, we still need every consumer i to solve i = p1 ), but we also require every her UMP (where we found the tangency condition MRS1,2 p2 12. A similar argument as for consumer A applies here. The expression here does not assume that consumer B is subject to a tax or subsidy. We will find out shortly whether our result yields a tax (if tB > 0) or a subsidy (if tB < 0). p1 tB B 13. A similar result arises if, in this analysis, we start by solving for good 2, xB 2 , yielding x2 = 50 p + 25 + 2p . 2 2 After applying steps (1) to (3), this expression simplifies to 200 = (50 × 2) + 25 + t2B , which also yields a tax of tB = $150 to consumer B. 240 Chapter 9 j j firm to solve its PMP (which yields an additional tangency condition MRT1,2 = pp12 ). MRT1,2 denotes the marginal rate of transformation of input j (such as labor), which is defined as j MRT1,2 = MPj1 MPj2 , where MPj1 is the marginal product of good 1 to input j (i.e., how much the output of good 1 increases as the firm uses 1 more unit of input j) and, similarly, MPj2 represents the marginal j product of good 2 to input j. Intuitively, the tangency condition MRT1,2 = pp12 says that the firm rearranges the use of every input j between the production of goods 1 and 2 until their relative productivity, MPj1 MPj2 , coincides with these goods’ price ratio, p1 p2 . j i = p1 and MRT In summary, an equilibrium allocation with production requires MRS1,2 1,2 p2 p1 = p2 , which we can compactly express as j i MRS1,2 = MRT1,2 = p1 , p2 holding for every individual i and every input j. Efficient allocations. Regarding efficiency, the previous definition still applies; that is, an allocation is efficient if we cannot find another feasible allocation that makes at least one consumer strictly better off and no consumers worse off. Mathematically, efficiency with i = MRT j , thus entailing that the rate at which consumers are production requires MRS1,2 1,2 willing to trade goods coincides with the rate at which firms are capable of transforming one good into another in their production process. Furthermore, the First and Second Welfare Theorems also hold in economies with production under relatively general conditions. 9.9 A Look at Behavioral Economics—Market Experiments Several controlled experiments have tested the sharp results discussed in this chapter— namely, that an equilibrium price ratio helps clear markets. Experimenters often construct a “double auction,” in which every buyer is informed of her value for the object, while every seller is informed that her reservation price reflects the cost of producing the good. Every seller is then asked to announce a price for the good, and simultaneously, every buyer announces the price that she is willing to pay for the good. The experimenter then aggregates the willingness-to-pay from all buyers to depict an approximated demand curve and, similarly, aggregates the prices from all sellers to construct an approximated supply curve. Finding the point where demand and supply curves cross each other, the experimenter determines the market price and quantity; comparing Partial and General Equilibrium 241 the observed results against the theoretical prediction. Surprisingly, experimental results converge relatively fast to the theoretical prediction. As Smith (1991) put it, “I am still recovering from the shock of the experimental results. The outcome was unbelievably consistent with competitive price theory.” For a more detailed introduction to market experiments, see Just (2013) and Angner (2016) and references therein. Appendix. Efficient Allocations and Marginal Rate of Substitution In this appendix, we show that an efficient allocation in an economy with two consumers A = MRS B . and two goods must satisfy MRS1,2 1,2 We start by recalling that, for an allocation x to be efficient, it must solve max uA (x) x subject to uB (x) uB and xA1 + xB1 eA1 + eB1 (Feasibility of good 1) xA2 + xB2 (Feasibility of good 2) eA2 + eB2 Essentially, an allocation is efficient if it maximizes consumer A’s utility without reducing the utility of consumer B below a certain cutoff level uB , while satisfying the feasibility condition for each good (which says that, in aggregate, individuals do not consume more units of each good than those in the initial endowments). The Lagrangian function associated with this maximization problem is L = uA (x) + λ uB (x) − uB + μ1 eA1 + eB1 −xA1 − xB1 + μ2 eA2 + eB2 −xA2 − xB2 , where λ denotes the Lagrange multiplier associated with the first constraint (the utility of consumer B cannot decrease below uB ); μ1 represents the Lagrange multiplier associated with the second constraint (i.e., the feasibility constraint for good 1); and μ2 is the Lagrange multiplier associated with the third constraint (i.e., the feasibility constraint for good 2). Differentiating with respect to the units of goods 1 and 2 for consumer A, xA1 and xA2 , we obtain ∂uA (x) ∂uA (x) − μ1 = 0 and − μ2 = 0 A ∂x1 ∂xA2 (9.5) 242 Chapter 9 in the case of interior solutions. Similarly, differentiating with respect to the units of goods 1 and 2 for consumer B, xB1 and xB2 , we find λ ∂uB (x) ∂uB (x) − μ1 = 0 and λ − μ2 = 0. B ∂x1 ∂xB2 (9.6) Dividing the two equations that make up (9.5) yields ∂uA (x) ∂xA1 ∂uA (x) ∂xA2 μ1 μ2 = (9.7) and dividing the two equations in (9.6), we obtain ∂uB (x) ∂xB1 ∂uB (x) ∂xB2 = μ1 , μ2 because λ cancels out. As equations (9.7) and (9.8) are both equal to equal to each other, yielding ∂uA (x) ∂xA1 ∂uA (x) ∂xA2 = ∂uB (x) ∂xB1 ∂uB (x) ∂xB2 (9.8) μ1 μ2 , we can set them A B , or MRS1,2 = MRS1,2 . Therefore, in efficient allocations, the MRS of consumers A and B must coincide. Exercises 1. Identifying perfectly competitive markets.A Are the following markets an example of perfect competition? If not, explain. (a) The soybean market (b) The market for a cable television provider (c) The market for a popular item on the Internet (d) The market for professional basketball players (e) The new car market 2. Short-run equilibrium.B Consider a perfectly competitive market with aggregate demand given by qD (p) = 10 − p. Assume that only two firms operate in this industry. The cost function of firm 1 is C1 (q1 ) = 3q21 − 7q1 , whereas that of firm 2 is C2 (q2 ) = 4q22 . Partial and General Equilibrium 243 (a) Find the supply function of each firm. (b) If no more firms can enter the industry, find the aggregate supply. Then, identify the equilibrium price and output. 3. Long-run equilibrium and subsidies.B Consider a perfectly competitive market with aggregate demand given by Q(p) = 330 − p. All firms face the same cost function C(qi ) = 2q2i − 4qi + 20, where qi denotes the output of firm i. (a) Assuming that there is free entry and exit in the industry, find the long-run equilibrium: number of firms operating, equilibrium price, and output. (b) The government considers two policies to induce the entry of more firms: (1) a subsidy of s > 0 per unit of output sold; or (2) a subsidy of c > 0 per unit of output consumed. Compare both policy tools. Which one is more effective at increasing the number of firms in the market (assume s = c)? 4. Shutdown price.A Consider a firm that washes cars and has a cost function 1 C(q) = q2 + 7q + 10, 2 where q denotes the number of cars washed. Intuitively, the first two terms capture the firm’s variable cost, because they depend on the output the firm produces, whereas the last term represents its fixed cost, because it is not a function of output q. (a) Find the firm’s marginal cost curve, its average cost curve, its average variable cost curve, and its average fixed cost curve. (b) Assume that the firm operates in a perfectly competitive industry, taking a price of p = $18 as given. Which output level does the firm choose in this scenario? (c) What if the price of output decreases to p = $11? (d) For which price will the firm choose to shut down its operations? 5. Perfectly competitive equilibrium–II.A Consider the perfectly competitive wheat market with aggregate demand given by Q(p) = 25 − 2p, where Q is in thousands of pounds of wheat and p is in thousands of dollars. (a) If the marginal cost of wheat is $2, 000, what is the market equilibrium price and quantity sold? (b) One of the more certain things in policy is the so-called farm bill, which subsidizes many agricultural goods. How would a subsidy of $1 per pound of wheat affect the market equilibrium? 6. Supply functions and equilibrium.B Sarah and Linda are owners of competing bakeries that operate under perfect competition with aggregate demand Q(p) = 500 − 10p. Sarah’s bakery produces cakes with costs Cs (qs ) = 5 − 10qs + 3q2s and Linda’s cost is Cl (ql ) = 10 + 2q2l . (a) Find each firm’s supply function, and, given that no other firms are entering the industry, find the aggregate supply. (b) Identify the equilibrium price and quantity. 244 Chapter 9 (c) The local chamber of commerce held a bake-off between the two that gave the winner (Sarah) free rent for the next year (assume that her entire fixed cost is her rent). Does this prize affect the equilibrium price and quantity of Sarah’s cakes? (d) Suppose that the contest also gave a bump to the overall demand for cakes, changing aggregate demand to Q(p) = 600 − 10p. What is the new equilibrium price and quantity? 7. Shutdown price and short-run supply.A Ben is the owner of an on-demand, pre-made food service. His total cost function is TC(q) = 50 − 6q + 2q2 , where his entire fixed cost is sunk (his fixed costs are the kitchen space he leases yearly). Find Ben’s shutdown price and short-run supply curve. 8. U-shaped MC.B Consider a firm with MC curve of MC(q) = (q − 3)2 + 1 that faces a price of $6. This MC curve crosses the price at two points. Which is the profit-maximizing quantity? Can you generalize this result to any U-shaped MC curve that crosses the demand curve in two spots? 9. Finding aggregate supply and equilibrium.A The market for gasoline currently has N firms, each of which faces the cost function C(q) = 5 + 0.75q2 , and current market demand is Q(p) = 500 − 0.1p. (a) Find the aggregate supply curve. (b) What is the equilibrium price, quantity, and number of firms that will operate in the gasoline market in the long run? 10. Crop choice and long-run equilibrium.B Farmers make a choice each growing season about what crop to plant in each field. Explain the reasons why a farmer might choose to plant one crop over another (such as soybeans rather than cotton). How does this choice affect the equilibrium price and quantity for each crop (the crop they plant versus the one they don’t)? Are these markets ever in a long-run equilibrium? 11. Calculating producer surplus.A Consider the market for dog treats, which has the aggregate supply of 1 p + 5 if p > $5.50 S 2 q (p) = 0 otherwise. Aggregate demand in this market is qD (p) = 120 − 6p. (a) Find the market equilibrium price and quantity. (b) Identify the producer surplus, PS. (c) A recent report found that dog treats were making dogs overweight, and regulators propose a tax of $2 per unit to decrease the purchase of dog treats. Will the tax have the intended consequences? Find the new producer surplus, PS . 12. Impacts of a tax on PS and CS.A Many cities have banned the use of plastic grocery bags, while some have implemented a tax on their use. Discuss the implications of these bans on producer and consumer surpluses in the grocery market in the case of a ban, and then in the case of a tax. 13. Profit versus producer surplus.B Is the following statement true or false? Profit is the same as producer surplus. Explain your answer. Partial and General Equilibrium 245 14. General linear producer surplus and consumer surplus.A Consider a market with aggregate supply QS (P) = d + cP, and aggregate demand QD (P) = a − bP where a > d. (a) Find the equilibrium price and quantity. (b) Find the producer and consumer surpluses. (c) Why must it be the case that a > d? 15. General equilibrium–I.B Consider two consumers with utility functions ui (xi1 , xi2 ) = ln xi1 + xi2 , A B B and endowments (eA 1 , e2 ) = (50, 100) for consumer A and (e1 , e2 ) = (25, 125) for consumer B. What is the equilibrium allocation and price? 16. General equilibrium–II.B Two roommates, Eric and Chris, have been buying their own groceries; and currently Eric has 25 slices of bread (bE ) and 8 eggs (eE ), while Chris has 12 slices of bread (bC ) and 16 eggs (eC ). Eric’s utility function is uE (bE , eE ) = ln eE + bE , while Chris’s utility is uC (bC , eC ) = eC bC . What is the equilibrium allocation and price? 17. Efficiency in general equilibrium.B Consider two neighbors that trade food from their gardens (f ) and groceries (g). Neighbor A has utility uA (f A , gA ) = ln f A + 2 ln gA and neighbor B has utility uB (f B , gB ) = 2 ln f B + ln gB . What is the equilibrium allocation and price if each neighbor has 40 units of food and 25 units of groceries? Is this equilibrium efficient? 18. General equilibrium–III.A Consider two consumers with utility functions ui (xi1 , xi2 ) = xi1 xi2 , and A B B endowments (eA 1 , e2 ) = (500, 100) for consumer A and (e1 , e2 ) = (100, 350) for consumer B. What is the equilibrium allocation and price? 19. Second Welfare Theorem.C Consider the two consumers in exercise 18. Propose a second allocation (i.e., not the equilibrium that you found), that satisfies the Second Welfare Theorem. How could this allocation be implemented by a social planner? 20. Gains from trade.A If we have two identical consumers (with the same utility function), can they gain from trade? 21. First Welfare Theorem with external effects.C Consider two individuals with utility funcA B B B tions uA = xA yA and uB = xB yB − 0.5xA , and endowments eA (eA x , ey ) = (15, 5) and e (ex , ey ) = (10, 15). (a) Is individual A’s utility affected by individual B’s consumption? Is individual B’s utility affected by A’s consumption? Interpret. (b) Find the equilibrium allocation. (c) Show that the equilibrium allocation is not socially efficient (Hint: Refer to the appendix in this chapter.) (d) Does the First Welfare Theorem hold? Interpret your results. 22. First Welfare Theorem with restricted trade.C Consider two individuals with utility functions A B B B uA = xA yA and uB = xB yB , and endowments eA (eA x , ey ) = (20, 5) and e (ex , ey ) = (5, 15). (a) Find the equilibrium allocation when only good x can be traded. (b) Find the efficient allocation. (c) Show that the equilibrium and socially efficient allocations do not coincide. Does the First Welfare Theorem hold? Interpret your results. 10 Monopoly 10.1 Introduction Chapter 9 analyzed equilibrium output and price in perfectly competitive markets where a large number of firms (each with a small market share) compete selling the same product. As we discussed, firms’ intense competition lead them to undercut each other’s price until it coincides with their common marginal cost of production. As a consequence, firms earn no profits in equilibrium. In this chapter, we examine the opposite type of market structure, where only one firm operates in an industry, thus having the ability to set output and price without competing with other firms. We start by discussing the barriers that prevent firms from entering a monopolized market, allowing the monopolist to charge high prices without the threat of entry. We then examine the monopolist’s profit-maximization problem (PMP), and how it differs from a competitive market. We also discuss the Lerner index of market power, which essentially measures a firm’s ability to set a price markup over marginal cost. In the last few sections of the chapter, we apply our analysis of monopoly to study three extensions. First, we consider multiplant monopolies, where a single firm produces in different plants, each with potentially distinct costs. Second, we analyze the welfare that results from a monopoly, and how it is less than that in a perfectly competitive industry. Finally, we examine markets with a single buyer and several sellers (monopsonies), showing that our mathematical approach to monopoly extends to this type of market as well. 10.2 Why Do Monopolies Exist? In this chapter, we show that monopolies set prices above those in a perfectly competitive industry, thus reducing consumer welfare. Before starting our analysis, a natural question is, “Why do monopolies exist in the first place if they are bad for society?” The following discussion highlights some of the barriers to entry that protect a firm from competitors joining its industry. 248 Chapter 10 Structural barriers. Incumbent firms may have a cost advantage (e.g., superior technology) or a demand advantage (a large group of loyal customers). These advantages cause potential entrants to find it relatively unattractive to join the industry. A common example of cost advantage is water and natural gas distribution companies, which incur massive fixed costs to start their operation, but then face a relatively low cost of servicing each additional customer. In this scenario, average cost (i.e., cost per unit of output) is decreasing in output q, implying that the average cost of a single firm producing q units is lower than the aggregate average cost of two firms that together produce q units; that is, AC(q) < AC(q1 ) + AC(q2 ), where q1 + q2 = q.1 This type of industry is often referred to as a “natural monopoly” because it is natural to find only one firm in this market, given that it benefits from decreasing average costs (economies of scale, as discussed in section 8.8 of chapter 8). A common example of demand advantage is that of online sellers, such as Amazon or eBay. Many customers visit their websites as their first option when they plan to buy a new item. Even large firms, such as WalMart, have struggled to increase their online sales due to the tendency of most online buyers to use Amazon by default.2 Legal barriers. In some countries, both developed and underdeveloped, monopolies are sometimes legally protected. An extreme example of this would be a country not allowing new telephone companies to operate, as was the case in several European countries until the 1980s. A less extreme case, still observed in most countries nowadays, is protection from patents. A firm can, after years of research and development, file for the patent of a product or process that it discovered. Such a patent prohibits other firms or individuals from using the patented product (or process). A common example is that of a pharmaceutical firm, patenting a new drug it discovered, which allows the firm to sell the drug as a monopolist (i.e., no other firm can sell the same drug) for twenty years. You might have heard of some prescription drugs with patents sold at astronomical prices, such as Glybera, with an annual cost of $1.2 million (yes, million!); Soliris, with an annual cost of $440,000; and Elaprase, $375,000, to name but a few. Even a 5 percent copay looks scary!3 Strategic barriers. Even in the absence of legal or structural barriers, an incumbent firm can take actions to deter entry, such as starting a price war against every newcomer. By doing 1. For instance, if TC(q) = 100 + 2q, then its average cost is AC(q) = 100 q + 2, which is decreasing in q. In this scenario, the average cost of producing q = 10 units by a single firm would be AC(10) = $12, whereas the aggregate average cost of two firms producing 5 units each is AC(5) + AC(5) = 22 + 22 = $44. A similar argument applies to firms with total cost (TC) function of the form TC(q) = a + bq, where a, b > 0, yielding AC(q) = aq + b. 2. Amazon net sales in 2017 were around $178 billion, while WalMart’s online sales were only $11 billion—a generous sum, but small in comparison. 3. Patents, and the monopoly profits that the firm can obtain, can provide incentives for firms to invest in research and development. However, many experts point out that the length of such patents (e.g., twenty years in the case of drugs) is probably excessive. Some economists have even proposed the complete elimination of patents, because the firm discovering a new product would still make some monopoly profits from its discovery while other firms spend time understanding the details of the product before they can copy it to produce it in large volumes. Monopoly 249 so, the incumbent builds a reputation of being a tough competitor, thus deterring potential entrants from joining the industry in the future. This was the case, for instance, with the price war between United Airlines and Frontier Airlines in the Billings, Montana to Denver, Colorado route in 1994.4 After a year of sustaining the war, Frontier Airlines (the smaller company of the two) withdrew from the route, leaving United as the only airline offering the flight between these two cities. As you might suspect, United increased its prices for this route immediately after. 10.3 The Monopolist’s Profit Maximization Problem Some preliminaries. To better understand a monopoly, let us recall the polar opposite: perfectly competitive markets, which we examined in chapter 9. In perfectly competitive industries, the market share of every firm is so small that each individual production decision has no effect on market prices. For instance, a firm’s decision to produce 100 more computers does not significantly affect aggregate supply (which can be hundreds of millions of units every year). Because market price is a function of aggregate supply, market price is, therefore, unaffected by a negligible change in aggregate supply.5 Informally, the increased production of firm i is like adding a drop of water to the sea. In contrast, in a monopolized industry, a single firm decides the output level, which implies that individual and aggregate outputs coincide in this scenario (i.e., q = Q), as there are no other firms in the market. As a result, a change in output q does affect market prices, as measured by the inverse demand function p(q), which decreases in output q. A common example is the linear inverse demand p(q) = a − bq, where a, b > 0. Graphically, this inverse demand function originates at a, and decreases at a rate of b, reaching the horizontal axis at ab . Intuitively, when the monopolist sells few units (i.e., low values of q), consumers are willing to pay a relatively high price for the scarce good, but as the firm offers more units (larger values of q), consumers are willing to pay less for the relatively abundant good. Writing the monopolist’s problem. We can express the monopolist PMP as follows: max π = TR(q) − TC(q) = p(q)q − TC(q). q (10.1) Essentially, the problem asks the monopolist to choose its output q to maximize its profits π , as measured by the difference between total revenue and total cost. This problem is analogous to that presented in chapter 9 for firms operating in a perfectly competitive industry, with only one difference—namely, that the price was assumed to be a constant p in that 4. This price war lead both firms to reduce their prices by half! What a bargain! 5. Formally, we say that the individual production of firm i is represented by qi , which implies that, in an industry with N firms, aggregate supply is the sum of individual supplies across all N firms (i.e., Q = N i=1 qi ). In a market where the number of firms, N, is sufficiently large, an increase in the supply of one firm i, qi , does not significantly affect the aggregate supply Q, as the latter includes output from many other firms. 250 Chapter 10 scenario (and every firm’s output decision is unaffected, given its negligible market share), whereas now it is a function of the monopolist output q, so we write its price as p(q). This will lead to different results, as we examine next. Differentiating the profit in equation (10.1) with respect to the monopolist’s output q, we obtain Solving the monopolist’s problem. p(q) + ∂TC(q) ∂p(q) q− = 0, ∂q ∂q or, rearranging, ∂p(q) p(q) + q ∂q Marginal revenue, MR(q) ∂TC(q) ∂q = . Marginal cost, MC(q) Therefore, to maximize its profits, the monopolist increases its output q until the marginal revenue obtained from selling an additional unit coincides with the marginal cost (i.e., extra cost) from producing such unit. If, instead, MR(q) > MC(q), the monopolist would still have incentives to increase output q because its revenues increase more than its cost. The opposite argument applies if MR(q) < MC(q), where the monopolist would have incentives to decrease its output q. 10.3.1 A Closer Look at Marginal Revenue From these results, marginal revenue is given by MR(q) = p(q) Positive effect + ∂p(q) q. ∂q Negative effect To understand this expression, consider that the monopolist increases its output by 1 unit. This additional unit produces two effects on the firm’s revenue (one positive and one negative), as indicated in the expression for MR(q). We next discuss each of these effects: • Positive effect on MR(q). If the firm sells 1 more unit, it would earn a price p(q) from that unit, as captured by the first (positive) term in MR(q), reflecting that the firm’s revenue increases. • Negative effect on MR(q). When offering 1 more unit, however, the firm needs to decrease the price of all units sold, as captured by the second term in MR(q), which is negative 6 because ∂p(q) ∂q < 0. Intuitively, the second effect emerges because the market becomes 6. That is, demand decreases in q. For example, consider the inverse demand in p(q) = a − bq, where −b < 0. ∂p(q) ∂q = Monopoly 251 flooded, thus forcing the monopolist to sell the new unit at a lower price. Because the firm charges the same price for all its units (i.e., uniform price), the price reduction on the new unit, as measured by ∂p(q) ∂q , must be applied to all units, q, ultimately reducing the monopolist’s revenue by ∂p(q) ∂q q. In summary, increasing output entails a positive and a negative effect on the firm’s additional revenue, whose total effect must exactly offset the additional costs that producing 1 more unit generates for the monopolists; that is, MR(q) = MC(q). Example 10.1: Positive and negative effects of selling more units Consider a monopoly facing an inverse demand function p(q) = 10 − 3q. If the firm were to marginally increase its output, its marginal revenue becomes MR(q) = (10 − 3q) + (−3) q = 10 − 6q, where p(q) = 10 − 3q and ∂p(q) ∂q = −3. If the firm sells q = 2 units, its total revenue is TR(1) = p(2)2 = (10 − 3 × 2)2 = $8. Evaluating this marginal revenue at q = 2 units yields MR(2) = p(2) + (−3)2 = 4 − 6 = −$2 because the inverse demand function is p(q) = 10 − 3q, the price when the monopolist sells q = 2 units is p(2) = 10 − (3 × 2) = $4. Intuitively, MR(2) = 4 − 6 means that the monopolist’s total revenue experiences a positive effect of $4, because the firm now sells 1 more unit at a price of $4; but it also experiences a negative effect, because selling 1 more unit entails applying a price discount of $3 on all previous units. Overall, these two effects generate a total (net) decrease in its revenue of $2. Example 10.2: Finding marginal revenue with linear demand Consider a monopoly facing an inverse demand p(q) = a − bq. In this scenario, marginal revenue is MR(q) = p(q) + ∂p(q) q = (a − bq) + (−b)q = a − 2bq. ∂q Figure 10.1 depicts marginal revenue, MR(q) = a − 2bq, which originates at a price of a, and decreases in output q at a rate of 2b.7 As suggested previously, when the 7. Indeed, evaluating the marginal revenue MR(q) = a − 2bq at an output level of zero (q = 0) yields MR(0) = a. In addition, the derivative of MR(q) = a − 2bq with respect to output q is −2b. 252 Chapter 10 p a –2b Demand Curve, p(q) = a – bq MR(q) = a – 2bq a a 2b b q Figure 10.1 Marginal revenue curve with linear demand. monopolist sells few units, its decision to sell 1 more unit brings a large additional revenue, (i.e., high MR(q) on the left side of figure 10.1), because the monopolist applies a price discount to only a few units. However, when she sells a large volume, selling 1 more unit brings a small increase in its revenue (i.e., low MR(q) on the right side of figure 10.1), because the monopolist is forced to apply price discounts to many units. As a result, the marginal revenue is decreasing in sales or, in other words, the additional revenue that the monopolist earns from selling further units is decreasing. Self-assessment 10.1 Consider a monopolist facing an inverse demand p(q) = 10 − 4q. Find the marginal revenue curve, its vertical intercept, horizontal intercept, and slope. Figure 10.1 illustrates two interesting features of the monopolist’s marginal revenue curve: (1) MR(q) lies below the inverse demand curve p(q); and (2) MR(q) and p(q) originate at the same point on the vertical axis. These are properties that not only hold for the linear demand function of example 10.2, but also for all downward-sloping demand curves, as we show next: • MR(q) lies below the demand curve. For the marginal revenue curve to lie below the inverse demand curve p(q), we need that MR(q) p(q). That is, p(q) + ∂p(q) ∂q q p(q), Monopoly 253 which simplifies to 0, condition • ∂p(q) ∂q q 0. ∂p(q) ∂q q 0 Because demand curve p(q) decreases in output q, ∂p(q) ∂q must hold, implying that MR(q) p(q), as required. MR(q) and the demand curve originate at the same height. The demand curve evaluated at q = 0 is p(0), whereas marginal revenue curve is p(0) + ∂p(q) ∂q 0 = p(0) as well. Graphically, both curves originate at p(0). For instance, if the monopolist’s demand curve is p(q) = a − bq, both marginal revenue and demand originate at p(0) = a, where q = 0. 10.3.2 Solving the Monopolist’s Problem After our detour explaining the monopolist’s marginal revenue and its properties, we can now return to this firm’s PMP in equation (10.1), which yields MR(q) = MC(q). Example 10.3: Finding monopoly output with linear demand Consider again the monopoly of example 10.2, and assume a total cost of TC(q) = cq, where c > 0. The monopolist maximizes its profits by solving max π = TR(q) − TC(q) = (a − bq) q − cq . q TR TC Differentiating with respect to output q yields a − 2bq − c = 0, or, rearranging, a − 2bq = c . MR(q) MC(q) Figure 10.2 separately depicts the marginal revenue and cost found here, as a function of q. As discussed in the previous section, MR(q) is decreasing in q, whereas MC(q) = c is constant in this example, and thus is depicted by a horizontal line. Rearranging this result a − 2bq = c, we find a − c = 2bq. Solving for output q, we obtain the profit-maximizing output for the monopolist: a−c . 2b We can now find the monopoly price by inserting this monopoly output into the inverse demand, as follows: qM = qM a−c M M p(q ) = a − bq = a − b 2b 254 Chapter 10 p a pM = a+c 2 MC (q) = c c MR(q) = a – 2bq qM = a –c 2b p(q) = a – bq a a 2b b q Figure 10.2 Output and price in a monopoly with linear demand. = 2ab − b(a − c) a + c = , 2b 2 and monopoly profits are π M = p(qM )qM − cqM a−c a+c a−c −c = 2 2b 2b a−c a+c −c = = 2 2b = (a − c)2 . 4b Lastly, we can evaluate the consumer surplus under this monopoly, as follows: a + c a − c 1 2a − a − c a − c (a − c)2 1 CS M = a− = . = 2 2 2b 2 2 2b 8b height base For instance, if the inverse demand function is p(q) = 10 − q (i.e., parameters a and b take values a = 10 and b = 1); and TC(q) = 4q, entailing c = 4, output, price, profits 10+4 M and consumer surplus under monopoly become: qM = 10−4 2 = 3 units, p = 2 = (a−c) (10−4) M $7, π M = (10−4) = 36 = 36 4 4 = $9, and CS = 8b = 8 8 = $4.5. 2 2 2 Monopoly 255 Self-assessment 10.2 Repeat the analysis in example 10.3, but assume a TC function TC(q) = cq + αq2 . Find the monopolist output, price, profits, and consumer surplus. (Hint: Marginal cost is now increasing in output, rather than being flat.) 10.4 Common Misunderstandings of Monopoly Markets In this section, we discuss three common misunderstandings related to monopoly markets: (1) while the monopolist does not face competition, it does not have any incentive to set infinitely high prices; (2) as opposed to firms operating in perfectly competitive industries, the monopolist does not have a supply curve; and (3) the monopolist chooses its output level in the elastic portion of the demand curve. There are no infinitely high prices. While the monopolist is the only firm in its industry, it faces a demand curve p(q), such as p(q) = a − bq in example 10.3. As a consequence, while setting higher prices might be attractive, it surely would lead to fewer sales. Hence, the monopolist must balance the increase in total revenue brought by a higher price per unit against the fewer sales that such higher prices entail. As discussed in the previous section, this trade-off implies that the monopolist does not set too high a price, and certainly not an infinitely high price because that would imply no sales at all. In example 10.3, for instance, any price above p = $a (e.g., $10 if a = 10) entails no sales for the monopolist. The monopolist does not have a supply curve. A common misunderstanding is to consider that the optimal output, where MR(q) = MC(q), constitutes the monopolist’s supply curve. In perfectly competitive markets, we found that every firm observes the given market price, and responds by offering the output that satisfies p = MC(q). As a consequence, we obtained a supply function q(p) which, for every price p, indicated how many units the firm supplies to maximize its profits. With a monopoly scenario, however, this does not occur because the monopolist determines output and price simultaneously. In other words, when the monopolist chooses to produce qM = 3 units (as in example 10.3), it simultaneously determines the market price of pM = 10 − 3 = $7, not allowing the firm to choose different output levels for a given market price of pM = 7. Graphically, when the monopolist chooses to produce a specific output level qM , it extends a dotted line from the horizontal axis (representing quantities) that hits the demand curve, determining the price at which every unit will be sold. The monopolist produces in the elastic portion of the demand curve. In previous chapters, we learned that goods with few (or no) close substitutes tend to have a relatively inelastic demand curve. Monopolies often produce goods with no close substitutes; otherwise, 256 Chapter 10 consumers would not be in the need to purchase such an expensive product. Many students reading about monopolies for the first time then conclude that the monopolist must be producing in the inelastic portion of the demand curve. This line of reasoning is, however, incorrect. To see this, consider again the formula of price elasticity of demand: εq,p = %q . %p If the monopolist was producing in the inelastic portion of the demand curve, εq,p would satisfy εq,p < 1. Essentially, an increase in price by 1 percent would entail a reduction in sales of less than 1 percent (i.e., a less-than-proportional decrease in q). However, if that were the case, the monopolist would have clear incentives to increase its price, as sales would not be greatly affected. In other words, the monopolist does not set a price in the inelastic portion of the demand curve, as that would not be profit maximizing. If, instead, the monopolist produces in the elastic segment of the demand curve, εq,p , an increase in its price p by 1 percent entails a reduction in sales of more than 1 percent, thus leaving the firm with no incentive to further adjust its price. Example 10.4 evaluates the price elasticity of demand in the profit-maximizing output qM found in example 10.3, confirming that εq,p > 1. Example 10.4: Price elasticity of output q M under a linear demand Consider the monopolist of example 10.3, where the inverse demand function was given by p(q) = 10 − q. We found that the profit-maximizing output was qM = 3 units, entailing an optimal price of pM = $7. In this scenario, we seek to find price elasticity, as follows: εq,p = %q q p = , %p p q p ∂q(p) or, if the change in price is small, εq,p = ∂q(p) ∂p q . Before finding the first term, ∂p , we need to obtain the direct demand function q(p) from inverse demand function p(q) = 10 − q. Solving for q, we find q(p) = 10 − p. Hence, we obtain that ∂q(p) ∂p = −1, ultimately yielding a price elasticity of εq,p = ∂q(p) pM 7 = −1 −2.33. ∂p qM 3 because qM = 3 units and pM = $7. Intuitively, if the monopolist increases prices by 1 percent, its sales decrease by 2.33 percent. Therefore, εq,p = 2.33 > 1, which illustrates the previous discussion: the monopolist sets a price pM that lies in the elastic portion of its demand curve. Lastly, note that this result also applies to the more general linear inverse demand function p(q) = a − bq considered in example 10.2. In particular, we first solve for q in p(q) to obtain the direct demand q(p) = ab − 1b p. 1 Hence, dq(p) dp = − b , yielding a price elasticity of Monopoly 257 εq,p = ∂q(p) pM 1 =− ∂p qM b a+c 2 a−c 2b =− a+c 1 2b a + c =− , b 2 a−c a−c a−c M where pM = a+c 2 and q = 2b , as described in example 10.3. Hence, we found that a+c a+c εq,p = − a−c , where ratio a−c is larger than 1 given that a + c > a − c. As a consequence, the monopolist sets its profit-maximizing price pM = a+c 2 in the elastic segment of the demand curve.8 Self-assessment 10.3 Consider again the monopolist in self-assessment 10.2, but assuming that the monopolist faces a TC function TC(q) = 4q2 . Use your findings from self-assessment 10.2 to evaluate the monopolist price-elasticity at qM . 10.5 The Lerner Index and Inverse Elasticity Pricing Rule While the monopolist produces in the elastic portion of the demand curve, it can charge a larger margin when facing a relatively inelastic demand curve (e.g., when consumers have no close substitutes of the product) than when facing a relatively elastic demand curve (when close substitutes exist). To show a relationship between margin, measured by the difference, p − MC(q), and price elasticity, εq,p , let us start by reproducing the profit-maximizing condition for the monopolist found previously, MR(q) = MC(q) or, alternatively, p(q) + ∂p(q) q = MC(q). ∂q q The marginal revenue MR(q) can be rearranged as MR(q) = p 1 + ∂p(q) ∂q p , where we factor price p out. We can further rewrite the marginal revenue as ⎞ ⎛ 1 1 ⎠ ⎝ , MR(q) = p 1 + ∂q(p) p = p 1 + εq,p ∂p q p because the price elasticity of demand is given by εq,p = dq(q) dp q . We can now use this expression of MR(q) in the monopolist’s profit-maximizing condition MR(q) = MC(q), as follows: 1 = MC(q). p 1+ εq,p 8. For instance, if the parameters a and c take the values 10 and 3, respectively, we obtain a price elasticity of a+c = − 10+3 = − 13 −1.85. εq,p = − a−c 7 10−3 258 Chapter 10 1 1 Rearranging, we obtain p + p εq,p = MC(q), or p − MC(q) = −p εq,p . Dividing both sides by p yields 1 p − MC(q) =− . p εq,p This is the “Lerner index,” which says that a monopolist’s ability to set a price above marginal cost, p−MC(q) , is inversely related to the price elasticity of demand. (This index p is also known as the “markup index” because it measures the price markup over marginal cost.) Intuitively, as demand becomes relatively elastic (i.e., a more negative number 1 , decreases (e.g., εq,p , such as −4), the ratio on the right side of the Lerner index, − εq,p 1 1 = − −4 = 0.25), which implies that the left side must also if εq,p = −4, we obtain − εq,p decrease. Therefore, the price markup over marginal cost decreases, for instance, to only 25 percent when price elasticity is εq,p = −4. If, in contrast, demand is relatively inelas1 1 = − −0.5 = 2, thus yielding a higher price markup of tic εq,p = −0.5, we find that − εq,p 200 percent. Example 10.5: Lerner index with a linear demand Consider again the linear demand of example 10.3, where p(q) = 10 − q. After solving for q, we obtain the direct demand q(p) = 10 − p, which yields an elasticity of εq,p = p ∂q(p) p = −1 . ∂p q 10 − p In this scenario, marginal costs were MC(q) = 4. Hence, the Lerner index, − εq,p , becomes p−MC(q) p = 1 1 p−4 =− . p p −1 10−p After rearranging, we obtain p − 4 10 − p = , p p which simplifies to p − 4 = 10 − p or, after solving for price p, p = $7. This result, of course, coincides with that in example 10.3.9 p−MC(q) = p 10−p 10+MC(q) , which reduces to p − MC(q) = 10 − p or p = . Expressed in words, the monopolist sets a price p 2 9. Generally, if we do not have precise information about marginal costs, the Lerner index becomes equal to its marginal cost plus $10, and then divides the result by two. Monopoly 259 Self-assessment 10.4 Consider a monopolist facing an inverse demand p(q) = 10 − 4q. Following the same steps as in example 10.5, use the Lerner index to find the monopolist’s profit-maximizing price. Example 10.6: Lerner index with constant elasticity demand Consider now a monopolist facing demand curve q(p) = 5p−ε . This demand function is referred to as “constant elasticity” because the price elasticity is exactly equal to the exponent of the demand curve, −ε, regardless of the price and quantity at which we evaluate the price elasticity.10 Let us now apply the Lerner index to this demand function, assuming a marginal cost of MC(q) = $4: 1 p−4 =− . p εq,p For instance, if the demand curve is q(p) = 5p−2 (i.e., price elasticity is ε = −2), this expression becomes 1 p−4 =− , p −2 which simplifies to 2p − 8 = p, or p = $8. As an exercise, note that if the demand function changes to q(p) = 5p−5 , the monopolist’s price decreases to p = 20 4 = $5. Intuitively, as demand becomes more elastic, price decreases. Self-assessment 10.5 Consider a monopolist facing the demand curve q(p) = 10p−ε . Following the same steps as in example 10.6, use the Lerner index to find the monopolist’s profit-maximizing price. 10. Indeed, for any demand function of the form q(p) = Ap−ε , where A > 0 and ε > 0, we obtain a price elasticity p ∂q(p) p p , which simplifies to εq,p = −ε Ap−ε 1p −ε = −ε. As a consequence, of εq,p = ∂p q = −εAp−ε−1 Ap−ε Ap price elasticity εq,p is a constant, thus being independent for q and p. 260 Chapter 10 We can use the Lerner index and solve for price p in order to find an expression of the monopolist’s profit-maximizing price as a function of its marginal cost MC(q) and the price elasticity of demand εq,p , as follows: Inverse elasticity pricing rule (IEPR). p= MC(q) 1 1 + εq,p , which is known as the “inverse elasticity pricing rule (IEPR).”11 For instance, if the monopolist faces a marginal cost of $4 and a price elasticity of εq,p = −2, the IEPR provides an optimal price of p = 4 1 = 41 = $8. 1+ −2 2 Self-assessment 10.6 Consider a monopolist facing a marginal cost of $3 and a price-elasticity of εq,p = −1.5. Use the IEPR to find the monopolist’s profitmaximizing price. 10.6 Multiplant Monopoly Our previous analysis considered a monopoly producing in a single plant (factory); but what if the firm has plants at different locations (e.g., countries), each with distinct costs? This could occur if, for instance, wages differ across countries, even if the monopolist uses the same technology and management everywhere. Would the monopolist produce all its output in the plant with the lowest marginal cost? Not necessarily, as we illustrate next. For simplicity, we consider only two plants, 1 and 2, where q1 denotes the output produced in plant 1, q2 that in plant 2, and Q = q1 + q2 represents the total output across all plants. Our analysis can be extended to monopolies with more than two plants. In this context, the monopolist maximizes the joint profits from both plants, as follows: max π = π1 + π2 = TR1 (q1 , q2 ) − TC1 (q1 ) + TR2 (q1 , q2 ) − TC2 (q2 ), q1 ,q2 π1 π2 where TR1 (q1 , q2 ) = p(q1 , q2 ) × q1 denotes the total revenue from selling q1 units; TR2 (q1 , q2 ) = p(q1 , q2 ) × q2 represents the total revenue from selling q2 units; TC1 (q1 ) measures the total cost of producing q1 units; and TC2 (q2 ) is the total cost of producing q2 units. While price p(q1 , q2 ) is affected by the units produced in each plant (e.g., p 11. To see that the IEPR originates from the Lerner index, rearrange the index as follows: p − MC(q) = − εq,p , or p 1 p + εq,p = MC(q). Factoring price p out on the left side, we obtain p 1 + εq,p = MC(q). Finally, solving for p, we find the expression of the IEPR, p = MC(q) 1 . 1+ εq,p Monopoly 261 p(q1 , q2 ) = 300 − q1 − q2 ), the total cost in each plant depends only on the units produced on that plant—that is, TC1 (q1 ) is unaffected by q2 . We can alternatively express this maximization problem as max π = [p(q1 , q2 ) × q1 − TC1 (q1 )] + [p(q1 , q2 ) × q2 − TC2 (q2 )] q1 ,q2 = p(q1 , q2 ) × (q1 + q2 ) − TC1 (q1 ) − TC2 (q2 ). Differentiating with respect to q1 yields ∂p(q1 , q2 ) ∂TC1 (q1 ) = p(q1 , q2 ) + ∂q1 ∂q 1 MR1 MC1 where the left side captures the marginal revenue that the multiplant monopolist earns after increasing the production of plant 1 by 1 unit, whereas the right side indicates the marginal cost from this additional production. Differentiating with respect to q2 , we obtain a similar expression: ∂p(q1 , q2 ) ∂TC2 (q2 ) , = p(q1 , q2 ) + ∂q2 ∂q 2 MR2 MC2 1 ,q2 ) 1 ,q2 ) In the special case ∂p(q = ∂p(q ∂q1 ∂q2 , marginal revenues from each plant coincide (MR1 = MR2 = MR), implying that the multiplant monopoly maximizes its joint profits at the point where MR = MC1 = MC2 . Intuitively, this occurs when prices are affected to the same extent when either plant increases its production, such as with inverse demand function p(q1 , q2 ) = 300 − q1 − q2 . In this scenario, the multiplant monopoly only needs to equate marginal costs across plants; otherwise, the manager still has the incentive to shift production from the plant with the 1 ,q2 ) = highest marginal cost to that with the lowest marginal cost. However, when ∂p(q ∂q1 ∂p(q1 ,q2 ) ∂q2 , marginal revenues from each plant do not coincide, which may occur if inverse demand function is p(q1 , q2 ) = 300 − q1 − 0.5q2 . In this context, the multiplant monopoly maximizes joint profits when these first-order conditions, MR1 = MC1 and MR2 = MC2 , hold. Example 10.7: Multiplant monopoly Consider a monopolist facing inverse demand function p(Q) = 100 − Q, where Q denotes aggregate output. In addition, assume that the monopolist operates two plants, one in the US with total cost 262 Chapter 10 TC1 (q1 ) = 5 + 12q1 + 6 (q1 )2 , and another in Chile, with total cost TC2 (q2 ) = 2 + 18q2 + 3 (q2 )2 . The monopolist maximizes the joint profits from both plants as follows: max π = π1 +π2 = (100 − q1 − q2 ) q1 − TC1 (q1 ) q1 0, q2 0 π1 + (100 − q1 − q2 ) q2 − TC2 (q2 ). π2 Therefore, the monopolist chooses its output in the US plant, q1 , and in the Chilean plant, q2 , to maximize its total profits π = π1 + π2 , where the latter are given by the sum of revenues across both plants and the total costs in each plant. Differentiating with respect to output q1 , we obtain 100 − 2q1 − q2 − 12 − 12q1 − q2 = 0, which simplifies to 88 − 14q1 − 2q2 = 0, or, after solving for q1 , q1 = respect to output q2 , yields 44−q2 7 . Similarly, differentiating these total profits with 100 − q1 − 2q2 − 18 − 6q2 − q1 = 0, which collapses to 82 − 2q1 − 8q2 = 0, 44−q2 1 and, after solving for q2 , entails q2 = 41−q 4 . Inserting this result into q1 = 7 , 44− 41−q1 4 1 we obtain q1 = , which simplifies to 7q1 = 135+q , yielding an optimal 7 4 production in the US plant of q1 = 5 units. Therefore, the optimal production in the Chilean plant is q2 = 41−5 4 = 9 units, entailing an aggregate output of Q = q1 + q2 = 5 + 9 = 14 units. In summary, the multiplant monopoly produces a share q of qQ1 = 59 ∼ = 0.56 in the US plant, and the remaining Q2 = 49 ∼ = 0.44 in the Chilean plant.12 12. As an exercise, note that if both plants were symmetrical in costs (e.g., both faced a total cost of TC(qi ) = 5 + 12qi + 6 (qi )2 for every plant i), the optimal output levels would coincide across all plants. In particular, the differentiation with respect to output qi would yield 88 − 14qi − 2qj = 0 for every plant i = j. Simultaneously solving for qi and qj , yields qi = qj = 5.5 units. Monopoly 263 Self-assessment 10.7 Consider the multiplant monopolist in example 10.7, but assume that the inverse demand function changes to p(Q) = 300 − 12 Q. Follow the steps in example 10.7 to find the optimal output in each plant. How are these results affected by demand increase? Our analysis about how the multiplant monopolist determines its aggregate production Q, and how it distributes such production among its plants (how much is being produced in plant 1, and how much in plant 2) is analogous to a “cartel” problem. A cartel is a group of firms coordinating their production decisions to increase their joint profits, such as the Organization of the Petroleum-Exporting Countries (OPEC), the diamond cartel, and the lysine cartel (as portrayed in the movie The Informant, featuring Matt Damon). Therefore, the cartel is, as a group, equivalent to a monopolist with different plants, where each plant is one of the firms participating in the cartel. For instance, in the OPEC cartel, some countries have a lower marginal cost of production (i.e., a lower cost of extracting an additional barrel of oil), such as Saudi Arabia, while others have a higher marginal cost, such as Angola or Venezuela. As a consequence, they coordinate their total production and distribute it among the cartel participants. We return to the analysis of cartels in chapter 14, where we examine imperfectly competitive markets, and firms’ incentives to collude to further increase their profits. Cartels. 10.7 Welfare Analysis under Monopoly As suggested in previous sections, output is lower under monopoly than it is under perfectly competitive industries, entailing a higher price. As figure 10.3 illustrates, this implies that consumer surplus is much smaller than under perfect competition because customers pay more per unit and buy fewer units. In contrast, profits are larger. A natural question is whether the firm’s profit gain offsets the consumers’ loss, giving rise to an overall increase in social welfare. As the graph depicts, the firm’s profit gain does not compensate for the loss in consumer surplus, ultimately yielding a net loss in social welfare. In particular, consumer surplus decreases from A + B + C to A, entailing a loss of B + C, as summarized in table 10.1. In contrast, profits increase from D + E + F to D + F + B, implying a net gain of B − E. As a consequence, a part of consumer welfare (region B) is transferred to the monopolist in the form of larger profits. However, a portion of total welfare under a perfectly competitive market is not transferred to another agent, but lost (see the region C + E). This net loss in social welfare, C + E, is often referred to as the “deadweight loss” of monopoly, which is explored in example 10.8. 264 Chapter 10 p MC(q) A pM B p C PC E D F MR(q) qM Demand, p(q) q q PC Figure 10.3 Welfare changes from a monopolistic market. Table 10.1 Welfare changes from monopoly. Consumer Surplus Profits Welfare Perfect Competition Monopoly Difference A+B+C D+E+F A+B+C+D+E+F A D+F +B A+D+F +B −B − C B−E −C − E Example 10.8: Finding the deadweight loss of a monopoly Consider the monopolist in example 10.3, where p(q) = 10 − q and MC(q) = 4. In that scenario, we found that monopoly output was qM = 3 units, entailing a monopoly price of pM = $7, which generates a consumer surplus of CS M = 12 (10 − 7)3 = $4.50 and profits of π M = (7 × 3) − (4 × 3) = $9, for a total welfare of W M = CS M + π M = 4.50 + 9 = $13.50. Under perfect competition, output is found at the point where demand crosses supply (marginal cost (MC) curve), 10 − q = 4, yielding qPC = 6 units, and a price of pPC = $4. Figure 10.4 depicts qPC and pPC under perfect competition, comparing them against qM and pM under monopoly. Therefore, consumer surplus is CS PC = 1 PC = (4 × 6) − (4 × 6) = $0, which generates a 2 (10 − 4)6 = $18, and profits are π total welfare of W PC = CS PC + π PC = 18 + 0 = $18. Monopoly 265 p 10 p M = $7 MC(q)=4 DWL p PC = $4 MR(q) = 10 – 2q 10 5 qM = 3 Demand, p(q) = 10 – q q q PC = 6 Figure 10.4 Monopoly versus perfect competition—an example. The difference between the welfare levels across market structures, W PC − W M = 18 − 13.50 = $4.50 represents the deadweight loss of monopoly. Alternatively, we could find such deadweight loss by measuring the area of triangle DWL in figure 10.4, as follows:13 DWL = 9 1 M 1 p − MC(qM ) qPC − qM = ($7 − $4) (6 − 3) = = $4.5. 2 2 2 height base Intuitively, the society loses $4.5 from having a monopoly, rather than a perfectly competitive industry. As discussed previously, this is a net loss, not simply a welfare transfer from consumers to the monopolist. Self-assessment 10.8 Consider the monopolist in example 10.8, but assume now that its total cost is TC(q) = 4q2 . Repeat the steps in example 10.8 to find the consumer surplus, profits, and welfare under monopoly, and then under perfect competition, and ultimately, find the welfare difference measuring the deadweight loss from monopoly. 13. Note that MC(qM ) = $4 because MC(q) = 4 is a flat horizontal line, as depicted in figure 10.4. 266 Chapter 10 10.8 Advertising in Monopoly When investing in advertising campaigns, the monopolist must balance the additional demand that advertising entails and its associated costs. In other words, the monopolist faces a trade-off: advertising increases demand, but it is costly. To find the profit-maximizing amount of advertising, A, let us write the monopolist problem as follows: max π = TR − TC − A, A where the last term, A, denotes the cost of advertising. Because total revenue is TR = p × q, and total cost is TC(q), we can rewrite this problem as follows: max π = (p × q) − TC(q) − A A = [p × q(p, A)] − TC[q(p, A)] − A, where q = q(p, A) represents the demand function (sales), which decreases in price p, but increases in the amount of advertising A. We can now differentiate with respect to the amount of advertising A, to obtain14 p ∂q(p, A) ∂TC ∂q(p, A) − − 1 = 0, ∂A ∂q ∂A MC or, rearranging, ∂q(p, A) = 1. (10.2) ∂A To express this result more compactly, let us define the advertising elasticity of demand, εq,A , as follows: (p − MC) % increase in q = εq,A = % increase in A q q A A = q A . A q A In the case of a small change in A, εq,A can be rewritten as εq,A = ∂q(p,A) ∂A q . After rearranging this expression, we find εq,A q ∂q(p, A) = . A ∂A 14. In the first term of the monopolist profit, p × q(p, A), advertising affects only the second component, q(p, A), ∂q(p,A) so the derivative of p × q(p, A) is p ∂A . In the second component of profits, TC(q(p, A), differentiating with ∂q(p,A) respect to A requires the application of the chain rule, yielding ∂TC ∂q ∂A , because TC(q) is a function of q, and q is a function of A. Finally, the third term in the monopolist’s profits, A, is linear in advertising, producing a derivative of 1. Monopoly 267 Therefore, we can rewrite equation (10.2) as q (p − MC) εq,A = 1. A ∂q(p,A) ∂A 1 A Dividing both sides by εq,A and rearranging yields p − MC = εq,A q . In addition, dividing both sides by p, we find p − MC 1 A = . p εq,A pq From the IERP, we know that becomes p−MC p − (10.3) 1 = − εq,p . Hence, the left side of equation (10.3) 1 εq,p = 1 A . εq,A pq And rearranging, we get, − εq,A A = . εq,p pq The right side represents the advertising-to-sales ratio. Therefore, for two markets with the A same price elasticity of demand, εq,p , the advertising-to-sales ratio pq must be larger in the market where demand is more sensitive to advertising (higher εq,A ). Example 10.9: Finding the monopolist’s optimal advertising ratio Consider a monopolist with a price elasticity of demand of εq,p = −1.5 and an advertising elasticε A = − εq,A , ity of εq,A = 0.1. In this scenario, the advertising-to-sales ratio should be pq q,p which entails εq,A 0.1 = 0.067. =− − εq,p −1.5 Therefore, advertising should account for 6.7 percent of this monopolist’s total revenue. Self-assessment 10.9 Consider the monopolist in example 10.9, but assume now that advertising elasticity increases to εq,A = 0.3. Find the advertising-to-sales ratio A pq , and compare it to that in example 10.9. Interpret your results. 268 Chapter 10 10.9 Monopsony A “monopsony” can be understood as a direct application of monopoly where, rather than a single seller offering its good to several buyers, there is now only one buyer in the market and several sellers. Examples of monopsonies are often found in small labor markets, such as mining jobs in a small town with only one mine (employer) and many workers willing to work (employees), or Walmart in a small town where it is the main employer.15 In this scenario, the buyer (employer) will be able to pay less for each hour of labor (lower wages) than if it had to compete against other employers to attract employees, as in a perfectly competitive market. To show this result, and to explicitly find how the monopsony chooses its output and price (wage), consider a firm (e.g., a coal mine) with production function q = f (L) which, as in chapter 7, increases the number of workers hired, f (L) > 0, but at a decreasing rate, f (L) < 0. The profits of the coal mine are then given by π = TR − TC = pq − w(L)L. Intuitively, the firm extracts q units of coal, each sold at a price p in the international market, yielding a total revenue of TR = pq. (For simplicity, we assume that such a price is given, implying that the international market for coal is perfectly competitive, so a larger/smaller production by the mine does not alter the international price p.) Regarding total costs, the firm hires L workers, paying each of them a wage of w(L). Importantly, such a wage w(L) is a function of L, specifically increasing in the number of workers hired L. Intuitively, as the firm hires more workers, labor becomes more scarce, and a more generous wage must be offered to attract new workers; that is, w(L) is increasing w (L) > 0. We are now ready to write the monopsonist’s PMP. Because the production function is given by q = f (L), we can rewrite the profit as π = pf (L) − w(L)L, which is only a function of L, the number of workers hired by the mine. Hence, the monopsonist solves max π = pf (L) − w(L)L. L0 Intuitively, this problem says: “Choose the number of workers you plan to hire, L, so as to maximize your profits.” 15. Other examples include technology companies, such as Cisco and Oracle, which have recently started contracts that prohibit their employees from working for a competing firm for a period of time after leaving their current job (such as months or years). This contract requirement essentially keeps a worker from using competing offers as a negotiation tool with her current employer. Monopoly 269 Differentiating with respect to L, we obtain16 pf (L) − w(L) + w (L)L = 0, or, rearranging, pf (L) = w(L) + w (L)L. MRPL MEL Let us intuitively express this condition. The left side represents that, after hiring 1 more worker (increase in L), the firm produces f (L) more units of output (e.g., coal). Because these additional units are sold at a price p, the left side, pf (L), measures the marginal revenue product of hiring 1 more worker, MRPL . In other words, it denotes the market value of the additional output that the firm can produce when hiring 1 more worker. In contrast, the right side measures the increase in cost w(L)L that the firm experiences when hiring 1 more worker. On the one hand, this worker must be paid w(L), as represented by the first term in MEL . On the other hand, the additional worker is only attracted to the job if the firm offers her a higher salary because labor becomes scarcer as the firm hires each additional worker. Such a wage increase, w (L), must be passed on to all existing workers, which entails a cost increase of w (L)L for the firm. Overall, the firm’s total expenditure on labor increases by MEL = w(L) + w (L)L.17 In summary, the monopsonist optimality condition says that MRPL = MEL , implying that the monopsonist hires workers until the point where the additional market value of the output produced by the new worker, MRPL , coincides with the additional cost that the firm incurs when hiring such a worker, MEL . Example 10.10: Finding optimal L in monopsony Consider a coal company that is the only employer in a small town. The mine has production function q = 100 × ln L and faces an international perfectly competitive price of coal, given by p = $8. In addition, assume that the supply curve for labor is w(L) = 3 + 12 L. In this scenario, the marginal revenue product of labor is 16. The second term in the firm’s profit, w(L)L, is a product with both elements depending on L. Therefore, when dw(L) ∂w(L) we differentiate w(L)L with respect to L, we apply the product rule, obtaining w(L) ∂L ∂L + dL L = w(L) + ∂L L, ∂w(L) because ∂L ∂L = 1. Using w (L) = ∂L to denote the derivative of w(L) with respect to L, we can express this derivative more compactly as w(L) + w (L)L. 17. Note that this increase in costs is analogous to the increase in total revenue that the monopolist experiences if it chooses to sell 1 more unit of output. The extra unit is sold at a price p(q), but that unit is sold only if the monopolist charges a lower price p (q), where the price discount must be applied to all units, entailing a loss in revenue of p (q)q. 270 Chapter 10 w MEL = 3 + L w(L) = 3 + A w 1 L, Supply curve 2 PC = $21.50 M w = $16 MRPL = 3 LM = 26 800 L L LPC = 37 Figure 10.5 Labor and wages under monopsony. 1 800 . MRPL = pf (L) = 8 × 100 = L L p f (L) Figure 10.5 depicts the MRPL = 800 L curve, which decreases in labor L, becoming flatter as L increases.18 In addition, we can find the marginal expenditure that the firm suffers from hiring 1 more worker, MEL , as follows: 1 1 MEL = w(L) + w (L)L = 3 + L + L = 3 + L, 2 2 which, as depicted in figure 10.5, originates at $3, which is the same height as the labor supply curve w(L). However, MEL increases at twice the slope as w(L) does (i.e., the slope of MEL is 1, while that of w(L) is only 1/2). Setting MRPL = MEL , we find their crossing point, as follows: 800 = 3 + L, L 800 L 18. Indeed, note that the first-order derivative of MRPL with respect to L, ∂MRP ∂L = − 2 , is negative for all L 2 L = 1600 , is positive for all values of L. values of L; and its second-order derivative, ∂ MRP 2 3 ∂L L Monopoly 271 which, expanding, yields 800 = 3L + L2 , or L2 + 3L − 800 = 0. Solving for L in this equation, we find two roots,19 L = −29.82 and L = 26.82. Because the firm must hire a positive number of workers (or zero), we find that LM = 26 workers is optimal. At LM = 26, wages become w(26) = 3 + 26 × 12 = $16. Under a perfectly competitive labor market, the number of workers is determined by the point where MRPL crosses the labor supply curve w(L), MRPL = w(L), which in this example implies 1 800 = 3 + L; L 2 2 after expanding, this expression yields 800 = 3L + L2 . Solving for L, we also obtain two roots, L = −43.11 and L = 37.11, with the latter being the optimal number of workers hired under perfect competition in the labor market, LPC = 37 (in integer amounts, as depicted in figure 10.5). In this context, wages become w(37) = 3 + 12 37 = $21.5. Hence, when the labor market is competitive, workers receive a higher wage and more workers are willing to work at that wage; whereas under a monopsony, the single employer takes advantage of her purchasing power by offering a lower wage, which entails that fewer workers are willing to work. Self-assessment 10.10 Consider the coal company in example 10.10, but assume now that the price of coal p increases from $8 to $10. Use the same steps as in example 10.10 to find the number of workers hired under monopsony, under perfect competition, and also find the corresponding salaries. Exercises 1. Monopoly equilibrium-linear costs.B Consider a drug company holding the patent of a new drug for a rare disease (monopoly rights). The firm faces inverse demand function p(q) = 100 − 0.1q, and a cost function C(q) = 4q. (a) Find the monopolist profit-maximizing output, its price, and its profits. 19. Expression L2 + 3L − 800 = 0 is a quadratic function, which has the √ general form ax2 + bx + c = 0. We can b2 −4ac . In this context, the quadratic use the quadratic formula to find the two roots for x, as follows: x = −b± 2a √ 2 +(4×1×800) , which simplifies to −3± 2 3,209 = −3±56.64 , which in turn produces two formula entails L = −3± 3 2×1 2 −3−56.64 −3+56.64 = −29.82 and L = = 26.82. roots, L = 2 2 272 Chapter 10 (b) Assume now that the government seeks the monopolist to produce the competitive equilibrium output (i.e., where demand crosses the MC function). Find the competitive equilibrium output in this context. (c) Find the subsidy per unit of output that the government needs to offer the monopolist to induce the latter to produce the competitive equilibrium output you identified in part (b). (d) What is the total cost that the government incurs with the subsidy? How are profits affected by the subsidy (i.e., the change in profits from parts a to c)? 2. Monopoly equilibrium-convex costs.B Consider a drug company holding the patent of a new drug for a rare disease (monopoly rights). The firm faces demand function p(q) = 100 − 3q, and a cost function C(q) = 5q2 . (a) Find the monopolist profit-maximizing output, its price, and its profits. Find the deadweight loss from the monopoly. (b) Assume that the government seeks to collect $100 by imposing a tax on the monopolist. (For simplicity, let us assume that the tax is revenue neutral, so upon collecting it from the firm, it is transferred to consumers.) Consider that the government sets a tax t per unit of output. Find the optimal tax that helps the government collect $100. Then identify the resulting equilibrium output, price, profits, and deadweight loss. (c) Consider now that the government sets a lump-sum tax T on profits. Find the optimal tax that helps the government collect $100. Then identify the resulting equilibrium output, price, profits, and deadweight loss. 3. Maximizing revenue versus profit.A Consider a monopolist facing linear inverse demand function p(q) = 20 − 2q, and constant marginal cost MC(q) = 1. (a) Assume that the monopolist seeks to maximize total revenue rather than profits. Which output does the monopolist choose? What are the equilibrium price and profits? (b) Assume now that the monopolist seeks to maximize profits. Show that its optimal output decreases relative to that maximizing total revenue in part (a), that price increases, and that profits increase. 4. Taxing a monopoly.B A local cable provider faces demand q(p) = 100p−2 , and cost function C(q) = q3/2 . Assume that this provider is the only firm offering cable in this town. (a) Find the equilibrium price, quantity, and profit for the cable provider. (b) Find the equilibrium price, quantity, and profit if the monopolist were to produce at the perfectly competitive equilibrium. (c) Can the local regulator impose a lump-sum tax on the cable provider to produce at the competitive equilibrium? Why or why not? If so, find the value of the tax T. (d) Can the local regulator impose a per-unit tax on the cable provider to produce at the competitive equilibrium? Why or why not? If so, find the value of the per-unit tax t. 5. Regulating a natural monopoly.B Duchess Energy, an electric utility company, provides electricity to Spartanburg. The demand for electricity is p(q) = 10 − 0.1q, and this company’s costs are C(q) = 1 + 0.5q. Monopoly 273 (a) Does Duchess Energy exhibit the properties to be a “natural monopoly”? (b) Find the unregulated monopolist’s profit-maximizing price, output, and profit. (c) The Spartanburg city government passes a law that requires utility and other electricity providers to practice MC pricing (i.e., p(qR ) = MC(qR )). What is the regulated monopolist’s output, price, and profit? (d) What is the lump-sum subsidy that the regulator must provide the electric utility company to practice MC pricing without operating at a loss? (e) Compute the consumer surplus from the pricing strategies in parts (a) and (b). (f) Discuss the pros and cons of MC pricing in natural monopolies. 6. Advantages to a monopoly.A We posed the question in section 10.2, “Why do monopolies exist in the first place if they are bad for society?” After reading the chapter, we know that a monopoly results in lower welfare than in a perfectly competitive industry, but are there social benefits to a monopoly? 7. Two parts of marginal revenue.A There are two parts to a monopolist’s marginal revenue function. Identify the two parts in each of the following demand functions: (a) p(q) = 25 − 1.5q. (b) p(q) = 12 + 50. q (c) p(q) = e−q . 8. Monopoly and changing condition.B Some small towns may have only one restaurant, making it a monopoly in that town. Consider Rosie’s Diner, in a small mountain town. Rosie’s inverse demand is p(q) = 20 − 0.4q, where q represents meals per week, and costs are C(q) = 5q. (a) Find Rosie’s profit-maximizing price, quantity, and profits. (b) The road into the town has become considerably harder to traverse since a recent mudslide, and Rosie’s suppliers have increased their delivery price. This has increased costs to C(q) = 8q + 10. How do her equilibrium prices, quantity, and profits change? (c) After the mudslide, fewer visitors have been hiking the trails around town, which has decreased demand to p(q) = 15 − 4q. Does Rosie stay in business? 9. Factors of a high monopoly price.B A monopolist does not charge an infinitely high price, but certain market conditions can lead to a situation where prices may seem infinitely high. Discuss what the “perfect storm” of market conditions might be. 10. Monopolies produce on the elastic part of demand.B Show that a monopolist facing inverse demand p(q) = q−2 + 50 with constant marginal cost MC = 5 will produce on the elastic segment of the demand curve. 11. Monopoly with general linear demand.C Consider a monopolist with general inverse demand p(q) = a − bq and constant marginal cost c. How does the monopolist’s optimal quantity, price, profit, and consumer surplus change as each of the parameters a, b, and c increase? 12. Using the IEPRA One advantage of using the Lerner index is that it uses elasticity, which can be easily estimated. Use the IEPR to solve for the optimal price for the following situations: 274 Chapter 10 (a) εq,p = −2, MC = $2. (b) εq,p = −3, MC = $2. (c) εq,p = −4, MC = $2. (d) εq,p = −2, MC = $3. (e) εq,p = −2, MC = $4. (f) How does the optimal price change as demand becomes more elastic? How does the optimal price change as marginal cost increases? 13. Multiplant monopoly–I.B Consider a firm that holds a patent on technology that makes the production of concrete less harmful to the environment (resulting in a monopoly on this technology). The firm has two plants: one domestic (D) and one located in Australia (A). Demand for their technology is p(Q) = 250 − 10Q, where Q = qD + qA is aggregate output. The domestic plant has total cost TCD (qD ) = 5 + 10qD + 4(qD )2 , and the Australian plant has total cost TCA (qA ) = 15 + 4qA + 5(qA )2 . Find the optimal output at each plant, and the price it will charge. 14. Multiplant monopoly–II.B Consider a monopolist that is considering outsourcing some of its production to a plant overseas. The firm currently faces demand p(Q) = 75 − 0.5Q, and its sole factory has total cost TC(q1 ) = 10 + 2q1 + (q1 )2 . If it invests in the overseas plant, it estimates that the plant will have total cost TC(q2 ) = 5 + 25q2 + 5(q2 )2 . In that case, the monopolist can produce in either plant or both plants. Should the firm invest in the new plant? 15. Multiproduct monopoly.C Consider a pharmaceutical company with a patent on two different prescription drugs, granting them a monopoly in each market. Both drugs (x1 and x2 ) are made in similar ways, with total cost of TC(x1 , x2 ) = 50 + 2(x1 + x2 ) + 0.5(x1 + x2 )2 . Drug x1 has demand p1 (x1 ) = 500 − x1 , and drug x2 has demand p2 (x2 ) = 1, 000 − x2 . Find the monopoly output, price, and profit for each drug. 16. Multiperiod monopoly.C Many new inventions rely on crowdfunding campaigns to finance the development of the resulting products. Many of these products also benefit from network externalities—the idea that the more of their products they sell today, the more valuable they will be to consumers tomorrow as more consumers are involved. Consider such a product, which would result in a monopoly over two periods: (1) the crowdfunding period with demand p(q) = 100 − 2q; and (2) the post-crowdfunding period, where demand increases to p(q) = 150 − 2q if the firm sells at least 20 units and remains unchanged if it does not sell 20 units. Assume that the firm has marginal cost MC = $40. (a) No network effects. Consider that the monopolist ignores network effects, assuming that its demand function is p(q) = 100 − 2q in both periods. Find the equilibrium output and prices in both periods. (b) Network effects, second period. Assume now that the monopolist recognizes the presence of network effects. Starting at the second period, what are its profit-maximizing output and price when the firm sold fewer than 20 units in the first period? What if the firm sold more than 20 units in the first period? Monopoly 275 (c) Network effects, first period. Still with the situation set out in part (b), let us move on to the first period. Find the monopolist output and price in this period. (Hint: The firm anticipates its second-period profits if it sells more than 20 units today, and if it doesn’t. The monopolist then chooses the first-period output that yields the largest overall profit.) 17. Advertising-to-sales ratio.A Consider a monopolist with a price elasticity of demand of εq,p = −2.5 and an advertising elasticity of εq,A = 0.5. What is the advertising-to-sales ratio? Comment on how price elasticity of demand affects the advertising-to-sales ratio. 18. Optimal advertising.B Annie’s Apples company is the only local producer of caramel apples in Appleville, making her a monopoly. Her demand for caramel apples is √ p(q, A) = 100 − q + A, where q is the number of apples and A is advertising expenditure. If she has a constant marginal cost of $2 per caramel apple, what is Annie’s equilibrium number of caramel apples sold and advertising expenditure? 19. Identify a monopsony.A Outside of employers in small towns, describe an example of a real-life monopsony. Be specific about what the good traded is and who the buyers and sellers are. 20. Monopsony–one input.B Consider a family business that produces shoes with a son (Edgar) eager to join the workforce. The family business is the only potential employer for Edgar, as he will be kicked out of the family if he works at a different store (and Edgar would rather not work at all than get kicked out of the family). The business sells shoes for $100 a pair, with production function q = 10 ln L. The supply curve for the number of hours that Edgar will work is w(L) = 5 + L. How many hours will Edgar work, and what will his wage be? 21. Monopsony–two inputs.B In many rural towns, there may be only one employer. An example of this may be a large, corporation-owned farm. This farm recently bought out many smaller farms in the area, and there is a large surplus of both high- and low-skilled labor (Lh and Ll , respectively). The production function for the farm is q = 10 ln Lh + 4 ln Ll . The supply curves for labor are wh (Lh ) = 5 + 4Lh and wl (Ll ) = 2 + 2Ll , and the farm’s output sells for $10 per unit. How much of each type of labor will the farm hire, and at what wages? How much output will the farm sell? What is the farm’s total profit? 11 Price Discrimination and Bundling 11.1 Introduction This chapter analyzes firms’ strategies to increase profits by charging different prices to different types of consumers, such as those with distinct demands for the good, or those purchasing it at different time periods or locations, or in different quantities. In particular, we discuss three forms of price discrimination. Under first-degree (or perfect) price discrimination, the firm has access to enough information about consumer demand that it can, essentially, charge a “personalized price” to each consumer, which coincides with her maximum willingness-to-pay (WTP) for the good. In second-degree price discrimination, the firm offers quantity discounts to customers. Lastly, in third-degree price discrimination, the firm charges different prices to groups of consumers with different characteristics and needs, such as students and nonstudents at the movies. While all forms of price discrimination allow the seller to increase profits, first-degree price discrimination increases its profits by the largest amount because consumers have no consumer surplus, transferring it entirely to the firm. This form of price discrimination is difficult to implement, however, as it requires extremely detailed information about each consumer’s WTP for the good, whereas second- and third-degree price discrimination do not assume such rich access to information. We finish the chapter by analyzing bundling strategies. You have probably encountered this before when purchasing a computer, facing a price for the monitor, another price for the (CPU), and another for the whole computer (bundling the monitor and CPU).1 In most cases, prices are set so that purchasing the bundle sounds like a better deal than buying all the parts separately. We analyze when the seller finds it profitable to offer bundles, the 1. Actually, if you purchase the CPU alone, you are being offered a bundle as well, because the package often includes the keyboard and mouse. Other common examples of bundle pricing are tickets to water and amusement parks, where you can purchase (1) a ticket giving you access to the park without access to rides (you can then buy each ride separately inside the park) or (2) a ticket allowing you unlimited access to the park and rides everywhere. 278 Chapter 11 p Customers with WTP above pM pM Customers with WTP below pM but above MC(qM) MC(qM) MC(q) MR(q) qM Demand q Figure 11.1 Room for larger profits in monopoly. optimal price that the seller should charge for the bundle, and the optimal price for each item sold separately. 11.2 Price Discrimination Monopolists, as well as firms with market power, can earn large profits. A natural question, however, is whether firms can do even better. As we discuss in this chapter, the answer is “Yes.” To understand this point, consider the monopolist’s decision again, as shown in figure 11.1. By choosing an output level where its marginal revenue coincides with its marginal cost, MR = MC, it charges a unique price pM to all its customers, giving up two business opportunities: • Price pM attracts buyers who would have been willing to pay a higher price, as depicted in the segment of the demand curve to the left of pM , where p > pM . Hence, the monopolist would like to charge a higher price to these customers in order to earn a larger profit margin from them. • Price pM does not attract buyers who are not willing to pay pM , but are willing to pay more than the cost that the monopolist incurs to produce the good. That is, the monopolist could charge a price p in the segment of the demand curve below pM and above MC(qM ). This means that the monopolist can make an additional profit p − MC(qM ) per unit by selling its product to these customers. These points highlight that the monopolist could increase its profits if it could charge different prices to specific customers (i.e., if the monopolist could “price discriminate”). In this section, we explore three types of price discrimination: first-degree, where the Price Discrimination and Bundling 279 monopolist sets a different price for each customer that coincides with her willingness-to-pay for the good; second-degree (or “nonlinear pricing”), where the monopolist offers a quantity discount to buyers purchasing large amounts of product; and third-degree, where the monopolist charges different prices to different groups of customers, each having a different demand curve. Before we start analyzing each type of price discrimination, it is important that we understand under which conditions the monopolist can price-discriminate: Conditions for price discrimination. • No arbitrage. The good cannot be resold from one customer to another (i.e., no arbitrage can occur); otherwise, individuals with a low WTP would purchase the good at a low price and resell it to individuals with a high WTP (but charge them less than the monopolist would). • Information about WTP. The monopolist needs some information about customers’ WTP for its good. While firms rarely observe extremely detailed information about such WTP for each potential customer, they at least gather information for various groups of customers. 11.2.1 First-Degree Price Discrimination In this scenario, the monopolist charges to every consumer i a price that coincides with her maximum willingness-to-pay (i.e., a personalized price). For instance, if the monopolist faces an inverse demand p(q) = a − bq, it charges price p = a to the individual with the highest WTP, then a price p = a − $0.01 to the individual with the second-highest WTP, and similarly for all subsequent buyers. The monopolist stops this pricing strategy when p = MC(q) because customers with WTP below MC(q) would entail a per-unit loss. As a consequence, the firm extracts all the surplus from every consumer, generating a total profit that coincides with the area below the demand curve and above its marginal cost (MC) function MC(q), as depicted in figure 11.2. In addition, the output that the monopolist produces under first-degree price discrimination, qFD (where the superscript FD denotes first-degree price discrimination), coincides with that under a perfectly competitive market, qPC , because at qPC , the demand curve crosses the firm’s marginal cost p(q) = MC(q). Example 11.1: First-degree price discrimination Consider a monopoly facing an inverse demand curve p(q) = a − bq, where a, b > 0; and a total cost (TC) function TC(q) = cq, where c > 0. Uniform pricing. If the monopolist sets a uniform price for all its customers (as described in chapter 10), it would produce MR(q) = MC(q); that is, a − 2bq = c. 280 Chapter 11 p Profits under FD MC(q) Demand q qFD = qFC Figure 11.2 First-degree price discrimination. After solving for output qM , we find qM = a−c 2b , which entails a monopoly price of pM = a − b a−c a+c = , 2b 2 with profits of π M = (a−c) 4b . (These results coincide with those in example 10.3 in chapter 10.) 2 First-degree price discrimination. If, instead, the monopolist practices first-degree price discrimination, it produces an output level where the demand curve crosses the marginal cost (i.e., a − bq = c) or, after solving for q, we find qFD = a−c b . As figure 11.2 depicts, the monopolist’s profits coincide with the area of the triangle below the demand curve p(q) = a − bq, and above the marginal cost c; that is, π FD a−c 1 (a − c)2 −0 = , = (a − c) 2 b 2b Height Base which exceeds those under a uniform (unique) price, π M = (a−c) 4b , because 2 (a−c)2 4b (a−c)2 2b > simplifies to > or 4b > 2b. For instance, if the monopolist faces a demand function p(q) = 10 − q (i.e., a = 10 and b = 1), and c = 2, the profit from set2 (10−2)2 = $16, but it doubles with first-degree ting a uniform price is π M = (a−c) 4b = 4 1 2b 1 4b , (10−2) price discrimination because π FD = (a−c) = $32. 2b = 2 2 2 Price Discrimination and Bundling 281 Self-assessment 11.1 Consider the monopolist in example 11.1, but assume that the inverse demand changes to p(q) = 16 − q and the marginal cost c = 3. Follow the steps in example 11.1 to find the monopolist’s profit if it sets a uniform price, π M , and if it practices first-degree price discrimination, π FD . First-degree price discrimination is ideal for the monopolist, of course, as it extracts all possible surplus from consumers. However, the monopolist needs a massive amount of information to practice this type of price discrimination; namely, it needs to know the maximum willingness-to-pay of every buyer, making this type of practice relatively uncommon, at least in its purest form. One of the closest examples of this pricing strategy is the Free Application for Federal Student Aid (FAFSA) forms that students applying for federal financial aid must submit to the college or university they attend. This form includes relatively detailed information about the student’s income, as well as her family’s, which is highly correlated with their willingness-to-pay for education—information that the student’s institution can use to better assess how much the student (and her family) are willing to pay in tuition.2 11.2.2 Second-Degree Price Discrimination With second-degree price discrimination, the monopolist offers a quantity discount to individuals willing to purchase several units, such as discounts for buying in bulk. That is, the monopolist charges at least two prices: one for each of the first q1 units, and another for each unit beyond q1 . For instance, the monopolist sets a price p1 = $4 for the first 3 units, and a lower price p2 = $2 for all units there after.3 This is a common pricing strategy in utilities, such as electricity and water, and in mass transit systems, where one may benefit from a discount after purchasing a large number of units. As you probably noticed from this discussion, this type of price discrimination gives rise to three unknowns that the firm needs to determine: • Where should the monopolist set the boundary q1 , so customers can start benefiting from a quantity discount? • Which price should the monopolist set for each unit in the first block, p1 ? • Which price should it set for each unit in the second block, p2 ? 2. Besides detailed information about WTP, the second condition for price discrimination to be successful (namely, no possibility of arbitrage) also holds in this scenario: degrees are nominative, so the student cannot resell her education to another student. 3. Hence, the monopolist charges the same price for all the units in the first block (e.g., before reaching 3 units), but a lower price for all the units in the second block (e.g., beyond 3 units). This explains why this pricing strategy is also known as “block pricing.” 282 Chapter 11 To find these three unknowns, we only need to solve the following monopolist’s problem: max p1 q1 + p2 (q2 − q1 ) − TC(q2 ), q1 ,q2 TR1 TR2 where TR1 = p1 q1 denotes the total revenue from the q1 units in the first block (i.e., units from q = 0 to q = q1 ), and TR2 = p2 (q2 − q1 ) represents the total revenue from the units in the second block (i.e., those from q1 to q2 ). Total cost TC (q2 ) is evaluated at q2 units of output because the firm produces a total of q2 units. Intuitively, this problem asks the monopolist: Choose the number of units in the first block, q1 , and in the second block, q2 − q1 , to maximize the profits from both blocks. Example 11.2 illustrates this pricing strategy. Example 11.2: Second-degree price discrimination Consider a monopolist facing the inverse demand function p(q) = 10 − q, where a, b > 0. The firm’s total cost (TC) function is TC(q) = cq, where c > 0. We first need to write down the monopolist’s profit maximization problem (PMP) as follows: TR1 TR2 2) TC(q max π = (10 − q1 )q1 + (10 − q2 ) (q2 − q1 ) − cq2 . q1 ,q2 p1 p2 Differentiating with respect to q1 , we obtain4 ∂π = 10 − 2q1 − (10 − q2 ) = 0, ∂q1 which simplifies to −2q1 + q2 = 0 or, after solving for q1 , we find that q1 = q22 . Differentiating now with respect to q2 , we obtain ∂π = 10 − 2q2 + q1 − c = 0, ∂q2 which leads to q2 = 10+q2 1 −c . Inserting q1 = q22 into q2 = 10+q2 1 −c , we find 4. Note that, to facilitate your differentiation, you can first expand the firm’s profit, obtaining π = 10q2 + q1 q2 − (q1 )2 − (q2 )2 − cq2 . Price Discrimination and Bundling 283 q1 q2 −c 10 + 2 q2 = , 2 . Inserting this result into or 3q2 + 2c = 20, which, solving for q2 , gives q2 = 2(10−c) 3 10−c . The first block is then q = q1 = q22 , we find q1 = 10−c 1 3 3 units, while the second block is q2 − q1 = 2(10 − c) 10 − c 10 − c − = units. 3 3 3 We can now find the optimal prices for each block by plugging these output levels into the inverse demand function as follows: p(q1 ) = 10 − 10 − c 20 + c = , and 3 3 p(q2 ) = 10 − 2(10 − c) 2(5 + c) = . 3 3 Numerical example. For instance, if the marginal cost is c = $4, the monopolist sells 20+4 q1 = 10−4 3 = 2 units in the first block at a price of p1 = 3 = $8 per unit. In addition, 2(10−4) this firm sells q2 = 3 = 4 units in total, implying q2 − q1 = 4 − 2 = 2 units in the = $6 per unit, thus offering a price second block, each of them at a price of p2 = 2(5+4) 3 discount once the buyer purchases more than 2 units. These prices and output levels generate profits of π = (8 × 2) + (6 × 2) − (4 × 4) = $12. If, instead, the monopolist charged a uniform price to all its customers (i.e., not practicing price discrimination), its output qM would solve to 10 − 2q = 4, or qM = 3 units, at a monopoly price of pM = 10 − 3 = $7, yielding a profit of only π M = (7 × 3) − (4 × 3) = $9. As expected, the monopolist increases its profits by price discriminating. Exercise 13 at the end of this chapter shows that profits can be further increased if the monopolist offers three blocks, rather than two. 284 Chapter 11 Self-assessment 11.2 Consider the monopolist in example 11.2, but assume that the inverse demand curve changes to p(q) = 16 − q, and its marginal cost is c = 4. Follow the steps in example 11.2 to find the units that the monopolist sells to each block, its corresponding prices, and the overall profits from doing so. Also, find the profit that the monopolist obtains from setting a uniform price for all customers, π M . Non-linear pricing. Uniform pricing is also known as “linear pricing” because the monopolist charges the same price per unit, regardless of how many units the buyer purchases. In contrast, second-degree price discrimination is known as “non-linear pricing” because the price per unit is not constant in output. As example 11.2 illustrates, the monopolist sets a relatively high price of $8 for the first block of units, but it offers a price discount ($6) for all subsequent units. Hence, if the monopolist offers at least one price discount, the price per unit is nonconstant (i.e., non-linear). 11.2.3 Third-Degree Price Discrimination In third-degree price discrimination, the monopolist charges different prices to customers with different demand curves. This entails that the monopolist, upon observing a potential customer, can easily identify which group she belongs to. Mathematically, the monopolist treats each group of customers as a separate monopoly, because they cannot resell the good to customers in another group (i.e., there is no arbitrage condition). As a consequence, the monopolist starts by finding the marginal revenue curve for each demand function, and then it sets each of them equal to the firm’s marginal cost. Example 11.3 illustrates this pricing strategy. Example 11.3: Third-degree price discrimination Consider a small town with only one movie theater. As a monopolist, the movie theater faces two groups of customers, students and non-students, which it can easily distinguish by checking whether they have a student ID. Students have a lower willingness-to-pay for movies, captured by inverse demand p1 (q) = 10 − q, whereas non-students have a higher willingnessto-pay, measured by p2 (q) = 25 − q. The marginal cost of a ticket is the same for both types of customers, MC = $3. In this scenario, the monopolist seeks to maximize its profits from both groups, π1 + π2 , as follows: max π = π1 + π2 = (10 − q1 )q1 − 3q1 + (25 − q2 )q2 − 3q2 . q1 ,q2 π1 π2 Price Discrimination and Bundling 285 Differentiating with respect to q1 , we obtain 10 − 2q1 = 3 which, solving for q1 , yields q1 = 3.5 tickets. Differentiating now with respect to q2 , we find 25 − 2q2 = 3, yielding q2 = 11 tickets. Alternatively, this problem can be solved by noticing that profits from students, π1 , only depend on the number of tickets sold to this group, q1 ; and, similarly, profits from non-students, π2 , only depend on the number of tickets sold to this group, q2 . We can then write this maximization problem as two separate problems: max π1 = (10 − q1 )q1 − 3q1 , and (Students) q1 max π2 = (25 − q2 )q2 − 3q2 q2 (Non-students) That is, the firm treats each group of customers as a separate monopoly, setting the monopoly rule MR1 = MC on students and MR2 = MC on non-students. In particular, to maximize profits from students, the monopolist sets MR1 = MC, or 10 − 2q1 = 3, which yields an output level of q1 = 3.5 units, selling each of them at a price of p1 = 10 − 3.5 = $6.5. Similarly, to maximize profits from the non-student group, the monopolist sets MR2 = MC, or 25 − 2q2 = 3, which yields an output level of q2 = 11 tickets, each sold at a price of p2 = 25 − 11 = $14. As a result, profits from both groups become: π1 + π2 = [(6.5 × 3.5) − (3 × 3.5)] + [(14 × 11) − (3 × 11)] = 12.25 + 121, implying that total profits are π = $133.25. Self-assessment 11.3 Consider the monopolist in example 11.3, but assume that students’ inverse demand changes to p(q) = 16 − q. Follow the steps in example 11.3 to find the monopolist’s sales to each market segment, the corresponding prices, and the profits. Compare your results against those in example 11.3. In example 11.3, students pay much less than non-students at the movies, reflecting their different demands ($6.50 versus $14). Customers might, however, try to pose as part of the low-demand group to buy an item at a lower price. What can the monopolist do to avoid such a strategy? The firm can rely on screening devices, such as student IDs, to sort customers. That is, the firm cannot directly observe the customer’s demand for the Screening. 286 Chapter 11 good and, if asked, the customer would have an incentive to lie to buy the good at a cheaper price. The firm can, however, use screening to infer the customer’s unobserved demand. As a consequence, screening must satisfy two key properties to work: (1) it must be perfectly observable for the firm, such as a customer’s age, student status, or residence; and (2) it must be strongly correlated with the customer’s WTP for the good. In the example of the movie theater a student ID can be observed by an employee at the ticket counter, and it is negatively correlated with the customer’s WTP (because students’ budgets are often more constrained than those of working adults).5 11.3 Bundling Common examples of bundling are found in electronics, where you can buy a desktop computer as a whole (with a monitor, CPU, keyboard, and mouse), or buy each of its units separately. Similarly, in water parks, you can purchase an entry ticket with access to all rides, or an entry ticket without access to rides (so you pay for each ride individually). As a consequence, we can consider three forms of bundling: • No bundling, where the firm does not bundle any good, allowing the buyer to purchase each item individually (e.g., each part of the computer is sold separately). • Pure bundling, where the firm allows the buyer to purchase either the bundle (e.g., the whole computer) or no good at all. • Mixed bundling, where the firm sets prices for each individual item and for the bundle, allowing the buyer to choose whether to purchase individual items or the bundle. Example 11.4 illustrates that the monopolist can increase its profits by offering pure bundling, so long as the customer’s demand for the different items is negatively correlated. Example 11.4: Bundling Consider a monopolist selling computers. Table 11.1 reports the WTP for the CPU alone, the monitor alone, or the bundle, for each customer. 5. Common screening methods used by airlines (or traveling websites such as Orbitz or Kayak) include the number of days in advance that the customer books her ticket because business travelers with a higher WTP often book their tickets just a few days in advance. Other methods include whether she stays at her destination over Saturday night (business travelers rarely do), the Internet Protocol (IP) address of the computer, tablet or smartphone on which that search was done, and whether it was a repeated search from the same device. Other screening methods you might have encountered are the number of days since the release of a new book or electronic gadget, where firms charge a higher price during the first days after the release because customers with a high WTP rush to purchase the item, but drop its price in a matter of days to target the general public, who are willing to wait a few weeks before purchasing the item. (Firms often learn about cheaper ways to produce goods as time progresses; however, a significant decrease in the item’s price a month after its release cannot be explained by lower costs of production, but price discrimination could be the reason.) Price Discrimination and Bundling 287 Table 11.1 WTP for the CPU, the monitor, and the bundle for a computer purchase. Consumer 1 Consumer 2 Average cost CPU Monitor Both Items (Computer) 500 500α 400 100β 100 80 500 + 100β 500α + 100 480 Starting with the first column, which describes WTP for the CPU, consumer 1 has a WTP of $500, while consumer 2’s WTP is a share of that, 500α, where α ∈ (0, 1). Similarly, the second column indicates that consumer 2’s WTP for the monitor is $100, whereas that of consumer 1 is lower, 100β, where β ∈ (0, 1). Therefore, consumer 1 has the higher WTP for the CPU, but the lower for the monitor; in contrast, consumer 2 has the higher WTP for the monitor but the lower for the CPU.6 The last column sums, for every consumer, the WTP across all items, in order to find her total WTP for the bundle. For simplicity, assume that consumer 1’s WTP is larger than that of consumer 2 (500 + 100β > 500α + 100).7 The last row represents the average cost (i.e., cost per unit) that the firm incurs. We next separately analyze the profits from not practicing bundling and from pure bundling, examining which pricing strategy generates the higher profit. No bundling. In this case, the firm sells the CPU at either $500 or $500α. If the firm sells the CPU at the lower of these two prices, $500α, it entices both types of consumers to buy the CPU, earning profits of (2 × 500α) − (2 × 400) = 1, 000α − 800. If the firm, instead, chooses to set the price equal to consumer 1’s WTP for the CPU, $500, its profits are only 500 − 400 = 100. As a consequence, the firm will choose to entice both consumers only if 1, 000α − 800 > 100 or, solving for α, if α > 0.9. Intuitively, the firm entices both types of consumers when consumer 2’s WTP for the CPU is relatively close to that of consumer 1 (i.e., parameter α is close to 1). Otherwise, selling to the buyer with the higher WTP (consumer 1) is more attractive. 6. Note that the negative correlation in WTP holds because α, β ∈ (0, 1). If β > 1, however, then consumer 1 would have the higher WTP for the CPU and the monitor, while consumer 2 would exhibit the lower WTP for both items. In other words, WTP would now be positively correlated. The end-of-chapter exercises explore this possibility, showing that the seller no longer has incentives to offer pure bundling. 7. If we solve for parameter α in this inequality, we obtain that consumer 1’s WTP for both items is larger than that of consumer 2 if α < 4+β 5 . That is, consumer 2 cannot have a WTP for the CPU close to that of consumer 1; otherwise, her WTP for the sum of both items would be larger. 288 Chapter 11 A similar argument applies to the pricing of the monitor. The firm can choose to set the monitor’s price at the lowest WTP, $100β, and entice both customers, generating a profit of (2 × 100β) − (2 × 80) = 200β − 160. Alternatively, the firm can choose to price at consumer 2’s WTP, $100, attracting only this consumer to buy the monitor. This would give the firm a profit of 100 − 80 = $20. As a result, the firm would choose to sell to both consumers only if 200β − 160 > 20 or, after solving for β, if β > 180 200 = 0.9. A similar intuition applies to the monitor: the firm entices both types of customers, so long as consumer 1’s WTP is sufficiently close to that of consumer 2 (i.e., parameter β is close to 1). Otherwise, selling to the buyer with the higher WTP (consumer 2) is more attractive. Bundling. With pure bundling, the firm sets a single price for the combination of CPU and monitor (i.e., the whole computer). Similarly as in the previous discussion about the individual items, the firm has two pricing options. First, it can set a price equal to consumer 1’s WTP, 500 + 100β, and only entice her, which generates a profit of (500 + 100β) − 480 = 20 + 100β. Instead, the firm can set a price equal to consumer 2’s WTP (the lower WTP for the computer), 500α + 100, inducing both consumers to purchase the computer, yielding a profit of 2 × (500α + 100) − (2 × 480) = 1, 000α − 760. Therefore, the firm entices both consumers if 1, 000α − 760 > 20 + 100β or, after solving for parameter α, if α > 0.78 + 0.1β. Figure 11.3 depicts line α = 0.78 + 0.1β, which originates at 0.78 and increases in β at a rate of 0.1, along with the two cutoffs found in the previous discussion (namely, a horizontal line α = 0.9 and a vertical line β = 0.9). Figure 11.3 illustrates the six regions that the previous discussion generated, which we describe separately next. (For compactness, let α denote 0.78 + 0.1β, so we can write α > α.) Region I. If α > 0.9 and β > 0.9, condition α > α holds.8 In this scenario, the firm prefers to sell the CPU, the monitor, and the bundle to both customers. It prefers to sell the bundle rather than the separated items (practicing no bundling) because 8. To see this point, note that α = α reaches its highest point at β = 1, where α = 0.78 + 0.1 = 0.88, lying below the horizontal line α = 0.9. Then, condition α > α holds for all values of β if α > 0.9. Price Discrimination and Bundling 289 α Region I 1 Region II α = 0.9 0.78 Region III Region IV α = 0.78 + 0.1β Region V Region VI β = 0.9 1 β Figure 11.3 Bundling incentives as functions of α and β. 1, 000α − 760 > (1, 000α − 800) + (200β − 160) Profits from the bundle Profits from the CPU Profits from the monitor simplifies to −760 > 200β − 960, or β < 1, which holds by assumption (negatively correlated demands). Region II. If α > 0.9 but β < 0.9, condition α > α still holds. However, the firm now sells the CPU and the bundle to both customers, and the monitor to customer 2 alone. In this context, the firm offers bundling given that, 1, 000α − 760 > (1, 000α − 800) + Profits from the bundle Profits from the CPU 20 Profits from the monitor collapses to 780 > 760. Region III. If α < 0.9, β > 0.9, and α > α, the firm sells the monitor and bundle to both customers, but the CPU to customer 1 alone. Therefore, the firm offers bundling because 1, 000α − 760 Profits from the bundle > 100 Profits from the CPU + (200β − 160) , Profits from the monitor which yields 1, 000α > 700 + 200β, or α > 0.7 + 0.2β. Figure 11.4 depicts the line α = 0.7 + 0.2β on the regions identified in figure 11.3. This dashed line originates 290 Chapter 11 α 1 α = 0.9 Region III α = 0.78 + 0.1β 0.78 0.7 α = 0.7 + 0.2β β = 0.8 β = 0.9 1 β Figure 11.4 Bundling incentives in region III. at 0.7 and reaches a height of 0.9 when β = 1, and it crosses cutoff α at β = 0.8,9 thus dividing region III into two areas: one where α > 0.7 + 0.2β holds (above the dashed line at the top of region III) and the firm prefers to bundle; and another where this condition is violated (at the bottom of region III), and the firm sells each item separately. Region IV. If α < 0.9, β < 0.9, and α > α, the firm sells the bundle to both customers, the CPU to customer 1 alone and the monitor to customer 2 alone. It offers bundling in this region as well, given that 1, 000α − 760 Profits from the bundle > 100 Profits from the CPU + 20 , Profits from the monitor which yields 1, 000α > 880, or α > 0.88. Because condition α > α is satisfied in this region, and cutoff α reaches its highest point at 0.88, condition α > 0.88 holds for all points in region IV. Region V. If α < 0.9, β > 0.9, and α < α, the firm sells the monitor to both customers, the CPU to customer 1 alone and the bundle to customer 1 alone. In this scenario, the firm does not offer bundling because 9. To understand this, set the equations of both lines equal to each other (0.78 + 0.1β = 0.7 + 0.2β). Rearranging, we obtain 0.08 = 0.1β, which, solving for β, yields β = 0.8. You can also find the height that both lines reach at Price Discrimination and Bundling 291 20 + 100β < 100 Profits from the CPU Profits from the bundle + (200β − 160) , Profits from the monitor which simplifies to 80 < 100β, or 0.8 < β. This condition on β holds because β > 0.9 is satisfied by all the points in region V. Therefore, the firm does not offer bundling in this region. Region VI. If α < 0.9, β < 0.9, and α < α, the firm sells the CPU to customer 1 alone, the monitor to customer 2 alone, and the bundle to customer 1 alone. In this context, offering bundling is unprofitable, given that 20 + 100β Profits from the bundle < 100 Profits from the CPU + 20 , Profits from the monitor which collapses to 100β < 100, or β < 1, which holds by assumption (negatively correlated demands). In summary, the firm finds bundling profitable in regions I, II, and IV, which can be defined by condition α > α in figure 11.3, and in the top area of region III, defined by α > 0.7 + 0.2β. Otherwise, the firm sells each item separately. Intuitively, conditions α > α and α > 0.7 + 0.2β indicate that the WTP of customers 1 and 2 for the CPU are relatively similar (indeed, they are identical when α = 1). In contrast, when α α, their demands are so different that the firm prefers to sell each item separately (no bundling). Self-assessment 11.4 Consider the bundling table in example 11.4. Assume now that the average cost of the CPU decreases to $300. Follow the steps in example 11.4 to find under which conditions the firm chooses to sell different items to each consumer. Compare your results against those in example 11.4. Exercises 1. Price discrimination with different demands.B Consider a monopolist selling to two markets. Every market faces a different demand function, which the monopolist can observe, and the monopolist can charge different prices in each market, thus practicing third-degree price discrimination. For simplicity, assume that marginal cost is c > 0 in both markets. their crossing point by inserting β = 0.8 into the equation of either line (i.e., α = 0.78 + (0.1 × 0.8) = 0.86) thus lying below the horizontal line depicting α = 0.9 in figure 11.4. 292 Chapter 11 (a) Linear demand. Consider that the inverse demand in each market i is given by pi (qi ) = ai − bi qi , where i = {1, 2}. Find the profit-maximizing price and quantity in each market. Under which conditions on the parameters (ai , bi , and c) does the monopolist charge the same price in both markets? (b) Constant elasticity of substitution (CES) demand. Consider now that the direct demand in each −b market i is given by qi (pi ) = Ai pi i , where i = {1, 2}. (Recall that the exponent −bi indicates the elasticity of substitution, which is just a number, and thus is constant in qi .) Find the profitmaximizing price and quantity in each market. Under which conditions on parameters (Ai , bi , and c) does the monopolist charge the same price in both markets? 2. Comparing discrimination profits.B Consider a cooperative of wheat producers in Washington State, who sell their products to two types of customers: households, with demand q1 = 100 − 3p1 ; and firms, with demand q2 = 100 − 12 p2 . The cooperative operates as a monopolist in the area, and its cost function is C(q) = 1, 300 + 4q, where q = q1 + q2 denotes aggregate output. There is no possibility of arbitrage between the two groups. (a) Third-degree price discrimination. Set up the PMP for the cooperative. Find the optimal output levels and prices for each group of customers. (b) First-degree price discrimination. Assume now that the cooperative can practice first-degree price discrimination. Find the optimal output levels and prices for each group of customers. (c) Compare the total profits of the cooperative when practicing each type of price discrimination. 3. Implementing price discrimination.A Describe how the following firms could implement price discrimination. Be specific about the degree, which markets/consumers are charged higher or lower prices, and what barriers they may face. (a) Restaurants (b) Airlines (c) Cable providers (d) Wheat growers 4. Third-degree price discrimination.A A local car wash faces two types of customers: older, more traditional customers, who like their cars to sparkle all the time, with demand q1 = 75 − 2p1 ; and the younger generation, who do not care as much about having a clean car, with demand q2 = 25 − 4p2 . Each car costs $2 to wash, and the fixed costs of the car wash are $50. Find the price and number of cars washed to each group if the car wash were to price-discriminate. What is their total profit? 5. When to price-discriminate.B Consider a monopolist selling to two markets, each with a different demand: (high) pH = aH − bH qH , and (low) pL = aL − bL qL , where aH > aL and bH bL , so that demand is greater in the high market for any price. The firm practices third-degree price discrimination and has a constant marginal cost MC = c > 0. Is there a level of cost where the monopolist chooses not to sell to the low market? 6. Third-degree price discrimination with elasticity.A A microchip manufacturer produces microchips used in cell phones and sells them in two countries: the United States (US) and Japan (J ). The price elasticity in the United States is εUS = −1.5; and in Japan, it is εJ = −2.5. Price Discrimination and Bundling 293 If the monopolist practices third-degree price discrimination, the marginal cost of producing the chips is $50, and the cost of shipping the chips to Japan is $5 per chip, what price does it set in each country? 7. Willingness to price discriminate.B An airline has been collecting data to estimate demand for flights between Greenville, SC and Seattle, for which it would be the only provider. It has estimated this demand to be p = 1, 000 − 2q. The total cost (in dollars) of this flight is TC(q) = 50, 000 + 20q. (a) Uniform pricing. If the airline cannot discriminate, what price does it charge, how many tickets does it sell, and what is its profit? (b) First-degree price discrimination. If the airline can do first-degree price discrimination (based on information it receives through its partners during online booking), how many tickets does it sell, and what is its profit? (c) Information acquisition. If the airline has to pay for the information on prices through its partner in order to practice first-degree price discrimination, how much are they willing to pay for that service? 8. Ineffective price discrimination.B A local movie theater has been worried about customers posing as students to purchase movie tickets at a lower price. To combat this, the owner of the theater is interested in purchasing a student ID scanner from the local university, at a cost of $75. The movie theater faces inverse demand of pS = 25 − 0.1qS for students, and inverse demand pO = 30 − 0.1qO for all other customers. The theater faces a marginal cost of $2 per customer. (a) If the theater price-discriminates, what does it charge each group, and what is its total profit? (b) If every non-student can pass as a student and get in at student prices, how many non-students will go to the theater? (Hint: Plug the student price into the non-student demand.) What is the theater’s profit? (c) What is the theater’s profit if it purchases and uses the student ID scanner? Should it purchase the scanner? 9. Deciding when to withdraw from a market.B Clarke’s Crisp Croissants has a monopoly on the local market for breakfast pastries, which it makes at zero marginal cost. Demand in the local market is qL = 10 − pL . The firm also sells croissants in a neighboring town with demand qN = 5 − pN , where transportation costs are zero. (a) Uniform price. If Clarke chooses to set a uniform price (i.e., the same price in all markets), what is the profit-maximizing price, quantity in each market, and total profit? (b) Third-degree price discrimination. If Clarke employs third-degree price discrimination, what price does it set in each market? How much does the firm sell in each market? What is Clarke’s total profit? Is it profitable for it to price-discriminate? (c) Demand change. If the neighboring town’s demand falls to qN = 2.5 − pN , should the monopolist set a uniform price that ignores the neighboring market? 10. Quantity discounts.B Phil’s Paper Supply is a monopoly in a small Midwestern town; it sells paper and faces the inverse demand function p(q) = 25 − 0.01q and has total cost of TC(q) = 10 − 5q + 2q2 . 294 Chapter 11 (a) Uniform pricing. If Phil charges a single price for his paper, what does he charge, how many units does he sell, and what is his profit? (b) Offering quantity discounts. Phil wants to reward his major customers by offering a quantity discount. How much does Phil charge for the first q1 units of paper, and how many units does he sell? How much does Phil charge for the next block of paper q2 − q1 , and how many units does he sell in that block? (c) Calculate Phil’s profit if he decides to offer the quantity discount found in part (b). Should he implement this pricing strategy? 11. Second-degree price discrimination.A Some local water companies offer a discount to customers who use large quantities of water. Consider a local water utility that faces the inverse demand p(q) = 100 − 10q. If the water utility has a marginal cost of $0.5 per unit of water, find the price and quantity the water utility sells if it practices price discrimination with two blocks, q1 and q2 − q1 . 12. Second-degree price discrimination with more general demand.B Consider a monopolist facing the inverse demand function p(q) = a − bq, with total cost of TC(q) = cq, where a > c > 0. (a) Write down the firm’s profit maximization problem if it were to practice second-degree price discrimination to two blocks, q1 and q2 − q1 . (b) Find the price that it would charge in each block, and how many units are in each block. (c) Show that the price that it charges in the first block q1 is greater than the price in the second block q2 − q1 . 13. Second-degree price discrimination with three blocks.C Consider the demand function from example 11.2, p(q) = 10 − q, and marginal cost c = 4. Consider now that the monopolist wants to add a third block of discounts, q3 − q2 . Set up and solve the monopolist’s problem in this case. Compare your answer to the results found in example 11.2, with two blocks. C Annie’s Apples is a monopoly selling caramel apples 14. Price discrimination and advertising. √ with demand p(q, A) = 100 − q + A, where q is the number of apples and A is the advertising expenditure. Annie’s marginal cost is $2 per apple. (a) If Annie wishes to offer a quantity discount, how much does she charge for each of the two blocks of consumers? How much does Annie spend on advertising? (b) If advertising is banned by law, so A = 0, what is the firm’s optimal block pricing strategy? How are your results in part (a) affected? Interpret. 15. Bundling prices.A A fast food restaurant faces two types of consumers and is deciding on a bundling strategy. The table here reports each consumer’s WTP for hamburgers, french fries, and the bundle of both items, as well as the average cost of each good. Consumer 1 Consumer 2 Average cost Hamburger French Fries Both items (bundle) $3 $5 $2 $4 $2.5 $2 $7 $7.5 $4 Price Discrimination and Bundling 295 (a) What prices should the restaurant set for each food item if it sells the items separately? What is its profit in each scenario? (b) What prices should the restaurant set if it sells the bundle “meal” of a hamburger and french fries? (c) How do your answers change if the restaurant’s cost of beef increases so that the marginal cost of hamburgers increases to $4? 16. Bundling–I.B Consider the scenario in example 11.4, but assume that individuals’ WTP for both goods are positively correlated (i.e., parameter β = 1.4). For simplicity, assume that α = 0.92, as in the following table: Consumer 1 Consumer 2 Average cost CPU Monitor Both Items (computer) $500 $460 $400 $140 $100 $80 $640 $560 $480 Show that the firm has no incentive to bundle in this case of positively correlated demands. 17. Bundling–II.B TV-Net, a local cable TV and internet provider, is deciding on a bundling strategy. This table reports the WTP for TV alone, internet alone, and the bundle for each customer, as well as the average cost. Consumer 1 Consumer 2 Average cost TV Internet Both Items (bundle) 60 60α 10 50β 50 20 60 + 50β 60α + 50 30 where α, β ∈ (0, 1). For simplicity, assume that customer 1 has the highest WTP for the bundle of both goods. Repeat the analysis from example 11.4 to show when the firm should prefer to bundle, sell the items separately, or choose a mixed-bundling strategy. 18. Why bundle?A Many TV providers offer cable TV, internet, and telephone services, which are often bundled in different ways. Explain the intuition behind offering each separately, bundling two out of three services, and bundling all three services together. 19. Bundling–III.B TV-Net is deciding on a bundling strategy. This table reports the WTP for TV alone, internet alone, and the bundle for each customer, as well as the average cost. Consumer 1 Consumer 2 Consumer 3 Average cost TV Internet Both items (bundle) 60 40 25 10 40 50 60 20 100 90 85 30 296 Chapter 11 (a) If TV-Net only faces consumers 1 and 2, does it prefer to bundle, sell the items separately, or choose a mixed-bundling strategy? (b) If TV-Net faces all three consumers, does it prefer to bundle, sell the items separately, or choose a mixed-bundling strategy? 20. Two-part tariff.B A two-part tariff is another price-discrimination method where the producer of a good is able to capture the entire consumer surplus. An example of this might be an amusement park that charges a fee for entry (the tariff), and then charges the customer for each ride (by buying tickets). Let’s investigate how a firm sets the optimal two-part tariff by assuming that we have 100 consumers each, with demand for rides of p = 9 − q, and the costs of running the amusement park are C(q) = 100 + q. (a) Uniform pricing. If the firm acts as a monopoly, setting a single price, what is its profitmaximizing price, quantity of rides (per person and aggregate), and profit? (b) Marginal cost pricing. If the firm sets its price per ride equal to marginal cost, what is the number of rides it will sell (per person and aggregate) and consumer surplus? (c) Two-part tariff. If the amusement park uses a two-part tariff, setting its entrance fee equal to consumer surplus while charging a price per ride equal to its marginal cost, what is its total profit? 12 Simultaneous-Move Games 12.1 Introduction This chapter is the first one to analyze game theory tools in economics, a discussion that we expand upon in chapter 13. Many of the tools we present are then applied to analyze imperfectly competitive markets (chapter 14), contract theory (chapter 16), and externalities and public goods (chapter 17). We start by describing the contexts in which economists and other social scientists refer to a situation where agents interact as a “game,” and how to represent these scenarios graphically using either matrices or game trees. The remainder of the chapter focuses on how to predict players’ behavior in various games, which we do by deploying various “solution concepts” that help us identify equilibrium strategies where, intuitively, no player has any incentive to change her strategy. We start with a simple solution concept known as “strategic dominance.” Rather than seeking to find which strategy provides the highest payoff to a given player, dominance looks at which strategies a rational player would never use because other strategies give her a strictly higher payoff, regardless of what her opponents do. Essentially, a dominated strategy provides a player with an unambiguously lower payoff than other strategies and, as a result, we delete it from her set of available strategies. Deleting all strategies that players regard as dominated is a straightforward tool for some games, which can help us provide relatively precise equilibrium predictions about how players behave. However, the application of strategic dominance may not delete many strategies (or have no effect at all) in some games. In these cases, we need to rely on solution concepts that help us predict players’ behavior more precisely. The concept of a player’s “best response” can help us in this regard, as it identifies which strategy (or strategies) provides a player with the highest possible payoff against each of the strategies that her opponents select. The Nash equilibrium (NE), defined later in this chapter, then uses the notion of best response by searching for a scenario (a profile of strategies, one for each player) where every player plays a best response to her opponents’ strategies (i.e., mutual best response). We discuss standard games in economics, such as the Prisoner’s Dilemma game, the Battle of the Sexes 298 Chapter 12 game, and Coordination and Anticoordination games, and how they can apply to related disciplines like business, finance, or political science. For several games (most of those covered in Intermediate Microeconomics courses), the notion of the NE helps us predict more precisely how players behave in equilibrium. Yet, the NE does not offer a precise equilibrium prediction for some games if we assume that players choose a specific strategy with 100 percent probability. In these games, we show that allowing players to randomize across some (or all) of their available strategies allows a precise NE prediction. The NE in which players randomize their strategies is known as a “mixed-strategy Nash equilibrium.” We illustrate its application with an example of the penalty kicks round in the 2015 Women’s World Cup final between the US and China. We then describe how to depict the best response of each player in this type of game. 12.2 What Is a Game? In economics and business, we often refer to a “game” every time we consider scenarios in which one agent’s actions affect other agents’ well-being. For instance, when a firm increases its output, it may lower market prices, which in turn decreases the profits of other firms in the same industry. Similarly, when a country sets a higher tariff on imports, it may decrease the volume of imports, at the expense of another country’s exports and welfare. As you probably noticed, day-to-day life is packed with strategic contexts in which our actions affect the wellbeing of other agents (either individuals, firms, or governments) and, as a consequence, most contexts can be modeled as games. We next describe the main ingredients of a game. Whether analyzing firm competition, donations to a charity, or government subsidies, all strategic scenarios include the following elements: • Players. The set of individuals, firms, or countries, that interact with one another. When we examine competition between two sellers, we say that there are only two players (i.e., two firms), whereas when analyzing the incentives to donate to a charity we may have more than a million players (e.g., individuals receiving a phone call to contribute to a charity).1 • Strategy. A complete plan describing which actions a player chooses in each possible situation (contingency). A strategy can be informally understood as an instruction manual: a player opens the manual, looks for the page describing the contingency she is facing in the game (e.g., the actions other players chose, and what stage of the game the player is at), and reads the action that the instruction manual tells her to choose in such a situation. 1. Needless to say, we do not consider “games with one player,” as there is no one else to be affected by the player’s actions. Simultaneous-Move Games 299 Player 2 Left Right −4, −4 0, −7 −7, 0 −1, −1 Up Down Player 1 Matrix 12.1 Example of a two-player game. Player 2 A a (4, 4) b (–2, –2) Player 1 B (0, 10) Payoff for Player 1 (First Mover) Payoff for Player 2 (Second Mover) Figure 12.1 Example of a game tree. • Payoffs. A game must also list the payoff that every player obtains under each possible strategy path. For example, if player 1 chooses A and players 2 and 3 choose B, the vector of payoffs is ($5, $8, $7), where the first component of the triplet lists the payoff going to player 1, $5, while $8 is accrued to player 2, and $7 to player 3. Throughout the analysis, we assume that all players are rational. In a strategic scenario, this requires that every player knows the rules of the game (i.e., who the players are, what their available strategies are in each contingency, and their resulting payoffs in each case). In addition, it requires that every player knows that every player knows the rules of the game, and every player knows that every player knows… ad infinitum. To better understand this assumption, consider that Ana and Felix are about to play checkers. “Rationality,” in this context, means that they both know the rules of the game, that Ana (Felix) knows that Felix (Ana) knows the rules of the game, that Felix (Ana) knows that Ana (Felix) knows that Felix (Ana) knows the rules of the game, ad infinitum. In short, this assumption is often referred to as “common knowledge of rationality,” and informally guarantees that every player can put herself in the shoes of her opponent at any stage of the game to anticipate her moves. We will encounter two approaches to graphically represent games: matrices and trees, as matrix 12.1 and figure 12.1 illustrate. In the case of matrices, player 1 is typically located on the left side of the matrix, as she chooses rows (and she Two graphical approaches. 300 Chapter 12 is often referred to as the “row player”), whereas player 2 is placed at the top of the matrix because she selects columns (and hence is called the “column player”). In matrix 12.1, for instance, if player 1 chooses Up while player 2 picks Right, their payoff becomes (0, −7), indicating that player 1’s payoff is zero, while player 2’s is −7. Matrices are often used to represent games in which players choose their actions simultaneously. In the case of game trees, such as that shown in figure 12.1, players act sequentially, with player 1 acting first (e.g., the leader) and player 2 responding to player 1’s action as a follower. In figure 12.1, player 1 can choose between A and B. If player 1 chooses B, the game is over and payoffs are distributed, whereas if she chooses A, player 2 is called on to move (responding with either action a or b). For instance, if player 1 chooses A and player 2 responds with b, we say that (A, b) is the “strategy profile”( i.e., the list of player 1’s and player 2’s strategies, or how the game was played).2 Now that we know how to describe a strategic scenario between players (a game), we turn to the main question of the chapter: How do we predict the way in which a game will be played? In other words, how can we forecast players’ behavior in a competitive context? In that regard, we seek to identify scenarios in which no player has any incentive to alter her strategy choice, given the strategy of her opponents. In short, these scenarios are called “equilibria” because players have no incentive to deviate from their strategy choices. 12.3 Strategic Dominance In this section, we analyze the first solution concept: equilibrium dominance. We define the types of dominance next, and then we apply them to a standard game. Strict dominance Player i finds that strategy si strictly dominates another strategy si if choosing si provides her with a strictly higher payoff than selecting si , regardless of her rivals’ strategies. When strategy si strictly dominates another strategy si , we say that si is a “strictly dominant strategy.” As a consequence, a player wants to choose a strictly dominant strategy because it provides her with an unambiguously higher payoff than any other available strategy (i.e., regardless of the strategy her opponents select). In other words, one specific 2. If, when player 2 responds with b, player 1 is again called on to move in the third stage of the game, choosing between two new actions C and D, then an example of a strategy profile would be (AC, b), where player 1 chooses A, player 2 responds with b, and player 1 ultimately responds with C in the last stage of the game. Note that, to illustrate that actions A and C are selected by player 1 (although one is in stage 1 and the other is in stage 3), we list both of them together in the first element of her strategy pair. Simultaneous-Move Games 301 strategy (her strictly dominating strategy) yields a higher payoff, regardless of the beliefs that she holds about her rivals’ choices.3 In contrast, in this definition, we say that strategy si is “strictly dominated” by strategy si . Intuitively, a strictly dominated strategy gives player i a strictly lower payoff, regardless of her rivals’ choice. We then expect a rational player to never choose such a strategy. Tool 12.1 provides a step-by-step road map on finding dominant strategies, while example 12.1 puts the tool to work. Tool 12.1. How to find a strictly dominant strategy: 1. Focus on the row player by fixing your attention on one strategy of the column player (i.e., one specific column). (a) Cover with your hand all columns you aren’t considering. (b) Find the highest payoff for the row player by comparing, across rows, the first component of every pair. (c) For future reference, underline this payoff. 2. Repeat step 1, but now fix your attention on a different column. 3. If, after repeating step 1 enough times, you find that the highest payoff for the row player always occurs at the same row (your underlined payoffs are all on the same row), this row becomes her dominant strategy. Otherwise, she does not have a dominant strategy. 4. For the column player, the method is analogous, but now fix your attention on one strategy of the row player (one specific row), covering with your hand all other rows you aren’t considering, and comparing the payoffs of the column player (second component of every pair) across columns. Example 12.1: Finding strictly dominant strategies Matrix 12.2a considers two firms simultaneously and independently choosing a technology, either A or B for firm 1, and a or b for firm 2. (All payoffs in these examples are in the millions of dollars.) We can easily show that technology A is strictly dominant for firm 1 because it yields a higher payoff than B, both when firm 2 chooses a in the left column (because 5 > 3) and when it selects b in the right column (given that 2 > 1).4 3. For a more formal (and shorter!) definition, we say that, in a game with two players i and j, player i finds strategy si to strictly dominate another strategy si if ui (si , sj ) > ui (si , sj ) for every strategy sj of her rival. That is, si yields a higher payoff than si regardless of the strategy her rival (player j) picks. 4. When comparing the payoffs of the row player (firm 1 in this example), we focus on the first number in every pair, such as 2 in the cell corresponding to (A, b) and 1 in the cell corresponding to (B, b). 302 Chapter 12 Firm 1 Firm 2 Tech a Tech b Tech A 5, 5 2, 0 Tech B 3, 2 1, 1 Matrix 12.2a Technology choice game–I. A similar argument applies to technology a, which is strictly dominant for firm 2 because it provides this firm with a higher payoff than b, both when firm 1 chooses technology A (in the top row, where 5 > 0) and when it selects B (in the bottom row, where 2 > 1).5 As a result, we can expect firm 1 choosing A and firm 2 selecting a in matrix 12.2a, yielding (A, a) as the equilibrium of this game. The definition of strict dominance does not allow for ties in the payoffs that player i earns. The next term, “weak dominance,” allows ties to occur. Weak dominance Player i finds that strategy si weakly dominates another strategy si if choosing si provides her with a strictly higher payoff than selecting si for at least one of her rivals’ strategies, but provides the same payoff as si for the remaining strategies of her rivals. Therefore, a weakly dominant strategy yields the same payoff as other available strategies, but a strictly higher payoff against at least one strategy of the player’s rivals.6 In matrix 12.2b, Firm 1 finds that technology A weakly dominates B because A yields a higher payoff than B against a (when firm 2 chooses the left column, where 5 > 3), but provides firm 1 with exactly the same payoff as B, $2, against b (when firm 2 selects the right column). A similar argument applies to firm 2, which finds that technology a weakly dominates b. Indeed, technology a yields a higher payoff than b when firm 1 selects A (5 > 0, on the top row of the matrix), but generates the same payoff as b, $1, when firm 1 chooses B at the bottom row. 5. When comparing the payoffs of the column player (firm 2 in this example), we focus on the second number in every pair, such as 2 in the cell corresponding to (B, a) and 1 in the cell corresponding to (B, b). 6. More formally, a player i finds strategy si to weakly dominate another strategy si if ui (si , sj ) ≥ ui (si , sj ) for every strategy sj of her rival, holding strictly for at least one strategy sj . Note that the last part of the definition (“… holding strictly for at least one strategy sj ”) is required to avoid a complete tie, where player i earns the same payoff when choosing strategy si and si against every strategy of her opponent, sj . Simultaneous-Move Games 303 Firm 1 Firm 2 Tech a Tech b Tech A 5, 5 2, 0 Tech B 3, 1 2, 1 Matrix 12.2b Technology choice game–II. Self-assessment 12.1 Consider matrix 12.2a again, but assume that the payoff when firms choose technology (B, b), in the lower-right cell of the matrix, is (3, 3), indicating that both firms receive a payoff of $3, rather than the payoff of $1 obtained in matrix 12.2a. Follow the steps in example 12.1 to find if either firm has a dominant strategy. Interpret. In matrices with more than two rows and/or columns, finding which strategies are strictly dominated can prove particularly helpful. From the previous discussion, we know that a rational player should not use a strictly dominated strategy because it yields a lower payoff than other available strategies, regardless of the strategy her rivals pick. As a consequence, we can delete from a matrix those strategies (rows or columns) that are strictly dominated for one player because she would not choose them in any case.7 Once we have deleted these dominated strategies for one player, we can move on to another player, and delete the strategies she considers strictly dominated, and subsequently move on to another player; this process is known as Iterative Deletion of Strictly Dominated Strategies (IDSDS). Once we cannot find any more strictly dominated strategies for either player, we are left with the equilibrium prediction according to IDSDS. We formally say that those strategy profiles (i.e., cells) survive IDSDS. In matrix 12.2a, this solution concept yields a precise equilibrium prediction: we can delete technology B for firm 1 because it is strictly dominated by A, and b for firm 2 because it is strictly dominated by a. Once we delete the bottom row corresponding to technology B and the right column associated with technology b, we are left with a unique cell surviving the application of IDSDS, corresponding to strategy profile (A, a), which predicts that firm 1 chooses A, while 2 selects a. While IDSDS offers precise equilibrium predictions in some games, it provides imprecise predictions in other cases, yielding multiple equilibria (i.e., several cells surviving IDSDS). Example 12.2 illustrates this possibility. 7. More formally, we say that player i does not choose a strategy she finds to be strictly dominated, regardless of the beliefs she sustains about the strategy that her rivals will pick. In plain English, player i could say: “I don’t care which strategy my rivals pick—choosing a strictly dominated strategy makes me worse off!” 304 Chapter 12 Example 12.2: When IDSDS does not provide a unique equilibrium Consider matrix 12.3 representing the pricing decision of two firms. Each firm simultaneously chooses whether to set high, middle, or low prices. In this context, let us apply IDSDS, starting with firm 1. High is strictly dominated by Low because it yields a lower payoff than that from Low, regardless of the price chosen by firm 2 (i.e., independent of the column that firm 2 selects).8 Firm 1 High Medium Low Firm 2 High Medium 2, 3 1, 4 5, 1 2, 3 3, 7 4, 6 Low 3, 2 1, 2 5, 4 Firm 2 High Medium 5, 1 2, 3 3, 7 4, 6 Low 1, 2 5, 4 Matrix 12.3 When IDSDS yields more than one equilibrium–I. Firm 1 Medium Low Matrix 12.4 When IDSDS yields more than one equilibrium–II. After deleting the strictly dominated strategy High from firm 1’s rows in matrix 12.3, we are left with the reduced matrix 12.4, which has only two rows. We can put ourselves in the shoes of firm 2, to see if we can find a strictly dominated strategy for it. Specifically, Low is strictly dominated by Medium because Low yields a strictly lower payoff than Medium, regardless of the row that firm 1 selects.9 After deleting the Low column from firm 2’s strategies, we are left with a further reduced matrix (see matrix 12.5). We can now move again to analyze firm 1. At this point, however, we cannot identify any more strictly dominated strategies for this firm 8. To understand this, note that when firm 2 chooses High (in the left column of matrix 12.3), firm 1 obtains a higher payoff with Low ($3) than with High ($2). Similarly, when firm 2 selects Medium (in the center column), firm 1 receives a higher payoff from playing Low ($4) than from High ($1), and so does firm 2 when it chooses Low (at the right column), where firm 1’s payoff is $5 from Low, and only $3 from High. Therefore, we can claim that High is strictly dominated by Low. 9. Indeed, when firm 1 chooses Medium (in the top row of matrix 12.4), firm 2 obtains a higher payoff by selecting Medium (3) than from choosing Low (2). Similarly, when firm 1 selects Low (in the bottom row), firm 2 receives a higher payoff from playing Medium (6) than from Low (4). Hence, we can claim that firm 2 finds Low to be strictly dominated by Medium. Simultaneous-Move Games 305 Firm 1 Firm 2 High Medium Medium 5, 1 2, 3 Low 3, 7 4, 6 Matrix 12.5 When IDSDS yields more than one equilibrium–III. because there is no strategy (no row in matrix 12.5) yielding a lower payoff, regardless of the column that firm 2 plays. Indeed, firm 1 prefers Medium to Low if firm 2 chooses High (in the left column) because 5 > 3; but it prefers Low to Medium if firm 2 chooses Medium (in the right column) given that 4 > 2. A similar argument applies to firm 2 because there is no strategy (column) yielding a lower payoff, regardless of the row that firm 1 selects. Therefore, the remaining four cells in matrix 12.5, (Medium, High), (Medium, Medium), (Low, High), and (Low, Medium), constitute our most precise equilibrium prediction after applying IDSDS. This is actually one of the disadvantages of IDSDS, as well as a motivation to consider other solution concepts to predict equilibrium behavior in games, as we discuss in the next section. The NE solution concept will help us provide more precise predictions (or at least the same) about how players behave in equilibrium. Self-assessment 12.2 Consider matrix 12.3 again, but assume that the payoff that firms obtain from (High, Medium) is (3, 4) rather than (1, 4) in the top row of the matrix. Which strategy profiles survive IDSDS? Compare your results against those in example 12.2. The application of IDSDS in example 12.2 left us with several equilibria. IDSDS nonetheless helped us delete one strategy for each player, as they are strictly dominated. That is, other strategies provide the player with a strictly higher payoff, regardless of her opponent’s strategy. In some games, such as that in example 12.3, IDSDS does not even allow us to delete a strategy for any player. In those cases, we say that IDSDS “doesn’t have a bite” because IDSDS does not help us reduce the set of strategies that a rational player would choose in equilibrium. Example 12.3: When IDSDS does not have a bite Matrix 12.6 represents the Matching Pennies game, which you may have played in your childhood. Players 1 and 2 each hold a penny in one hand, but don’t show it to each other. Both players must 306 Chapter 12 Player 1 Player 2 Heads Tails Heads 1, −1 −1, 1 Tails −1, 1 1, −1 Matrix 12.6 Matching Pennies game. then simultaneously show their coins, with the following payoffs: If both players show Heads or they both show Tails, player 1 gets player 2’s penny, for a gain of 1, while player 2 loses his penny, for a loss of 1. This is illustrated in the matrix by the cells along the main diagonal, with payoffs (1, −1). However, if the players show different sides of their coins, player 1 must give her penny to player 2, so player 2’s payoff is 1, while player 1’s is −1, with payoffs (−1, 1). To see that IDSDS has no bite, consider one player at a time. Player 1 does not find any of her strategies strictly dominated: she prefers Heads when player 2 chooses Heads (left column), but Tails when player 2 chooses Tails (right column). In short, player 1 seeks to select the same strategy as player 2 because by doing so, she wins a penny. As a consequence, we cannot find a strategy that player 1 does not use regardless of player 2’s strategy (i.e., regardless of the column that player 2 chooses). A similar argument applies to player 2. He prefers to choose the opposite strategy as player 1 (i.e., choosing Tails when player 1 selects Heads (top row), but Tails when she chooses Heads (bottom row)). In summary, no player has strictly dominated strategies, implying that we cannot delete any row or column from the matrix. Therefore, the application of IDSDS left us with the original matrix! In these cases, we say that IDSDS has “no bite.” Self-assessment 12.3 Consider matrix 12.6 again, but assume that the payoff from both players choosing the same action (i.e., their payoff from (Heads, Heads) or their payoff from (Tails, Tails)), is (0, 0) rather than (1, −1). Which strategy profiles survive IDSDS? Compare your results against those in example 12.3. 12.4 Nash Equilibrium From examples 12.1–12.3, we learned that applying IDSDS helps us delete all but one cell from the matrix in some games (and thus predict a unique equilibrium in the game). For other games, however, IDSDS deleted only a few strategies for each player, leaving several Simultaneous-Move Games 307 surviving cells, thus providing a relatively imprecise equilibrium prediction (e.g., four strategy profiles could emerge as equilibria of the game). And for some games, such as Matching Pennies, applying IDSDS did not delete any strategy for any player, so we say that IDSDS does not have a bite. In this section, we examine a different solution concept which has “more bite” than IDSDS, and thus offers either the same or more precise equilibrium predictions (i.e., fewer strategy profiles can emerge as equilibria of the game). This solution concept, known as the “Nash equilibrium” after Nash (1950), builds upon the notion that every player finds her best response to each of her rivals’ strategies, and hence we start with the definition of “best response.”10 Best response Player i regards strategy si as a best response to her rival’s strategy sj if si yields a weakly higher payoff than any other available strategy si against sj . Tool 12.2. How to find best responses in matrix games: 1. Focus on the row player by fixing your attention on one strategy of the column player (i.e., one specific column). (a) Cover with your hand all columns that you are not considering. (b) Find the highest payoff for the row player by comparing the first component of every pair. (c) For future reference, underline this payoff. This is the row player’s best response to the column that you considered from the column player. 2. Repeat step 1, but now fix your attention on a different column. 3. For the column player, the method is analogous, but now direct your attention on one strategy of the row player (i.e., one specific row), cover with your hand all other rows you are not considering, and compare the payoffs of the column player (i.e., second component of every pair). We next use the concept of best response to define a NE as a scenario in which every player chooses her best strategy, given the strategies chosen by her rivals. In such a scenario, no player has unilateral incentives to deviate from her equilibrium strategy. Indeed, because she is choosing a best response to her rivals’ strategies, deviating would only lower her payoff (or leave it unaffected). 10. Formally, in a game with two players, strategy si is a best response against player j’s strategy sj if and only if ui (si , sj ) ≥ ui (si , sj ) for every strategy si that is different from si . That is, there is no other strategy si that provides player i with a strictly higher payoff than si against her opponent’s strategy sj . 308 Chapter 12 Nash equilibrium (NE) A strategy profile s∗i , s∗j is a NE if every player chooses a best response to her rivals’ strategies. In other words, a strategy profile is a NE if it is a mutual best response: the strategy that player i chooses is a best response to that selected by player j, and vice versa. As a result, no player has incentives to deviate because doing so would either lower her payoff, or keep it unchanged.11 Tool 12.3 describes how to find NEs. Example 12.4 illustrates how to find best responses and then use these responses in our search of NEs. Tool 12.3. How to find Nash equilibria: 1. Find the best responses to all players (see tool 12.2 for details). 2. Identify which cell or cells in the matrix has all payoffs underlined, meaning that all players have a best response payoff. These cells are the NEs of the game. Example 12.4: Finding best responses and NEs Using the left matrix from example 12.1 again, let us first identify the best responses to each firm. Firm 1’s best responses. When firm 2 chooses a in the left column, firm 1’s best response is A (on the top row) because it yields a higher payoff than B (bottom row). To see this, a common visual guide many students use is to cover with one hand (or a piece of paper) the column that firm 2 is not choosing (b in this case, on the right side of the matrix), leaving strategy a uncovered. Once you focus on the column corresponding to a, it is obvious that firm 1’s best response is A, on the top row, because 5 > 3. Following a similar approach, when firm 2 chooses b in the right column, firm 1’s best response is … (cover with your hand the unchosen column a on the left side of the matrix!) technology A, given that 2 > 1. Summarizing, firm 1’s best responses are BR1 (a) = A when firm 2 chooses a and BR1 (b) = A when firm 2 selects b.12 Firm 2’s best responses. We can now follow the same approach to figure out the best responses of firm 2 to each strategy chosen by firm 1. Let us first analyze the case 11. Another way to describe a NE is by focusing on every player i’s beliefs about how her rivals will behave. Therefore, player i’s beliefs assign a probability to each of her opponents’ strategies. Using that approach, we can say that a NE is a system of beliefs (that is, a list of beliefs for each player) and a list of actions that satisfy two properties: (1) every player uses a best response to her beliefs about how her rivals behave; and (2) the beliefs that players sustain are, in equilibrium, correct. For simplicity, however, we focus on the definition given here. 12. In other words, firm 1 responds with technology A regardless of firm 2’s choice. Indeed, as shown in example 12.1, Firm 1 finds strategy A to strictly dominate B; that is, strategy A yields a higher payoff than B independent of the strategy chosen by firm 2 (i.e., regardless of the column firm 2 chooses). Simultaneous-Move Games 309 Firm 1 Firm 2 Tech a Tech b Tech A 5, 5 2, 0 Tech B 3, 2 1, 1 Matrix 12.7 Finding best responses and NEs in Technology game–I. in which firm 1 selects technology A (in the top row). To focus your attention on the strategy that firm 1 selects, cover with your hand the bottom row (corresponding with the strategy firm 1 does not select). We can easily see that in this context firm 2’s best response is BR2 (A) = a because 5 > 0. Similarly, when firm 1 chooses B (bottom row), we cover the top row with a hand and find that firm 2’s best response is BR2 (B) = a because 2 > 1. Like firm 1, firm 2 chooses a, regardless of the strategy chosen by its rival (firm 1). Therefore, strategy profile (A, a) constitutes a mutual best response, and thus the NE of the game. In particular, firm 1 does not have an incentive to deviate from A when its rival chooses a, nor does firm 2 have the incentive to deviate from a when its rival chooses A. Faster tool: underlining BR payoffs. A common tool used to rapidly find NEs in games is to underline best response payoffs (i.e., the payoff that a player obtains from playing her best response to each of her opponents’ strategies). In the game we analyze here, matrix 12.7 underlines the payoff that firm 1 obtains from choosing A against a ($5 in the left column) and b ($2 in the right column), and the payoff that firm 2 accrues from selecting a against A ($5 in the top row) and B ($2 in the bottom row). Once we are done underlining best response payoffs (see matrix 12.7), the cells where the payoffs from all players are underlined must constitute a NE of the game because players are playing best responses against their rivals’ strategies (mutual best responses). In this matrix, the NE solution concept provides the same equilibrium prediction as IDSDS did in example 12.1, (A, a). We next examine matrix 12.1b from example 12.1, showing that NE yields more precise predictions than IDSDS. Matrix 12.8 reproduces matrix 12.1b for easier reference. It is easy to identify that firm 1’s best responses are BR1 (a) = A when firm 2 chooses a (in the left column), and BR1 (b) = {A, B} when firm 2 selects b (in the right column), where the latter indicates that firm 1 is indifferent between responding with A or B when firm 2 chooses b (in the right column) as both yield a payoff of 2. Similarly, firm 2’s best responses are BR2 (A) = a when firm 1 chooses A (in the top row), and BR2 (B) = {a, b} when firm 1 selects B (in the bottom row). We can follow this approach of underlining best response payoffs, obtaining matrix 12.8. 310 Chapter 12 Firm 1 Firm 2 Tech a Tech b Tech A 5, 5 2, 0 Tech B 3, 1 2, 1 Matrix 12.8 Finding best responses and NE in Technology game–II. Two strategy profiles have the payoffs from all players underlined, (A, a) and (B, b), which constitute the two NEs of the game. Recall that the application of IDSDS to this game did not have a bite, as we could not delete any strategy as being strictly dominated for either firm 1 or 2. As a consequence, we were left with all four cells (four strategy profiles) as the most precise equilibrium prediction according to IDSDS. We have now shown that the NE solution concept yields two NEs, thus providing a more precise prediction than IDSDS. Self-assessment 12.4 Consider matrix 12.8 again, but assume that the payoff players obtain from choosing technology (B, b), in the lower right side of the matrix, is (3, 3). Intuitively, coordinating on the superior technology (A, a) is still preferable, yielding a payoff (5, 5), but the payoff difference to (B, b) is now smaller than in matrix 12.8. Find the NE of the game, and compare your results against those in example 12.4. Interpret. 12.5 Common Games In this section, we apply the NE solution concept to four common games in economics and other social sciences: the Prisoner’s Dilemma game, the Battle of the Sexes game, the Coordination game, and the Anticoordination game. Example 12.5: Prisoner’s Dilemma game Consider the following scenario. Two people have been arrested by the police, and they are placed in different cells so they cannot communicate with each other (cell phones were left in custody too!). The police have only minor evidence against them, which would lead to a minor sentence (a year in jail). However, the police suspect that these two individuals committed a specific crime, and separately offer each of them the following deal: Simultaneous-Move Games 311 If you confess to the crime and your partner doesn’t, we will let you go home, while your partner will serve 10 years in jail. If instead you don’t confess but your partner does, she will go home and you will serve 10 years in jail. If both you and your partner confess, both of you will serve 5 years in jail. Finally, if neither of you confess, both of you will only serve one year in jail. Matrix 12.9a describes this game, where all amounts are negative to represent that years in jail generate a disutility. As in the previous discussion, we first identify best responses for each player. Player 1 Player 2 Confess Not confess Confess −5, −5 0, −10 Not confess −10, 0 −1, −1 Matrix 12.9a The Prisoner’s Dilemma game. Player 1’s best responses. For player 1, we first fix player 2’s strategy at Confess (left column), yielding BR1 (C) = C because −5 > −10. Similarly, fixing player 2’s strategy at Not confess (right column), yields BR1 (NC) = C because 0 > −1. Therefore, player 1 responds with Confess regardless of player 2’s strategy (both when player 2 confesses and when he does not). Player 2’s best responses. Player 2’s best responses are symmetric because her payoffs are symmetric to those of player 1, but we include the analysis here as further practice. We first fix player 1’s strategy at Confess (top row), obtaining BR2 (C) = C because −5 > −10; second, we fix player 1’s strategy at Not confess (bottom row), yielding BR2 (NC) = C because 0 > −1. Overall, both players choose Confess regardless of her opponent’s strategy, thus indicating that Confess is a strictly dominant strategy for both players. Underlining best response payoffs, we obtain matrix 12.9b. Player 1 Player 2 Confess Not confess Confess −5,−5 0, −10 Not confess −10,0 −1, −1 Matrix 12.9b The Prisoner’s Dilemma game – Underlining best response payoffs. As a result, (Confess,Confess) is the unique NE of the game, because in that strategy profile, both players choose mutual best responses. 312 Chapter 12 Self-assessment 12.5 Consider the Prisoner’s Dilemma game from matrix 12.9a again. However, let us now assume that, when a player confesses while her partner does not, police do not offer any deal to the confessing player. As a consequence, payoff (−10, 0) becomes (−10, −1); and similarly, payoff (0, −10) becomes (−1, −10). All other payoffs are unaffected. Find the NE of the game, and compare your results against those in example 12.5. Interpret. The NE in the Prisoner’s Dilemma game is rather somber: every player, seeking to maximize her own payoff, confesses, which entails that they both serve 5 years in jail. If, instead, players could coordinate their actions and not confess, they would only serve 1 year in jail. However, the individual incentives of each prisoner lead her to confess, both when her opponent confesses (as her sentence is reduced from 10 to 5 years) and when her opponent does not confess (as her sentence is reduced from 1 year to zero). This game, hence, illustrates strategic scenarios in which there is tension between the individual incentives of each player and the collective interests of the group. In contexts that can be modeled like a Prisoner’s Dilemma game, players’ equilibrium behavior does not result in the socially optimal outcome (e.g., in example 12.5, the group’s payoff is maximized when no player confesses, and every player only serves 1 year in jail). Scenarios in which similar conflicts arise between individual and social incentives are common in economics, such as price wars between firms (both firms would be better off by setting high prices, but each firm has individual incentives to lower its own price to capture a larger market share); tariff wars between countries (where both countries would be better off by setting low tariffs, but each country has individual incentives to raise its own tariff to protect its domestic industry); or the use of negative campaigning in politics (where all candidates would be better off by not spending money on negative campaigning, but each candidate has incentives to spend some money on it to win the election). Example 12.6: Battle of the Sexes game Consider the following scenario: Ana and Felix are incommunicado in separate areas of the city. In the morning, they talked about where to go after work, the football game or the opera, but they never agreed on where to go. Every individual must simultaneously and independently choose whether to attend the football game (F) or the opera (O). As illustrated in matrix 12.10a, Felix prefers to attend the football game if Ana is also there, followed by attending the opera with Ana, followed by being at the football game without her, and finally by attending the opera without her. Ana’s payoffs are symmetric: her most preferred event is being at the opera with Felix, followed by being at the football game with him, finally by Simultaneous-Move Games 313 Felix Football Opera Ana Football 5, 4 2, 2 Opera 3, 3 4, 5 Football Opera Ana Football 5,4 2, 2 Opera 3, 3 4,5 Matrix 12.10a The Battle of the Sexes game. Felix Matrix 12.10b The Battle of the Sexes game–underlining best response payoffs. being at the opera without him, and followed by being at the football game without him.13 Felix’s best responses. Following the previous approach to identifying best responses, we can find that Felix’s best responses are BRFelix (F) = F when Ana goes to the football game because 5 > 2 (see left column), and BRFelix (O) = O when Ana goes to the opera because 4 > 3 (see right column). Intuitively, Felix seeks to attend the same event as Ana does. Ana’s best responses. Similarly, Ana’s best responses are BRAna (F) = F when Felix goes to the football game because 4 > 3 (see top row), and BRAna (O) = O when Felix goes to the opera because 5 > 2 (see bottom row). Hence, they both prefer to be together than separated, but each has a more preferred event. Matrix 12.10a becomes matrix 12.10b (with the best response payoffs underlined). Therefore, we found two cells in which both players’ payoffs are underlined. These two cells constitute the two NEs in this game (Football, Football) and (Opera, Opera). 13. Alternatively, we can understand matrix 12.10a by looking at each strategy profile. When both players attend the football game, Felix’s payoff is 5, while Ana’s is 4. When they both attend the opera, these payoffs are switched— now Ana receives a payoff of 5 and Felix a payoff of 4. When players miscoordinate, however, their payoffs are lower: both receive a payoff of 3 when Felix goes to the football game while Ana is at the opera (each at their preferred event), but each receives a payoff of 2 when Felix goes to the opera while Ana is at the football game (each goes to their least preferred event). 314 Chapter 12 Self-assessment 12.6 Consider again the Battle of the Sexes game from matrix 12.10a. However, let us now assume that Felix started to appreciate opera, changing the payoffs in the second row of matrix 12.10a to (3, 2) and (5, 5)—that is, only Felix’s payoffs from going to the opera changed. Find the NE of the game, and compare your results against those in example 12.6. Interpret. Example 12.7: Coordination game Consider the game in matrix 12.11a, illustrating a “bank run” between depositors 1 and 2, with payoffs in thousands of dollars. News suggest that the bank where depositors 1 and 2 have their savings accounts could be in trouble, and each depositor must decide simultaneously and independently whether to withdraw all the money in her account or wait. If both wait, they both maintain their funds ($150); if both withdraw, the bank can offer cash for only a portion of their savings ($50); and if one depositor withdraws while the other waits, the bank can provide the former with funds for most of her savings ($100), while the waiting depositor is left with no money. Depositor 1’s best responses. In this scenario, depositor 1’s best responses are BR1 (W ) = W when depositor 2 withdraws because 50 > 0 (in the left column), and BR1 (NW ) = NW when depositor 2 does not withdraw because 150 > 100 (in the right column). Similar to the Battle of the Sexes game, depositor 1 chooses the same strategy as her opponent. Depositor 2’s best responses. Because depositors’ payoffs are symmetric, depositor 2’s best responses are also BR2 (W ) = W when depositor 1 withdraws because 50 > 0 (in the top row), and BR2 (NW ) = NW when depositor 1 does not withdraw, given that 150 > 100 (in the bottom row). Matrix 12.11a becomes matrix 12.11b (with the best response payoffs underlined). As a result, the two NEs in this game are (Withdraw, Withdraw) and (Not withdraw, Not withdraw). Because players seek to coordinate, by choosing the same strategy, either both withdrawing or both not doing so, this type of games is commonly known as a “Coordination game.” Depositor 1 Matrix 12.11a Coordination game. Depositor 2 Withdraw Not withdraw Withdraw 50, 50 100, 0 Not withdraw 0, 100 150, 150 Simultaneous-Move Games 315 Depositor 1 Depositor 2 Withdraw Not withdraw Withdraw 50,50 100, 0 Not withdraw 0, 100 150,150 Matrix 12.11b The Coordination game–underlining best response payoffs. Self-assessment 12.7 Consider again the Coordination game from matrix 12.11a. Let us now assume that the payoff that depositors obtain when they both withdraw their funds is lower: only (10, 10). Check if the two NEs found in example 12.7 still emerge in this scenario. Example 12.8: Anticoordination game Matrix 12.12a presents a game with the opposite strategic incentives as the Coordination game in example 12.7. In particular, the matrix illustrates the Game of Chicken, as seen in movies like Rebel without a Cause and Footloose,14 where two teenagers in cars drive toward each other (or toward a cliff). If both swerve, they avoid the accident, but both are regarded as “chicken” by their friends, yielding a negative payoff of −1 to each player; if only one player swerves, he is declared the chicken, obtaining a negative payoff of −10, while his friend (who stayed) is the top dog, getting a payoff of 10; finally, if both players stay, they crash in a serious car accident, yielding a payoff of −20 for both of them (they almost die!). Player 1 Matrix 12.12a Anticoordination game. Swerve Stay Player 2 Swerve Stay −1, −1 −10, 10 10, −10 −20, −20 As usual, let us start by finding best responses in this setting. 14. In the new version of Footloose, released in 2011, two teenagers drive school buses until one of them gives up and a similar strategic scenario to the one we consider here arises. 316 Chapter 12 Player 1 Swerve Stay Player 2 Swerve Stay −1, −1 −10,10 10,−10 −20, −20 Matrix 12.12b Anticoordination game–underlining best response payoffs. Player 1’s best responses. Player 1 has BR1 (Swerve) = Stay when player 2 Swerves (in the left column), because 10 > −1; and BR1 (Stay) = Swerve when player 2 Stays (in the right column), because −10 > −20. Intuitively, player 1 chooses the opposite strategy as his opponent: when his opponent Swerves, player 1 becomes the top dog by Staying, whereas when his opponent Stays, player 1 avoids the accident by Swerving. Player 2’s best responses. Because players’ payoffs are symmetric, player 2’s best responses are also BR2 (Swerve) = Stay when player 1 Swerves (in the top row); and BR2 (Stay) = Swerve when player 1 Stays (in the bottom row). Underlining best response payoffs, matrix 12.12a becomes matrix 12.12b. As a result, the two NEs in this game are (Swerve, Stay) and (Stay, Swerve). Since every player seeks to anticoordinate by choosing the opposite strategy of her opponent, this type of game is known as “an Anticoordination game.” Self-assessment 12.8 Consider again the Anticoordination game from matrix 12.12a, but assume that all payoffs are doubled. Show that the two NEs found in example 12.8 still emerge in this scenario. For more examples of games and details on how to predict equilibrium behavior, see Harrington (2014) and Muñoz-Garcia and Toro-Gonzalez (2019). 12.6 Mixed-Strategy Nash Equilibrium All games we have analyzed thus far in this chapter have had at least one NE (one NE in the Prisoner’s Dilemma game and two NEs in the Battle of the Sexes game, the Coordination game, and the Anticoordination game). A natural question at this point is whether all games have a NE. The answer is Yes, under relatively general conditions. However, some games may not have a NE if we restrict players to choose a specific strategy 100 percent of the time, rather than allowing them to randomize across some of their available strategies. Simultaneous-Move Games 317 Example 12.9 illustrates such a possibility with penalty kicks in soccer, and then we examine how to find a NE when we allow players to randomize. Example 12.9: Penalty kicks in soccer Consider matrix 12.13a, representing a penalty kick in soccer. If the kicker aims left (right) and the goalie dives to the left (right, respectively), the kicker does not score, and both players’ payoffs are zero.15 However, when the kicker aims left (right) and the goalie dives in the opposite direction (right and left, respectively), the kicker scores, yielding a negative payoff of −5 to the goalie and a positive payoff of 8 for the kicker.16 Goalie Dive Left Dive Right Kicker Aim Left 0, 0 −5, 8 Aim Right −5, 8 0, 0 Dive Left Dive Right Kicker Aim Left 0, 0 −5,8 Aim Right −5,8 0, 0 Matrix 12.13a Anticoordination game. Goalie Matrix 12.13b Anticoordination game–underlining best response payoffs. No pure strategy NE. Finding best responses for the goalie, we obtain that, BRG (L) = L when the kicker aims left (in the left column) because 0 > −5, and BRG (R) = R when the kicker aims right (in the right column) because 0 > −5. Intuitively, the goalie tries to move in the same direction as the kicker, aiming to prevent the latter from scoring. In contrast, the kicker’s best responses are BRK (L) = R when the goalie dives left (in the top row) because 8 > 0, and BRK (R) = L when the goalie dives right (in the bottom row) because 8 > 0. Intuitively, the kicker seeks to aim to the opposite location of the goalie to score a goal. Hence, underlining best response payoffs, we obtain matrix 12.13b. 15. For simplicity, we say that the goalie dives to the left, meaning to the same left as the kicker (rather than to the goalie’s left). A similar argument applies when the goalie dives to the right. Otherwise, it would be a bit harder to keep track of the fact that the goalie’s left corresponds to the kicker’s right, and vice versa. 16. Similar payoffs would still produce our desired result of no NE when players are restricted to using a specific strategy with 100 percent probability. You can make small changes on the payoffs, and then find the best responses of each player again to confirm this point. 318 Chapter 12 Goalie Prob. p Prob. 1 − p Kicker Prob. q Aim Left Dive Left 0, 0 Dive Right −5, 8 Prob. 1 − q Aim Right −5, 8 0, 0 Matrix 12.13c Anticoordination game—including probabilities. In summary, there is no cell where the payoffs for all players have been underlined, indicating that there is no mutual best response. As a consequence, there is no NE when we restrict players to use a specific strategy (either left or right) with 100 percent probability. If, instead, we allow players to randomize (e.g., playing left with some probability, such as 1/3, and right with the remaining probability of 2/3) we can find the NE of the game. Because players in such a scenario mix their strategies, this type of NE is known as a “mixed-strategy NE,” whereas those in which players use a specific strategy with 100 percent probability are referred to as a “pure-strategy NE.” Allowing for randomization. Let us next consider that the goalie dives left, with probability p, and right, with the remaining probability 1 − p. (For easier reference, matrix 12.13c includes these probabilities next to the corresponding row for the goalie.) If p = 1, the goalie would be diving left with 100 percent probability. Similarly if p = 0, she dives right with 100 percent probability, whereas when p satisfies 0 < p < 1, the goalie randomizes her diving decision. Graphically, this randomization can be understood as choosing the top row of the matrix with probability p and the bottom row with probability 1 − p. Following a similar approach, let the kicker assign a probability q to aiming left (in the left column) and the remaining probability 1 − q to her aiming right (in the right column). Matrix 12.3c also includes these probabilities on the top of each column for the kicker. Goalie (row player). Once we have assigned probabilities to each row and column, we can make an important point about mixed strategies: if the goalie does not select a particular action with 100 percent probability, it must be that she is indifferent between her two options: dive left and dive right. That is, her expected utility from both options must coincide. To represent this conclusion mathematically, let’s first find the goalie’s expected utility from diving left: EUGoalie (Left) = q0 kicker aims left + (1 − q)(−5) . kicker aims right Simultaneous-Move Games 319 This expression can be understood as follows: when the goalie dives left (in the top row), she does not know whether the kicker will aim left or right. If the kicker aims left, which occurs with probability q, she does not score, yielding a payoff of 0 for the goalie (see the first zero in the left column of the top row). If, instead, the kicker aims right (which happens with probability 1 − q), she scores, which produces a payoff of −5 for the goalie (see the right column in the top row). Simplifying this expression, we obtain that the goalie’s expected payoff from diving left is EUGoalie (Left) = −5 + 5q. Similarly, the goalie’s expected utility from diving right is EUGoalie (Right) = + q(−5) kicker aims left (1 − q)0 kicker aims right = −5q. In this case, the goalie dives right (in the bottom row of the matrix), and she either obtains a payoff of −5, which occurs when the kicker aims left and thus scores, or a payoff of 0, which happens when the kicker aims right and does not score. As discussed previously, if the goalie is not playing a pure strategy (i.e., either choosing to dive left or right 100 percent of the time), she must be indifferent between diving left and right. We can express this indifference as follows: EUGoalie (Left) = EUGoalie (Right) Using these results, this expression is equivalent to −5 + 5q = −5q, 5 which simplifies to 10q = 5, and solving for q yields q = 10 = 12 . Therefore, the goalie is indifferent between diving left and right when the kicker aims left 50 percent (because we found that q = 1/2). Kicker (column player). Following a similar approach for the kicker, we first find her expected utility from aiming left: EUKicker (Left) = p0 goalie dives left + (1 − p)8 . goalie dives right In terms of the matrix, we fix our attention on the left column because the kicker aims left. Recall that the kicker’s payoff is uncertain because she does not know if the goalie will dive left, preventing the kicker from scoring (which yields a payoff of 0 for the kicker, in the top row of the matrix), or if the goalie will dive right, which entails a 320 Chapter 12 score and a payoff of 8 (in the bottom row of the matrix). Simplifying this expected utility, we find EUKicker (Left) = 8 − 8p. Turning now to the case in which the kicker aims right, we obtain an expected utility of EUKicker (Right) = p8 goalie dives Left + (1 − p)0 goalie dives right = 8p, which entails the opposite payoffs than before because the kicker scores only when the goalie dives left, which occurs with probability p (the probability with which the goalie plays the top row). As discussed previously, if the kicker randomizes, it must be that she is indifferent between aiming left and right, or more compactly, EUKicker (Left) = EUKicker (Right), which, using these results, entails 8 − 8p = 8p, 8 which simplifies to 8 = 16p. Solving for p, we obtain p = 16 = 12 . Therefore, the kicker is indifferent between aiming left and right when the goalie aims left with 50 percent probability (because we found that p = 1/2). We can then summarize that the only NE of this game has both players randomizing between right and left with 50 percent probability. That is, the mixed-strategy NE (msNE) is p = q = 12 . Note that players do not need to randomize with the same probability. They only did it in this situation because payoffs are symmetric in matrix 12.13b. Self-assessment 12.9 Consider again the penalty-kicks scenario from matrix 12.13a. Let us now assume, however, that the payoff that players obtain when the kicker scores a goal is (−2, 30) rather than (−5, 8). Intuitively, the kicker is really happy about winning the game, while the goalie is just a bit unhappy. All other payoffs are unaffected. Show that this game does not have a pure-strategy (psNE) either, and find the msNE. Compare the mixing probabilities that you find against p = q = 12 in example 12.9. Interpret your results. Do all games have an msNE? Not necessarily. The Prisoner’s Dilemma game, for instance, has a psNE in which all players choose to confess. However, because players find confessing to be a strictly dominant strategy, they have no incentive to randomize their decision. In other Simultaneous-Move Games 321 games, such as the Battle of the Sexes game or the Coordination game, players do not have a strictly dominant strategy. In these cases, we found two psNE values and, as an exercise, check that each game has one msNE when we allow players to randomize. Lastly, note that the penalty kicks example illustrated that all games must have at least one NE, either in pure or mixed strategies (i.e., either a psNE or an msNE).17 12.6.1 Graphical Representation of Best Responses In this section, we describe how to graphically represent the best response of each player and its interpretation. For presentation purposes, consider the goalie and kicker in example 12.9. We separately examine the goalie and kicker’s best responses next. Goalie. From the previous analysis, the goalie chooses to dive left if her expected utility from diving left is higher than from diving right; that is, EUGoalie (Left) > EUGoalie (Right), which can be expressed as −5 + 5q > −5q, ultimately simplifying to q > 12 . Intuitively, when the probability that the kicker aims left (as measured by q) is sufficiently high (in this case, q > 12 ), the goalie responds by diving left, so she increases her chances of blocking the ball. Mathematically, this means that, for all q > 12 , the goalie chooses to dive left (i.e., p = 1). In contrast, for all q < 12 , the goalie responds by diving right (i.e., p = 0). Figure 12.2a depicts this best response function.18 Kicker. A similar argument applies to the kicker. From our analysis in example 12.9, we know that she aims left if EUKicker (Left) > EUKicker (Right), which entails 8 − 8p > 8p, or, after simplifying, p < 12 . Intuitively, when the goalie is likely diving right (as captured by p < 12 ), the kicker aims left, increasing her chances of scoring. Mathematically, we can write this result by saying that, for all p < 12 , the kicker aims left (i.e., q = 1), while for all p > 12 , the kicker aims right (i.e., q = 0). Figure 12.2b illustrates the kicker’s best response function.19 17. This result holds so long as a game is not extremely “strange,” in the sense that players’ payoffs are discontinuous at several points. All games considered in this book, even the most complicated you may envision, have at least one NE. 18. Graphically, condition “for all q > 12 ” means that we are on the right side of figure 12.2a. For these points, the goalie’s best response says that p = 1 at the top horizontal line of the graph. Similarly, condition “for all q < 12 ” indicates that we look at the left side of the graph. For these points, the goalie’s best response of setting p = 0 is at the bottom horizontal line on the left of the graph (the overlapping part of the horizontal axis). 19. Graphically, condition “for all p < 12 ” means that we are on the bottom half of figure 12.2b. For these points, the kicker’s best response says that q = 1 at the vertical line on the right side of the graph. Similarly, condition “for all p > 12 ” indicates that we look at the top half of the graph. For these points, the kicker’s best response is to set q = 0, at the vertical line on the left of the graph (on top of the vertical axis). 322 Chapter 12 For all q > 1/2, the goalie chooses p = 1 (dive left) p 1 0 q = 1/2 1 q For all q < 1/2, the goalie chooses p = 0 (dive right) Figure 12.2a Goalie’s best responses. For all p > 1/2, the kicker chooses q = 0 (aim right) p 1 q = 1/2 0 1 q For all p < 1/2, the kicker chooses q = 1 (aim left) Figure 12.2b Kicker’s best responses. Putting together goalie’s and kicker’s responses. Figure 12.3 superimposes the goalie’s and the kicker’s best response functions, which we can do because we used the same axes in figures 12.2a and 12.2b. The players’ best responses only cross each other at one point in the graph, where p = q = 12 , as predicted in example 12.9. Graphically, the fact that both players’ best responses cross means that both are using their best responses or, in other words, that the strategy profile is a mutual best response, as required by the definition of NE. For this example, the crossing point p = q = 12 is the only NE of the game, an msNE, which has both players mixing. If the game had one or more psNEs, the best response functions Simultaneous-Move Games 323 p BRGoalie(q) 1 msNE where p = ½ and q = ½ p = 1/2 BRKicker(q) 0 q = 1/2 1 q Figure 12.3 Both players’ best responses. should cross at some point on the vertices of the unit square of figure 12.3. Generally, if the game you analyze has more than one NE, the best responses you depict should cross at more than one point; namely, one point in the (p, q)–quadrant of figure 12.3 for each psNE that you find and, similarly, one for each msNE that we obtain. Self-assessment 12.10 Consider again the Anticoordination game in matrix 12.12a. While we found two psNEs in that game, we can still find one msNE. Repeat the analysis in example 12.9 to find the msNE of the Anticoordination game, and depict the best responses for each player. Show that the best responses cross at three points: (1) at (p, q) = (0, 1) at the corner of the graph, which corresponds to the psNE (Stay, Swerve); (2) at (p, q) = (1, 0) at the top-left corner of the graph, corresponding to the psNE (Swerve, Stay); and (3) at an interior point where both p and q are strictly between 0 and 1, illustrating the msNE of the game. Exercises 1. Strict dominance.A Apply IDSDS to the game shown here: Player 1 U M D Player 2 L C 1, 1 3, 2 2, 3 4, 4 3, 5 5, 6 R 5, 3 6, 5 7, 5 2. IDSDS and deletion order.A Consider the game in exercise 12.1. (a) Start deleting strictly dominated strategies for player 1. What is the equilibrium that you find after applying IDSDS? 324 Chapter 12 (b) Start deleting strictly dominated strategies for player 2. What is the equilibrium you find after applying IDSDS? (c) Compare the results of parts (a) and (b). What do they imply about the ordering of IDSDS? 3. Strict dominance—some bite.B Apply IDSDS to the game shown here: Player 1 A B C D W 1, 1 2, 3 3, 5 2, 6 Player 2 X Y 3, 2 5, 3 4, 4 6, 5 5, 6 7, 5 6, 4 4, 2 Z 3, 3 1, 3 4, 3 3, 2 4. Strict dominance—no bite.B Apply IDSDS to the game shown here: Player 1 Player 2 L R U 1, 6 5, 7 D 4, 1 5, 1 5. Prisoner’s Dilemma.B In the Prisoner’s Dilemma, why did each player elect to confess when each player would be better off had they both remained silent? Do we see similar behavior like this in the real world? Explain. 6. Mixed dominance.C Consider the following normal-form game: Player 1 Player 2 L R U 2, 1 2, 3 M 1, 3 4, 2 D 4, 1 1, 4 (a) Apply IDSDS to this game. (b) Suppose that player 1 chooses to randomize between selecting M and D with probability of 0.5 each. Show that player 1’s expected payoff from this randomization is strictly higher than that of strategy U. 7. Weak dominance.A Consider the game presented in exercise 12.4. (a) Identify any weakly dominated strategies. (b) Does IDWDS provide a unique solution to this game? 8. Weak dominance—deletion order matters.B Consider the following payoff matrix: Player 1 Player 2 L R U 1, 1 1, 4 D 3, 2 1, 2 Simultaneous-Move Games 325 Let us show that IDWDS does not necessarily provide the same equilibrium result, regardless of which player we start with. (a) Start deleting weakly dominated strategies for player 1, and then player 2, and so on. Find the equilibrium prediction after applying IDWDS. (b) Start deleting weakly dominated strategies for player 2, and then player 1, and so on. Find the equilibrium prediction after applying IDWDS. (c) Do your equilibrium predictions in parts (a) and (b) coincide? 9. Weak dominance–III.B In several game shows across the US and UK, two players work together to build up a cash prize for the end of the show of size M. At the end of the show, each player must simultaneously choose whether to “Split” or to “Steal” the cash prize. • If both players choose “Split,” each player leaves the show with half the cash price, M 2. • If one player chooses “Split” while the other player chooses “Steal,” the player who chooses “Split” receives none of the cash prize, while the player who chooses “Steal” receives the whole cash prize. • If both players choose “Steal,” they both receive none of the cash prize. (a) Depict the normal-form representation of this game. (b) Identify any strictly or weakly dominated strategies in this game. (c) Which is your equilibrium prediction after applying IDSDS? (d) Which is your equilibrium prediction after applying IDWDS? 10. Nash equilibrium in a 2x2 matrix.A Find all psNEs in the game in the following payoff matrix: Player 2 L R U 6, 4 −2, 1 D 5, 2 2, 3 Player 1 11. Nash equilibrium in a 3x3 matrix.B Find all psNEs in the normal-form representation of the game shown here: Player 1 U M D Player 2 L C 5, 0 1, 9 0, 4 −9, −4 1, 6 8, 9 R 5, 8 8, 0 1, 6 12. Nash equilibrium—several equilibria.B Find all psNEs in the game shown here: Player 1 Player 2 L R U 6, 4 2, 3 D 6, 4 2, 3 326 Chapter 12 13. Nash equilibrium in the Split–Steal game.B Find all psNEs in the game presented in exercise 12.9. Is there ever a reason for a player to choose “Split”? 14. Three–player games.B Find all psNEs in the normal-form representation of the game shown here. Player 3 acts as the matrix player. Player 1 Player 2 L R U 4, 2, 6 4, 6, 9 D 3, 10, 6 5, 5, 3 Player 3 : A Player 1 Player 2 L R U 3, 5, 7 5, 1, 6 D 7, 3, 10 8, 6, 6 Player 3 : B 15. Rock, Paper, Scissors–I.A Suppose that you and a friend engaged in the classic game of rock, paper, scissors. In this game, both players simultaneously choose among “Rock,” “Paper,” and “Scissors,” where “Paper” defeats “Rock,” “Scissors” defeats “Paper,” and “Rock” defeats “Scissors.” Suppose that whoever wins this game receives $1 from the loser. (a) Depict the normal-form representation of this game. (b) Describe each player’s best response. (c) Is there a psNE of this game? If so, what is it? If not, why not? 16. Competition.B Suppose that two firms are determining how to price their products against one another. They each simultaneously choose whether to price high or low with the following results: • If they both price high, each firm brings in a profit level of B. • If one firm prices high while the other prices low, the high pricing firm receives a profit level of D, while the low pricing firm receives a profit level of A. • If they both price low, each firm brings in a profit level of C. Assume that A > B > C > D. (a) Depict the normal-form representation of this game. (b) Describe each player’s best response. (c) Is there a psNE of this game? If so, what is it? If not, why not? 17. Charitable contributions.B Suppose that two wealthy donors are considering making a charitable contribution to a public project. This project is costly, and it only reaps benefits if both donors contribute. To contribute (C), a donor must pay a cost of $1, and the project is worth $3 to each donor if both donors contribute, and 0 otherwise. Not contributing (NC) costs nothing. (a) Depict the normal-form representation of this game. (b) Describe each donor’s best response. (c) Is there a psNE of this game? If so, what is it? If not, why not? 18. Finding msNE–Battle of the Sexes.B Find the msNE in the Battle of the Sexes game described in this chapter. 19. Finding msNE–Coordination game.B Find the msNE in the Coordination game described in this chapter. Simultaneous-Move Games 327 20. Rock, Paper, Scissors–II.B Find the msNE in the Rock, Paper, Scissors game given in exercise 12.15. 21. Penalty kicks.A Consider the penalty kicks scenario in example 12.9. Suppose that the goalie received information with certainty that the kicker was going to aim right for this shot. (a) Should he continue to randomly dive? Why or why not? (b) Suppose now that the kicker knew that the goalie had this information. What should the kicker do? 22. Scaling payoffs.B Consider the results of exercise 12.10. Suppose now that all the payoffs for both players were doubled. Would that change the results for either the pure or mixed-strategy NE? Explain why or why not. 13 Sequential and Repeated Games 13.1 Introduction Chapter 12 analyzed strategic scenarios in which all players act at the same time (simultaneous-move games). In many contexts, however, one player has the ability to select her action first (e.g., the first mover), while other agents respond to her moves. For instance, an industry leader may choose the price of its product (or its production level) before other firms get to choose theirs. We refer to these scenarios as “sequential-move games” or “sequential games.” While we can predict how players behave by deploying the notion of Nash equilibrium (NE) learned in the previous chapter, we show that this solution concept provides us with too many equilibria when the game we analyze is sequential; and some NEs can be based on incredible beliefs about players’ future moves. Given these problems, we present a more common tool used to understand players’ equilibrium behavior in sequential games, “subgame perfect equilibrium (SPE)” or “rollback equilibrium,” which essentially starts by analyzing the optimal action by the last mover in the game. Once we know how the last mover will behave, we move to the previous-tolast mover, who anticipates how the last mover will behave. Intuitively, the previous-to-last mover puts herself in the shoes of the last mover, understands her motives, and then forecasts how the latter will behave. Understanding this optimal response by the last mover, the previous-to-last mover can maximize her payoff fully by anticipating how the game ensues after each of her own moves. We can then repeat a similar process, moving one more step closer to the first mover, again and again until we finally reach her. In the second part of the chapter, we apply an SPE to repeated games (i.e., scenarios in which players interact with one another several times). For presentation purposes, we return to the standard Prisoner’s Dilemma game discussed in the previous chapter, where we know that players do not cooperate when playing the game only once. We then ask whether repeating the game twice can help players cooperate with one another. Our answer, however, is No. To understand this result, consider the twice-repeated Prisoner’s Dilemma game. In round 2, players can anticipate that they will both defect. Importantly, this behavior 330 Chapter 13 would be unaffected by their previous behavior during the first round of play (i.e., players will defect in round 2 regardless of whether they both cooperated or defected in round 1). Expecting this behavior, the game that players face in round 1 becomes independent from that in round 2, leading players to behave as they do in the unrepeated version of the game explored in chapter 12. A similar argument applies when we repeat the game three times or, more generally, a finite number of times, such as 5 or 200 times, because players can anticipate their mutual defection in the last round of play, regardless of previous history, and roll back from that until their first round of interaction. You may wonder: “Wow, this is grim news because we don’t seem to achieve cooperation even if we repeat the game hundreds of times!” Well, we have some good news for you: If we repeat the game an infinite number of times, cooperation can be supported as the SPE of the game if players care enough about their future payoffs. We discuss how to identify the conditions under which this type of cooperation emerges in equilibrium, and then we provide several examples that illustrate how to approach similar problems. 13.2 Game Trees The games analyzed so far in this book have assumed that players chose their strategies simultaneously or, alternatively, that the time difference between one player’s choices and her opponent’s is small enough to be modeled as if players acted at the same time.1 In some realworld scenarios, however, players may act sequentially, with one player choosing her strategy first and another player responding with his strategy choice days or even months later. The game tree in figure 13.1a provides an example, where a potential entrant first chooses whether to enter an industry in which an incumbent firm operates as a monopolist. (Because the entrant is the first mover, its choice is labeled at the “root” of the tree on the left side of the game.) If it does not enter, the game is over, yielding a payoff of zero for the entrant, but a monopoly payoff of 10 for the incumbent. If the entrant enters, however, the incumbent is called on to move, as indicated by the node labeled “Incumbent” at the top of the game tree. At this point, the incumbent chooses whether to accommodate the entry, which entails a payoff of 4 for each firm, or start a price war, which generates a payoff of −2 (i.e., loss) for both firms. Figure 13.1b offers an example of a game where one of the players (firm 2) does not observe the moves of its opponent in previous stages. In particular, firm 1 chooses to invest or not in the first period. If it does not invest, the game is over, but if it invests, firm 1 1. This principle may apply to games such as the Rock-Paper-Scissors game, or penalty kicks between a kicker and a goalie. Alternatively, simultaneous-move games can be used to model scenarios where two players act sequentially, but the follower does not observe the leader’s actions before choosing her own. For instance, an industry leader may choose its technology by acting as the follower, not observing the leader’s decision before selecting its own technology. Sequential and Repeated Games 331 Accommodate Entry (4, 4) Incumbent In Price War Potential Entrant (–2, –2) Out (0, 10) Payoff for Entrant (First Mover) Payoff for Incumbent (Second Mover) Figure 13.1a Entry game. High price Invest Firm 2 Firm 1 Low price H´ (4,4) L´ (1,6) H´ (4,1) L´ (2,2) Firm 2's information set Firm 1 Does not invest (2,6) Figure 13.1b Sequential-move game with an information set. chooses whether to set a high or a low price. Firm 2, without observing whether firm 1 chose a high or a low price (but knowing that it invested), responds with a high price (H’) or a low price (L’). The dotted line connecting firm 2’s two nodes in the upper part of the tree is referred to as an “information set” because this player does not know at which node it gets to play. Firm 2’s available actions at the information set have the same labels. Indeed, because firm 2 cannot condition its response on firm 1’s price, it must be that firm 2 responds with a high (H’) or a low price (L’).2 2. If, instead, the labels of firm 2’s actions were HH and LH in the top node (after firm 1 chooses a high price) and HL and LL in the bottom node (after firm 1 selects a low price), firm 2 would know at which node it is called on to move just by looking at its available actions (either HH and LH , or HL and LL ), implying that firm 2 would not be uninformed about which action firm 1 chose in previous stages. 332 Chapter 13 13.3 Why Don’t We Just Find the Nash Equilibrium of the Game Tree? A natural question at this point is: “Well, we learned how to find equilibria in simultaneousmove games by using the NE solution concept. Why not apply NE to sequential-move games?” NE can indeed help us in identifying equilibrium behavior in a game tree that depicts players’ sequential moves but, as we next illustrate, the NE provides us with several equilibria. Most important, some of them may be illogical in a context where players act sequentially. Example 13.1: Applying NE to the Entry game Let us consider the Entry game in figure 13.1a again. To find the NEs in this game tree, we first need to represent the game in its matrix form (see matrix 13.1). The potential entrant has only two available strategies, In and Out, and thus the matrix has two columns. Similarly, the incumbent has two strategies at its disposal: Accommodate or Price war. (All payoffs are in millions of dollars.) We can now underline the best response payoffs, as we did in the previous chapter, to label the NEs of the game. Incumbent’s best responses. For the incumbent, we find that its best response to In (in the left column of matrix 13.1) is BRinc (In) = Acc because 4 > −2, while its best response to Out is BRinc (Out) = {Acc, War} because both yield a profit of 10. To illustrate our results, matrix 13.2 reproduces matrix 13.1, with the best response payoffs underlined. Incumbent Potential entrant In Accommodate 4, 4 Price war −2, −2 Out 10, 0 10, 0 Potential entrant In Accommodate 4,4 Price war −2, −2 Out 10, 0 10,0 Matrix 13.1 The Entry game in matrix form. Incumbent Matrix 13.2 Finding NEs in the Entry game. Sequential and Repeated Games 333 The entrant’s best responses. For the potential entrant (the column player), we find that its best responses to each of the incumbent’s strategies (i.e., for each row of the matrix) are BRent (Acc) = In, because 4 > 0, and BRent (War) = Out, because 0 > −2. Intuitively, if the entrant believes that the incumbent will choose to accommodate after its entry, then it should enter, obtaining a profit of 4, rather than staying out, with a payoff of 0. In contrast, if the entrant believes that the incumbent will respond with a price war to its entry, it should remain outside the industry; otherwise, its profit from entering and facing a price war is negative! Overall, this analysis found two cells in which we underlined both players’ payoffs as being best responses. That is, we found two strategy profiles where players choose mutual best responses to each other’s strategies (two NEs of the game): (Acc, In) and (War, Out) . As discussed previously, in the first NE, entry occurs and the incumbent follows by accommodating, whereas in the second NE, the entrant does not enter because it anticipates a price war. Do you notice something fishy about the second NE—something that doesn’t sound right? You should! In the strategy profile (War, Out), the incumbent is located in the upper part of the game tree. In this position, the incumbent must take entry as given and, taking that into account, choose the response that maximizes its profit. In particular, its payoff from accommodating entry (4) is larger than that from starting a price war (−2). In other words, once the entrant is in, the best option that the incumbent has is to accommodate entry. As a result, the entrant’s belief that a price war will ensue upon entry is not sequentially rational, in the sense that the entrant should put itself in the incumbent’s shoes to better anticipate its subsequent moves if the entrant were to join the industry. Alternatively, the incumbent’s threat to start a price war upon entry in strategy profile (War, Out) is noncredible, because the entrant can anticipate that, upon entry, the incumbent prefers to avoid a price war. Self-assessment 13.1 Repeat the analysis in example 13.1, but assume that when the potential entrant joins the industry and the incumbent responds by accommodating, both firms earn a payoff of only 1. Are the results in example 13.1 affected? Interpret. In the following section, we present a new solution concept, subgame-perfect equilibrium (SPE), which identifies only those NEs that are sequentially rational (i.e., those that are not based on incredible beliefs). 334 Chapter 13 13.4 Subgame-Perfect Equilibrium To predict how players behave in these sequential contexts, we apply the solution concept of backward induction, illustrated in tool 13.1. Tool 13.1. Applying backward induction 1. Go to the farthest right side of the game tree (where the game ends), and focus on the last mover. 2. Find the strategy that yields the highest payoff for the last mover. 3. Shade the branch that you found to yield the highest payoff for the last mover. 4. Go to the next-to-last mover and, following the response of the last mover that you found in step 3, find the strategy that maximizes her payoff. 5. Shade the branch that you found to yield the highest payoff for the next-to-last mover. 6. Repeat steps 4–5 for the player acting before the previous-to-the-last mover, and then for each player acting before her, until you reach the first mover at the root of the game. Example 13.2: Backward induction in the Entry game To apply backward induction, we first focus on the last mover, the incumbent. Comparing its payoff from accommodating entry, 4, and starting a price war, −2, we find that its best response to entry is to accommodate. Following the steps in tool 13.1, we shade the branch corresponding to Accommodate in figure 13.2. We now move to the player acting before the incumbent, which in this example is the first mover. The entrant can anticipate the incumbent’s subsequent choices if it were to enter. As a consequence, the entrant can expect that, if it chooses to enter, the incumbent will respond with accommodation because such strategy yields a higher Accommodate Incumbent In Potential Entrant Price War Out (0, 10) Figure 13.2 Applying backward induction in the Entry game. (4, 4) (–2, –2) Sequential and Repeated Games 335 payoff for the incumbent than the price war. Graphically, the entrant can understand that, if it enters, the game will proceed through the shaded branch of accommodation, ultimately yielding a payoff of 4 from entering. If, instead, the entrant stays out, its payoff is only 0 and, therefore, the optimal strategy for the entrant is to enter. We then say that the SPE after applying backward induction is {Enter, Accommodate}, which indicates that the first mover (entrant) chooses to enter, and the second mover (incumbent) responds by accommodating, entailing equilibrium payoffs of (4, 4). Self-assessment 13.2 Repeat the analysis in example 13.2, but assume that when the potential entrant joins the industry and the incumbent responds by accommodating, both firms earn a payoff of only 1. Are the results in example 13.2 affected? Interpret. The equilibrium that results from applying backward induction is also referred to as “rollback equilibrium” because the backward induction procedure looks like rolling back the game tree from its branches on the right side to the game’s root on the left side; or SPE, because backward induction helps us find the equilibrium strategy of each player when she is called on to move at any point along the tree. 13.4.1 Subgame Perfect Equilibrium in More Involved Games In this section, we explore how to apply backward induction, and thus find SPEs, in games where at least one player faces an information set, meaning that she does not observe the moves from a previous player before she is called on to move. To apply backward induction in this scenario, we first need to define what we mean by a “subgame” in a game tree. Subgame A portion of the game tree that can be circled around without breaking any information set. Intuitively, the circle of a subgame indicates a portion of the game in which a player (e.g., the second mover) is called on to move, and it takes into account the information that this player observes at that specific point of the game tree. For instance, in the Entry game of figure 13.2, there are only two subgames: one in which we circle the part of the game tree starting at the incumbent’s node and ending at the end of the game, and another in which we 336 Chapter 13 circle the game as a whole. Other games with more involved trees may have more or fewer subgames, as example 13.3 illustrates. Example 13.3: Applying backward induction in more involved game trees Consider the game tree in figure 13.3, where firm 1 acts as the first mover, choosing either Up or Down. If firm 1 selects Down, the game is over, with firm 1 obtaining a payoff of 2, while firm 2 earns a payoff of 6. However, if firm 1 chooses Up, this firm gets to play again, choosing between A and B. Firm 2 is then asked to respond, but without seeing whether firm 1 chose A or B. Firm 2’s uncertainty is graphically represented by the dotted line connecting the end of the branches that it doesn’t distinguish, A and B. This dotted line is formally known as an “information set” for firm 2, because this firm doesn’t know which of these two actions was chosen by firm 1.3 B Up Firm 2 A Firm 1 X (3,4) Y (1,4) X (2,1) Y (2,0) Firm 1 Down (2,6) Figure 13.3 A more involved game tree. Before applying backward induction to this game, a usual trick is to find all the subgames (i.e., circling the portions of the tree that do not break any information set). Starting from the last mover (firm 2), the smallest subgame that we can circle is one initiated after firm 1 chooses Up, which is labeled “Subgame 1” in figure 13.4a. If we now move to the lower part of the game tree, note that we cannot circle any part of the tree without breaking firm 2’s information set. Circles that breaks firm 2’s information 3. Firm 2 has the same available strategies when firm 1 chooses A and when it chooses B, i.e., firm 2 must select either X or Y in both cases. If, instead, firm 2 had to choose between X and Y when firm 1 chooses A, but between a different pair of action, X and Y , when firm 1 chooses B, firm 2 would be able to infer which action firm 1 selected by just observing its own available actions. Sequential and Repeated Games 337 (a) B Firm 2 A Firm 1 Up Firm 1 X (3,4) Y (1,4) X (2,1) Y (2,0) Subgame 1 Down (2,6) Game as a whole (b) Not a subgame! B Up Firm 1 Firm 2 A Firm 1 X (3,4) Y (1,4) X (2,1) Y (2,0) Not a subgame! Down (2,6) Figure 13.4 (a) Proper subgames. (b) Not proper subgames. 338 Chapter 13 set are included in figure 13.4b as a reference. As a result, the only subgame that we can identify in this tree, besides subgame 1, is the game as a whole. You may be wondering: “These are nice circles, but why should we care about the subgames in a game tree?” The answer is simple: we can next apply backward induction by just focusing on the two subgames we found. Subgame 1. Let us start by analyzing subgame 1. In this subgame, firm 2 does not observe which action firm 1 chose (either A or B).4 Therefore, subgame 1 can be represented using matrix 13.3, with firm 1 in rows and 2 in columns. Firm 1 Firm 2 X Y A 3, 4 1, 4 B 2, 1 2, 0 Matrix 13.3 Representing subgame 1 in matrix form. We can now find the NE of subgame 1 by underlining best response payoffs, as discussed in tool 12.3 and section 12.4 of the previous chapter. Matrix 13.4 reproduces matrix 13.3, but it includes underlined best response payoffs.5 As discussed in chapter 12, the cell in which both firms’ payoffs are underlined constitutes the NE of subgame 1, (A, X ), with corresponding payoffs (3, 4). Firm 1 Firm 2 X Y A 3,4 1,4 B 2,1 2, 0 Matrix 13.4 Finding the NE of subgame 1. 4. Even if firm 2 acts a few hours (or days) after firm 1 chooses between A and B, firm 2 cannot condition its response (i.e., whether to respond with X or Y ) on the specific action selected by firm 1. In this sense, firm 2 acts as if it were selecting its action at the same time as firm 1 chose its own, making the analysis of subgame 1 analogous to a simultaneous-move game. 5. This is a good moment to practice best responses. Recall that the underlined payoffs in matrix 13.4 illustrate that firm 1’s best response to firm 2 selecting X (in the left column) is A because 3 > 2, and to firm 2 choosing Y (in the right column) is B, given that 2 > 1. Similarly, firm 2’s best response to firm 1 selecting A (in the top row) is {X , Y } because firm 2 receives a payoff of 4 in both X and Y , and its best response to firm 1 choosing B (in the bottom row) is X because 1 > 0. Sequential and Repeated Games 339 (3,4) Up From the NE (A, X ) of Subgame 1 Firm 1 Down (2,6) Figure 13.5 The reduced game tree from figure 13.4a. The game as a whole. We can now study the game as a whole. Firm 1 must choose between Up and Down, anticipating that if it chooses Up, subgame 1 will start. From our previous analysis, firm 1 can anticipate equilibrium behavior in subsequent stages of the game; that is, the NE of subgame 1 is (A, X ), while the payoffs are (3, 4). Firm 1 can then simplify its decision problem to the tree depicted in figure 13.5, where we insert the equilibrium payoffs from subgame 1, (3, 4), if firm 1 were to select Up. Therefore, firm 1 only needs to conduct the following payoff comparison: if it chooses Down, the game is over and its payoff is 2, whereas if it chooses Up, subgame 1 is initiated, obtaining a payoff of 3. Because 3 > 2, firm 1 prefers to choose Up rather than Down, as illustrated by the thick arrow on the branch corresponding to Up. Summarizing, after applying backward induction, the SPE of this game is (Up, (A, X )), which yields an equilibrium payoff of 3 for firm 1 and 4 for firm 2. Self-assessment 13.3 Repeat the analysis in example 13.3, but assume that when firm 1 chooses Up and A, and firm 2 responds with X , their payoff becomes (5, 5) rather than (3, 4). Find the equilibrium of the game tree, and compare your result against that in example 13.3. 340 Chapter 13 13.5 Repeated Games Previous sections of this chapter have analyzed games where players interact only once. These games are also known as “one-shot games” or “unrepeated games,” and they can help us model strategic scenarios in which players do not anticipate interacting again, such as two randomly picked individuals in a large city, like Seattle, who may not encounter each other again. In other settings, however, agents interact several times, as in a small town like Anatone, Washington, and so they face the same game repeatedly. Repeated games are common in real life, such as Treasury bill auctions (some of which are organized monthly or weekly), price competition between the same group of firms operating in a given industry, and production decisions of countries participating in the Organization of the Petroleum-Exporting Countries (OPEC) cartel. In these three examples, the set of players is unchanged from one week to the next (or is mostly unaffected), and the game they play is also unchanged (firms can choose among the same set of prices, and sustain similar technologies as in previous editions of the game). An interesting feature of repeated games is that players’ interaction in a repeated scenario can help us rationalize cooperation in contexts where such cooperation could not be sustained if players interact only once. Consider the following Prisoner’s Dilemma game. As discussed in chapter 12 (section 12.5), the only NE of the game is (Confess, Confess), where both players confess, obtaining a payoff of −4 (that is, 4 years in jail). As we highlighted, this outcome is inefficient, as players could be better off if they coordinated their actions; namely, if they both choose not to confess, each player’s payoff increases to −1 (serving only 1 year in jail). In this section, we explore if such a cooperative outcome can be sustained when the game is repeated (i.e., when players interact many times, playing the game reproduced in matrix 13.5 in each round). Player 1 Player 2 Confess Not confess Confess −4, −4 0, −7 Not confess −7, 0 −1, −1 Matrix 13.5 Another iteration of the Prisoner’s Dilemma game. 13.5.1 Finite Repetitions Let us first consider that the game is repeated T periods, where T is a finite number (e.g., 2 times, or 500 times, but not an infinite number of times). In this scenario, every player chooses her action at stage t = {1, 2,… , T}, and an outcome emerges for stage t, which is perfectly observed by both players; and then stage t + 1 starts, whereby every player chooses her action at that stage. In a nutshell, this is a sequential-move game because every player, when considering her move at stage t + 1, perfectly observes the past history of play by Sequential and Repeated Games 341 both players from stage 1 until t. Given this history, every player responds with her choice at stage t + 1. Fortunately, we know how to solve sequential-move games! As described in section 13.4, we can use backward induction to solve for the SPE of the game as follows: Period T. Starting from the last round of play at t = T, we see that every player’s strictly dominant strategy is Confess (C), thus providing us with (C,C) as the NE of the last-stage game. Period T − 1. In the next-to-last stage, t = T − 1, every player can anticipate that (C,C) will ensue if the game proceeds until stage t = T, and that both players will be choosing C regardless of the outcome in stage T − 1. As a consequence, every player finds C a strictly dominant strategy once more. Therefore, strategy profile (C,C) is, again, the NE of the stage game (in this case, for stage T − 1). Period T − 2. A similar argument applies if we move one step up, to stage T − 2, where both players anticipate that (C,C) will be the equilibrium outcome in both subsequent stages T − 1 and T, and choose C at the current stage, thus yielding (C,C) as the NE outcome in this stage as well. Continuing with this argument, we find that (C,C) is the NE of every stage t, from the beginning of the game, at t = 1, to the last stage, t = T. Therefore, the SPE of the game has every player choosing C at every round, regardless of the outcomes in previous rounds. Intuitively, the existence of a terminal period makes every individual anticipate that both players will defect during that period, and because the last stage outcome is unaffected by previous moves, players in prior stages find no benefit from not confessing. In the next section, we explore whether such an unfortunate result can be avoided by allowing the game to be repeated an infinite number of times. 13.5.2 Infinite Repetitions Consider now an infinitely repeated Prisoner’s Dilemma game. You might wonder: “How are we going to have the prisoners playing forever—tie them to their chairs at the police station?” This is operationalized by assuming that, at any given moment, players continue to play the game one more round with some probability p. Even if this probability is close to 1, the probability that players interact for a large number of rounds drops very rapidly.6 However, it is still statistically possible that players interact for infinite rounds. As we know from previous sections, when the game is played once or a finite number of times, the only equilibrium prediction is (C,C) in every single round of play. How can we sustain cooperation if the game is played an infinite number of times? By the use of the so-called Grim-Trigger Strategy (GTS). A standard GTS works as follows: 1. In the first period of interaction, t = 1, every player starts by cooperating (playing Not confess, NC, in the Prisoner’s Dilemma game). 6. For instance, if the probability of interacting with one another is p = 0.9, the probability that players interact for 10 rounds is 0.910 ∼ = 0.34, and the probability that they continue playing for 100 rounds is extremely small (0.9100 ∼ = 0.00002). Nonetheless, this probability is always positive. 342 Chapter 13 2. In all subsequent periods, t > 1, (a) Every player continues to cooperate, so long as she observes that all players cooperated in all past periods. (b) If, instead, she observes some past cheating in any previous round (deviating from this GTS), then she plays C thereafter. To show that the GTS can be sustained as an SPE of the infinitely repeated game, we need to show that every player finds the GTS optimal at every time period t (i.e., at any period in which she wonders whether to continue with the implicit cooperative agreement that the GTS entails, both t = 1 and any t > 1). In addition, she must find the GTS optimal after any previous history of play, which in our case implies that it is optimal both (1) after no history of cheating, and (2) after some cheating episode. Let us separately analyze cases (1) and (2) next. Example 13.4: Sustaining cooperation with a Grim-Trigger Strategy Case (1). No cheating history. If no previous cheating occurs, the previously described GTS dictates that every player keeps cooperating in the next period, which yields a payoff of −1 for every player. Therefore, by sticking to the GTS, every player obtains the following stream of discounted payoffs: −1 + δ(−1) + δ 2 (−1) + …, where δ ∈ (0, 1) represents her discount factor. Intuitively, δ indicates how much the individual cares about future payoffs. When δ → 1, she assigns the same weight to future as to present payoffs; on the other hand, when δ → 0, she assigns no importance to future payoffs.7 Alternatively, a high discount factor δ can be interpreted as being that the individual cares similarly about current and future payoffs (she is patient), while a low discount factor can be understood as that she cares only about current payoffs, essentially ignoring future payoffs (she is impatient). Factoring out the −1 payoff yields −1 + δ(−1) + δ 2 (−1) + … = −1(1 + δ + δ 2 + …), 7. For simplicity, we assume that all players have the same discount factor. The results will not qualitatively change if we allow different discount factors for each player, and one of the end-of-chapter exercises asks you to revisit the infinitely repeated Prisoner’s Dilemma game, considering a discount factor δ1 for the row player and δ2 for the column player. Sequential and Repeated Games 343 1 which ultimately reduces to −1 1−δ because the term in parentheses, 1 + δ + δ 2 + …, 1 8 . If, instead, the is an infinite geometric progression that can be simplified to 1−δ player cheats today (playing C while her opponent plays NC), her payoff is 0.9 However, her defection is detected by the other players, who respond with C thereafter (recall that this is the punishment prescribed in step 2b of the GTS), which yields a payoff of −4 thereafter. As a result, her stream of discounted payoffs from cheating becomes + δ(−4) + δ 2 (−4) + …, 0 She cheats Punishment thereafter which simplifies to −4(δ + δ 2 + δ 3 + …) = −4δ(1 + δ + δ 2 + …) = −4 δ . 1−δ Therefore, after a history of no previous cheating, every player chooses to cooperate, 1 δ , rather than defect, receiving −4 1−δ , if obtaining −1 1−δ −1 1 δ −4 . 1−δ 1−δ A common trick to simplify this inequality is to multiply both sides by the denominator (1 − δ), which yields −1 −4δ, ultimately reducing the expression to 1 δ . 4 Case (2). Some cheating history. If some (or all) of the players cheat in a previous period t − 1, then the GTS prescribes that every player should play C thereafter, yielding a stream of discounted payoffs: −4 + δ(−4) + δ 2 (−4) + … = −4(1 + δ + δ 2 + …) 1 . = −4 1−δ 8. Indeed, recall that the infinite sum 1 + δ + δ 2 + … can be expressed as δ 0 + δ 1 + δ 2 + … or, more compactly, ∞ 1 . We make extensive use of this compact as δ t . This is an infinite geometric series that can be written as 1−δ t=0 expression in this chapter. 9. Note that, to check if the GTS is optimal for every player, we must maintain all other players selecting the GTS while she is the only player deviating. That is, we test for unilateral deviations. In this context, this means that all 344 Chapter 13 If, instead, a player deviates from such a punishment scheme (playing NC while her opponent chooses C), her stream of discounted payoffs becomes + δ(−4) + δ 2 (−4) + …. −7 She deviates Punishment thereafter Intuitively, her payoff is extremely low when she plays NC while her opponent confesses, −7, but then her cheating triggers an infinite punishment by all players, as prescribed by the GTS, yielding a payoff of −4 thereafter.10 This stream of payoffs reduces to −7 + (−4)(δ + δ 2 + δ 3 + …) = −7 − 4δ(1 + δ + δ 2 + …) δ = −7 − 4 . 1−δ Comparing these results, we can say that, upon observing a defection to C, every player prefers to stick to the GTS rather than deviating if −4 δ 1 −7 − 4 , 1−δ 1−δ which simplifies to δ 1, thus holding for all values of δ. This result is not surprising: if your opponent will play C thereafter, you don’t have any incentive to unilaterally deviate toward NC, even for one period. If you do, your payoff during the period or periods that you deviate will be lower than if you didn’t, and when you start playing C again, your payoff will be the same as that derived from playing C during all periods. Summary. Overall, we found that the only condition we require for cooperation to be sustained as an equilibrium of this infinitely repeated game (i.e., for the GTS to be SPE of the game) was found in Case 1 (namely δ 14 ). Therefore, this condition states that players cooperate every single round of the game, so long as they assign a sufficiently high weight to future payoffs. Figure 13.6 illustrates the trade-off that every player faces when, upon observing that no player defected in previous rounds, she must choose whether to continue cooperating or to other players keep cooperating (choosing NC, as prescribed by the GTS because there was no previous cheating), while the player that we consider here deviates to C. 10. The GTS then triggers an infinite punishment by all the players if any deviated from cooperating in any previous period. In this case, a deviation to C was detected in a prior period, and one of the players, rather than responding with C thereafter (as prescribed by the GTS), foolishly selects NC while her opponent chooses C, yielding outcome (NC,C) during that period. In all subsequent rounds, players observe that at least one player previously selected C, which triggers the infinite punishment again. Sequential and Repeated Games 345 Payoff Instantaneous gain from cheating 0 Payoff from cooperating –1 Future payoff loss from cheating –4 t t+1 t+2 ... Time Periods Figure 13.6 Incentives to cheat in repeated games. cheat, as analyzed in case 1. If a player cooperates, her payoff remains −1 in all subsequent periods. While this payoff looks good, there is another attractive option out there: if she cheats today, her payoff increases from −1 to 0 today. However, her defection is thereafter punished by her opponent, yielding a payoff drop from 0 to −4 in all subsequent periods. Graphically, the instantaneous gain from cheating today is represented by the left square, whereas the future loss from cheating is illustrated with the right rectangle. Figure 13.6 also helps us understand in which contexts cooperation can more easily occur. For instance, if the instantaneous gain from cheating decreases, the incentives to cheat also decrease. This may occur when the payoff from cheating only increases from −1 to −0.5 (the shallow square on the graph), or because cheating is immediately detected rather than requiring several periods to be detected by other players (the narrow square). Lastly, it is important to note that we can design variations of the Grim-Trigger Strategy that still help us sustain cooperation in the infinitely repeated game. A common variation is to consider a temporary reversion to the NE of the unrepeated game, (C,C), rather than the permanent reversion assumed previously. For instance, the GTS could prescribe that, upon cheating, every player chooses C during N rounds (e.g., 3 periods) but returns to cooperation once the punishment has been inflicted (i.e., after (C,C) has been played for N rounds). One of the end-of-chapter exercises asks you to revisit the Prisoner’s Dilemma game, finding under which conditions you can sustain cooperation under a GTS that temporarily punishes defections for only 3 periods. As you probably suspect, cooperation can be sustained under more restrictive conditions on discount factor δ when players temporarily punish defections than when they permanently do. Graphically, the future payoff loss from cheating (depicted in the right rectangle in figure 13.6) is narrower because the punishment phase lasts only 3 periods. Intuitively, a temporary punishment following a deviation becomes less threatening than a permanent punishment, thus making defection more attractive. 346 Chapter 13 Self-assessment 13.4 Repeat the analysis in example 13.4 but use the following game: Suppose that two firms could choose to Cooperate or Compete with one another. If both firms choose Cooperate, they both receive a payoff of 5. If one firm chooses Cooperate but the other firm chooses Compete, the firm that chooses Cooperate receives a payoff of 0, while the firm that chooses Compete receives a payoff of 7. Lastly, if both firms choose Compete, they both receive a payoff of 1. For which minimal discount factor δ do firms choose Cooperate? What if, when one firm chooses Cooperate and the other firm chooses Compete, the firm that chooses Compete receives a payoff of 10 instead? Interpret. 13.6 A Look at Behavioral Economics—Cooperation in the Experimental Lab? As suggested in chapter 12, the Prisoner’s Dilemma game clearly illustrates the tension between individual and group incentives often seen in real life. As a consequence, this game has been recurrently tested in experimental labs over several decades, in its unrepeated and repeated versions, along with many variations in the experimental designs. Individuals participating in the experiment (often college students) are asked to sit at computer terminals where they are informed about the rules of the game, are allowed to ask questions, and even practice for a trial run of the game, before they start playing the game on their computers. In finitely repeated games (such as those repeated two or four times), experiments found that in the last round of interactions, players behave as if they were in an unrepeated (one-shot) game, but in the first rounds, players are more likely to cooperate. This behavior contradicts the theoretical prediction discussed in section 13.5.1, where players defect in all rounds of interaction when playing the finitely repeated Prisoner’s Dilemma game. What about the infinitely repeated version of the game? Because an infinitely repeated game cannot actually be played, individuals participating in the experiment were informed that they will play one more round of the game with some probability (e.g., p = 80 percent).11 Overall, the literature found that, in this scenario, players are more likely to cooperate when there is a higher probability that they will interact in future rounds (e.g., p increases from 80 percent to 90 percent). This result is consistent with our previous findings of cooperation being easier to sustain when players care more about the future (i.e., higher probability p would play the same role as a higher discount factor δ in supporting cooperation). However, when players interact during many rounds, they start defecting more frequently. Anticipating that they may not interact in the future (because the probability that 11. As discussed in section 13.5.2, the probability that players interact in a future round declines rapidly. For instance, the probability that they interact for 10 periods is 0.810 0.1, while that of interacting for 50 periods decreases to 0.850 0.00001. Sequential and Repeated Games 347 they will encounter each other again declines rapidly), they try to reap the gains from a unilateral defection in one of the last rounds of play. For more details and references, see Duffy and Ochs (2009) and Dal Bó and Fréchette (2011). Exercises 1. Backward induction–I.A Find the SPE of the extensive-form game depicted in figure 13.7. Player 1’s payoff is the top number, whereas player 2’s payoff is the bottom number. 2. Backward induction–II.A Find the SPE of the extensive-form game depicted in figure 13.8. Player 1’s payoff is the top number, player 2’s payoff is the middle number, and player 3’s payoff is the bottom number. Player 1 L R Player 2 Player 2 A B 3 6 C 4 2 D 4 1 5 5 Figure 13.7 Backward induction–I. Player 1 R L Player 2 Player 2 A B 1 4 3 Figure 13.8 Backward induction–II. D Player 3 Player 3 Player 3 S C T 4 5 2 U 7 2 1 V 3 8 5 W 2 6 7 Player 3 X 8 1 4 Y 6 3 8 Z 5 7 6 348 Chapter 13 Player 1 L R Player 2 Player 2 A B Player 1 V W 1 6 C 7 1 X 4 5 E D 2 7 Player 1 3 4 Y 5 2 Z 6 8 8 3 Figure 13.9 Backward induction–III. Player 1 L R Player 2 A Player 2 B 1 4 V 4 5 W 7 2 X 3 8 D Player 1 Player 1 Player 1 U C Y 7 2 Z 8 1 Y 6 3 Z 5 7 Figure 13.10 Backward induction–IV. 3. Backward induction–III.B Find the SPE of the extensive-form game depicted in figure 13.9. Player 1’s payoff is the top number, whereas player 2’s payoff is the bottom number. 4. Backward induction–IV.B Find the SPE of the extensive-form game depicted in figure 13.10. Player 1’s payoff is the top number, whereas player 2’s payoff is the bottom number. 5. Backward induction–V.B Find the SPE of the extensive-form game in figure 13.11. Player 1’s payoff is the top number, whereas player 2’s payoff is the bottom number. 6. Prisoner’s Dilemma.A Consider the Prisoner’s Dilemma game in example 12.5 from chapter 12. (a) Suppose now that player 1 chooses whether to stay silent or confess first, then player 2 observes player 1’s choice and responds by staying silent or confessing. Find the SPE of this game. (b) Suppose now that player 2 chooses whether to stay silent or confess first, then player 1 observes player 2’s choice and responds by staying silent or confessing. Find the SPE of this game. Sequential and Repeated Games 349 Player 1 L R Player 2 Player 2 A B C Player 1 Player 1 W 1 4 X 4 8 D W 7 2 X 3 5 Y 7 2 Z 8 1 Y 6 3 Z 5 7 Figure 13.11 Backward induction–V. (c) Compare the results from parts (a) and (b), and the results from the simultaneous-move version of this game. If they are similar, explain why. If they are different, provide an explanation for any differences. 7. Battle of the Sexes game.A Consider The Battle of the Sexes game in example 12.6 from chapter 12. (a) Suppose now that Felix chooses where he goes first, and then Ana observes Felix’s choice and decides where to go afterward. Find the SPE of this game. (b) Suppose now that Ana chooses where she goes first, and then Felix observes Ana’s choice and decides where to go afterward. Find the SPE of this game. (c) Compare the results from parts (a) and (b), and the results from the simultaneous-move version of this game. If they are similar, explain why. If they are different, provide an explanation for any differences. 8. Coordination game.A Consider the Coordination game in example 12.7 from chapter 12. (a) Suppose now that depositor 1 chooses whether to withdraw or not withdraw first, and then depositor 2 observes depositor 1’s choice and decides whether to withdraw or not. Find the SPE of this game. (b) Suppose now that depositor 2 chooses whether to withdraw or not withdraw first, and then depositor 1 observes depositor 2’s choice and decides whether to withdraw or not. Find the SPE of this game. (c) Compare the results from parts (a) and (b), and the results from the simultaneous-move version of this game. If they are similar, explain why. If they are different, provide an explanation for any differences. 9. Anticoordination game.A Consider the Anticoordination game in example 12.8 from chapter 12. (a) Suppose now that player 1 chooses whether to swerve or stay first, and then player 2 observes player 1’s choice and responds by staying or swerving. Find the SPE of this game. 350 Chapter 13 (b) Suppose now that player 2 chooses whether to swerve or stay first, and then player 1 observes player 2’s choice and responds by staying or swerving. Find the SPE of this game. (c) Compare the results from parts (a) and (b), and the results from the simultaneous-move version of this game. If they are similar, explain why. If they are different, provide an explanation for any differences. 10. Penalty kicks.B Consider the penalty kicks scenario in example 12.9 from chapter 12. (a) Suppose now that the goalie chooses whether to dive left or dive right first, and then the kicker observes the goalie’s choice and responds by aiming left or right. Find the SPE of this game. (b) Suppose now that the kicker chooses whether to aim left or aim right first, and then the goalie observes the kicker’s choice and responds by diving left or right. Find the SPE of this game. (c) Compare the results from parts (a) and (b), and the results from the simultaneous-move version of this game. If they are similar, explain why. If they are different, provide an explanation for any differences. 11. Complementary pricing.B Consider a situation where two firms of complementary goods are simultaneously deciding whether to price high or low. While not direct competitors in their respective markets, they know that if one firm prices high while the other prices low, that is not beneficial to either firm. The normal-form representation is presented here where the payoffs (in dollars) denote the profits for each firm: Firm 1 High Low Firm 2 High Low 90, 75 35, 40 45, 30 60, 60 (a) Find all pure-strategy Nash equilibria of this game. Do the firms know how they should price? (b) Suppose now that firm 1 sets their price first, and then firm 2 observes the price and responds. Depict the extensive form of this game and find the SPE. (c) Suppose now that firm 2 sets their price first, and then firm 1 observes the price and responds. Depict the extensive form of this game and find the SPE. (d) Compare the results from parts (a), (b), and (c). If they are similar, explain why. If they are different, provide an explanation for any differences. 12. Holdout game.B Consider a situation where two individuals (players 1 and 2) have neglected to purchase their tickets to the latest blockbuster movie in advance. They both arrive at the movie theater at exactly the same time, and once they reach the front of the line, there is only one ticket left. The cashier decides that whoever waits out the other player will receive the ticket. Receiving the ticket is worth $100 to either player.12 The game proceeds as follows: 12. This game, which is also known as the “War of Attrition,” helps us understand firm exits in declining industries that exhibit enough demand to support only one firm. Each firm decides whether to stay or exit, suffering costs (e.g., losses) in each period. If only one firm remains, then demand is sufficient for this firm to earn an economic profit. Sequential and Repeated Games 351 • Player 1 chooses whether to remain in the line or leave. If she leaves the line, player 2 receives the ticket. If she remains in the line, both players pay a cost of $10 (they both also neglected to use the bathroom beforehand, and it’s getting uncomfortable), and then player 2 gets to act. • Player 2 responds choosing whether to remain in the line or leave. If she leaves the line, player 1 receives the ticket. If she remains in the line, both players pay a cost of $10, and then the game repeats. • After both players have had two opportunities to remain or leave, the game ends, and a player is chosen randomly to receive the ticket (each with 50 percent probability). This corresponds with a payoff of $10 for both players. (a) Depict the extensive form of this game. (b) Find the SPE of this game. (c) Suppose that player 2 knew that the cashier was friends with player 1, and that after two rounds of play, he would just choose to sell the ticket to his friend. Find the SPE of this game. 13. Centipede game.C Consider a game where two players take turns deciding whether to take a pile of money. At the start of the game, there are two piles of money, one “large” pile, with $5 dollars and one “small” pile, with $1. Player 1 gets to choose whether to take the large pile or leave both piles alone. • If player 1 takes the large pile, he receives $5 as his payoff, and player 2 receives the small pile, $1 as his payoff, and the game ends. • If player 1 leaves both piles alone, both the large and the small piles double ($10, and $2, respectively). • Player 2 then decides whether to take the large pile or leave both piles alone. • Every time a player leaves both piles alone, both piles double in size. • Players take turns deciding whether to take the large pile or leave the piles alone until each player has had three opportunities to take the large pile (six rounds total). • If, at the end of the game, neither player has taken the large pile, player 1 is awarded the large pile (which will contain $320 at this point) and player 2 is awarded the small pile (which will contain $64). (a) Depict the extensive form of this game. (b) Find the SPE of this game. (c) Suppose that, if at the end of the game no player ever takes the large pile of money, both players are awarded an equal share of both the piles of money, rather than player 1 being awarded the large pile as in previous parts of the exercise. Find the SPE of this game. 14. Seven bean game.C Consider a situation where there are seven beans sitting on a table. Two players take turns taking either one or two beans off of the table, with player 1 going first. Whoever takes the last bean off the table loses the game and must pay $1 to the winner. (a) If both players act optimally, who wins the game? What is the winning player’s optimal strategy? (Hint: Try this game with fewer beans on the table first, and work your way up to seven beans.) 352 Chapter 13 (b) Suppose instead that there are nine beans on the table. If both players act optimally, who wins the game? What is the winning player’s optimal strategy? 15. Repeated dilemmas.B Repeat the infinitely repeated Prisoner’s Dilemma game of this chapter, but assume different discount factors for each player, δ1 for the row player and δ2 for the column player. 16. Collusion.B Consider a situation where two identical firms are simultaneously deciding whether to price high or price low. If both firms price high, they both receive half the market’s profits, π . If one firm prices high while the other firm prices low, the low-pricing firm receives all the 2 market’s profits, π , while the high-pricing firm receives 0 in profits. If both firms price low, they both receive 0 in profits. (a) Depict the normal-form representation of this simultaneous move game. Find all pure strategy Nash equilibria. (b) Suppose now that the firms decided to collude to charge a high price. For what minimal discount factor δ do the firms cooperate by charging a high price? 17. Alternative triggers.B Consider the following normal-form game below: Player 1 H M L Player 2 H M 100, 100 30, 125 125, 30 60, 60 80, 20 50, 30 L 20, 80 30, 50 40, 40 (a) Find all pure strategy Nash equilibria of this game. (b) Suppose that each player implements the following GTS: “Choose H in the first round. In every other round, if both players chose H in all previous rounds, choose H again. Otherwise, choose M forever after.” For what minimal discount factor δ do the players cooperate by choosing H? (c) Suppose that each player implements the following GTS: “Choose H in the first round. In every other round, if both players chose H in all previous rounds, choose H again. Otherwise, choose L forever after.” For what minimal discount factor δ do the players cooperate by choosing H? (d) Compare the results from parts (b) and (c). If they are similar, explain why. If they are different, provide an explanation for any differences. 18. Carrot and stick.B Consider the following normal-form game below: Player 1 H M L Player 2 H M 100, 100 30, 125 125, 30 80, 80 80, 20 50, 30 (a) Find all pure strategy Nash equilibria of this game. L 20, 80 30, 50 40, 40 Sequential and Repeated Games 353 (b) Suppose this game is played twice. Consider the following strategy: “Choose H in the first round. If both players chose H in the first round, choose M. Otherwise, choose L.” For what minimal discount factor δ do the players cooperate by choosing H? 19. Temporary punishments.C Repeat the infinitely repeated Prisoner’s Dilemma game of example 13.4, finding under which conditions you can sustain cooperation under a GTS that temporarily punishes defections for only two periods. Compare your results with those in the chapter (where the GTS has a permanent reversion to NE). 20. Imperfect monitoring.C Repeat example 13.4, but assume imperfect monitoring. Specifically, if player j cooperates, the probability that player i will observe j defecting is zero; but if player j defects, the probability that i will observe j defecting is p. When p = 1, the game exhibits perfect monitoring, but when p < 1, monitoring is imperfect. 21. Punishment size.B Consider the situation in exercise 13.19. While it is harder to sustain cooperation when the punishment is temporary (i.e., higher δ required), it might be more beneficial to have only small punishments. Why? 22. Rock, Paper, Scissors–I.A Consider the Rock, Paper, Scissors game from exercise 12.15 in chapter 12. Suppose instead that the game is played sequentially. Would you rather move first or second? Why? 23. Rock, Paper, Scissors–II.B Consider the results of exercises 12.15 and 13.22. A friend of yours approaches you and states that he has found the best strategy for Rock, Paper, Scissors. He claims that he should always pick a random choice for the first round of the game. If he wins, he should stay with his choice; but if he loses, rotate to whatever item beat him previously (e.g., if he lost while playing “Paper,” he switches to “Rock”). How would you respond to your friend? Can you describe a strategy that could beat him almost all the time? 14 Imperfect Competition 14.1 Introduction In this chapter, we return to our analysis of market structure, which we initiated with two extreme types of industries (perfect competition, where infinitely many firms operate, as described in chapter 9; and monopolies, where only one firm operates, as discussed in chapter 10). In particular, we seek to understand output and pricing decisions in less extreme industries, where a small number of firms operate. We refer to such industries as “oligopolies.” Examples of these types of markets are (1) smartphones, where Samsung, Apple, and Huawei capture almost 50 percent of the industry; (2) airlines competing on the same route, where we rarely see more than three or four carriers; or (3) light bulbs, where Philips Lightning, Osram, and Acuity Brands capture a large market share. Because every firm’s decision affects its rivals’ profits, we can deploy our tools from game theory (discussed in chapters 12 and 13) to characterize how firms will behave when they compete with each other in this setting. We start by describing how to measure market power with a single index and how to interpret it in terms of market concentration. We then examine firms’ interactions in models where every firm simultaneously chooses its strategy (either its output or its price). We also apply the tools from repeated games to investigate firms’ output choices when they interact with one another a finite or infinite number of times. In particular, we identify under which conditions firms can cooperate with each other (i.e., when they can form a cartel and sustain it through time). Understanding firms’ incentives to collude in cartels has important policy implications for antitrust authorities, as they can better design regulations that will reduce collusion incentives and promote competition. We then move on to scenarios in which firms compete sequentially (e.g., an industry leader choosing its output first, and the follower responding with its own output decision). The last section of the chapter examines firms selling products that are regarded as not being identical by customers (i.e., products are differentiated in some dimension such as taste, e.g., Mac and PCs, iPhones and Samsung smartphones).1 In this context, we show 1. Generally, we distinguish whether products are differentiated horizontally or vertically. The former indicates that some consumers prefer one of the goods because its features are closer to their ideal, whereas the latter represents 356 Chapter 14 Table 14.1 Summary of market structures. Industry Number of firms Type of good Price-takers? Entry barriers? Perfect competition Monopoly Oligopoly Many One Some Homogeneous No close substitutes Homogeneous or heterogeneous Yes No No No Yes Yes that firms’ competition is attenuated because, intuitively, customers do not fully switch their purchasing decisions if one of the firms slightly undercuts its rivals’ price, given their relative preference for one brand over another. Table 14.1 summarizes the different industries encountered in previous chapters—perfect competition (chapter 9) and monopoly (chapter 10)—and the market structures we study in this chapter (oligopoly), describing how they differ in terms of (1) the number of firms in the industry, (2) the type of good they sell, (3) whether firms are price-takers or not, and (4) the presence of entry barriers. 14.2 Measuring Market Power A common measure of market power is the number of firms in an industry, N 1. Most people agree that a market with N = 3 is probably less competitive than one with N = 1, 000. However, such a measure is relatively vague, as it does not inform us about the market share that each firm sustains. For instance, this measure would evaluate two industries, A and B, with the same number of firms (e.g., N = 3) as being equivalent. A closer look, however, could reveal that in industry A, one of the firms enjoys a 98 percent market share, while the other two firms only have 1 percent each; on the other hand, in industry B, market share is evenly distributed across firms (each firm holds 33.3 percent of the market share). To avoid this problem, the Herfindahl-Hirschmann index (HHI) accounts for both the number of firms and their market shares. Herfindahl-Hirschman index (HHI) of market concentration given by This index is HHI = (s1 )2 + (s2 )2 + … (sN )2 , where s1 represents the market share of firm 1 (in percentage), s2 is that of firm 2, and similarly for all remaining N firms in the industry. markets where all customers regard one good as superior in terms of quality. This chapter does not cover vertical differentiation; for a detailed presentation, see section 5.3.1 in Belleflamme and Peitz (2015). Imperfect Competition 357 To understand the HHI, it is useful to consider extreme market structures. In a monopoly, a single firm captures the entire market share, implying that s1 = 100 percent, which produces an HHI of HHI = (100)2 = 10, 000. Similarly, in a duopoly with two firms evenly sharing the market, the HHI decreases to HHI = (50)2 + (50)2 = 5, 000. In an oligopoly with 1, 000 1 of the market share, the HHI further decreases to firms, each capturing 1,000 1 1, 000 = 1, 000 HHI = 2 2 2 1 1 +… + 1, 000 1, 000 2 = 0.001. + 1 1, 000 Generally, in an industry with N 1 firms, all of them evenly sharing the market, and thus entailing a market share of si = N1 for every firm i, the HHI is given by 2 2 2 1 1 1 … HHI = + + + N N N 2 1 1 =N = , N N which converges to zero when the number of firms, N, is sufficiently large. In summary, a high HHI arises in highly concentrated industries, which can occur because a single firm captures all market share (as in the monopoly example given previously) or because a few firms sustain most market power, despite several firms being present. In contrast, a low HHI emerges when market power is more evenly distributed. As a consequence, the HHI ranges from 10, 000 to 0. As a reference, the US light bulb market, with around fifty-seven firms, has an HHI of 2, 757. This value indicates that some of these firms enjoy a large market share. In contrast, glass container manufacturing, despite having only twentytwo firms, exhibits a lower HHI of 2, 582. This value suggests that market shares are more evenly split among firms (i.e., the market is less concentrated).2 14.3 Models of Imperfect Competition Consider a market with N 2 firms, all of them selling a relatively homogeneous product (i.e., a good with close attributes).3 In this scenario, we will consider three models of firm competition: (1) the Cournot model of simultaneous quantity competition; (2) the Bertrand 2. For information about other industries, visit the concentration ratios posted on the US Census website at http://www.census.gov/epcd/www.concentration.html. 3. A common example is that of brands of unflavored mineral water, where fewer than eight brands compete in most US stores; or that of salt, where Morton Salt, Cargill, and IMC are the main players, all of which sell an extremely similar product. In both markets, firms seek to differentiate their products by adding flavors or touting their health properties (which are often difficult to confirm by the buyer). As we discuss later in this chapter, firms’ profits increase when their products are regarded as differentiated from those of their rivals’. 358 Chapter 14 model of simultaneous price competition; and (3) the Stackelberg model of sequential quantity competition.4 In future sections, we explore how the results in these models are affected when firms sell differentiated products. 14.3.1 Cournot Model—Simultaneous Quantity Competition Let us consider an industry with N = 2 firms selling a homogeneous product. (The appendix at the end of this chapter analyzes how the results are affected when we allow for N 2 firms.) In this model every firm independently and simultaneously chooses its profitmaximizing output (q1 for firm 1 and q2 for firm 2). The market price is then determined by inserting output levels q1 and q2 into the inverse demand function p(q1 , q2 ). For simplicity, assume that this function is linear, p(q1 , q2 ) = a − b(q1 + q2 ), where a, b > 0 are positive constants.5 Firm 1’s total cost (TC) function is TC1 (q1 ) = cq1 , where c > 0 is a positive parameter representing firm 1’s marginal cost of production. Firm 2’s TC function is symmetric, TC2 (q2 ) = cq2 . (Our analysis considers how our results would change if firms were asymmetric in their costs–that is, one firm is more efficient than the other in producing the good.) Firm 1. Let us start by considering firm 1’s profit maximization problem (PMP). In particular, the firm chooses its output q1 to solve max π1 = TR1 − TC1 = p(q1 , q2 )q1 − cq1 = [a − b (q1 + q2 )] q1 − cq1 , q1 TR1 TC1 where TR1 = p(q1 , q2 )q1 denotes total revenue (price times units sold) and TC1 = cq1 is its total cost. To maximize its profits, firm 1 differentiates this expression with respect to its output level, q1 , to obtain ∂π1 = a − 2bq1 − bq2 − c = 0. ∂q1 Rearranging this, we get a − c − bq2 = 2bq1 , and solving for q1 yields q1 (q2 ) = a−c 1 − q2 , 2b 2 (BRF1 ) which is referred to as firm 1’s “best response function.” This function describes the profitmaximizing output that firm 1 chooses as a response to each of the output levels that firm 2 selects. For instance, if a = 10, b = 1, and c = 2, firm 1’s best response function 4. One of the exercises at the end of the chapter examines the Stackelberg model of sequential price, rather than quantity, competition. 5. Example 14.1 considers p(q1 , q2 ) = 12 − q1 − q2 . In that scenario, if firm 1 produces q1 = 5 units, and firm 2 produces q2 = 4 units, the price becomes p(5, 4) = 12 − 5 − 4 = $3. Imperfect Competition 359 q1 a–c 2b q1 (q 2 ) = a–c 1 – q2 2b 2 a–c b q2 Figure 14.1 Firm 1’s best response function. 1 1 becomes q1 (q2 ) = 10−2 2 − 2 q2 = 4 − 2 q2 . If firm 2 produces q2 = 3 units, firm 1 responds 1 with q1 (2) = 4 − 2 2 = 2.5 units. Figure 14.1 depicts firm 1’s best response function, which originates at a height of a−c 2b units on the vertical axis when firm 2 does not produce at all, but decreases with a slope of −1/2 for every unit of firm 2’s output. In addition, when q2 = a−c b , firm 1 optimally a−c 1 a−c − = 0 units. This outcome extends to all output levels responds with q1 b = a−c 2b 2 b . Intuitively, as firm 2 increases its output q , firm 1 is left with a smaller residual q2 a−c 2 b demand to serve (i.e., fewer customers). When firm 2’s output is massive, exceeding a−c b , firm 1’s profit-maximizing decision is to shut down, producing q1 = 0, rather than selling units of q1 at a loss. This is illustrated in figure 14.1 by the flat segment of firm 1’s best response function, which overlaps the horizontal axis, where firm 1’s output is zero (q1 = 0) for all q2 a−c b . Firm 2. A similar argument applies to firm 2, which solves max π2 = TR2 − TC2 = p(q1 , q2 )q2 − cq2 = [a − b (q1 + q2 )] q2 − cq2 . q2 TR2 TC2 Differentiating with respect to q2 , we find ∂π2 = a − bq1 − 2bq2 − c = 0. ∂q2 Rearranging this, we get a − c − bq1 = 2bq2 . Solving for q2 yields firm 2’s best response function, BRF2 as follows, q2 (q1 ) = a−c 1 − q1 , 2b 2 (BRF) 360 Chapter 14 q1 a–c b q 2 ( q1) = a– c 1 – q1 2b 2 a–c 2b q2 Figure 14.2 Firm 2’s best response function. which is symmetric to that of firm 1 (i.e., only the subscripts changed) because both companies face the same demand and costs. Figure 14.2 depicts this best response function. Like firm 1’s best response function, firm 2’s function originates at q2 = a−c 2b units when firm 1 is inactive, but it decreases at a rate of 1/2 as firm 1 increases its production. A common graphical trick used to plot firm 2’s best response function is to use the same axis orientation as that used to depict firm 1’s best response function.6 Superimposing firm 1’s and firm 2’s best response functions onto the same graph, we obtain their crossing point, as depicted in figure 14.3. At this crossing point, both firms are choosing output levels that constitute a best response to the output of its rival (i.e., firms are selecting mutual best responses). From chapter 12, we know that a mutual best response is the Nash equilibrium (NE) of a game. To find the point where the best response functions cross each other, we can insert firm 2’s best response function into that of firm 1, which yields a−c 1 a−c 1 − − q1 , q1 = 2b 2 2b 2 q2 which depends on output q1 alone. Rearranging this, we obtain a−c 3 q1 = . 4 2b 1 6. That is, you can rotate the page counterclockwise 90 degrees and plot q2 (q1 ) = a−c 2b − 2 q1 , starting with a a−c a−c vertical intercept at q2 = 2b units and a horizontal intercept at q1 = b units, where q2 = 0. As in figure 14.1, firm 2’s best response function in figure 14.2 depicts q2 = 0 for all q1 ≥ a−c b by including a segment that overlaps the vertical axis (the top-left side of figure 14.2). Imperfect Competition 361 q1 a–c b 45 degrees, q1 = q2 a–c 2b Cournot equilibrium a–c 3b a–c b a–c 2b a–c 3b q2 Figure 14.3 Cournot equilibrium. After solving for q1 , we find firm 1’s equilibrium output q∗1 = a−c 3b . Inserting this output level in firm 2’s best response function yields q2 q∗1 a−c a−c 1 a−c − = 3b 2b 2 3b 3(a − c) − 2(a − c) 6b a−c . = 3b = Because firms face the same demand function and the same cost they produce function, a−c the same output level in equilibrium. This output pair (q∗1 , q∗2 ) = a−c is the NE of the , 3b 3b Cournot game, and figure 14.3 depicts it at the point where the best response functions of the firms cross each other. Alternative approach. A more straightforward approach to solve for the equilibrium output is to invoke symmetry. Indeed, because firms are symmetric in their revenues and costs, we can claim that there must be a symmetric equilibrium where both firms produce the same amount, q∗1 = q∗2 = q∗ . Inserting this property into either firm’s best response function simplifies it to the following equation, which no longer includes subscripts: q∗ = a−c 1 ∗ − q , 2b 2 362 Chapter 14 ∗ or 32 q∗ = a−c 2b . Solving for q , we obtain the equilibrium output for every firm in this Cournot a−c ∗ model, q = 3b , thus coinciding with the output level found previously (where we inserted firm 2’s best response function into firm 1’s). After finding equilibrium output, we can turn our attention to the equilibrium price, which we obtain by evaluating the inverse demand function p(q1 , q2 ) = a − b(q1 + q2 ) at q∗1 = q∗2 = a−c 3b , as follows: p∗ a−c a−c a−c a−c , =a−b + 3b 3b 3b 3b 2(a − c) 3 a + 2c . = 3 =a− Finally, equilibrium profits for every firm i = {1, 2} are a−c a + 2c a − c πi∗ = p∗ q∗i − cq∗i = −c 3 3b 3b (a + 2c)(a − c) 3c(a − c) − 9b 9b 2 2 a − 2ac + c , = 9b = 2 2 2 or, more compactly, πi∗ = (a−c) 9b because (a − c) = a − 2ac + c . This can be alternatively ∗ ∗ 2 expressed as πi = (q ) . 2 Example 14.1: Cournot model with symmetric costs Consider a duopoly with inverse demand function p(q1 , q2 ) = 12 − q1 − q2 , where every firm i = {1, 2} faces a symmetric cost function TCi (qi ) = 4qi . Firm 1’s best response function. In this scenario, firm 1 chooses its output level q1 to solve max π1 = (12 − q1 − q2 ) q1 − 4q1 . q1 To maximize its profits, firm 1 differentiates this expression with respect to its output level, q1 , to obtain ∂π1 = 12 − 2q1 − q2 − 4 = 0. ∂q1 Imperfect Competition 363 Rearranging this, we get, 8 − q2 = 2q1 , and solving for q1 yields 1 q1 (q2 ) = 4 − q2 , 2 (BRF) which is firm 1’s best response function, originating at 4 units and decreasing with a slope of −1/2 as firm 2 increases its production. Firm 2’s best response function. A similar argument applies to firm 2, which chooses q2 to solve max π2 = (12 − q1 − q2 ) q2 − 4q2 . q2 Differentiating with respect to q2 , we find ∂π2 = 12 − q1 − 2q2 − 4 = 0. ∂q2 Rearranging this, we find 8 − q1 = 2q2 . Solving for q2 yields firm 2’s best response function, as follows: 1 q2 (q1 ) = 4 − q1 , 2 which is symmetric to that of firm 1 (only the subscripts change). Finding equilibrium output. To solve for the equilibrium output levels of each firm, we can invoke symmetry because firms are symmetric, and we just confirmed that their best response functions are symmetric! In other words, there must be a symmetric equilibrium where both firms produce the same amount, q∗1 = q∗2 = q∗ . Inserting this property into either firm’s best response function simplifies it to 1 q∗ = 4 − q∗ , 2 or 32 q∗ = 4. Solving for q∗ , we obtain the equilibrium output for every firm in this Cournot model, q∗ = 83 . As a consequence, equilibrium price is 8 8 20 ∼ ∗ 8 8 p , = 12 − q∗ − q∗ = 12 − − = = $6.67, 3 3 3 3 3 ultimately producing equilibrium profits of 8 160 96 64 20 8 ∗ ∗ ∗ ∗ −4 = − = πi = p q − cq = 3 3 3 9 9 9 for every firm i = {1, 2}. 364 Chapter 14 Self-assessment 14.1 Repeat the analysis in example 14.1, but assume that firm 1 faces an inverse demand function p(q1 , q2 ) = 5 − 13 (q1 + q2 ) (i.e., a = 5 and b = 13 ). Find the firm’s best response function, its vertical and horizontal intercept, and slope. Self-assessment 14.2 Repeat the analysis in example 14.1, but assume that the inverse demand function changes to p(q1 , q2 ) = 20 − q1 − q2 . Find each firm’s best response function, the Cournot equilibrium output, and the corresponding equilibrium price and profits. Example 14.2 considers an industry with two firms facing different production costs. Example 14.2: Cournot model with asymmetric costs Consider two firms competing à la Cournot, facing the same inverse demand function as in example 14.1, p(q1 , q2 ) = 12 − q1 − q2 , but different cost functions: TC1 (q1 ) = 4q1 for firm 1, and TC2 (q2 ) = 3q2 for firm 2. (Note that the marginal cost of firm 2 is less than that of firm 1, and hence we can expect its equilibrium output to be larger. We confirm this suspicion in the discussion that follows.) Firm 1’s best response. We first find firm 1’s BRF by solving its PMP: max π1 = (12 − q1 − q2 ) q1 − 4q1 . q1 This problem coincides with the one discussed in example 14.1, and thus it yields the same best response function, q1 (q2 ) = 4 − 12 q2 . Firm 2’s best response. In contrast, firm 2’s PMP is now max π2 = (12 − q1 − q2 ) q2 − 3q2 . q2 Differentiating with respect to its output q2 yields ∂π2 = 12 − q1 − 2q2 − 3 = 0. ∂q2 Rearranging this, we find that 9 − q1 = 2q2 . Solving for q2 yields firm 2’s best response function, as follows: q2 (q1 ) = 9 1 − q1 . 2 2 Imperfect Competition 365 This function has the same slope as that in example 14.1, −1/2 (where we assumed that its marginal costs were 4), but it originates at 9/2 rather than at 4. This indicates that, for every output of firm 1, firm 2’s output is now larger because its marginal cost is 3 rather than 4. Finding equilibrium output. At this point, we cannot invoke symmetry in output level in equilibrium because firms face different production costs. As a result, we need to simultaneously solve for q1 and q2 in BRF1 and BRF2 by, for instance, inserting BRF2 into BRF1 , as follows: 1 9 1 − q1 . q1 = 4 − 2 2 2 q2 Rearranging this, we find q1 = 4 − 94 + 14 q1 , or 34 q1 = 74 . Solving for q1 yields an equilibrium output of q∗1 = 73 2.33 units. Inserting this output into firm 2’s best response function, we find its equilibrium output: q∗2 = 9 1 7 10 ∼ − = = 3.33 units, 2 2 3 3 q∗1 where q∗2 > q∗1 because firm 2’s marginal cost is lower than that of firm 1. As an exercise, you can check that in this scenario, equilibrium price is p∗ = 19 3 , ∗ = 100 for firm 2. Comparing for firm 1 and π and equilibrium profits are π1∗ = 49 2 9 9 equilibrium profits under asymmetric costs (this example) and under symmetric costs (example 14.1), we find that the firm benefiting from a cost advantage (firm 2) earns a larger profit, while the firm suffering from a cost disadvantage (firm 1) earns a smaller profit. Self-assessment 14.3 Repeat the analysis in example 14.2 but assuming that firm 2’s cost function changes to TC2 (q2 ) = q2 , thus emphasizing the cost advantage of firm 2 relative to firm 1. Compare your results against those in example 14.2. 14.3.2 Bertrand Model—Simultaneous Price Competition Consider now that firms compete in prices rather than in quantities. Will our equilibrium results from the Cournot model be affected? The Bertrand model of price competition that we explore in this section answers this question with a robust “Yes.” 366 Chapter 14 Let us start by clarifying the setup of the game: two symmetric firms produce an homogeneous good and face a common marginal cost, c > 0. They simultaneously and independently set prices for their products, p1 and p2 . If firm 1 charges the lowest price (i.e., p1 satisfies p1 < p2 ), firm 1 captures all the demand, while firm 2 captures none: x1 (p1 , p2 , I) > 0 and x2 (p1 , p2 , I) = 0, where x1 (p1 , p2 , I) denotes the demand function found in chapter 3; and I > 0 represents income level. Similarly, if firm 2 sets the lowest price, p1 > p2 , the roles are switched, as it is now firm 2 captures all demand. Lastly, if prices coincide, p1 = p2 , both firms equally share market demand; that is, 12 x1 (p1 , p2 , I) > 0 for firm 1, and similarly, 12 x2 (p1 , p2 , I) > 0 for firm 2. The Bertrand model of price competition claims that, in equilibrium, both firms set the same price, and this common price coincides with their marginal cost: p1 = p2 = c. Next, let us show this result by systematically going over all possible price pairs (p1 , p2 ) that are different from (p1 , p2 ) = (c, c) (i.e., where both firms’ price coincides with their common cost, c). We will demonstrate that these price profiles cannot be equilibria of the Bertrand model of price competition. What do we mean by that? We only need to show that any price different than the marginal cost c is “unstable” in the sense that at least one firm has an incentive to deviate to a different price. For presentation purposes, we will first examine asymmetric price pairs, p1 = p2 , and then analyze symmetric price pairs, where p1 = p2 . 1. Asymmetric price profiles. (a) Consider a price profile p1 > p2 > c, as depicted in figure 14.4. In this scenario firm 2 charges the lower price, thus capturing the entire market and making a positive margin per unit because p2 > c. This price profile, however, cannot be stable because firm 1 has incentives to deviate by undercutting firm 2’s price, charging p1 = p2 − ε, where ε > 0 indicates a small reduction in firm 2’s price.7 Hence, price profile p1 > p2 > c cannot be an equilibrium because we found at least one profitable deviation.8 (b) Consider now a price profile p1 > p2 = c. As depicted in figure 14.5, firm 2 sets the lowest price (and so captures all sales), but in this case, it makes no profit per unit. Firm 1 would not have the incentive to undercut firm 2’s price, as that would entail charging a price below marginal cost, thus incurring a loss per unit. Firm 2, instead, 7. If, for instance, p2 = $10, firm 1 could undercut firm 2’s price by 1 cent, by charging p1 = $9.99, thus making ε = 0.01. A similar argument applies if ε could be smaller than 1 cent. 8. A similar argument applies if we switch the identities of the firms by considering the price profile p2 > p1 > c, whereby firm 1 captures the whole market, and firm 2 would now have incentives to undercut firm 1’s price by a small amount. Imperfect Competition 367 p2 – ε c p2 p1 Profitable Deviation Figure 14.4 Profitable deviation when p1 > p2 > c. p1 – ε p1 p2 = c Profitable Deviation Figure 14.5 Profitable deviation when p1 > p2 = c. would have an incentive to deviate by increasing its price from p2 = c to slightly below its rival’s price, p2 = p1 − ε, where ε > 0 is a small number (e.g., 1 cent) and make a higher profit. Because we found a profitable deviation, we can claim that price profile p1 > p2 = c cannot be an equilibrium either.9 In summary, points 1(a) and 1(b) considered all possible asymmetric price profiles. For all of them, we showed that at least one firm has an incentive to deviate, entailing that the price profile considered in each case cannot be an equilibrium. As a consequence, if an equilibrium exists, it must be symmetric in the sense that both firms charge the same price, p1 = p2 = p. We examine this possibility next. 2. Symmetric price profiles. (a) Consider a price profile where both firms charge the same price, but such a common price is larger than the marginal cost of production, p1 = p2 > c, as depicted in figure 14.6. In this case, both firms evenly share the market because their prices are the same. Every firm i now has the incentive to deviate by undercutting its rival’s price p by a small amount, ε, so that pi = p − ε.10 Hence, price profile p1 = p2 > c cannot be an equilibrium either. 9. A similar argument applies if we switch the identities of the two firms by considering the pricing profile p2 > p1 = c, where now firm 1 captures the market but would have the incentive to increase its price until p1 = p2 − ε. 10. Firm i’s price decrease, from p to p − ε, exerts two effects on its profits. On the one hand, it increases its sales from half the market to all the market. On the other hand, it reduces its margin per unit from p − c to (p − ε) − c. However, the first (positive) effect dominates the second (negative) effect, yielding an overall increase in profits, when the firm undercuts its rival’s price p by a small amount (i.e., when ε is a small number). 368 Chapter 14 pi – ε c p1 = p2 Profitable Deviation Figure 14.6 Profitable deviation when p1 = p2 > c. (b) Finally, consider the price profile p1 = p2 = c. Here, prices coincide, thus leading firms to evenly share the market. In addition, these prices leave no positive margin per unit because pi = c for every firm i. While profits are zero in this price profile, no firm can strictly increase its payoff by unilaterally deviating: setting a lower price would attract all customers, but at a loss per unit, and setting a higher prices would reduce the deviating firm’s sales to zero, as its price is now higher than that of its rival. Summarizing, we can claim that setting a price equal to the common marginal cost, pi = c, is a weakly dominant strategy in the Bertrand model of price competition because no firm can strictly increase its profit by deviating from such a price. This discussion considers, for simplicity, two firms. Nonetheless, a similar argument can be extended to scenarios with more than two firms, where pi = c remains an equilibrium of the game for every firm i.11 Example 14.3: Bertrand model Consider again the inverse demand function in example 14.1, p(q1 , q2 ) = 12 − q1 − q2 . Because Q ≡ q1 + q2 denotes the aggregate output in the industry, we can express the inverse demand function as p(Q) = 12 − Q. According to the Bertrand model of price competition, all firms in the industry lower their prices until p = c. In this context, because p(Q) = 12 − Q, equilibrium condition p = c entails 12 − Q = c, which, solving for aggregate output Q, yields Q = 12 − c. For instance, if the marginal cost is c = 4, as in examples 14.1 and 14.2, aggregate output becomes Q = 12 − 4 = 8 units, each of which sold at a price of $4. Self-assessment 14.4 Repeat the analysis in example 14.3, but assuming that firms face an inverse demand function p(q1 , q2 ) = 20 − q1 − q2 . How are the results in example 14.3 affected? 11. In oligopolies with more than two firms, however, other NEs can exist where two firms set their price equal to the common marginal cost c, while all other firms set their prices above c (i.e., pi = pj = c and pk > c for every firm i = j = k). Imperfect Competition 369 Why are the results in the Cournot model of quantity competition and the Bertrand model of price competition so dramatically different? Recall that in the Cournot model, firms set a price above marginal cost, thus making positive profits, whereas in the Bertrand model, firms set p = c, earning no economic profits. The underlying assumption driving the difference in their equilibrium predictions is, essentially, the absence of capacity constraints in the Bertrand model of price competition: if a firm charges 1 cent less than its rival, it captures all market demand, regardless of its size. This assumption might be reasonable for certain goods (such as online movie streaming), but relatively difficult to justify for others (such as smartphones or smartwatches) with a world demand that cannot be served by a single firm. Reconciling the Cournot and Bertrand models. 14.3.3 Cartels and Collusion Our previous results indicate that firms competing in quantities earn profits below those under monopoly; this result becomes emphasized when firms compete in prices (à la Bertrand). What if, rather than competing against each other, firms were to coordinate their production decisions? In this section, we analyze how collusion can help firms increase their profits, and under which conditions such cooperation can be expected to hold.12 Cartels seek to coordinate production decisions to raise prices and profits for cartel participants. A famous example of a cartel is the Organization of the Petroleum-Exporting Countries (OPEC), which limits the oil extraction of each participating country in order to increase market prices. Other famous examples include lysine, vitamin B2 , vitamin C, steel, rayon fiber, diamonds, or heating pipes.13 As the next example illustrates, in a cartel firms seek to maximize their joint rather than their individual profits, making our analysis analogous to that under multiplant monopolies (discussed in section 10.6 of chapter 10). Example 14.4: Collusion when firms compete in quantities Consider the industry in example 14.1, where p(q1 , q2 ) = 12 − q1 − q2 and TCi (qi ) = 4qi for every firm i. If firms join a cartel, they choose the output of firm 1, q1 , and that of firm 2, q2 , to maximize their joint profits, π = π1 + π2 , as follows: max π = π1 + π2 q1 ,q2 = (12 − q1 − q2 ) q1 − 4q1 + (12 − q1 − q2 ) q2 − 4q2 . π1 π2 12. We consider here quantity competition, while one of the end-of-chapter exercises examines collusion under price competition. 13. See Levenstein and Suslow (2006) and Harrington (2006) for more details on these cartels. 370 Chapter 14 This expression looks scary (we agree on that), so let us try to simplify it a bit. First, note that (12 − q1 − q2 ) shows up twice and can be factored out, and so does the unit cost, 4, which yields max (12 − q1 − q2 ) (q1 + q2 ) − 4 (q1 + q2 ) . q1 ,q2 Because 12 − q1 − q2 = 12 − (q1 + q2 ), we obtain max [12 − (q1 + q2 )] (q1 + q2 ) − 4 (q1 + q2 ) . q1 ,q2 Finally, because Q = q1 + q2 denotes aggregate output, we can rewrite the cartel’s profit maximization problem more compactly as max [12 − Q] Q − 4Q. Q The cartel just needs to choose the aggregate amount of output Q—the total production for all the cartel—to maximize profits [12 − Q] Q − 4Q, as if it were a single firm (i.e., a monopolist). Differentiating with respect to Q, we find 12 − 2Q − 4 = 0, which, after solving for Q, yields Q∗ = 82 = 4 units. Because firms are symmetric, each produces half of Q∗ = 4 units (i.e., 2 units per firm). In contrast, under Cournot competition (as found in example 14.1), every firm produces q = 83 2.66 units. Therefore, under the cartel, every firm limits its own production to increase market price and profits. We can confirm this result by finding that the cartel price is p(2, 2) = 12 − 2 − 2 = $8, which is higher than under Cournot competition ($6.67). Similarly, the cartel profits for every firm i are πi∗ = (12 − q1 − q2 ) qi − 4qi = (12 − 2 − 2) 2 − (4 × 2) = $8, while under Cournot competition, profits were only πi = 64 9 $7.11. Self-assessment 14.5 Repeat the analysis in example 14.4, but assume that firms face an inverse demand function p(q1 , q2 ) = 20 − q1 − q2 . How are the results in example 14.4 affected? Imperfect Competition 371 Example 14.4 indicates that firms have incentives to coordinate their production decisions, reducing their individual output to increase market prices and ultimately profits. Why are cartel profits larger than under Cournot competition? Under Cournot, when every firm increases its individual output, it considers the effect that such additional production has on its own profits, but it ignores the effect that a larger output has on its rival’s profits. Indeed, a larger output lowers market prices, ultimately reducing its rival’s profits. Under the cartel agreement, in contrast, firms take into account each other’s benefits, as the cartel maximizes joint (rather than individual) profits. As a consequence, firms produce less under the cartel, both at the individual and aggregate levels, elevating market prices and ultimately increasing their profits. In short, by sharing profits, the cartel helps every firm internalize the negative effect that an increase in its own output produces in its rival’s profits. While the previous discussion ranks profits in cartels and Cournot, it does not identify under which conditions collusion can be sustained over time. Importantly, every firm has an incentive to cheat on the cartel agreement—that is, produce more than its quota (2 units per firm in example 14.4)—while its rival sticks to the agreement. From chapter 13, we know that, if firms interact only once, cooperation cannot be sustained in equilibrium, nor can it be supported if firms interact a limited number of times. However, if firms interact infinitely (or there is a probability that both firms will still be in the industry tomorrow), then cooperation can be sustained. We evaluate under which conditions this occurs in example 14.5. Example 14.5: Sustaining cooperation within the cartel Assume that firms play an infinitely repeated Cournot game, and they seek to coordinate their production decisions through the following Grim-Trigger Strategy (GTS), similar to that discussed in chapter 13 for the Prisoner’s Dilemma game: 1. In the first period of interaction t = 1, every firm starts cooperating (producing 2 units). 2. In all subsequent periods t > 1, (a) Every firm continues cooperating, so long as all firms cooperated in all previous periods. (b) If, instead, a firm observes some past cheating (deviating from this GTS), then it produces the Cournot output q∗ = 83 thereafter. As shown in chapter 13, we only need to check if every firm has incentives to deviate from the GTS: (1) after observing a history of cooperation; and (2) after observing 372 Chapter 14 that some firm/s cheated. We focus here on testing (1), while you can explore option (2) as an exercise.14 Cooperation. If firm i continues cooperating (i.e., producing the cartel output of 2 units), it obtains the cartel profit of $8. Therefore, its stream of discounted payoff from cooperating becomes 8 + δ8 + δ 2 8 + … = 8(1 + δ + δ 2 + …) 8 , = 1−δ where δ denotes the discount factor weighting future payoffs. Best deviation. If, instead, firm i deviates from producing 2 units while its rival sticks to the cartel agreement, its profits could increase. But what is firm i’s best deviation? To find this, we need to evaluate its profits when its rival produces the cartel output of 2 units, qj = 2, obtaining (12 − qi − 2) qi − 4qi = (10 − qi ) qi − 4qi . Differentiating with respect to qi , we obtain 10 − 2qi − 4 = 0, which, solving for qi , yields qi = 3 units. Inserting this “best deviation” into firm i’s profits, we obtain deviation profits of π Dev = (10 − 3) 3 − (4 × 3) = $9, which are indeed larger than the cartel profit of $8. Therefore, if firm i deviates, its stream of discounted payoffs becomes 64 64 64 9 + δ + δ 2 + … = 9 + (δ + δ 2 + …) 9 9 9 Deviation Punishment 64 δ(1 + δ + …) 9 64 δ . =9+ 9 1−δ =9+ Intuitively, the deviating firm increases its profits from $8 to $9 for one period (i.e., instantaneous gain from deviation), but its defection is detected by its cartel partner, which triggers an infinite punishment in which both firms produce the Cournot output, yielding a Cournot profit of 64 9 thereafter. 14. As in chapter 13, you should find that, upon observing some player or players cheating, every player has an incentive to implement the punishment in the GTS (choosing its Cournot output thereafter) rather than producing any other output level. Importantly, this result should hold for all values of the firm’s discount factor δ. Imperfect Competition 373 Comparing profits. As a consequence, every firm i prefers to cooperate, so long as 64 δ 8 9+ . 1−δ 9 1−δ Multiplying both sides by (1 − δ), we obtain 8 9(1 − δ) + 64 9 δ. Solving for discount 9 0.53 factor δ, we find that the cartel output can be sustained with this GTS if δ 17 (i.e., as firms assign sufficient importance to their future profits). If, in contrast, δ < 0.53, the cartel agreement cannot be sustained over time because firms would have incentives to cheat during every period. In this case, the Cournot outcome emerges in equilibrium in every period. Self-assessment 14.6 Repeat the analysis in example 14.5, but assume that firms detect a deviation only after two periods, so a deviating firm earns a profit of $9 during two periods before the punishment starts. This means that cheating is still detected with certainty, but with a lag of two periods rather than immediately. Find the minimal discount factor δ supporting cooperation in this scenario, and show that cooperation is more difficult to be sustained than in example 14.5. 14.4 Stackelberg Model—Sequential Quantity Competition Let us now modify the Cournot model of simultaneous quantity competition by considering that, while firms still compete in quantities, they do so sequentially. Specifically, the time structure of the game is the following: 1. Firm 1 chooses its output q1 . 2. Firm 2 observes q1 and responds with its own output, q2 . This timing may be due to industry or legal reasons that provide firm 1 with an advantage. For instance, firm 1 was the first to develop a new product or technology, allowing it to choose its output before firm 2. Because this is a sequential-move game, with firm 1 acting as the leader and firm 2 as the follower, we can solve it by applying backward induction, the game-theoretic tool discussed in chapter 13. We first need to focus on the last mover (firm 2), and analyze its profit-maximizing output for every possible output that firm 1 produces. Firm 2 takes the leader’s output q1 as given, because it is already chosen by the time firm 2 gets to move. Mathematically, firm 2 treats q1 as a parameter when Firm 2 (follower). 374 Chapter 14 maximizing its profits, as follows: max [a − b (q1 + q2 )] q2 − cq2 . q2 Differentiating with respect to firm 2’s output, q2 , we obtain a − bq1 − 2bq2 − c = 0; and solving for q2 yields q2 (q1 ) = a−c 1 − q1 . 2b 2 (BRF2 ) This expression is similar to firm 2’s best response function in the Cournot model of example 14.1. Indeed, in that setting, firm 2 chose its profit-maximizing output for every q1 chosen by its rival, firm 1. A similar intuition applies now, except for the fact that firm 2 observes firm 1’s output before choosing its own, whereas in the Cournot model, firm 2 chooses its output level simultaneously with that of firm 1. Nonetheless, in both scenarios firm 2 treats firm 1’s output q1 as given, either because firm 2 cannot alter it (in the Cournot model) or because q1 is already produced (in the Stackelberg model that we analyze). Firm 1 (leader). The leader chooses its output q1 to maximize its profits, as follows: max [a − b (q1 + q2 )] q1 − cq1 . q1 However, firm 1 can anticipate that firm 2 will optimally respond with the same best 1 response function q2 (q1 ) = a−c 2b − 2 q1 , as this maximizes the follower’s profits. Intuitively, the leader can put himself in the shoes of the follower, expecting the latter to respond with 1 the output level q2 (q1 ) = a−c 2b − 2 q1 . Inserting this best response function into the leader’s PMP yields ⎞⎤ ⎡ ⎛ ⎡ ⎛ ⎞⎤ ⎟⎥ ⎢ ⎜ a−c 1 ⎟⎥ ⎢ ⎜ ⎟⎥ ⎢ ⎜ − q1 ⎟⎥ q1 − cq1 , max ⎣a − b ⎝q1 + q2 (q1 )⎠⎦ q1 − cq1 = ⎢a − b ⎜q1 + q1 ⎠⎦ ⎣ ⎝ 2b 2 BRF2 q2 (q1 ) from BRF2 or, after simplifying,15 max q1 1 (a + c − bq1 ) q1 − cq1 . 2 2bq1 +a−c−bq1 1 15. Note that the term a − b q1 + a−c simplifies to a − b , which further reduces to 2b − 2 q1 2b 2a−2bq1 −a+c+bq1 1 , ultimately yielding 2 (a + c − bq1 ). 2 Imperfect Competition 375 Note that the leader’s problem became a function of its output level, q1 , alone. Differentiating with respect to q1 , we obtain 1 (a − c − 2bq1 ) = 0. 2 Further, solving for q1 yields the profit-maximizing output for the leader, q∗1 = a−c 2b . in equilibrium, we can find the follower’s equilibrium Hence, if the leader chooses q∗1 = a−c 2b output by inserting q∗1 = a−c 2b into the follower’s best response function as follows: q2 a−c a−c 1 a−c 2(a − c) a − c a − c = − − = , = 2b 2b 2 2b 4b 4b 4b q∗1 which is exactly half of the leader’s output, q∗2 = 12 q∗1 . However, we describe the subgame perfect equilibrium (SPE) of the game more generally as q∗1 = a−c a−c 1 and q2 (q1 ) = − q1 , 2b 2b 2 because the follower’s best response function allows firm 2 to optimally respond to the a−c leader’s output level, both in equilibrium, q∗1 = a−c 2b , and off the equilibrium q1 = 2b . If, a−c ∗ instead, we said that the follower chooses q2 = 4b in the SPE of the game, we would provide no information about how the follower responds if the leader “made a mistake” by deviating from its equilibrium output q∗1 . Interestingly, the leader produces more in the Stackelberg model than in Cournot, because a−c a−c a−c a−c 2b > 3b , whereas the follower produces less, given that 4b < 3b . The leader anticipates the follower’s reaction after observing a larger output from the leader, and thus increases q1 to gain larger profits. In this context, equilibrium price is a−c a−c ∗ + p =a−b 2b 4b 2 (a − c) a − c − 4 4 a + 3c . = 4 =a− Equilibrium profits for the leader are π1∗ = a − c (a − c)2 a + 3c −c = , 4 2b 8b 376 Chapter 14 and for the follower, they are π2∗ = a − c (a − c)2 a + 3c −c = , 4 4b 16b that is, exactly half of the leader’s profits, π2∗ = 12 π1∗ . As an exercise, you can easily check that the leader’s profits are higher in Stackelberg than in Cournot, whereas the follower’s profits are lower. Example 14.6: Stackelberg model Consider the same inverse demand function as in example 14.1, p(q1 , q2 ) = 12 − q1 − q2 , and marginal cost c = 4. Inserting the follower’s best response function found in example 14.1, q2 (q1 ) = 4 − 12 q1 , into the leader’s PMP yields ⎛ ⎡ ⎞⎤ ⎟⎥ ⎜ ⎢ 1 ⎜ ⎢ ⎟⎥ max ⎢12 − ⎜q1 + 4 − q1 ⎟⎥ q1 − 4q1 . q1 ⎝ ⎣ ⎠⎦ 2 q2 (q1 ) Simplifying this, we obtain16 max q1 1 (16 − q1 ) q1 − 4q1 . 2 Differentiating with respect to q1 , we find 8 − q1 − 4 = 0. Solving for q1 , we find the profit-maximizing output for the leader, q∗1 = 4 units. In this scenario, equilibrium price is p∗ = $6, and equilibrium profits become π1∗ = (6 × 4) − (4 × 4) = $8 for firm 1 and π2∗ = (6 × 2) − (4 × 2) = $4 for firm 2. Self-assessment 14.7 Repeat the analysis in example 14.6 but assume that firms face an inverse demand function p(q1 , q2 ) = 20 − q1 − q2 . How are the results in example 14.6 affected? 16. Note that the term 12 − q1 + 4 − 12 q1 simplifies to 12 − q1 − 4 + 12 q1 q1 , which further reduces to 8 − 12 q1 q1 . Factoring 1/2 out, we can alternatively write this expression as 12 (16 − q1 ) q1 . Imperfect Competition 377 14.5 Product Differentiation In previous sections, we considered that firms sell undifferentiated products (i.e., homogeneous goods). While this might occur in some markets, such as specific agricultural products and cereals, most goods are differentiated from those of their rivals, such as Coke and Pepsi in the soda industry, Dell and Lenovo in the computer industry, and iPhone and Samsung Galaxy in the smartphone market. To understand firm competition in these industries, and to predict their output and pricing decisions, we will rely on a similar approach as in the Cournot model of section 14.3.1, but with a twist, because we need to account for product differentiation between products. Demand for product differentiation. Consider a scenario with two firms, A and B, with the following inverse demand functions: pA (qA , qB ) = a − bqA − dqB and pB (qA , qB ) = a − bqB − dqA , where b, d 0 and b d. We next interpret these demand functions. Because they are symmetric, let us focus on one of the demand functions, such as that for good A. An increase in either qA or qB reduces the price of good A, pA , but the effect of firm A’s output qA is larger than the effect of firm B’s output because b > d. Intuitively, the price of a particular brand is more sensitive to changes in its own output than to changes in its rival’s output. We refer to this assumption by saying that “own-price effects” dominate “cross-price effects.” Furthermore, note that when d = 0, the inverse demand function for good A collapses to pA (qA , qB ) = a − bqA (and similarly for the demand of good B), thus indicating that every firm’s price is unaffected by its rival’s output, as in two separate monopoly markets, one for good A and another for B. In contrast, if parameter d increases until it coincides with b, d = b, the inverse demand function for good A becomes pA (qA , qB ) = a − bqA − bqB = a − b (qA + qB ) , reflecting that price pA is symmetrically affected by an increase in either qA or qB , as in the Cournot model with homogeneous goods. Best responses with product differentiation. As in previous sections, we assume that every firm i = {A, B} faces a cost function TC(qi ) = cqi , where c > 0 indicates its marginal cost of production. We are now ready to represent the PMP of firm A as follows: max [a − bqA − dqB ] qA − cqA . qA Differentiating with respect to qA , we obtain a − c − 2bqA − dqB = 0. 378 Chapter 14 qA a–c 2b –½ BRFA when d < b BRFA when d = b a–c b a–c d qB Figure 14.7 Best response function and product differentiation. Rearranging this yields a − c − dqB = 2bqA . Solving for qA , we find firm i’s best response function d a−c − qB . 2b 2b Figure 14.7 depicts this best response function. As with best response functions found throughout the chapter, firm A’s optimal output is a−c 2b when its rival, firm B, produces zero d units (qB = 0) but it decreases at a rate 2b for every unit of output of its rival. If firm B pro17 duces more than a−c d units, firm A chooses to optimally respond with zero output (qA = 0). This intuition about parameters b and d in the demand function extends to the best response function as well. In particular, if d = 0, the best response function reduces to qA = a−c 2b , which is independent of qB , as expected, because firm i’s demand is unaffected by firm B’s sales, effectively transforming firm i into a monopolist. In contrast, if d = b, the 1 best response function collapses to qA (qB ) = a−c 2b − 2 qB , as in the standard Cournot model of homogeneous products. Graphically, when d = b, the best response function has a slope of −1/2, whereas when d < b, this slope becomes smaller than −1/2, thus producing a pivoting effect on the best response function: it becomes flatter. Intuitively, firms’ competition is ameliorated, because every firm i is induced to reduce its output by a smaller amount when products are differentiated (b > d) than when they are homogeneous (d = b). As an exercise, you can find firm B’s best response function, which is symmetric to that of firm A; that d ∗ ∗ is, qB (qA ) = a−c 2b − 2b qA . We can then invoke symmetry in equilibrium output qi = qj = q, which yields qA (qB ) = q= d a−c − q. 2b 2b 17. Recall that, in order to obtain the point where the best response function crosses the horizontal axis, we only d a−c d a−c need to set it equal to zero, a−c 2b − 2b qB = 0, and solve for qB . Rearranging, we find 2b = 2b qB , or qB = d . Imperfect Competition 379 Rearranging this, we find (2b+d)q 2b = a−c 2b . Solving for q, we obtain the equilibrium output: q∗ = a−c . 2b + d When products are completely differentiated (d = 0) this output becomes a−c 2b , as in monopoly, whereas when products are homogeneous (d = b) this output simplifies to a−c a−c 2b+b = 3b , as in the Cournot model described in section 14.3.1. Equilibrium price is then given by a−c a−c ab + c(b + d) , −d = p∗i = a − bq∗i + dq∗j = a − b 2b + d 2b + d 2b + d q∗i q∗j whereas equilibrium profits for every firm i are a−c (a − c)2 b ab + c(b + d) πi∗ = (p∗ − c)q∗ = −c = . 2b + d 2b + d (2b + d)2 As suggested previously, when products are completely differentiated (d = 0), this profit (a−c)2 b (a−c)2 becomes (2b+0) 2 = 4b , as in monopoly, whereas when products are homogeneous (d = b), this profit collapses to (a−c)2 b (2b+b)2 = (a−c) 9b , as in the Cournot model. 2 Example 14.7: Output competition with product differentiation Consider two firms, A and B, facing the demand curves pA (qA , qB ) = 100 − 5qA − 2qB and pB (qA , qB ) = 100 − 5qB − 2qA . In this context, parameters are a = 100, b = 5, and d = 2, which indicates that ownprice effects are larger than cross-price effects (i.e., b > d). In addition, assume that both firms have a symmetric marginal cost of c = 3. Inserting these parameters in 100−3 = the previous equilibrium results, we obtain that equilibrium output is q∗ = (2×5)+2 97 12 8.08 units. The equilibrium price is then p∗i = and profits become πi∗ = (100 × 5) + 3(5 + 2) 521 = $43.41, (2 × 5) + 2 12 (100−3)2 5 [(2×5)+2]2 $326.7. 380 Chapter 14 Self-assessment 14.8 Repeat the analysis in example 14.7, but assume that firms experience a higher marginal production cost of c = 5 (rather than c = 3). How are the results in example 14.7 affected? Exercise 22 at the end of the chapter examines how our results are affected when firms, still selling differentiated products, compete on prices (à la Bertrand) rather than on quantities. For a more detailed presentation of imperfectly competitive markets, see Cabral (2017). Appendix. Cournot Model with N Firms How do the results in the Cournot model of simultaneous quantity competition change if, rather than N = 2 firms, we consider more firms? To answer this question, let us first write the inverse demand function in this scenario, p(Q) = a − bQ, where Q denotes the aggregate output by all firms. Alternatively, we can express Q = qi + Q−i , which decomposes aggregate output into two components: qi , the output that firm i produces, and Q−i , which represents the production of all firms different than firm i; that is, Q−i = q1 + q2 + … + qi−1 + qi+1 + … + qN , where you can see that term qi is not included in the sum. For instance, if there are N = 4 firms in the market, and firm i is the second firm, then i = 2 and Q−2 = q1 + q3 + q4 . (Note that Q−i sums across the output of N − 1 firms, because firm i is not included in the sum.) The representation of aggregate output as Q = qi + Q−i allows us to rewrite the inverse demand function as p(qi , Q−i ) = a − b(qi + Q−i ). Q If all N firms face the same marginal cost c, where c satisfies a > c > 0, every firm i solves the following PMP: max [a − b (qi + Q−i )] qi − cqi . qi (PMPi ) Differentiating with respect to firm i’s output qi , we obtain a − 2bqi − bQ−i − c = 0. Rearranging this, we find a − c − bQ−i = 2bqi . Solving for qi yields firm i’s best response function: a−c 1 − Q−i . (BRFi ) qi (Q−i ) = 2b 2 Imperfect Competition 381 Firm i’s best response function informs us about this firm’s profit maximizing output qi as a function of the sum of its rivals’ output, Q−i . The best response function is analogous to the one that we previously found for the Cournot model with only two firms: it originates at a−c 2b and decreases in Q−i at a rate of 1/2. In addition, this function captures the Cournot model with two firms as a special case. Indeed, if we consider only two firms i and j, then firm i has a single rival (firm j), and thus the total output of firm i’s rivals is Q−i = qj . In that scenario, the BRFi collapses to that given in section 14.3.1. Let us now continue with the case of N 2 firms. Because all firms are symmetric, they all solve a problem similar to PMPi , obtaining best response functions like that in BRFi , a result that holds for every firm i. We can hence invoke symmetry in equilibrium output, which means that q1 = q2 = … = qN = q; that is, every firm produces the same amount in equilibrium. (We just dropped the subscripts in the individual output levels, which greatly simplifies our next calculations!) Therefore, aggregate output is Q = Nq, and the sum of firm i’s rivals’ output is Q−i = (N − 1)q. Inserting this result in the BRFi , we find a−c 1 − (N − 1)q, q= 2b 2 Q−i which depends on output q alone; all other elements are parameters (and treated as given by the firm). Rearranging this yields 2q + (N − 1)q a − c = , 2 2b which further simplifies to q [2 + (N − 1)] = a−c b . Solving for q, we obtain the equilibrium output in a Cournot model with N 2 firms: q∗ = 1 a−c . N +1 b This individual output level decreases with the number of firms operating in the market, N. Intuitively, as more firms compete, the individual production of each firm decreases.18 The aggregate output in this scenario becomes 1 a−c , Q∗ = Nq∗ = N N +1 b ∗ (a−c) 18. To confirm this result, differentiate equilibrium output q∗ with respect to N, finding ∂q 2 , which ∂N = − (N+1) b is negative, given that a > c by assumption, implying that q∗ decreases with the number of firms, N. As a numerical 1 100−10 = 90 , example, consider a = 100, b = 1, and c = 10. Then, this individual output simplifies to q = N+1 1 N+1 90 = 30 units in the case of N = 2 firms, 90 = 22.5 in the case of N = 3 firms, 90 = 18 in the case which is 2+1 3+1 4+1 of N = 4 firms, and so on for a larger number of firms. 382 Chapter 14 which increases as more firms enter the industry, N.19 The equilibrium price is 1 a−c p(Q∗ ) = a − bQ∗ = a − b N N +1 b Q∗ = a + Nc , N +1 which is decreasing with the number of firms, N.20 Interestingly, the results in this model encompass the results in previous chapters as special cases. To see this, let us start by considering an industry with only one firm (a monopoly), entailing N = 1. Monopoly. If we insert N = 1 into our equilibrium output q∗ , we obtain q∗ = 1 a−c a−c = . 1+1 b 2b a+c ∗ Aggregate output is, of course, Q∗ = Nq∗ = a−c 2b , and equilibrium price becomes p = 1+1 = a+c 2 . Needless to say, these three results coincide with the profit-maximizing output and price found in monopoly (see chapter 10). Let us now consider an oligopoly with two firms (i.e., a duopoly). Inserting N = 2 into the previous results, we obtain that individual output is Duopoly. q∗ = 1 a−c a−c = , 2+1 b 3b a+2c ∗ aggregate output is Q∗ = Nq∗ = 2q∗ = 2 a−c 3b , and equilibrium price becomes p = 2+1 = a+c 3 , which also coincides with the profit-maximizing output and price found in duopoly (see section 14.3.1 of this chapter). Lastly, consider an industry with a large number of firms, N → +∞, as in perfectly competitive markets where each firm represents a negligible share of the industry. We start by finding the limit of the individual output found previously, when N → +∞, Perfect competition. lim q∗ = lim N→+∞ N→+∞ 1 a−c = 0; N +1 b ∂Q∗ a−c , 19. To see this point, differentiate Q∗ with respect to the number of firms N, to obtain ∂N = − (a−c)N2 + b(N+1) b(N+1) which simplifies to a−c 2 . This expression is positive because a > c by assumption, implying that aggregate b(N+1) output, Q∗ , increases with the number of firms, N. ∂p(Q∗ ) c − 20. Differentiating the equilibrium price p(Q∗ ) with respect to the number of firms N, we find ∂N = N+1 a+Nc , which collapses to − (a−c) . Because a > c by definition, the derivative is unambigously negative, (N+1)2 (N+1)2 implying that the equilibrium price decreases with the number of firms, N. Imperfect Competition 383 and aggregate output is given by lim Q∗ = lim N N→+∞ N→+∞ a−c 1 a−c = , N +1 b b whereas equilibrium price becomes lim p∗ = lim N→+∞ N→+∞ a + Nc = c. N +1 As suspected, this coincides with the profit-maximizing output and price obtained in perfectly competitive markets (see chapter 9). Self-assessment 14.9 Repeat the analysis in this subsection, but assume that firms face an inverse demand function p(Q) = 10 − 2Q, and all N firms have a marginal cost of c = 3. Evaluate your results under monopoly, duopoly, and perfect competition. Exercises 1. Herfindahl-Hirschman index.A Calculate the HHI in the following markets, where three firms operate under different levels of market share: (a) Each firm has an equal share of the market (i.e., 33.3 percent). (b) One firm captures 50 percent of the market, while the other two each have 25 percent. (c) One firm captures 80 percent of the market, while the other two each have 10 percent. (d) Two firms have 45 percent of the market, while the other firm has 10 percent. (e) How do these different market shares (in parts a–d) affect the HHI? 2. Cournot competition between two breweries.B Two breweries across the street from each other sell slightly differentiated beers. Clay’s Brews (subscript C) has demand pC = 10 − qC − 0.5qJ and total cost TCC (qC ) = 10 + qC , while John’s Barley Sodas (subscript J ) has demand pJ = 14 − qJ − 0.5qC and total cost TC(qJ ) = 12 + 1.1qJ . (a) Find Clay’s and John’s best response functions. (b) How much beer will Clay and John sell, and what will each one set the price at? 3. Properties of the best response function.B Consider the best response function of firm 1 in the Cournot model of quantity competition: q1 (q2 ) = a−c 1 − q2 . 2b 2 Let us do some comparative statics, to understand how this expression changes as we increase one parameter at a time. 384 Chapter 14 (a) How is the best response function q1 (q2 ) affected by a marginal increase in the vertical intercept of the inverse demand function, a? Interpret. (b) How is the best response function q1 (q2 ) affected by a marginal increase in the slope of inverse demand function, b? Interpret. (c) How is the best response function q1 (q2 ) affected by a marginal increase in the firm’s marginal production cost, c? Interpret. 4. Properties of the Cournot equilibrium.B Consider the equilibrium in the Cournot model of quantity competition, where every firm produces q∗ = a−c 3b . Let us do some comparative statics in order to understand how this expression changes as we increase one parameter at a time. (a) How is the equilibrium output q∗ affected by a marginal increase in the vertical intercept of the inverse demand function, a? Interpret. (b) How is the equilibrium output q∗ affected by a marginal increase in the slope of inverse demand function, b? Interpret. (c) How is the equilibrium output q∗ affected by a marginal increase in the firm’s marginal production cost, c? Interpret. 5. Symmetric Cournot.A Two medical supply companies are the only two firms that supply stethoscopes to the medical professionals and are competing à la Cournot: Hearts Beat (H) and Lungs Breathe (L). Inverse market demand is p = 50 − 2(qH + qL ), and each firm has the same total cost of producing stethoscopes of TC(qi ) = 5qi . (a) Write the PMP for Hearts Beat and Lungs Breathe. (b) Find each firm’s best response function. (c) Find the equilibrium quantity that each firm will produce, and the market price. 6. Cournot with asymmetric marginal costs.B Consider the Cournot duopoly in section 14.3.1. Assume that firm 1 faces marginal cost c1 , while firm 2’s is c2 , where c1 < c2 (so firm 1 enjoys a cost advantage relative to firm 2) and a > c2 . (a) Find the best response function of firm 1 and of firm 2. Compare them. (b) Insert firm 2’s best response function into that of firm 1, to find the output that each firm produces in the NE of the Cournot game of quantity competition. Which firm produces a larger output? (c) Find equilibrium price and equilibrium profits for each firm. Which firm earns a larger profit? (d) Assume that both firms now become cost symmetric, so that, c1 = c2 = c. Evaluate your results from parts (b) and (c) at c1 = c2 = c, showing that you obtain the same results as in section 14.3.1. 7. Cournot with asymmetric fixed costs.B Consider the Cournot duopoly in section 14.3.1. Assume that firm 1 faces a TC function TC1 (q1 ) = F1 + cq1 , where F1 > 0 denotes its fixed cost and c > 0 represents its marginal cost. Firm 2’s TC function is TC2 (q2 ) = F2 + cq2 , where F2 > 0 denotes its fixed cost and satisfies F2 > F1 , and c > 0 is the same marginal cost as firm 1. Consider that firms still face a linear inverse demand function p(Q) = a − bQ, where parameter a satisfies a > c Imperfect Competition 385 and b > 0. The scenario is therefore analogous to the Cournot duopoly of section 14.3.1, except for the fact that both firms now face fixed costs of production. (a) Find the best response function of each firm, as well as the equilibrium output. (b) How are the equilibrium results affected? Interpret. 8. Cournot with asymmetric marginal costs.A Two firms, Melissa’s Meals (M) and Stephanie’s Sustenance (S), compete à la Cournot over the service of meal delivery via bicycle. Melissa has extensive knowledge of bike maintenance and keeps her bike fleet in tip-top shape so that her total costs are TCM (qM ) = 1 + qM , while Stephanie is not as good at maintenance and upkeep, so her total costs are higher, TCS (qS ) = 2qS . Inverse market demand is p = 12 − 0.3(qM + qS ). (a) Write down the PMP for Melissa and Stephanie. (b) Find each firm’s best response function. (c) Find the equilibrium quantity that each firm will produce, as well as the market price. (d) Will each firm produce an identical amount? Why or why not? (e) Find equilibrium profits for each firm and compare them. 9. Cournot with three firms.C Consider a market with three firms producing a homogeneous good and facing a linear demand function p(Q) = 1 − Q, where Q ≡ q1 + q2 + q3 denotes aggregate output. All firms face a constant marginal cost of production given by c, where 1 > c > 0. (a) Set up firm 1’s PMP, differentiate with respect to its output q1 , and obtain this firm’s best response function. [Hint: It should be a function of firm 2’s and 3’s output, q2 and q3 .] (b) Repeat the process for firms 2 and 3, to obtain their best response functions. [Hint: You should find that all firms have symmetric best response functions.] (c) Interpret firm 1’s best response function: if firm 2 were to marginally increase its output, does firm 1 increase or decrease its own output? Either way, by how much? (d) Using the three best response functions for these firms, find the point where they cross. The triplet (q∗1 , q∗2 , q∗3 ) characterizes the NE of this Cournot game. (e) Is the equilibrium output that you found in part (d) increasing or decreasing in marginal cost c? (f) Find the price that emerges in equilibrium, along with the profits that every firm earns. 10. Investigating the Bertrand equilibrium.A Consider the Bertrand model of simultaneous price competition in section 14.3.2. (a) Assume that firms’ common marginal cost c increases to c , where c > c. How are the results in that section affected? (b) Our presentation assumed two firms competing in prices. Repeat the analysis on that section, assuming N 2 firms. How are the main findings in that section affected by the number of firms? 11. Finitely repeated Grim Trigger Strategy.B Let us repeat example 14.5, but without considering an infinitely repeated game. 386 Chapter 14 (a) Assume that firms interact for T = 2 periods. Can the GTS in example 14.5 be sustained as an SPE of the game? (b) Assume that firms interact for T 2 periods. Can the GTS in example 14.5 be sustained as an SPE of the game? 12. Grim Trigger Strategy and Bertrand.B Consider our analysis of collusive behavior between two firms which competed on the basis of quantities (section 14.3.3). Assume that firms compete on the basis of prices (à la Bertrand). For which discount factor δ can collusion be sustained? Compare your results with those in section 14.3.3. 13. Collusion with delayed detection.B Consider self-assessment 14.6, allowing firms to deviate from the collusive outcome without being detected during three periods. This means that cheating is still detected with certainty, but with a lag of three periods rather than immediately. If a firm can deviate and earn a profit of $9 during three periods before the punishment starts, what is the minimal discount factor supporting cooperation? Compare your results with those under immediate detection. 14. Collusion in a one-shot game.A Let us now repeat example 14.5, but in a one-shot version, where every firm chooses to either produce the cartel output, qCartel , or the Cournot competition output, qCournot . (a) Find each firm’s profit when both choose qCartel , when both choose qCournot , and when only one chooses qCartel . (b) Show that every firm finds qCournot a strictly dominant strategy, so cooperation cannot be supported if the game is unrepeated. (c) Let us now generalize these findings by allowing that firms can choose any output level, rather than restricting them to select either qCartel or qCournot . Show that, if firm j chooses qCartel , firm i’s best response is not to choose qCartel , making the cartel output unsustainable in the unrepeated (one-shot) version of the game. 15. Colluding barbecue.B Mike (M) and Jeff (J ) each owns a barbecue place in a Southern town. Market demand for barbecue is p = 15 − 0.75Q, where Q = qM + qJ . Mike’s costs are TC(qM ) = 10 + 1.5qM , and Jeff ’s are TC(qJ ) = 5 + 3qJ . (a) Assuming that Mike and Jeff compete in quantities (à la Cournot), find their best response functions. (b) Find the equilibrium price and quantity for Mike and Jeff. (c) If Mike and Jeff form a cartel, how much barbecue product will they sell, and at what price? 16. Colluding gas stations.C Two gasoline stations are situated across the street from each other and are in fierce competition. They face market demand of p = 10 − 0.05Q, where Q = q1 + q2 denotes aggregate output, and each has total cost TC(qi ) = 10 + 0.5qi , where i ∈ {1, 2} denotes the firm. (a) If firms compete on the basis of quantities, find each firm’s best response function. (b) Find equilibrium output for each firm, price, and profits. (c) If the firms collude, what equilibrium price and quantity will each firm offer? What will their profits be? Imperfect Competition 387 (d) If the firms play an infinitely repeated game, and they seek to coordinate their production decision through the GTS considered in example 14.5. What discount factor supports continued collusion? 17. Properties of the Stackelberg equilibrium.A Consider the equilibrium output in the Stackelberg a−c ∗ game discussed earlier in the chapter, q∗1 = a−c 2b for the leader and q2 = 4b for the follower. Let us do some comparative statics in order to understand how this expression changes as we increase one parameter at a time. (a) How are equilibrium output q∗1 and q∗2 affected by a marginal increase in the vertical intercept of the inverse demand function, a? Interpret. (b) How are equilibrium output q∗1 and q∗2 affected by a marginal increase in the slope of inverse demand function, b? Interpret. (c) How are equilibrium output q∗1 and q∗2 affected by a marginal increase in the firm’s marginal production cost, c? Interpret. 18. Stackelberg with two and three firms.C Consider a market where two firms produce a homogeneous good, and face a linear demand function p(Q) = 1 − Q, where Q ≡ q1 + q2 denotes aggregate output. All firms face a constant marginal cost of production given by c, where 1 > c > 0. Firm 1 is the industry leader, choosing its output q1 in the first stage; firm 2 is the follower, who observes the choice of q1 from the leader and responds with its own output level q2 in the second stage of the game. (a) Find the follower’s best response function. Interpret. (b) Set up the leader’s PMP. [Hint: You will need to insert the follower’s best response function in the leader’s profits.] (c) Find the leader’s optimal output q∗1 . Which output does the follower respond with? (d) Allowing for three firms. Assume now that a third firm enters the industry. The time structure remains unaffected: first, firm 1 chooses output q1 ; observing q1 , firm 2 responds with its output q2 ; and, observing both q1 and q2 , firm 3 responds by choosing its output q3 . Follow the same process as in the two-firm version of the game to find the output levels that firms 1–3 choose in the equilibrium of the Stackelberg game. 19. Cournot versus Stackelberg.B Consider two neighboring wineries in fierce competition over the production of their specialty wine (where their grapes come from the same vineyard, so we assume that their wines are regarded as identical by customers). One winery is owned by Jill (J ), and the other by Ray (R). Each winery produces its wine the same way and have symmetric TC function TCi (qi ) = 3 + 0.5qi . Inverse market demand for wine is p = 50 − 2(qJ + qR ). (a) Cournot competition. Write down the PMP for each firm if they compete on the basis of quantities. (b) If the firms compete à la Cournot, what is each winery’s equilibrium output and price? (c) Stackelberg competition. If Jill was able to get her wine to market first (and become a Stackelberg leader), how will each winery’s output and price change? 20. Comparing monopoly and Stackelberg.B Patents give pharmaceutical companies monopoly rights over new drugs. After the patent expires, generic versions of these drugs hit the market. 388 Chapter 14 Consider such a market, where demand for a new drug is p = 500 − 5q and the company that created it (i.e., leader) has total cost of TC = 25 − 2q + 0.5q2 . (a) If the leader has monopoly rights for the product, what will the equilibrium price, quantity, and profit be for this drug? (b) After the monopoly rights end and a generic version of the drug is released, what will happen to the market equilibrium? (For simplicity, assume that only one other competitor releases the generic drug, has the same costs as the leader, and acts as a follower.) (c) Compare the equilibrium price, quantity, and profit for the leader in the market. 21. Product differentiation.A Two companies sell cell phone cases and compete over quantity. Each firm has a slightly different case, but the two companies, 1 and 2, face symmetric demand as follows: pi = 25 − 2qi − qj , where i ∈ {1, 2} and j = i. This inverse demand indicates that every firm i is more significantly affected by its own sales, qi , than by its rival’s sales, qj . Each firm has total cost TCi = qi + 0.5q2i . Assuming that the firms compete over quantity, find the equilibrium output, price, and profit. 22. Bertrand and product differentiation.B Consider a similar scenario to that in section 14.5, where two firms, A and B, offer a differentiated good but now compete over prices. The firm’s demand functions are 1 1 qA (pA , pB ) = 1 − pA + pB and qB (pA , pB ) = 1 − pB + pA . 2 2 Intuitively, every firm’s sales are more sensitive to its own price than to its rival’s price. For compactness, we refer to this property by saying that own-price effects dominate cross-price effects. Assume that each firm faces the same constant marginal cost, c = 18 . (a) Find each firm’s pricing best response function. Interpret its slope. (b) What are each firm’s equilibrium price and output? Do the firms practice marginal cost pricing? 23. Stackelberg prices with homogeneous goods.A Consider the Bertrand model in 14.3.2, except now firm 1 can set prices first, while firm 2 is the follower. Show that in this Stackelberg version of the Bertrand game, the equilibrium set of prices is (p1 , p2 ) = (c, c). 24. Stackelberg prices with heterogeneous goods.B Consider a similar scenario to that in section 14.5, where two firms, A and B, offer a differentiated good but now compete over prices. The firm’s demand functions are 1 1 qA (pA , pB ) = 1 − pA + pB and qB (pA , pB ) = 1 − pB + pA . 2 2 As in previous exercises with heterogeneous goods, these demand functions indicated that every firm’s sales are more sensitive to its own price than to its rival’s price. Stated more compactly, own-price effects dominate cross-price effects. Assume that each firm faces the same constant marginal cost, c = 18 . Imperfect Competition 389 (a) Second stage (follower). If firm A is a Stackelberg leader and can set prices first, what is firm B’s best response function? (b) First stage (leader). Set up firm A’s PMP and solve for the price that they offer. (c) What price will firm B offer? 25. Reconciling Bertrand and Cournot through capacity.C A common criticism of the Bertrand model of price competition is that firms face no capacity constraints. In particular, if firm 1 sets the lowest price in the market, it attracts all customers and can serve them regardless of how large demand is. In this exercise, we add a previous stage to the standard Bertrand model of price competition where firms choose a capacity level. Consider a market with two firms. In the first stage, each firm i chooses a production capacity q̄i at a cost of c = 14 per unit of capacity, where 0 q̄i 1. In the second stage, the firms observe each other’s capacity and respond by competing over prices. Once capacity q̄i is decided, the firms can produce up to that capacity with zero marginal cost. Each firm faces a demand of p = 1 − Q and chooses prices simultaneously in the second stage, and sales are distributed as in the Bertrand model of price competition. (a) Second stage. Begin in the second stage. Show that both firms set a common price p1 = p2 = p∗ = 1 − q̄1 − q̄2 in the second stage. (b) First stage. In the first stage, every firm i simultaneously and independently chooses its capacity q̄i . How much capacity does each firm invest in? (c) How do your results compare to the standard Cournot model, with two firms competing on the basis of quantities, facing the inverse demand function p(Q) = 1 − Q, and marginal cost c = 14 ? 26. Collusion in a Cournot model with N firms.C Consider a market with N firms producing a homogeneous good and facing a linear demand function p(Q) = 1 − Q, where Q = q1 + … + qN denotes aggregate output. All firms face a constant marginal cost of production given by c, where 1 > c > 0. (a) If all N firms compete on the basis of quantities, what is firm i’s equilibrium output and profit? (b) What is the equilibrium output and profit for each firm i if N firms were to collude? What is the discounted stream of profit from colluding if firms collude in an infinitely repeated game? (c) Consider a GTS where every firm starts cooperating in the first period (producing the cartel output) and keeps doing so if all firms cooperated in past periods. Otherwise, every firm produces the Cournot output level thereafter. Assuming a previous history of cooperation, what is firm i’s discounted stream of profits from cooperating? What is its discounted stream of profits if it deviates? (d) What is the discounted stream of profits for a deviating firm if all firms play the GTS described in part (c)? (e) Assuming firms collude and play this GTS, what is the lowest common discount rate δ that sustains collusion? (f) How does the discount rate found in part (e) change as the number of firms N increases? Interpret your results. 15 Games of Incomplete Information and Auctions 15.1 Introduction Previous chapters of this book considered economic situations in which agents— individuals, firms, or countries—strategically choose their actions simultaneously or sequentially. We learned how to predict equilibrium behavior with two simple, yet powerful, tools: the Nash equilibrium (NE) solution concept, with the help of best responses; and the subgame perfect equilibrium (SPE) concept by applying backward induction. While we studied different types of games, we always assumed that players were perfectly informed about each other’s characteristics. This meant that every player could perfectly predict her opponent’s payoff in every contingency. In other words, we only considered games of complete information. However, many strategic settings in real life involve elements of incomplete information, such as the following: • Firms can observe their own production costs, but they do not perfectly observe their rivals’ costs. In this context, firms may have estimates about rivals’ costs, but do not accurately observe them. • An incumbent firm, with decades of experience in an industry, may have reliable information about market demand, while a new entrant in the industry has limited information about demand. • Bidders in an auction know how much they are willing to pay for the object being sold (e.g., a painting), but usually cannot observe other bidders’ private valuations for the object. In these scenarios, players need to choose their best strategy by comparing payoffs but, given their limited information, this payoff comparison may be in expectation: finding the expected payoff that the player receives from each of her strategies, and then choosing the strategy that yields the highest expected payoff. As we examine in this chapter, this approach to selecting optimal strategies is analogous to the NE solution concept explored in chapter 12, but now it is extended to games of incomplete information (i.e., settings where players do not observe all information about their opponents). 392 Chapter 15 We start the chapter by defining this solution concept, and then applying it to the Cournot model of quantity competition, where we now assume that firms do not observe each other’s costs. The rest of the chapter is devoted to the application of this solution concept to auctions, seeking to predict the optimal bidding strategy that bidders use in an auction. We also look at various auction formats, such as first-price auctions (FPAs), second-price auctions (SPAs), and common-value auctions. 15.2 Extending Nash Equilibria to Games of Incomplete Information Before we extend the NE solution concept to incomplete information games, let us clarify a couple of points about notation. First, a player’s “type” will be used to represent her private information. In the example of two firms privately observing their costs, every firm i’s type is its production cost (e.g., high cH or low cL , where cH > cL 0). Similarly, in the auction example, a bidder’s type denotes her valuation for the object being sold, v > 0 (i.e., a positive dollar amount). Second, we will express the strategies of player i as a function of her type. Continuing with the previous example about two privately informed firms, a production strategy specifies how many units firm i produces as a function of its cost, a number potentially being lower when the firm experiences a high cost cH than a low cost cL . In the auction setting, a bidding strategy specifies how much player i bids as a function of her valuation of the object, v, which we write as bi (v). We are now ready to extend the NE solution concept to incomplete information games. First, we need to extend the notion of a player’s best response to allow for incomplete information, as we do next. Best response Player i regards strategy si as a “best response” to her rival’s strategy sj if si yields a weakly higher expected payoff than any other available strategy si against sj . This definition is identical to that of best responses in complete information games (chapter 12), except for the fact that we are now considering expected payoffs, rather than payoffs that occur with certainty. To understand this definition, consider again the example of the two firms discussed previously. Firm i observes its own production cost, such as cH , but does not observe that of its rival. We then say that a production strategy qi (cH ) is its best response to its rival j’s output level if qi (cH ) yields a higher expected profit than any other production different from qi (cH ).1 1. This assumes, of course, that firm i’s type (its cost cH ) is given because that is something that the firm cannot change. Games of Incomplete Information and Auctions 393 The definition given here says something more: firm i must have an optimal production strategy for each of its possible types (e.g., costs) [i.e., a profit-maximizing output when its costs are high, qi (cH ), and one when its costs are low, qi (cL )]. As a result, a best response in this context can be understood as a list specifying this player’s optimal strategy for each of her privately observed types. Using this extended version of best response, we can define a Bayesian Nash equilibrium (BNE) as follows: Bayesian Nash Equilibrium (BNE) A strategy profile (s∗i , s∗j ) is a Bayesian Nash equilibrium if every player chooses, for each of her types, a best response (evaluated in expectation) given her rivals’ strategies. Like the definition of best response, the BNE definition is analogous to that of NE, except for the fact that players choose best responses by comparing expected payoffs rather than certain payoffs. Intuitively, players select mutual best responses to each other’s strategies, where best responses are now “lists,” as discussed previously, specifying which strategy a player chooses for each of her possible types. We understand that these definitions, while maintaining a common theme with those presented in chapter 12 for complete information games, can look a bit intimidating. Without further ado, we apply this definition to the two-firms example, which should illustrate how to approach strategic scenarios where players interact under incomplete information. Example 15.1: Cournot competition, with asymmetric information about costs Consider a duopoly game where two firms compete on the basis of quantities and face the inverse demand function p = 1 − q1 − q2 . Assume that firm 1 is an incumbent that operated in the industry for decades, with marginal cost MC1 = 0, which every firm can accurately estimate. Firm 2 privately observes its marginal costs, which can be low, MC2 = 0, or high, MC2 = 1/4. Because firm 2 is a newcomer (i.e., a company from a different industry or from a foreign country), firm 1 cannot accurately observe firm 2’s costs, but after some research (e.g., hiring a consulting company), it assigns equal probability to firm 2 having low and high costs. We now seek to find the BNE of this duopoly game, specifying how much every firm produces. We can start by focusing on the informed player (firm 2). Firm 2’s best response. When firm 2 has low costs (MC2 = 0), it chooses its production level qL2 (where superscript L indicates that the firm has low costs), to maximize 394 Chapter 15 its profits as follows: max π2L = (1 − q1 − qL2 )qL2 . qL2 0 Differentiating with respect to qL2 yields 1 − q1 − 2qL2 = 0, and solving for qL2 , we obtain firm 2’s best response function when experiencing low costs: 1 1 (BRF2L (q1 )) qL2 (q1 ) = − q1 . 2 2 On the other hand, when firm 2 has high costs (MC2 = 14 ), its profit maximization problem (PMP) is 1 H H max π2H = (1 − q1 − qH 2 )q2 − q2 . H 4 q2 0 Differentiating with respect to qH 2 yields 1 − q1 − 2qH 2 − 1 = 0, 4 and solving for qH 2 , we find firm 2’s best response function when experiencing high costs: 3 1 (BRF2H (q1 )) qH 2 (q1 ) = − q1 . 8 2 Comparing the best response function under low and high costs, we can see that for a given output level of firm 1, q1 , firm 2 responds by producing more units when its own costs are low than when they are high because qL2 (q1 ) > qH 2 (q1 ) for every are parallel to each other, but qH value of q1 .2 Graphically, qL2 (q1 ) and qH (q ) 1 2 2 (q1 ) 1 3∼ H originates at 2 , while q2 (q1 ) originates at a lower height, 8 = 0.375. Firm 1. Let us now analyze firm 1 (the uninformed player in this game). This firm seeks to maximize its expected profits because it does not observe whether firm 2 has low or high costs. Then firm 1 solves the following problem: 2. As an example, consider that q1 = 12 units. Then firm 2 produces qL2 (2) = 12 − 12 × 12 = 14 units when its 3 1 1 1 costs are low, but only qH 2 (2) = 8 − 2 × 2 = 8 units when its costs are high. Games of Incomplete Information and Auctions 395 max π1 = q1 0 1 1 (1 − q1 − qL2 )q1 + (1 − q1 − qH 2 )q1 2 2 if firm 2 has low costs = 1 − q1 − if firm 2 has high costs qL2 qH − 2 2 2 q1 . Differentiating with respect to q1 yields 1 − 2q1 − qL2 qH − 2 = 0. 2 2 Solving for q1 , we obtain firm 1’s best response function: q1 qL2 , qH 2 = 1 1 L 1 H − q − q , 2 4 2 4 2 (BRF1 (qL2 , qH 2 )) which is a function of both firm 2’s output when having low costs, qL2 , and when having high costs, qH 2 . We then found three best response functions, which we can solve to obtain the three unknown output levels q1 , qL2 , and qL2 . Inserting the best response functions for firm 2, qL2 (q1 ) and qH 2 (q1 ), into the expression of firm 1’s best response , yields function, q1 qL2 , qH 2 q1 = 1 1 1 1 1 3 1 − − q1 − − q1 , 2 4 2 2 4 8 2 qL2 (q1 ) qH 2 (q1 ) 3 1 which simplifies to q1 = 9+8q 32 . Solving for output q1 , we obtain q1 = 8 . We can now plug this result into firm 2’s best response function, first when having low costs, qL2 3 1 13 5 = − = , 8 2 2 8 16 and then when having high costs, qH 2 3 3 13 3 = − = . 8 8 2 8 16 Therefore, the BNE of this duopoly game with incomplete information prescribes 3 5 3 production levels q1 , qL2 , qH = , 2 8 16 , 16 . 396 Chapter 15 Self-assessment 15.1 Repeat the analysis in example 15.1, but assuming that firm 1’s marginal cost changes to MC1 = $ 12 . Firm 2’s costs are still either low, MC2 = $0, or high, MC2 = $ 14 . How are the results in example 15.1 affected? Interpret. 15.3 Auctions Auctions have always been a large part of the economic landscape, with some auctions reported as early as in Babylon, around 500 BCE and during the Roman Empire, in 193 CE.3 Auctions with precise sets of rules emerged in 1595, where the Oxford English Dictionary first included the term; and auction houses like Sotheby’s and Christie’s were founded as early as 1744 and 1766, respectively. Commonly used auctions nowadays are often online, with popular websites such as eBay, with $9 billion in total revenue in 2017 and thousands of employees worldwide, which attracted the entry of competitors into the online auction industry, such as QuiBids. Auctions have also been used by governments throughout history. In addition to auctioning off treasury bonds, in the last decade, governments started to sell airwaves (3G and 4G technology). For instance, the British 3G telecom licenses generated $34 billion (about 2 percent of their gross domestic product at the time) in what British economists called “the biggest auction ever.”4 In the rest of the chapter, we study the common ingredients in most auction formats (understanding them as an allocation mechanism). Then, we analyze optimal bidding behavior in first-price auctions (FPAs), second-price auctions (SPAs), common-value auctions, and the so-called winner’s curse. 15.3.1 Auctions as Allocation Mechanisms Consider N bidders who seek to acquire a certain object, where each bidder i has a valuation vi for the object, and assume that there is one seller. Note that we can design many different rules for the auction, following the same auction formats commonly observed in real-life scenarios. For instance, we could use any of the following: 1. First-price auction (FPA), where the winner is the bidder submitting the highest bid, and she must pay the highest bid (which in this case is hers). 3. In particular, the Praetorian Guard, after killing Pertinax, the emperor, announced that the highest bidder could claim the empire. Didius Julianus was the winner, becoming the emperor for two short months, after which he was beheaded. 4. Several game theorists played an important role in designing and testing the auction format before its final implementation. In fact, the design of 3G auctions created a great controversy in most European countries during the 1990s because similar countries collected enormously different revenue amounts from the sale. Games of Incomplete Information and Auctions 397 2. Second-price auction (SPA), where the winner is the bidder submitting the highest bid, but in this case, she must pay the second-highest bid. 3. Third-price auction, where the winner is still the bidder submitting the highest bid, but now she must pay the third-highest bid. 4. All-pay auction, where the winner is still the bidder submitting the highest bid, but in this case, every single bidder must pay the price she submitted. These auction formats have several features in common, implying that all auctions can be interpreted as allocation mechanisms with two main ingredients: 1. An allocation rule, specifying “who gets the object.” The allocation rule for most auctions determines that the object is allocated to the bidder submitting the highest bid. This was, in fact, the allocation rule for all four auction formats considered here. However, we could assign the object by using a lottery, where the probability of winning the object is 1 a ratio of my bid relative to the sum of all bidders’ bids (i.e., prob(win) = b1 +b2 +b… +bN ), an allocation rule often used in certain Chinese auctions. 2. A payment rule, which describes “how much each bidder pays.” For instance, the payment rule in the FPA determines that the individual submitting the highest bid pays her own bid, while everybody else pays zero. In contrast, the payment rule in the SPA specifies that the individual submitting the highest bid (the winner) pays the second-highest bid, while everybody else pays zero. Finally, the payment rule in the all-pay auction determines that every individual must pay the bid that she submitted.5 For ease of exposition, we first present SPAs and then move to FPAs. Our presentation seeks to avoid most technicalities. For a more advanced introduction to auction theory, see the books by Krishna (2002), Milgrom (2004), Menezes and Monteiro (2004) and Klemperer (2004). 15.4 Second-Price Auctions In the SPA class of auctions, bidding your own valuation (i.e., bi (vi ) = vi ) is a weakly dominant strategy for all players. That is, regardless of the valuation you assign to the object, and independent of your opponents’ valuations, submitting a bid equal to your valuation, bi (vi ) = vi , yields an expected profit equal to or higher than that of submitting any other bid, bi (vi ) = vi . To show this bidding strategy is an equilibrium outcome of the second-price auction, let’s first examine bidder i’s expected payoff from submitting a bid that coincides 5. This auction format is used by the internet seller QuiBids.com. For instance, if you participate in the sale of a new iPad, and you submit a low bid of $25, but some other bidder wins by submitting a higher bid, you will still see your $25 withdrawn from your QuiBids account. 398 Chapter 15 with her own valuation vi (which we refer to as the “First case”), and then compare it with what she would obtain by deviating to bids below her valuation for the object, bi (vi ) < vi (denoted as “Second case”), or above her valuation, bi (vi ) > vi (“Third case”). If the bidder submits her own valuation, bi (vi ) = vi , then either of the following situations can arise: First case: 1a) If the highest competing bid lies below her bid, hi < bi , where hi = max{bj },6 then j=i bidder i wins the auction. In this case, she obtains a net payoff of vi − hi because in an SPA, the winning bidder does not pay the bid she submitted, but rather the secondhighest bid, hi , and in this case, bi > hi . 1b) If, instead, the highest competing bid lies above her bid, hi > bi , then she loses the auction, earning zero payoff. We do not consider the case when her bid coincides with the highest competing bid (i.e., bi = hi ), and thus a tie occurs. Ties are normally solved in auctions by randomly assigning the object to the bidders who submitted the highest bids (e.g., if bidders 3 and 7 are tied in the highest bid, the auctioneer can flip a coin to determine if bidder 3 or 7 will receive the object). As a consequence, bidder i’s payoff becomes vi − hi , but with only 12 probability (i.e., her expected payoff of 12 (vi − hi )).7 However, because vi = hi in this case, the bidder earns a zero expected payoff. Second case: Let us now compare these equilibrium payoffs with those that bidder i could obtain by deviating toward a bid that shades her valuation (i.e., bi < vi ). In this case, we can also see three cases emerging (see figure 15.1), depending on the ranking between bidder i’s bid, bi , and the highest competing bid, hi : 2a) If the highest competing bid hi lies below her bid (i.e., hi < bi ), then she still wins the auction, obtaining the same net payoff as when she does not shade her bid, vi − hi . 2b) If the highest competing bid hi lies between bi and vi (see case 2b in figure 15.1), bidder i loses, making zero payoff. Had she submitted a bid equal to her valuation for the object, she would have won the auction, earning a payoff of vi − hi > 0. 2c) If the highest competing bid hi is higher than vi (see case 2c), bidder i loses the auction, thus yielding the same outcome as when she submits a bid, bi = vi . Hence, we just showed that when bidder i shades her bid, bi < vi in cases 2a–2c, she obtains the same or lower payoff as when she submits a bid that coincides with her valuation 6. Intuitively, expression hi = max{bj } just finds the highest bid among all bidders different from bidder i, j = i. j =i Alternatively, hi can be written more explicitly as hi = max{b1 , b2 , … , bi−1 , bi+1 , … , bN }, where we find the highest bid among all N bidders except for bidder i (note that we wrote everyone’s bid but i’s, bi ). 7. More generally, if K ≥ 2 bidders are tied submitting the highest bid, the auctioneer randomly assigns the object to any of them, implying that each of these bidders earns an expected payoff of K1 (vi − hi ). Games of Incomplete Information and Auctions Case 2a 399 hi vi Bids bi vi Bids bi vi Bids vi bi Bids bi hi Case 2b Case 2c hi Figure 15.1 Cases when bidder i shades his bid, bi < vi . Case 3a hi hi Case 3b vi bi vi bi Case 3c Bids hi Bids Figure 15.2 Cases when bidder i bids above his value, bi > vi . for the object (bi = vi ). Therefore, she does not have incentives to shade her bid because her payoff would not improve from doing so. Third case: Let us finally examine bidder i’s equilibrium payoff from submitting a bid above her own valuation (i.e., bi > vi ). Three cases also arise (see figure 15.2): 3a) If the highest competing bid hi lies below bidder i’s valuation, vi , she still wins, earning a payoff of vi − hi , which coincides with that when she submits her valuation, bi = vi . 3b) If the highest competing bid hi lies between vi and bi (see case 3b in figure 15.2), bidder i wins the object but earns a negative payoff because vi − hi < 0. If, instead, bidder i submits a bid bi = vi , she would have lost the object, earning zero payoff. 3c) If the highest competing bid hi lies above bi (see case 3c in figure 15.2), bidder i wins, but at a loss since her payoff is negative (i.e., vi − hi < 0). If, instead, bidder i submits a bid bi = vi she loses the auction, earning zero payoff. Hence, bidder i’s payoff from submitting a bid above her valuation either coincides with her payoff from submitting her own value for the object, or becomes strictly lower, thus eliminating her incentive to deviate from her equilibrium bid of bi (vi ) = vi . In other words, 400 Chapter 15 there is no bidding strategy that provides a strictly higher payoff than bi (vi ) = vi in the SPA, and all players bid their own valuation, without shading their bids; in the next section we see that this result differs from the optimal bidding function in FPA, where players shade their bids unless N → ∞. Remark—The equilibrium bidding strategy in the SPA is, first, unaffected by the number of bidders who participate in the auction, N, or their risk-aversion preferences. In particular, our discussion considered the presence of N bidders, and an increase in their number does not emphasize or ameliorate the incentives that every bidder has to submit a bid that coincides with her own valuation, bi (vi ) = vi . Second, these results remain when bidders evaluate their net payoff (e.g., vi − hi ), according to a concave utility function, such as u(x) = xα , exhibiting risk aversion. Specifically, for a given value of the highest competing bid, hi , bidder i’s expected payoff from submitting a bid, bi (vi ) = vi , would still be weakly larger than when deviating to a bidding strategy above, bi (vi ) > vi , or below, bi (vi ) < vi , her true valuation for the object. Finally, our results are also unaffected by how valuations for the object are distributed (e.g., following a uniform, normal, or exponential distribution); as these arguments did not rely on the specific distribution of valuations. Self-assessment 15.2 Consider an SPA with N = 25 bidders. If your valuation for the object is vi = $14, what is your optimal bidding strategy? What if your valuation for the object increases to vi = $17? What if the number of bidders increases to N = 120? Interpret. 15.5 First-Price Auctions 15.5.1 Privately Observed Valuations Before analyzing equilibrium bidding strategies in first-price auctions, note that auctions are strategic scenarios where players must choose their strategies (i.e., how much to bid) in an incomplete information context. In particular, every bidder knows her own valuation for the object, vi , but does not observe other bidders’ valuations, such as vj . That is, bidder i is “in the dark” about her opponents’ valuations. Despite not observing j’s valuation, bidder i knows the probability distribution behind bidder j’s valuation. For instance, vj can be relatively high (e.g., vj = $10, with probability 0.4), or low (e.g., vj = $5, with probability 0.6). More generally, bidder j’s valuation, vj , is distributed according to a cumulative distribution function F(v) = prob(vj < v), intuitively representing that the probability that vj lies below a certain cutoff v is exactly F(v). For simplicity, we normally assume that every bidder’s valuation for the object is drawn from a Games of Incomplete Information and Auctions 401 F(v) 1.0 prob(vj > v) = 1 – F(v) = 1 – v F(v) = v (i.e., 45-degree line) 0.8 0.6 v 0.4 prob(vj < v) = F(v) = v 0.2 0.2 0.4 vj < v (i.e., bidder j’s valuation is lower than bidder i’s) 0.6 v 0.8 1.0 vj vj > v (i.e., bidder j’s valuation is higher than bidder i’s) Figure 15.3 Uniform probability distribution. uniform distribution function between 0 and 1 (i.e., vj ∼ U[0, 1]), while the appendix in this chapter extends our analysis to other cumulative distribution functions F(vi ).8 Figure 15.3 illustrates this uniform distribution, where the horizontal axis depicts vj and the vertical axis measures the cumulated probability F(v). For instance, if bidder i’s valuation is v, then all points on the left side of v on the horizontal axis represent that vj < v, entailing that bidder j’s valuation is lower than that of bidder i. The mapping of these points on the vertical axis gives us the probability prob(vj < v) = F(v) which, in the case of a uniform distribution, is F(v) = v. Similarly, the valuations to the right side of v describe points where vj > v, and thus bidder j’s valuation is higher than that of bidder i. Mapping these points on the vertical axis, we obtain the probability prob(vj > v) = 1 − F(v) which, under a uniform distribution, implies 1 − F(v) = 1 − v. 15.5.2 Equilibrium Bidding in First-Price Auctions Let’s start analyzing equilibrium bidding behavior in the first-price auction. First, note that submitting a bid above one’s valuation, bi > vi , is a dominated strategy. To understand this point, the bidder would obtain a negative payoff if she wins, and zero payoff if she loses. 8. Note that this assumption does not imply that bidder j does not assign a valuation vj larger than 1 to the object. Instead, her valuation vi , which lies on interval [0, v], can be divided by v, which normalizes the interval to [0, 1]. 402 Chapter 15 Bid, b i vi (i.e., 45-degree line) 1.0 0.8 Bid shading b i (v i ) = a . v i 0.6 0.4 0.2 0.2 0.4 0.6 1.0 vi 0.8 Figure 15.4 “Bid shading” in a FPA. Specifically, her expected utility from participating in the auction is EUi (bi |vi ) = prob(win) × (vi − bi ) + prob(lose) × 0, Negative if vi <bi which becomes negative when the bidder submits a bid above her valuation, vi < bi , regardless of her probability of winning. Note that in this expected utility, we specify that, upon winning, bidder i receives a net payoff of vi − bi ; that is, the difference between her true valuation for the object and the bid she submits (which ultimately constitutes the price she pays for the good if she were to win).9 Similarly, submitting a bid bi that exactly coincides with one’s valuation, bi = vi , also constitutes a dominated strategy because even if the bidder happens to win, her expected utility would be zero; that is, EUi (bi |vi ) = prob(win) × (vi − bi ) . Zero if vi =bi Therefore, the equilibrium bidding strategy in an FPA must imply a bid below one’s valuation, bi < vi . That is, bidders must practice what is usually referred to as “bid shading.” In particular, if bidder i’s valuation is vi , her bid must be a share of her true valuation (i.e., bi (vi ) = a · vi , where a ∈ (0, 1)). Figure 15.4 illustrates bid shading because bidding strategies lie below the 45-degree line (where vi = bi ). A natural question at this point is: how intense must bid-shading be in the first-price auction? Or, alternatively, what is the precise value of the bid shading parameter a? To 9. Upon losing, bidders do not obtain any object and, in this type of auction, they do not have to pay any monetary amount, thus implying a zero payoff. Games of Incomplete Information and Auctions 403 Bid, b i 1.0 vi (i.e., 45-degree line) 0.8 b i (v i ) = a . v i 0.6 bi = x 0.4 0.2 0.2 0.4 0.6 x a 0.8 1.0 vi Figure 15.5 Recovering bidder i’s valuation. answer such a question, we must first describe bidder i’s expected utility from submitting a given bid x, when her valuation for the object is vi , EUi (x|vi ) = prob(win) × (vi − x) + prob(lose) × 0. Before continuing our analysis, we still must precisely characterize the probability of winning in the expression, prob(win). Specifically, upon submitting a bid bi = x, bidder j can anticipate that bidder i’s valuation is ax , by just inverting the bidding function bi (vi ) = x = a × vi (i.e., solving for vi in x = a × vi yields vi = ax ). This inference is illustrated in figure 15.5, where bid x on the vertical axis is mapped into the bidding function a × vi , which corresponds to a valuation of ax on the horizontal axis. Intuitively, for a bid x, bidder j can use the symmetric bidding function a × vi to “recover” bidder i’s valuation, ax , that generated a bid of $x. Hence, the probability of winning is given by prob(bi bj ) and, according to the vertical axis in figure 15.5, prob(bi > bj ) = prob(x > bj ). If, rather than describing probability prob(x > bj ) from the point of view of bids (see shaded portion of the vertical axis in figure 15.6), we characterize it from the point of view of valuations (in the shaded segment of the horizontal axis), we obtain that prob(bi > bj ) = prob( ax > vj ). Indeed, the shaded set of valuations on the horizontal axis illustrates valuations of bidder j, vj , for which her bid lies below player i’s bid, x. In contrast, valuations vj satisfying vj > ax entail that player j’s bids would exceed x, implying that bidder j wins the auction. Hence, if the probability that bidder i wins the object is given by prob( ax > vj ) and valuations are uniformly distributed, we have that prob( ax > vj ) = ax .10 10. Recall that if random variable y is distributed according to a uniform distribution function U[0, 1], the probability that the value of y lies below a certain cutoff c is exactly c (i.e., prob(y < c) = F(c) = c). 404 Chapter 15 Bid, b j vj (i.e., 45-degree line) 1.0 bi < bj (and bidder i loses) 0.8 bj (v j ) = a . v j 0.6 bi = x 0.4 bi > bj (and bidder i wins) 0.2 0.2 0.4 0.6 Valuations of bidder j,vj , for which bi > bj (bidder i wins). x 0.8 1.0 vj a Valuations of bidder j,vj , for which bi > bj (bidder i loses). Figure 15.6 Probability of winning in a FPA. We can now plug this probability of winning into bidder i’s expected utility from submitting a bid of x in the FPA, as follows: x vi x − x2 . EUi (x|vi ) = (vi − x) = a a Taking first-order conditions with respect to bidder i’s bid, x, we obtain solving for x, yields bidder i’s optimal bidding function: vi −2x a = 0 which, 1 x(vi ) = vi . 2 Intuitively, this bidding function informs bidder i how much to bid, as a function of her privately observed valuation for the object, vi . For instance, when vi = $0.75, her optimal bid becomes 12 0.75 = $0.375. This bidding function implies that, when competing against another bidder j, and with only N = 2 players participating in the FPA, bidder i shades her bid in half, as figure 15.7 illustrates. Self-assessment 15.3 Consider a first-price auction with N = 2 bidders. If your valuation for the object is vi = $14, what is your optimal bidding strategy? What if your valuation increases to vi = $17? Interpret. Games of Incomplete Information and Auctions 405 Bid, bi 1.0 vi (i.e., 45-degree line) 0.8 0.6 bi ( vi ) = 0.4 vi 2 0.2 0.2 0.4 0.6 0.8 1.0 v i Figure 15.7 Optimal bidding function in an FPA with N = 2 bidders. 15.5.3 Extending the First-Price Auction to N Bidders Our results are easily extended to FPA with N bidders. In particular, the probability of bidder i winning the auction when submitting a bid of $x is x x x x > v1 ·… · prob > vi−1 · prob > vi+1 ·… · prob > vN prob(win) = prob a a a a N−1 x x x x x = ·… · · ·… · = , a a a a a where we evaluate the probability that the valuation of all other N − 1 bidders, v1 , v2 ,…, vi−1 , vi+1 ,…, vN (except for bidder i), lies below the valuation vi = ax , which generates a bid of exactly $x. Hence, bidder i’s expected utility from submitting x becomes EUi (x|vi ) = x N−1 a (vi − x). prob(win) To facilitate the differentiation with respect to bid x, the bidder’s expected utility can be 1 1 rewritten as follows: EUi (x|vi ) = aN−1 xN−1 vi − xN−1 x , which entails aN−1 xN−1 vi − xN . Taking first-order conditions with respect to her bid, x, we obtain 1 aN−1 (N − 1) xN−2 vi − NxN−1 = 0, which is zero when the term in brackets is nil, (N − 1) xN−2 vi − NxN−1 = 0. RearrangN−1 xN−1 ing this term, we obtain xxN−2 = N−1 N vi . Recall that the left side, xN−2 , can be rewritten as x(N−1)−(N−2) = x, which helps us further simplify our results, finding that bidder i’s optimal bidding function is 406 Chapter 15 Bid, bi 1.0 v, where N → ∞ 0.8 3v , where N = 4 4 2v , where N = 3 3 0.6 v 2 , where N = 2 0.4 0.2 0.2 0.4 0.6 0.8 1.0 vi Figure 15.8 Optimal bidding function in a FPA increases in N. x(vi ) = N −1 vi . N Figure 15.8 depicts the bidding function for the case of N = 2, N = 3, and N = 4 bidders, showing that bid shading is ameliorated when more bidders participate in the auction (i.e., bidding functions approach the 45-degree line). Indeed, for N = 2, the optimal bid1 3−1 2 ding function is 2−1 2 vi = 2 vi , but it increases to 3 vi = 3 vi when N = 3 bidders compete 4−1 3 for the object, to 4 vi = 4 vi when N = 4 players participate in the auction, etc. For an extremely large number of bidders (e.g., N = 2, 000), bidder i’s optimal bidding funcvi and, hence, bidder i’s bid almost coincides with her tion becomes bi (vi ) = 1,999 2,000 vi valuation for the object, describing a bidding function that approaches the 45-degree line. Intuitively, if bidder i seeks to win the object, she can shade her bid when few bidders are competing for the good. However, when several players are competing in the auction, the probability that some of them have a high valuation for the object (and thus submit a high bid) increases. That is, competition gets tougher as more bidders participate, where every bidder responds increasing her bid, ultimately ameliorating her incentives to practice bid shading. Self-assessment 15.4 Consider a first-price auction with N = 25 bidders. If your valuation for the object is vi = $14, what is your optimal bidding strategy? What if the number of bidders increases to N = 120? Interpret. Games of Incomplete Information and Auctions 407 15.5.4 First-Price Auctions with Risk-Averse Bidders Let us next analyze how our equilibrium results would be affected if bidders are risk averse (i.e., their utility function is concave in income, x), [e.g., u(x) = xα , where 0 < α 1 denotes bidder i’s risk-aversion parameter]. In particular, when α = 1, she is risk neutral, while when α decreases, she becomes risk averse.11 Two bidders. First, note that the probability of winning is unaffected because for a symmetric bidding function bi (vi ) = a · vi for every bidder i, where a ∈ (0, 1), the probability that bidder i wins the auction against another bidder j is x x > vj = . prob(bi > bj ) = prob(x > bj ) = prob a a Therefore, bidder i’s expected utility from participating in this auction is EUi (x|vi ) = x × (vi − x)α , a where, relative to the case of risk-neutral bidders analyzed previously, the only difference arises in the evaluation of the net payoff from winning, vi − x, which is now evaluated as (vi − x)α . Taking first-order conditions with respect to her bid, x, we have 1 x (vi − x)α − α(vi − x)α−1 = 0, a a and solving for x, we find the optimal bidding function, x(vi ) = 1 vi . 1+α This case embodies that of risk-neutral bidders analyzed here as a special case. Specifically, when α = 1, bidder i’s optimal bidding function becomes x(vi ) = v2i . However, when her risk aversion increases (i.e., α decreases), bidder i’s optimal bidding function increases. vi i) Specifically, ∂x(v ∂α = − (1−α)2 , which is negative for all parameter values. In the extreme case in which α decreases to α → 0, the optimal bidding function becomes x(vi ) = vi , and players do not practice bid shading. Figure 15.9 illustrates the increase in players’ bidding function, starting from v2i when bidders are risk neutral, α = 1, and approaching the 45-degree line (no bid shading) as players become more risk averse. Intuitively, a risk-averse bidder submits more aggressive bids than a risk-neutral bidder, to minimize the probability of losing the auction. In particular, consider that bidder i reduces α , is pos11. This utility function is increasing in income because its first-order derivative, u (x) = αxα−1 = 1−α x , itive. In addition, it is concave in income because its second-order derivative, u (x) = α(α − 1)xα−2 = − α(1−α) x2−α is negative, given that α satisfies 0 < α ≤ 1. However, when α = 1, the utility function becomes linear in income α = α = α, and u (x) collapses to zero. In contrast, when α < 1, the function because u (x) simplifies to u (x) = 1−1 1 x √ is concave in income. A typical example of this utility function explored in chapter 6 was u(x) = x1/2 = x. 408 Chapter 15 Bid, bi 1.0 v, where α = 0 0.8 v , where α = 0.2 1.2 v , where α= 0.5 1.5 0.6 v , where α =1 2 0.4 0.2 0.2 0.4 0.6 0.8 1.0 vi Figure 15.9 Optimal bidding function in an FPA with risk-averse bidders. her bid from bi to bi − ε. If she wins the auction, she obtains an additional profit of ε because she has to pay a lower price for the object. However, by lowering her bid, she increases the probability of losing the auction. Importantly, for a risk-averse bidder, the positive effect of getting the object at a cheaper price is offset by the negative effect of increasing the probability of losing the auction. This result connects with our discussion of risk aversion in section 6.6.1 of chapter 6, where we said that, for a risk-averse individual, the disutility she suffers from the downside of a lottery is larger than the utility she experiences from the upside of a lottery. Overall, the risk-averse bidder does not have incentives to reduce her bid, but rather to increase it, relative to a risk-neutral bidder. N 2 bidders. These results can be easily extended to scenarios with more bidders. From section 15.5.3, we know that the probability that bidder i wins the auction is x x x x > v1 ·… · prob > vi−1 · prob > vi+1 ·… · prob > vN prob(win) = prob a a a a x x x x N−1 x = ·… · · ·… · = . a a a a a Therefore, bidder i’s expected utility from participating in this auction is x N−1 EUi (x|vi ) = × (vi − x)α . a Differentiating with respect to bidder i’s bid, x, we obtain x N−2 x N−1 α 1 (N − 1) − (vi − x) α(vi − x)α−1 = 0, a a a Games of Incomplete Information and Auctions which simplifies to x N−1 a 409 (vi − x)α−1 [(N − 1)vi + (N − 1 + α)x] = 0. Solving for x, we obtain the equilibrium bidding function x(vi ) = N −1 vi . N −1+α When N = 2 bidders compete for the object, this bidding function becomes x(vi ) = thus coinciding with the expression found here. However, when N = 3 3−1 2 2 vi = 2+α vi , where 2+α > bidders compete, the bidding function increases to x(vi ) = 3−1+α 1 . More generally, we can show that the bidding function x(v ) increases in N because i 1+α αvi i) = is positive, indicating that, as N increases, bidders become the derivative ∂x(v 2 ∂N (N−1+α) more aggressive. 2−1 1 2−1+α vi = 1+α vi , Self-assessment 15.5 Consider an FPA with N = 2 bidders. If your valuation for the object is vi = $14 and your utility function is u(x) = x1/3 , what is your optimal bidding strategy? What if your utility function is u(x) = x1/10 ? Interpret. 15.6 Efficiency in Auctions Auctions, and generally allocation mechanisms, are characterized as efficient if the bidder (or agent) with the highest valuation for the object is indeed the person receiving the object. Intuitively, if this property does not hold, the outcome of the auction (i.e., the allocation of the object) would open the door to negotiations and arbitrage among the winning bidder— who, despite obtaining the object, may not be the player who assigned the highest value to it—and bidders with higher valuations who would like to buy the object from her. In other words, the auction’s outcome would still allow negotiations that are beneficial for all parties involved, thus suggesting that the initial allocation was not efficient. According to this criterion, both the FPA and the SPA are efficient because the bidder with the highest valuation submits the highest bid, and the object is ultimately assigned to the player who submits the highest bid. Other auction formats, such as the Chinese (or lottery) auction described in section 15.3, are not necessarily efficient because they may assign the object to an individual who did not submit the highest valuation for the object. In particular, recall that the probability of winning the object in this auction is a ratio of the bid that you submit relative to the sum of all players’ bids. Hence, a bidder with a low valuation for the object, and who submits 410 Chapter 15 the lowest bid (e.g., $1), can still win the auction. Alternatively, the person who assigns the highest value to the object, submitting the highest bid, might not end up receiving the object. Therefore, for an auction to satisfy efficiency, bids must be increasing in a player’s valuation, and the probability of winning the auction must be 1 (100 percent) if a bidder submits the highest bid. 15.7 Common-Value Auctions The auction formats considered here assume that each bidder privately observes her own valuation for the object. Her valuation was drawn from a distribution function (e.g., a uniform distribution), implying that two bidders are unlikely to assign the same value to the object for sale. However, in some auctions, such as the government sale of oil leases, bidders (oil companies) might assign the same monetary value to the object (common value) (i.e., the profits they would obtain from exploiting the oil reservoir). Bidders are, nonetheless, unable to precisely observe the value for the object. In the oil lease example, firms cannot accurately observe the exact volume of oil in the reservoir, or how difficult it will be to extract, but they can accumulate different estimates from their own engineers, or from other consulting companies, that inform them about the potential profits to be made from the oil lease. Such estimates are nonetheless imprecise, and they allow the firm to assign a value to the object (profits from the oil lease) only within a relatively narrow range, such as v ∈ [10, 11,… , 20], in millions of dollars. In other words, the value in profits that all firms assign to the oil lease is common, which explains why we refer to this type of auction as “common-value auctions.” The estimate ei that each firm i receives about this common value is potentially different, however, with some firms receiving an upward-biased estimate, ei > v, and others receiving a downward-biased estimate, ei < v. In this type of auction, shading your bid—which, at first glance, could be regarded as a conservative strategy—can lead you to win the auction, but at a loss! To understand this point, consider two bidders A and B, each receiving an estimate eA and eB , where eA > v > eB . Bidder A’s estimate is then biased upward relative to the true value of the oil lease, v, while bidder B’s estimate is biased downward. (A similar argument applies if we start from the opposite ranking of estimates.) If every bidder submits a bid that shades her estimate by $1, we would have bA = eA − 1 and bB = eB − 1, where bA > bB . Therefore, bidder A submits a more aggressive bid than B does (bA > bB ) because the former received a higher estimate than the latter, (eA > eB ). Bidder A is then the winner of the auction, but her payoff could be negative! This occurs if her margin after paying bid bA , v − bA = v − (eA − 1), bA Games of Incomplete Information and Auctions 411 is negative, which, solving for eA , yields v + 1 < eA . Intuitively, if bidder A’s estimate, eA , is $1 million larger than the true value of the oil lease, shading her bid by $1 million can still lead this bidder to win the auction, even if paying too much for the object. This is the so-called winner’s curse in common-value auctions: winning the auction means that the winning bidder probably received an overestimated signal of the true value of the object for sale, as firm A in this example. Therefore, to avoid the winner’s curse, participants in common-value auctions must significantly shade their bid to account for the possibility that the estimates they receive are above the true value of the object.12 Despite the straightforward intuition behind this result, the winner’s curse has been empirically observed in several scenarios. A common example is that of subjects in an experimental lab, where they are asked to submit bids in a common-value auction where a jar of nickels is being sold. Consider that your instructor shows up in class with a glass jar full of nickels. The monetary value that you assign to the jar (value of the coins inside the jar) coincides with that of your classmates, but none of you can accurately estimate the number of nickels in the jar because you can only gather some imprecise information about its true value by looking at it for a few seconds. In these experiments, it is usual to find that the winner ends up submitting a bid above the jar’s true value (i.e., the winner’s curse emerges).13 The winner’s curse in the classroom. 15.8 A Look at Behavioral Economics—Experiments with Auctions Several controlled experiments have been developed to test whether individuals bid according to the optimal bidding function discussed in this chapter. Generally, an experimental session starts by randomly distributing to every individual her valuation for the object prior to each auction period, where their valuations are typically drawn from a uniform distribution. In each period, the bidder submitting the highest bid earns a profit equal to her value minus the auction price (either her own bid in an FPA, or the second-highest bid in an SPA), while other bidders earn zero profit. Overall, most studies indicate that individuals tend to bid more aggressively than what would be expected according to the bidding function bi (vi ) found in previous sections of this chapter, although this bid difference is partly reduced when we consider their risk aversion. However, the comparative statics remain: individuals tend to bid more aggressively when competing against more bidders, when their valuation of the object is higher, and 12. It can be formally shown that, in the case of N = 2 bidders who receive independent signals, the optimal bidding function is bi (vi ) = 12 ei , where ei denotes the signal that bidder i receives. Intuitively, every bidder needs to “shade her signal” by submitting a bid of exactly half of the signal she received. More generally, for N bidders, bidder i’s optimal bid becomes bi (vi ) = (N+2)(N−1) ei . For more details, see Harrington (2014), pp. 400–02. 2N 2 13. For some experimental evidence on the winner’s curse see, for instance, Thaler (1988). 412 Chapter 15 when they are more risk averse. For an excellent review of the literature, see Kagel and Levin (2014). Appendix. First-Price Auctions in More General Settings Section 15.5 analyzes equilibrium bidding in first-price auctions assuming that every bidder i’s valuation is drawn from a uniform distribution—that is, vi ∼ U[0, 1]. In this appendix, we extend our analysis to allow for valuations to be drawn from more general cumulative distribution functions, F(vi ), with positive density in all its support—that is, f (vi ) > 0 for all vi ∈ [0, 1]. In the case of uniformly distributed valuations, F(vi ) = vi and f (vi ) = 1, but now we consider a general function F(vi ). Writing expected utility. We can then write bidder i’s expected utility maximization problem (UMP) as follows: max prob(win)(vi − bi ), bi 0 which denotes the probability of winning the object times bidder i’s net payoff from winning, vi − bi , because she values the object at vi and pays her bid bi for it. At this point, we need to write the probability of winning, prob(win), as a function of bidder i’s bid, bi . Bidder i wins the auction when her bid exceeds that of bidder j, bj < bi , which is equivalent to her valuation exceeding that of bidder j, vj < vi . We can express this probability as prob(vj < vi ) = F(vi ). Therefore, when bidder i faces N − 1 rivals, her probability of winning the auction is the probability that her valuation exceeds that of all other N − 1 bidders. Because valuations are independently distributed, we can write this probability as the following product: prob(vj < vi ) × prob(vk < vi ) × … × prob(vl < vi ) = F(vi ) × F(vi ) × … × ·F(vi ) = F(vi )N−1 N−1 times where bidders j = k = l represent i’s rivals. As a result, we can express the expected UMP as follows: max F(vi )N−1 (vi − bi ). bi 0 Using this bidding function, we can write bi (vi ) = xi , where xi ∈ R+ represents bidder i’s bid when her valuation is vi , as in section 15.5. Applying the inverse b−1 (·) on both N−1 , as sides, yields vi = b−1 i (xi ), which helps us rewrite the probability of winning, F(vi ) −1 N−1 , so this problem becomes F(bi (xi )) Games of Incomplete Information and Auctions 413 N−1 max F(b−1 (vi − xi ), i (xi )) xi 0 where we expressed the bid as xi in the last term because bi (vi ) = xi . Finding equilibrium bids. Now that the probability of winning is written as a function of bidder i’s bid, xi , we are ready to differentiate with respect to xi to find the equilibrium bidding function that players use in the FPA. Differentiating with respect to xi yields ∂b−1 (x ) i −1 −1 i N−1 N−2 + (N − 1)F(b (x )) (x )) f b (x ) (vi − xi ) = 0. − F(b−1 i i i i i i ∂xi Because b−1 i (xi ) = vi and ∂b−1 i (xi ) ∂xi = , 1 b b−1 i (xi ) this expression simplifies to − F(vi )N−1 + (N − 1)F(vi )N−2 f (vi ) 1 b (vi ) (vi − xi ) = 0. Rearranging this further, we obtain (N − 1)F(vi )N−2 f (vi )vi − (N − 1)F(vi )N−2 f (vi )xi = F(vi )N−1 b (vi ) or F(vi )N−1 b (vi ) + (N − 1)F(vi )N−2 f (vi )vi = (N − 1)F(vi )N−2 f (vi )xi . The left side is ∂ F(vi )N−1 bi (vi ) , ∂vi which helps us write this expression as ∂ F(vi )N−1 bi (vi ) = (N − 1)F(vi )N−2 f (vi )xi . ∂vi Integrating both sides yields F(vi ) N−1 vi bi (vi ) = (N − 1)F(vi )N−2 f (vi )vi dvi . 0 Applying integration by parts on the right side,14 we find vi (N − 1)F(vi )N−2 f (vi )vi dvi = F(vi )N−1 vi − 0 vi F(vi )N−1 dvi 0 14. Recall that, when integrating by parts, we consider two functions, g(x) and h(x), such that (gh) = g h + gh . Integrating both sides yields g(x)h(x) = g (x)h(x)dx + g(x)h (x)dx. Reordering this expression, we find g (x)h(x)dx = g(x)h(x) − g(x)h (x)dx. At this point, we can apply integration by parts in our auction setting by defining g (x) ≡ (N − 1)F(vi )N−2 f (vi ) and h(x) ≡ vi , so we obtain the result given in the text. 414 Chapter 15 so we can write our above first-order condition as vi F(vi )N−1 bi (vi ) = F(vi )N−1 vi − F(vi )N−1 dvi . 0 We can now solve for the equilibrium bidding function bi (vi ) that we seek to find. Dividing both sides by F(vi )N−1 yields vi F(vi )N−1 dvi 0 . bi (vi ) = vi − F(vi )N−1 bid shading Intuitively, bidder i submits a bid equal to her valuation for the object, vi , minus an amount captured by the second term in this expression, which is referred to as her “bid shading.” We can then claim that the bidding function bi (vi ) constitutes the BNE of the FPA when bidders’ valuations are distributed according to F(vi ). Consider, for instance, when individual valuations N−1 = vN−1 and ) = v are uniformly distributed, F(v i i . In this scenario, we obtain F(vi ) i vi 1 N−1 N F(vi ) dvi = vi , producing a bidding function of N 0 Uniformly distributed valuations. bi (vi ) = vi − 1 N N vi viN−1 = vi − vN i NviN−1 = vi N −1 N which coincides with the result in section 15.5.3. When only two bidders compete for the object, N = 2, this bidding function simplifies to bi (vi ) = v2i , which coincides with the result in section 15.5.2. When N = 3, equilibrium bids increase to bi (vi ) = 2v3 i , and a similar result occurs when N = 4 bidders compete for the object, where bi (vi ) = 3v4 i .15 Informally, as more bidders participate in the auction, every bidder i submits more aggressive bids because she faces a higher probability that another bidder j has a higher valuation for the object than she has. Exercises 1. Pareto uncertainty–I.A Two firms are considering the adoption of a new technology that would be mutually beneficial if they both chose to implement it. In the case where only one firm adopted the technology, however, the results could be unpredictable. Firm 1, however, has insider information about whether the technology is useful (with a payoff of 6) or useless (with a payoff of 0) if firm with respect to the number of bidders N 15. More generally, the derivative of bidding function bi (vi ) = vi N−1 N i (vi ) = 1 v , which is clearly positive. yields ∂b∂N 2 i N Games of Incomplete Information and Auctions 415 2 does not adopt the new technology. Firm 2 does not have this information, but it knows that the technology is useful with probability 0.5, and useless otherwise. The payoff for both firms are depicted in the following normal form games: Firm 2 New Old New 8, 8 0, 0 Firm 1 Old 0, 0 4, 4 Useless technology Firm 2 New Old New 8, 8 6, 0 Firm 1 Old 0, 6 4, 4 Useful technology (a) Find the best responses of the privately informed player, firm 1, which is type-dependent. (b) Find the best response of the uninformed player, firm 2. (c) Identify the BNE of the game and interpret your results. 2. Pareto uncertainty–II.B Consider the situation in exercise 15.1, but now assume that the probability that the technology is useless on its own is p, where p takes some value between 0 and 1. (a) Find the best responses of the privately informed player, firm 1, which is type-dependent. (b) Find the best response of the uninformed player, firm 2. (c) Identify the BNE of the game and interpret your results. 3. Stackelberg leader facing uncertain costs.A Consider the situation in example 15.1, but suppose that firm 1 acts as a Stackelberg leader. Find the BNE of this duopoly game. 4. All firms facing uncertain costs.B Consider the situation in example 15.1, but now firm 2 cannot observe firm 1’s costs. Firm 1 has low costs, MC1 = 0, with probability 0.5, and high costs, MC1 = 1 , with probability 0.5. Firm 1 is able to observe its own costs. Find the BNE of this duopoly game. 4 5. Uncertain demand—one uninformed firm.B Consider a duopoly game where two firms compete on the basis of quantities and face inverse demand function p = a − q1 − q2 . Assume that firm 1 is an incumbent and understands that the size of the market is a = 100. Firm 2, the entrant, is unable to accurately observe the size of the market and instead knows that it is low, a = 80, with probability 0.5, and high, a = 100, with probability 0.5. Assume that marginal costs of production for both firms are 0. Find the BNE of this duopoly game. 6. Uncertain demand—two uninformed firms.A Consider the situation in exercise 15.5, but now neither firm is privately informed about the size of the market. Both firms know that the market size is either low, a = 80, with probability 0.5, or high, a = 100, with probability 0.5. Find the BNE of this duopoly game. 7. Entry deterrence.B Consider the Entry game presented in example 13.1 in chapter 13, but suppose that the incumbent had private information about whether she was crazy or not. A noncrazy incumbent has a game tree exactly as depicted in example 13.1, but a crazy incumbent loves to engage in price wars, and receives a payoff of 6 from doing so. Suppose that the entrant was aware that the probability of an incumbent being crazy is p = 0.1. (a) Find the BNE of this game. Does the entrant still enter this market? (b) For what value of p is the entrant indifferent between entering or staying out of this market? 416 Chapter 15 8. First-price auction–I.A Consider an auction with two participants, each of them with the following (privately observed) valuation of the object for sale: Person A ($50), Person B ($60). (a) If the seller organizes an FPA, who will be the winner? What will be her winning bid? What price will she pay for the object? (b) Suppose now that Person A was able to observe Person B’s private valuation prior to the auction. Would Person A change her bid? If so, how? If not, why not? 9. First-price auction–II.A Consider the situation in exercise 15.8, but suppose that Person A’s valuation is only $25. (a) If the seller organizes an FPA, who will be the winner? What will be her winning bid? What price will she pay for the object? (b) Suppose that Person A were able to observe Person B’s private valuation prior to the auction. Would Person A change her bid? If so, how? If not, why not? 10. First-price auction–III.B Consider an FPA with N = 10 bidders. (a) If your valuation for the object is vi = $200, what is your optimal bidding strategy? (b) Suppose that you received information that the valuations of all bidders ranges from a low of $150 to a high of $210. Assuming that no other bidder has this information, would your bidding strategy change? If so, how? 11. Risk aversion–I.B Consider the situation in exercise 15.8, but suppose that both bidders have utility function u(x) = x0.4 . (a) If the seller organizes an FPA, who will be the winner? What will be her winning bid? What price will she pay for the object? (b) Suppose that Person A were able to observe Person B’s private valuation prior to the auction. Would Person A change her bid? If so, how? If not, why not? (c) For what degree of risk aversion (α) would Person A not want to change her bid in part (b)? 12. Risk aversion–II.A After losing an auction to a sole rival bidder for a bid of bj = $100, you later learn that her valuation for the object was vj = $250. Based on this information, what is bidder j’s attitude toward risk? 13. Second-price auction–I.A Consider the situation in exercise 15.8. (a) If the seller organizes a second-price auction, who will be the winner? What will be her winning bid? What price will she pay for the object? (b) Suppose that Person A were able to observe Person B’s private valuation prior to the auction. Would Person A change her bid? If so, how? If not, why not? 14. Second-price auction–II.B Consider an auction with five participants, each of them with the following (privately observed) valuations of the object for sale: Person A ($10), Person B ($6), Person C ($45), Person D ($81), and Person E ($62). (a) If the seller organizes an SPA, who will be the winner? What will be her winning bid? What price will she pay for the object? Games of Incomplete Information and Auctions 417 (b) Suppose that bidders can observe each other’s valuations, but the seller cannot. The seller, however, only knows that bidders’ valuations are in the range {0, 1, … , $90}. If players participate in an SPA, who will be the winner? What is her winning bid? 15. Lottery allocation–I.B Consider a situation where a public all-pay auction takes place for an item, with its allocation determined by a lottery. Each bidder is able to observe all the bids of all the bi other bidders as they make their own. The probability that bidder i wins the auction is p = b +B i −i where B−i denotes the total bids made by all other bidders. Suppose that bidder i has a valuation of vi = 9 for this item, and he knows that bids totaling B−i = $4 have already been submitted. Find the optimal bidding strategy for bidder i, bi , taking into consideration that he must pay her bid regardless of whether he wins the auction. 16. Lottery allocation–II.B Consider the scenario in exercise 15.15. Answer the following questions: (a) If the seller organizes an FPA, which is bidder i’s equilibrium bid? Who wins the object? (b) If the seller organizes an SPA, which is bidder i’s equilibrium bid? Who wins the object? 17. All-pay auction.C Consider the following all-pay auction, with two bidders privately observing their valuations for the object. Valuations are uniformly distributed vi ∼ U[0, 1]. The player submitting the highest bid wins the object, but all players must pay the bid they submitted. Find the optimal bidding strategy, taking into account that it takes the form bi (vi ) = m × v2i , where m denotes a positive constant. 18. Third-price auction.B Consider a third-price auction, where the winner is the bidder who submits the highest bid, but she only pays the third-highest bid. Assume that you compete against two other bidders, whose valuations you are unable to observe, and that your valuation for the object is $10. Show that bidding above your valuation (with a bid of, for instance, $15) can be a best response to the other bidders’ bids, while submitting a bid that coincides with your valuation ($10) might not be a best response to their bids. 19. Efficiency with risk aversion.B Consider a situation where bidders with heterogeneous attitudes toward risk compete in an FPA. Provide an example of how these bidders can lead to an inefficient allocation of the object. 20. Bid shading.A On your way back from an SPA, an inexperienced colleague of yours informs you that she had been advised by a veteran competitor that she should shade her bid by bidding only 90 percent of her valuation for the object. Did your colleague act optimally? Why would her competitor give her this advice? 21. Comparing auctions.A Compare and contrast the similarities and differences between an FPA where all bidders can observe everyone’s private valuations and an SPA. 16 Contract Theory 16.1 Introduction This chapter covers another scenario where inefficiencies may exist: contracting under asymmetric information. In these contexts, one agent has different information from another agent, such as when the employee observes how much effort she exerts on a task while the employer does not or, in the other direction, when the employer knows about an industry’s characteristics more than a candidate who seeks to work for it. As we show in this chapter, asymmetric information leads to lower aggregate payoffs than when all individuals are perfectly informed. In other words, total surplus is suboptimal due to asymmetric information. Specifically, two common problems arise when agents interact under asymmetric information. First, “moral hazard” problems may exist when one of the parties (i.e., the employer) cannot observe the actions of the other party (i.e., the employee). These scenarios are, therefore, also known as “hidden action” because one party does not get to observe the action chosen by the other party. Intuitively, the firm manager’s lack of information about a worker’s effort on the job could lead the worker to slack off if she knew that her job security, salary, or chances of promotion were unaffected. Of course, firms understand the asymmetric information situation in which they operate, anticipate the moral hazard problem that flat contracts may generate, and respond by designing contracts in which a worker’s salary, promotion, or job security increases based on her performance.1 As we discuss in this chapter, firms can design “high-powered” contracts to ameliorate moral hazard problems. In these contracts, the worker receives a relatively low salary when her performance is low (e.g., when a worker doing manual labor in a factory produces few units, or when a marketing executive secures few sales), but earns more money (e.g., 1. While performance may be an imperfect predictor of the worker’s effort, a firm can accummulate several observations about her performance after weeks on the job, and can even compare her performance relative to that of other workers, which ultimately helps the firm infer more accurately the worker’s effort from the observed performance. 420 Chapter 16 a bonus) when her performance exceeds a certain level. While the contract differs from one in which both employer and employee are symmetrically informed, the worker now has the incentive to exert more effort than under a flat contract paying her the same salary regardless of her performance. As a result, expected profits are higher than under flat contracts. The second type of problem that often emerges in scenarios of asymmetric information is “adverse selection.” In this case, the uninformed party observes the actions of the other party, but she does not observe a piece of private information, such as a manager not observing a job candidate’s productivity while interviewing her for the job. This explains why these scenarios are also known as “hidden information,” to emphasize that one party has access to a piece of information (e.g., productivity) that the other party does not.2 We then explore two common scenarios in which adverse selection problems arise, and how firms can design contracts which, despite being imperfect, help ameliorate this problem and increase expected profits. The first example we consider is the used-cars market. In this market, the seller is privately informed about a car’s quality—high-quality cars, also known as “peaches,” and low-quality cars, “lemons”— while the buyer can obtain only a rough estimate of it during a test drive or a short visual inspection. If the buyer was as informed about the car’s quality as the seller, all types of cars (peaches and lemons) would be sold at a market-clearing price (a high price for peaches, given their high quality; and a lower price for lemons, given their low quality). When the buyer is uninformed, however, we demonstrate that high-quality cars go unsold. Intuitively, the buyer forms an expectation about the average car quality and purchases a car only if the seller asks a price below that reference point. Anticipating this relatively low willingness-to-pay from the buyer, the seller does not have incentives to offer high-quality cars, as she would need to ask a high price, which would not be accepted by the buyer. Therefore, good cars are not traded, which occurs exclusively based on the asymmetric information between seller and buyer. The second scenario in which adverse selection problems are common is the labor market, where a worker can observe her cost of exerting effort on the job, while the employer does not. Because the employer seeks to induce the worker to exert effort, these contexts are often referred to as “principal-agent” models. We examine the salary that the firm offers and the effort that the worker exerts in response in a symmetric information scenario, comparing them against that occurring under asymmetric information. We then analyze how the firm can design contracts inducing workers who experience a high (low) cost from effort to exert little (great) effort on the job for a low wage (high wage), and how the contract meant for one type of worker cannot be profitably chosen by her counterpart. 2. The firm manager has some information about the share of high- and low-productivity workers in town, or in that profession, allowing her to find an “expected productivity” of candidates. However, the exact productivity of a candidate is still better known by the candidate herself than by the firm manager, maintaining the hidden information structure. Contract Theory 421 In a scenario where a firm hires a worker, adverse selection problems are often referred to as “precontractual” because the firm does not observe the worker’s type (such as her productivity on the job) before hiring her. Moral hazard, however, is “postcontractual” because the firm cannot observe the worker’s effort after hiring her, assuming that the firm knows her productivity level, which eliminates adverse selection problems. In real life, firms often face both problems: adverse selection when interviewing candidates for a position, followed by moral hazard issues after hiring them.3 A similar argument applies to insurance markets, where companies offering health plans do not observe an individual’s health status before purchasing insurance (or her genetic predisposition to develop certain conditions), nor her level of care after acquiring a specific health plan, such as her diet and exercise routine. While firms face precontractual problems (adverse selection) before hiring workers, their presentation is more involved than postcontractual problems (moral hazard) and, for this reason, we start the chapter by examining moral hazard. 16.2 Moral Hazard Moral hazard (or hidden action) A scenario in which an agent cannot observe the actions taken by other agents. Moral hazard problems arise, of course, in health insurance plans because companies offering these plans cannot observe the actions that their clients take to maintain good health (e.g., exercising, eating a good diet, and avoiding risky behaviors). Another context in which moral hazard problems abound is insurance markets because an insurance company does not observe how careful the insured party is (e.g., the driver’s carefulness is a hidden action for the company) but designs insurance policies to give incentives to its clients so that they are as careful as possible. For example, Progressive’s “Snapshot” device monitors the insured individual’s driving behavior, such as sudden braking, while other firms offer “pay-per-drive” insurance, thus providing discounts for low mileage. Consider the following scenario: you are hired by a small firm, which pays you $400 a week to work 6 hours a day. If the contract does not specify a target outcome of your effort (units of output being produced per week), how much effort will you exert? We know that you are a responsible worker, but the firm does not monitor your work, and the contract sets a flat weekly pay (i.e., it is a flat contract, thus providing no incentives to increase effort). You may then slack off a little, or at least not work as much as you could every minute of the day. 3. For a more detailed presentation of contractual problems, see Macho-Stadler and Perez-Castrillo (2001) and Campbell (2018), both at the undergraduate level. For more technical presentations, see Laffont and Martimort (2002) and Bolton and Dewatripont (2004). 422 Chapter 16 (Workers may exert effort even in this scenario because of nonmonetary incentives, such as career concerns and future promotions. For simplicity, we abstract from these concerns in the next discussion.) 16.2.1 Contracts When Effort Is Observable Anticipating the incentive problems of a flat contract, the firm could specify a salary connected to the effort you exert. The problem with this contract, however, is that effort is relatively difficult to measure from the employer’s point of view. A firm manager may see how many hours you put in at the workplace, but she cannot easily observe how focused you were at a task or which distractions affected your concentration during the day. Alternatively, she could write a contract specifying that your pay will increase based on the output you produce (e.g., $2 per unit of output without defects). Sound good? Well, if the relationship between effort and units of output was not affected by any randomness, then the manager would be able to infer effort from output. For instance, if every hour of effort yields 4 units of output, the observation of 16 units of output must imply that the worker exerted 4 hours of effort. In that scenario, the observation of output would be equivalent to observing effort, and the contract would provide workers with the right incentives. What’s the problem with this argument? Simply put, life is messy; effort does not simply materialize into a constant amount of output. While exerting more effort often implies producing more output, random shocks affect our performance (focus, being infected with the COVID-19 virus, sleep patterns, distractions with other co-workers). Even if we put in the exact same amount of working hours for more than 2 consecutive days (and try to concentrate on our jobs fully), we may obtain different results each day. This randomness between effort and output emerges both in manual jobs and intellectual tasks and cannot be ignored by managers at the time of drafting contract details. Example 16.1 examines optimal contracts when effort is observable. We then move on to optimal contracts in contexts where the relationship between effort and output is random. Example 16.1: Finding optimal contracts when effort is observable Consider a √ worker with utility function u(w) = w, where w 0 denotes her salary. The worker experiences disutility from exerting effort, e, measured by g(e) = e, and her reservation utility is u 0, which captures the utility that she would obtain in an alternative job (or earning an unemployment benefit). For simplicity, assume that this reservation utility is zero (u = 0), and that there are two effort levels the worker can exert, eH = 5 and eL = 0. As reported in the top row of table 16.1, when the worker exerts a high effort, the firm’s sales are $0 with probability 0.1, $100 with probability 0.3, and $400 with probability 0.6. As reported in the bottom row, when the worker exerts low effort, Contract Theory 423 Table 16.1 Probability of sales for each effort level. High effort Low effort $0 in sales $100 in sales $400 in sales 0.1 0.6 0.3 0.3 0.6 0.1 the firm’s sales are $0 with probability 0.6, $100 with probability 0.3, and $400 with probability 0.1. Intuitively, low effort makes it more likely that $0 sales occur, while high effort increases the probability of $400 in sales. As a consequence, the expected sales when the worker exerts high effort becomes (0.1 × 0) + (0.3 × 100) + (0.6 × 400) = $270, while when she exerts low effort, expected sales are only (0.6 × 0) + (0.3 × 100) + (0.1 × 400) = $70. How can the firm induce high or low effort from the worker? The worker accepts the high-effort contract if u(wH ) − g(eH ) u, which in this context implies √ wH − 5 0, √ or, after rearranging, wH 5. Squaring both sides, we obtain wH (5)2 = $25. Because the firm seeks to pay the lowest possible salary, it will reduce wH until condition wH $25 holds with equality, thus paying wH = $25. Operating similarly for low effort, the worker accepts the low-effort contract if u(wL ) − g(eL ) u, which in this scenario implies √ wL − 0 0, √ or, after simplifying, wL 0. Squaring both sides yields wL $0, which entails a salary of wL = $0. Lastly, we can compare the firm’s expected profits (measuring expected sales less salary) as follows: With high effort, $270 − $25 = $245 With low effort, $70 − $0 = $70. Therefore, the firm offers a contract (wH , wL ) = ($25, $0), inducing the agent to exert a high effort level. 424 Chapter 16 Self-assessment 16.1 Repeat the analysis in example 16.1, but assume that the worker’s reservation utility increases to u = 1/2. This could happen if, for instance, unemployment benefits become more generous. Find the optimal salaries wH and wL in this scenario, and compare them against those in example 16.1. Interpret. In example 16.1, we assume that the worker is risk averse (i.e., her √ utility function u(w) = w is concave) while the firm is risk neutral (i.e., the profit function is linear in money). We showed that the principal offers a contract that pays a relatively generous salary when the worker exerts high effort (which the firm seeks to induce) but a lower payoff if she exerts low effort. If the worker is less risk averse (e.g., her utility function changed to u(w) = w9/10 , thus being close to linear), wages become less generous. In contrast, if the worker is more risk averse (e.g., her utility function changes to u(w) = w1/10 , thus becoming more concave), she needs more generous compensation. A similar argument would apply if the firm manager was not risk neutral, but risk averse. The end-ofchapter exercises ask you to confirm these results by altering the utility and profit functions in example 16.1. Role of risk aversion. 16.2.2 Contracts When Effort Is Unobservable When the firm cannot observe the effort that the worker exerts, it needs to provide her with incentives to exert the amount of effort that maximizes profits. (A similar argument applies if the relationship between effort and output is not random, as discussed previously.) In order to understand the optimal wage in this context, let us consider again the previous scenario where effort was observable. Under observable effort, inducing eH gives rise to two effects: on the one hand, it increases expected profits (because higher outcomes are more likely to occur when the worker exerts eH than eL ) but on the other hand, effort eH is more expensive to induce than eL , as the former requires a more generous salary. For simplicity, let us assume that the positive effects offset the negative effects, entailing that the firm seeks to induce a high effort eH from the worker. How are these effects modified when we introduce unobservability of effort? The positive effect from effort level eH (larger expected profits) is unaffected. However, the wage that the firm must pay to induce eH is more generous when effort is unobservable (requiring the worker to voluntarily choose high rather than low effort) than when it is observable. In summary, while the expected benefits from eH are unaffected, its expected costs go up when effort is not observable, thus restricting the cases for which the firm continues to induce this high effort. Contract Theory 425 Example 16.2: Finding optimal contracts when effort is unobservable Consider the firm and worker in example 16.1, but now let us allow for effort to be unobservable for the firm. In this context, assume that when the worker exerts high effort, the firm observes high output with probability 0.6, but low output otherwise, with probability 0.4. In contrast, if the worker exerts low effort, the firm observes high (low) output with probability 0.1 (0.9, respectively). Intuitively, high output is more likely to originate from high than low effort, but both efforts have a probability of yielding high or low output. Table 16.2 summarizes these probabilities.4 Table 16.2 Probability of high and low outputs for each effort level. High effort Low effort High output Low output 0.6 0.1 0.4 0.9 Because in this discussion, we assumed that the firm prefers to induce a high effort level, the firm’s problem in this context becomes max $270 − [0.6wH + 0.4wL ], wH ,wL Expected labor cost subject to √ √ 0.6 wH + 0.4 wL −50 (PC) Expected utility from high effort √ √ 0.6 wH + 0.4 wL Expected utility from high effort −5 √ √ 0.1 wH + 0.9 wL − 0, (IC) Expected utility from low effort where $270 denotes the firm’s expected sales from high effort, while the term in brackets represents the firm’s expected labor cost (either paying wH when the observed output is high or wL when it is low). The first constraint of the firm’s problem simply states that the worker prefers to exert high effort (obtaining an expected utility of √ √ 0.6 wH + 0.4 wL , but suffering an effort cost of 5) rather than rejecting the firm’s contract altogether (receiving a payoff of 0). That is, the worker prefers to participate 4. Strictly, the entries in table 16.2 are “conditional probabilities” because high or low output level is affected by the effort that the worker selects. Specifically, a high output becomes more likely when the worker exerts high effort rather than low. 426 Chapter 16 in the contract, which explains why this constraint is often referred to as the agent’s “participation constraint,” or PC. The second constraint, however, indicates that the worker prefers to exert high rather than low effort. In other words, the contract provides her with sufficient incentive to exert high effort, which is known as the “incentive constraint,” or IC. In this context, IC holds with equality. Intuitively, if IC did not hold, the firm could still reduce the salary offered to the worker when high (low) output is observed, wH (wL , respectively), increasing its profits as a result. Because IC holds, we obtain √ √ √ √ 0.6 wH + 0.4 wL − 5 = 0.1 wH + 0.9 wL , √ √ or after rearranging, 0.5 wH − 0.5 wL = 5. Solving for wH in IC, we find wH = √ 2 wL + 10 . We can then plug in this result everywhere we had wH in the previous maximization problem, simplifying it to ⎤ ⎡ 2 ⎥ ⎢ √ max $270 − ⎣0.6 wL + 10 + 0.4wL ⎦ , wL wH subject to 0.6 √ √ wL + 10 + 0.4 wL − 5 0. (PC) Note that the firm now only has one choice variable below the max operator √ (salary 2 wL ) rather than two (salaries wH and wL ) because we plugged in wH = wL + 10 making the maximization problem a function of wL alone. Finally, a common approach of the problem at this point is to ignore the PC condition, treating the program as an unconstrained maximization problem (as if we didn’t have a constraint!). Once we are done solving this unconstrained problem, we will need to check that our results satisfy the PC condition. This trick indeed simplifies our calculations significantly because we can operate as if PC were absent, differentiating the firm’s objective function with respect to wL , which yields 0.6 √ ∂π =− √ wL + 10 + 0.4 ∂wL wL 6 = −0.6 − √ − 0.4 wL 6 = −1 − √ . wL Contract Theory 427 This expression is clearly negative for all salaries wL . Therefore, the firm reduces this salary as much as possible, to w∗L = 0. We can conclude that the firm pays w∗L = $0 after observing low output, and 2 √ 2 √ w∗H = wL + 10 = 0 + 10 = $100 after observing high output. Relative to the case where effort is observable, the firm still pays a salary wL = $0 after observing low output. However, the salary after high output, wH , increases from $25 to $100 when the agent’s effort is unobservable. We further check that the PC is slack (i.e., it holds with strict inequality) because √ √ 0 + 10 + 0.4 0 − 5 = 1 > 0. 0.6 In contrast, when effort is observable, PC holds with equality, leaving the worker indifferent between accepting and rejecting the contract (i.e., zero expected payoff). In contrast, when effort is unobservable, her expected payoff is larger ($1, in this example). Our last result—that the worker’s utility is larger when the firm cannot observe her effort than when the firm can observe it—is often referred to as that the informed agent in this relationship enjoys an “information rent.” Information rent A utility gain that an agent enjoys when moving from a symmetric to an asymmetric information context. Similar information rents will emerge in other contractual relationships analyzed in this chapter, in which the worker’s payoff is higher when she benefits from an information advantage relative to the firm (asymmetric information) than when both parties are symmetrically informed. Self-assessment 16.2 Repeat the analysis in example 16.2, but assume that the worker’s reservation utility increases to u = 1/2. Find the optimal salaries w∗H and w∗L in this scenario, the worker’s information rent, and compare them against your findings in self-assessment 16.1 (the complete information version of this scenario). Interpret. 428 Chapter 16 16.2.3 Preventing Moral Hazard Given the inefficiencies that emerge under moral hazard, firms seek to observe the worker’s effort. For instance, the firm manager may monitor the worker’s effort. This is costly for the firm, however, even if it only measures the worker’s effort for a few minutes every month. For monitoring to be effective: (1) workers must know that monitoring may occur, as otherwise, they would behave as if the manager could never observe their effort levels; and (2) they must not know when their effort will be monitored, as otherwise, they would strategically work hard only when their effort is monitored. Is our department chair looking over our shoulder while we write this? 16.3 Adverse Selection In this section, we continue with our analysis of asymmetric information but rather than considering scenarios where an agent does not observe the actions taken by another agent (hidden action), we now examine contexts where she cannot observe some private information of the other agent (hidden information). Examples include a buyer not observing a used car’s quality (which is only observed by the seller) or a manager not observing a job applicant’s ability. As we show, lack of information could lead the uninformed party to make a wrong decision (e.g., select a low-quality car, or a low-ability job candidate). This explains why hidden information models are also known as “adverse selection.” Examples in insurance markets abound, with insurance companies not being able to observe the risk of an insured party (an individual’s health or her driving ability). 16.3.1 Market for Lemons In previous chapters, we assume that if buyers are interested in purchasing a good and sellers find it profitable to sell it, then a market will exist where parties exchange the good at a mutually agreeable price. In this section, however, we show that markets might fail to exist if buyers and sellers have access to different amounts of information. Because a market would exist when agents are symmetrically informed, we can say that information asymmetries can lead to market imperfections. Following Akerlof (1970), consider a used-cars market, where quality is denoted by q. Quality is a random variable whose realization is observed by the seller, but not by the buyer. For simplicity, quality q is uniformly distributed between 0 and 32 . A car of quality q = 23 q by the seller. The buyer, q is valued as such by the buyer, and at discounted value 3/2 therefore, assigns to the car a larger valuation than the seller, and they could find prices between 23 q and q for which the trade makes both parties better off. For instance, if they 1 q, whereas agree on a price p = 34 q, the seller makes a profit of π = p − 23 q = 34 q − 23 q = 12 3 1 the buyer obtains a utility of u = q − p = q − 4 q = 4 q. Contract Theory 429 16.3.2 Market for Lemons—Symmetric Information When both seller and buyer observe the car’s quality q, the seller only needs to charge a price p that maximizes her profits, p − 23 q, subject to guaranteeing that the price is accepted by the buyer (i.e., q − p 0 or q p). Formally, the seller sets p to solve 2 max p − q p 3 subject to q − p 0. (PC) The buyer’s participation constraint (PC) must hold with equality (i.e., q − p = 0 or q = p). Otherwise, the seller could charge a higher price, which would still be accepted by the buyer. Inserting q = p into the seller’s profit (in this maximization problem), simplifies it to 2 1 max p − p = p. p 3 3 Differentiating with respect to p, we obtain 13 . Because this result is always positive, we found a corner solution where the seller increases price p as much as possible, that is to say, p = q. (Higher prices would not satisfy the buyer’s PC, leading her not to purchase the car.) Therefore, the seller charges a price equal to the car’s quality q, which in this scenario the buyer can perfectly observe. Importantly, all car types are traded: from those with q close to zero (poor quality, or “lemons”) to those with q close to 32 (good quality, or “peaches”). In summary, when both parties observe the car’s quality, no market failures arise. Self-assessment 16.3 Repeat the analysis in subsection 16.3.2, but assume that the car quality is uniformly distributed between 0 and 2 (rather than between 0 and q 3 1 2 ). This means that the seller now values a car of quality q as 2 = 2 q, rather than at q 2 3 = 3 q. How are the results in subsection 16.3.2 affected? 2 16.3.3 Market for Lemons—Asymmetric Information How would these results change if the buyer could not observe the car’s true quality?5 In this context, the buyer will accept a price p if she receives a positive expected utility, E[q] − p 0. Term E[q] indicates the car’s expected utility, which we can find as follows: 5. We have all been in that position as buyers of a used item. For instance, when we purchased our first car (a used car, of course), the seller said something along the lines of, “This is an excellent car, an old lady owned it for a few years and took great care of it.” Well, by the number of miles on the car, our “old lady” must have driven across many states every weekend! We admit that the car ended up being in great condition (a peach!), so we still trust that seller. 430 Chapter 16 E[q] = 3 2 +0 3 = , 2 4 because quality q is uniformly distributed between 0 and 32 . The buyer then accepts a price p if 34 − p 0, or p 34 . In this scenario, the seller’s problem now becomes 2 max p − q p 3 3 subject to p . 4 (PC) By the same argument, the seller can raise the price p until the PC holds with equality, p = 34 . But we have then solved this problem: the seller sets the highest acceptable price to the buyer, as any higher price yields a negative expected utility for the buyer. This price, however, leads the seller to offer cars with quality q that satisfies p − 23 q = 34 − 23 q 0 (i.e., positive profits). Rearranging, this condition entails 34 23 q, or solving for quality q, we obtain 3/4 9 = q. 2/3 8 In other words, offering cars with qualities above 98 is unprofitable for the seller. This is problematic because, as we described in the symmetric information scenario, the seller and buyer can make a positive margin from exchanging cars of all quality levels if they could both observe the quality. In other words, the buyer’s inability to observe the car’s quality leads to the nonexistence of the market for good cars (“peaches”), whereas only bad cars (“lemons”) exist in this market. Informally, bad cars pushed out good cars from the market. Self-assessment 16.4 Repeat the analysis in subsection 16.3.3, but assume that the car quality is uniformly distributed between 0 and 2 (rather than between 0 and q 3 1 2 ). This means that the seller now values a car of quality q as 2 = 2 q, rather than at q 3 2 = 23 q. How are the results in subsection 16.3.3 affected? Lemons in other markets. Similar problems emerge in the labor market where buyers of job services (firms) have access to less information than sellers of labor (job applicants). In this context, a worker privately observes her productivity, θ , but firms do not. Following this argument, firms would only offer a wage equal to the worker’s expected productivity, w = E[θ ]. However, this salary attracts only workers whose productivity lies below such a salary (i.e., θ E[θ ]), but it does not attract those with relatively high productivity, (i.e., θ > E[θ ]). Contract Theory 431 In short, asymmetric information prevents the existence of a market of high-skilled workers, leaving only low-skilled workers employed. Overcoming the lemon problem. Sellers often try to overcome this market failure by offering warranties for their items (used cars). If sellers offer warranties when selling a high-quality car (peach) but not a low-quality car (lemon), the observation of whether a car comes with a warranty signals its true quality to the buyer, who can now operate as in the symmetric information context. In that scenario, markets for both lemons and peaches exist. More recent tools include CARFAX, which provides information about the car’s reported accidents, miles accumulated by every owner, large repairs and factory recalls of defective parts, as if their quality was certified by a third party. In addition, some manufacturers offer certified preowned vehicles to signal good quality. 16.3.4 Principal-Agent Model Consider now a scenario between a principal (firm) and an agent (worker). The principal’s profits are given by π(e) = log(e) − w, thus increasing in the effort e that the worker exerts, but at a decreasing rate because the log function is concave. In addition, profits decrease in the wage w that the principals pays to the agent. The agent’s utility is u(w, e) = w − θ e2 , which increases in the salary that she receives from the firm w. The second term in this utility function, θ e2 , can be understood as the worker’s “cost of effort,” which is increasing and convex in the effort she exerts, e. In addition, the worker’s cost of effort increases in parameter θ . For instance, her innate productivity might be lower, and thus the same amount of effort generates a larger disutility when θ increases.6 For simplicity, consider that parameter θ is either high or low, denoted as θH and θL where, θH > θL . 16.3.5 Principal-Agent Model—Symmetric Information When the firm observes parameter θ , it knows the cost θ e that the worker incurs from exerting effort. In that scenario, the firm finds a wage w and an effort level e that maximize its profits, log(e) − w, subject to guaranteeing that the worker accepts the contract (i.e., w − θ e2 0). Formally, the firm sets a salary w to solve max log(e) − w w,e subject to w − θ e2 0. 6. The marginal cost of effort, 2θ e, is also larger for workers with a high parameter θ . (PC) 432 Chapter 16 As in the lemon problem, where the firm increased the price it charges as much as possible, the firm now seeks to decrease wages as much as possible, while still guaranteeing workers’ acceptance. That is, the PC w − θ e2 0 holds with equality, w − θ e2 = 0, which yields w = θ e2 . Inserting this result into the firm’s profits transforms this problem as follows:7 max log(e) − θ e2 . e Differentiating with respect to e, we obtain 1e − 2θ e = 0. Solving for e, we find 1 2 2θ = e . Applying the square root on both sides yields the optimal effort level 1 e = 2θ 1 e = 2θ e, or 1 SI 2 , where superscript SI denotes “symmetric information.” Because θH > θL , efforts satisfy 1 1 2 2 1 = < 2θ1L = eSI eSI H L , implying that the high-cost worker exerts a lower effort than 2θH the low-cost worker. Lastly, we can find the optimal wages in this context using w = θ e2 . When the firm observes θH , it offers a wage of SI 2 wSI H = θH (eH ) = θH × 1 2θH 1 2 2 = θH 1 1 =$ . 2θH 2 Similarly, when it observes θL , the firm offers a wage of SI 2 wSI L = θL (eL ) = θL × 1 2θL 1 2 2 = θL 1 1 =$ . 2θL 2 Both types of worker receive the same wage because it is more expensive for the firm to induce effort from the high-type (because effort is more costly for this worker), implying that the firm induces less effort from her. In other words, the firm pays this type of worker the same salary as if she were a low-type worker, but this salary induces her to exert a lower effort level than the low-type worker. Example 16.3: Principal-agent problem under symmetric information Consider that θH = 2 and θL = 1. Using the previous results, we find that, when the firm observes the worker’s type to be θH = 2, it requires an effort level of 7. Because the firm’s profits now do not depend on wage w, we delete it from the list of variables the firm can choose from (the max operator only includes e below it, rather than w and e). Contract Theory 433 eSI H = 1 2θH 1 2 = 1 2×2 1 2 = 1 2 from the worker. In contrast, when the firm observes θL = 1, it requires an effort of eSI L = 1 2θL 1 2 = 1 2×1 1 2 1 =√ . 2 Intuitively, the firm demands more effort from the worker with the lowest cost of effort, SI eSI L > eH . As shown previously, however, the firm pays the same salary to both types 1 SI of workers, wSI L = wH = $ 2 . Self-assessment 16.5 Repeat the analysis in subsection 16.3.5, but assume that the worker’s reservation utility is $2 rather than zero (see the right side of her PC). SI SI SI Find the optimal efforts, eSI L and eH , and the optimal salaries, wL and wH . How are the results in subsection 16.3.5 affected? 16.3.6 Principal-Agent Model—Asymmetric Information The firm now does not observe the worker’s type θ , but it knows that a proportion γ of workers are θH , while the remaining share, 1 − γ , are θL . In that context, the firm maximizes its expected profits (because it does not know whether it deals with a high or a low type of worker). Like in the symmetric information scenario, workers must be willing to work for the firm (PC), which implies that the contract generates a positive utility, both for the high and the low type of worker. Unlike in the symmetric information case, we must require that each type of worker prefers to choose the contract meant for her, rather than that of the other type of worker. That is, the contract meant for the high type must be profitable only for the high type, and the contract meant for the low type must be profitable only for her. We next discuss how to mathematically represent the firm’s profit maximization problem (PMP) so it takes these points into account. Writing the firm’s problem under asymmetric information. Mathematically, we write the firm’s expected PMP as follows: max wH ,eH ,wL ,eL γ [log(eH ) − wH ] + (1 − γ ) [log(eL ) − wL ], If worker is high type If worker is low type 434 Chapter 16 subject to wH − θH e2H 0 (PCH ) wL − θL e2L 0 (PCL ) wH − θH e2H wL − θH e2L wL − θL e2L wH − θL e2H . (ICH ) (ICL ) Intuitively, the firm offers a “menu” of two contracts, each of which specifies a wage and an effort level: one contract is meant for the high-type worker, (wH , eH ), and another is meant for the low-type worker, (wL , eL ). The firm obtains profits of log(eH ) − wH when the worker is a high type, but log(eL ) − wL when she is of a low type, so the firm maximizes its expected profits. In addition, the four constraints in this problem specify that (1) the high-type worker finds her contract acceptable, and thus prefers to participate, as indicated by PCH ; (2) the lowtype worker finds her contract acceptable, as reflected by PCL ; (3) the high-type worker has incentives to choose the contract meant for her rather than that of the low type, as indicated by ICH ; and (4) the low-type worker prefers the contract meant for her rather than that written for the high-type worker, as captured by ICL . In short, the PC constraints guarantee the voluntary participation of all types of workers, while the IC constraints ensure selfselection (i.e., each type of worker selecting the contract meant for her). Simplifying this problem. In this context, it is straightforward to show that PCH and ICL hold with equality, which yields wH − θH e2H = 0 and wL − θL e2L = wH − θL e2H , respectively. (See appendix at the end of the chapter for a step-by-step proof.) This is a common feature worth remembering: the PC of the least efficient agent (PCH in this context, as she experiences the highest cost of effort) and the IC of the most efficient agent (ICL in our case, as the low-type worker exhibits the lowest cost of effort) both hold with equality. From PCH binding, we find that wH = θH e2H . We can now insert the binding PCH into the binding ICL to obtain wL − θL e2L = θH e2 H − θL e2H , wH =θH e2H from PCH which, rearranging, yields a salary of wL = θH e2H + θL e2L − e2H . We can now insert these results for wH and wL into the maximization problem, which simplifies to8 8. Note that the firm’s profits now do not contain wages because we already found them in the previous discussion, limiting the list of choice variables to effort levels eH and eL alone. In addition, we do not include the constraints PCL and ICH . A common trick in problems with several constraints is to ignore some of them (such as these two constraints) and solve the problem as if did not have them. At the end of the problem, however, we must check that our solutions satisfy all constraints, including those we ignored in the process. Contract Theory 435 ⎡ ⎤ ⎢ ⎥ ⎥ ⎢ log (eL ) − θH e2H + θL e2L − e2H ⎥ max γ ⎣log (eH ) − θH e2H ⎦ + (1 − γ ) ⎢ ⎣ ⎦. eL ,eH ⎤ ⎡ wH Solving the simplified problem. wL Differentiating with respect to eL , we obtain 1 − 2θL eL = 0, (1 − γ ) eL (FOCeL ) which, after rearranging and solving for effort eL , yields eAI L 1 = 2θL 1 2 , where the superscript AI indicates that we are in an “asymmetric information” context. Differentiating with respect to eH , we find 1 − 2θH eH − 2eH (1 − γ ) (θH − θL ) = 0, (FOCeH ) γ eH which, after rearranging and solving for effort eH , yields 1 2 γ eAI = . H 2[θH − (1 − γ ) θL ] We can now find the wage for the low-type worker. Using the expression of the binding ICL , wL = θH e2H + θL e2L − e2H , we obtain 1 γ γ + θ = θ − wAI H L L 2[θH − (1 − γ ) θL ] 2θL 2[θH − (1 − γ ) θL ] 1 γ + θL 2[θH − (1 − γ ) θL ] 2θL (1 + γ ) θH − θL . = 2[θH − (1 − γ ) θL ] = (θH − θL ) Analogously, the wage to the high-type worker is found using the binding PCH , wH = θH e2H , yielding 2 AI wAI H = θH eH = θH γ . 2[θH − (1 − γ ) θL ] 436 Chapter 16 Example 16.4: Principal-agent problem under asymmetric information Let us continue with example 16.3, where θL = 1 and θH = 2, and assume that the probability (or proportion) of high-cost workers in the pool of workers is γ = 13 . We can then evaluate the optimal effort levels under asymmetric information as follows: eAI L 1 = 2θL 1 2 1 = 2×1 1 γ eAI H = 2[θH − (1 − γ ) θL ] 2 1 =√ 2 ⎛ 1 2 =⎝ ⎞1 2 1 3 ⎠ = √1 , 8 2[2 − 1 − 1 × 1] 3 while optimal wages are 1 + 13 × 2 − 1 − θ + γ θ 5 (1 ) H L =$ wAI = L = 2[θH − (1 − γ ) θL ] 2 2 − 1 − 1 × 1 8 3 wAI H = θH 1 1 γ 3 =$ . =2 2[θH − (1 − γ ) θL ] 4 2[2 − 1 − 1 × 1] 3 Self-assessment 16.6 Repeat the analysis in example 16.4, but assume that the proportion of high-cost workers increases to γ = 12 . How are the results in example 16.4 affected? 16.3.7 Principal-Agent Model—Comparing Information Settings The results found previously showed that the introduction of asymmetric information entails AI no changes in effort for the worker with low cost of effort because eSI L = eL = 1. This is often known as the “no distortion at the top” result, which predicts that the most efficient agent suffers no distortion in her effort (or output) across information scenarios. However, these results find that the high-cost worker exerts less effort when the firm is uninformed about her type than when it is informed; that is,9 1 1 γ 2 SI 9. More generally, it is fairly straightforward to check that eAI < 2θ1 2 H < eH , because 2[θH −(1−γ )θL ] H translates into γ θH < θH − (1 − γ ) θL , leading to (1 − γ ) θL < (1 − γ ) θH , or θL < θH ; a condition that holds true by assumption. Contract Theory 437 1 1 SI eAI H = √ < = eH . 8 2 Salaries are, in contrast, higher for the low-cost worker (the efficient worker) under asymmetric than symmetric information; that is,10 5 1 SI wAI L = $ > $ = wL , 8 2 but lower for the high-cost (inefficient) worker; that is,11 1 1 SI wAI H = $ < $ = wH . 4 2 As a consequence, the efficient worker earns a positive information rent under asymmetric information because she receives a larger wage by exerting the same level of effort. Intuitively, for the efficient worker to voluntarily choose the contract meant for her, rather than that of the inefficient type, the firm must offer a positive rent. In contrast, the inefficient worker is left with zero utility (no information rent) in both the symmetric and asymmetric information contexts. In the context of example 16.4, the efficient worker’s utility under asymmetric information is 2 5 1 1 AI AI − 1 × = , e = w − θ = uAI L L L L 8 2 8 whereas under symmetric information, her utility was 2 1 1 SI SI SI uL = wL − θL eL = − 1 × = 0, 2 2 implying that under asymmetric information, she earns information rent (i.e., a higher utility than under symmetric information). Regarding the inefficient worker, we find that her utility under asymmetric information is 2 1 1 AI AI − 2 × = 0, e = w − θ = uAI H H H H 4 8 10. More generally, we can check that the wage to the low-cost worker is higher under asymmetric than symmetric SI information, wAI L > wL . From these results (before plugging in specific numbers for our parameter values), we (1+γ )θH −θL 1 SI need to show that wAI L = 2[θH −(1−γ )θL ] > 2 = wL . Simplifying this expression, we find that (1 + γ ) θH − θL > θH − (1 − γ ) θL , leading to θH > θL , which is true by assumption. 11. One can check that, generally, the salary to the high-cost worker is lower under asymmetric than symmetric SI information, wAI H < wH . From these results (before inserting specific numbers for our parameters), we need to γ θH 1 SI show that wAI H = 2[θH −(1−γ )θL ] < 2 = wH . Simplifying this inequality, we obtain γ θH < θH − (1 − γ ) θL , which leads to θL < θH , a condition that holds true by assumption. 438 Chapter 16 which coincides with her utility under symmetric information because 2 1 1 SI SI SI uH = wH − θH eH = − 2 × = 0, 2 4 illustrating that this type of worker does not earn a higher utility under asymmetric than symmetric information. 16.3.8 Preventing Adverse Selection Given the inefficiencies from adverse selection, a natural question is what agents can do to ameliorate these problems. We next present three common approaches to prevent adverse selection, all of which help the uninformed agent to become better informed: • Screening. In insurance markets, a typical tool that firms use to reduce asymmetric information problems is by identifying groups of individuals with a higher (lower) risk and charging them a higher (lower) premium, but a low (high, respectively) deductible in case of an accident.12 Formally, we say that companies offer a menu (or list) of contracts—one meant for individuals with high risk and another for individuals with low risk—and each type of individual has incentives to select the contract meant for her. Intuitively, if you are young and healthy, you may choose a health insurance with a low monthly premium but a high deductible because you do not expect many doctor visits. If, instead, you are old or have a serious medical condition, you may prefer a health plan with a relatively high premium, but low deductibles. In summary, insurance companies design menus of contracts that work as “screening devices” to identify which individuals are more or less risky because individuals themselves have incentives to select the contract they prefer and, by doing so, reveal their riskiness to the company. • Signaling. Another common tool to prevent adverse selection problems is the informed party (worker) doing something costly, such as earning a graduate degree, to signal her type to the uninformed party (firm). In the principal-agent model, however, this signaling from the worker to the firm can occur only if the worker has the ability to send a signal before the firm offers the contract. Consider that the worker has an undergraduate degree and has only two available actions at this point: earn a master’s degree in her field or not. For simplicity, we assume that this degree does not change the worker’s productivity in the firm that is considering hiring her, which helps us focus on the role of education as a signaling device rather than as a productivity-enhancing tool. Assume, in addition, that the efficient worker (that with parameter θL ) suffers a cost of $100 from earning this master’s degree, while the inefficient worker (that with θH ) suffers a cost of $300. The cost difference reflects that the efficient worker can finish her coursework faster, which 12. For instance, life insurance companies go through underwritting that evaluates whether to give an applicant a policy and determines her premium. Contract Theory 439 reduces her tuition costs as well as other opportunity costs of time while completing the degree. While the firm is uninformed about the worker’s efficiency, observing a master’s degree signals that the worker is more likely to be efficient because it is more costly for the inefficient than the efficient worker to earn the degree. Education can then work as a signal that the informed agent uses to convey information to the uninformed firm. As discussed in section 16.3.3, a similar argument applies to the used-cars market, where a seller of highquality cars may offer a 3-year warranty that the seller of low-quality cars cannot profitably match. Unfortunately, spending time and resources on acquiring a master’s degree is costly for the worker, and it does not increase the worker’s productivity. In other words, education helps convey information, but it only does that! Costly signaling then, while effective, gives rise to its own inefficiencies relative to the complete information scenario analyzed in section 16.3.5. • Legal rules. Finally, most countries provide buyers with rights that can help ameliorate adverse selection problems, such as laws requiring the seller to replace the object if it breaks down during a certain period after the purchase. These laws are often known as “implied warranties” because they do not need to be included in the purchasing contract. Appendix. Showing That PCH and ICL Hold with Equality In this short appendix, we show that two of the constraints in the profit-maximization problem that the firm solves in contexts of adverse selection (hidden information) hold with equality (PCH and ICL )—that is, they bind. In contrast, the other two constraints in the firm’s problem hold with strict inequality (PCL and ICH )—that is, they are slack. • PCL is slack. The incentive compatibility condition of the low-cost worker, ICL , is wL − θL e2L wH − θL e2H . Therefore, because θH > θL by definition, we have > wH − θH e2H 0. wL − θL e2L wH − θL e2H By θH >θL Combining the first and last elements of the inequality yields wL − θL e2L > 0. This result coincides with PCL . In other words, we just showed that PCL holds with strict inequality (>) rather than with a weak inequality () (i.e., PCL is slack). • ICL binds. The incentive compatibility condition of the low-cost worker, ICL , must hold with equality (i.e., it must bind). Otherwise, the principal could reduce the wage that it offers to the low-cost worker, still inducing her to take the contract meant for her rather than the one meant for the high-cost worker. Therefore, ICL holds with equality, implying that wL − θL e2L = wH − θL e2H , which, rearranging, yields wL = wH + θL e2L − e2H . 440 Chapter 16 • ICH is slack. The incentive compatibility condition of the high-cost worker, ICH , says that wH − θH e2H wL − θH e2L . Using the binding ICL , wL = wH − θL (eL − eH ), this expression can be rewritten as wL from ICL wH − θH e2H wH + θL e2L − e2H − θH e2L . Canceling out wH on both sides of the inequality, and rearranging, we obtain θH e2L − e2H − θL e2L − e2H 0, or, more compactly, (θH − θL ) e2L − e2H 0, which is strictly positive because θH > θL by assumption and if the firm induces a higher effort from the more efficient, low-cost worker, than the less efficient, high-cost worker (i.e., eL > eH ). Our analysis of the principal-agent problem under asymmetric information must then find that eL > eH in equilibrium (as we did in section 16.3.6 and example 16.4). Otherwise, ICH would not necessarily hold strictly. Therefore, the original expression for ICH satisfies wH − θH e2H > wL − θH e2L , a condition that holds with strict inequality. Intuitively, the high-cost agent has no incentive to take the contract meant for the low-cost agent because doing so would entail a loss. • PCH binds. The participation constraint of the high-cost worker, PCH , holds with equality (i.e., it must bind). Otherwise, the firm can still reduce the wage offered to the high-cost worker, while inducing her to take the contract meant for her. Exercises 1. Moral hazard.A Give two real-world examples where moral hazard problems exist. In all examples, identify the individuals/firms involved, their order of play, and the available actions. 2. Risk aversion–I.A Consider the situation in example 16.1. (a) Suppose that the worker is risk neutral (i.e., u(w) = w). Find the optimal salaries, wH and wL in this scenario, and compare them against those in example 16.1. (b) Suppose that the worker is risk loving (i.e., u(w) = w2 ). Find the optimal salaries, wH and wL in this scenario, and compare them against those in example 16.1. 3. Risk aversion–II.A Consider the situation in example 16.2. (a) Suppose now that the worker is risk neutral (i.e., u(w) = w). Find the optimal salaries, wH and wL in this scenario, and compare them against those in example 16.2. Contract Theory 441 (b) Suppose now that the worker is risk loving, with utility function u(w) = w2 . Find the optimal salaries, wH and wL in this scenario, and compare them against those in example 16.2. 4. Risk loving–I.B Amelia has been hired by Boeing to develop the first electric passenger airplane. Amelia’s utility function is u(w) = ( w2 )2 , where w denotes wage. She needs to balance her professional work with her personal life; hence, she experiences a disutility from exerting effort at work, √ e, measured by g(e) = e, and her reservation utility is u = 0. Amelia can choose between two effort levels: high, eH = 16, or low, eL = 4. If she spends a lot of energy trying to develop Boeing’s project, the probability that she is successful (unsuccessful) is 0.7 (0.3, respectively), which generates a profit of $750 million ($100 million, respectively). However, if Amelia’s effort is low, the project is successful (unsuccessful) with a probability of 0.2 (0.8, respectively). It is evident that low effort implies that an electric passenger airplane is less likely to be developed by Boeing. Identify the type of contract that Boeing should offer to Amelia to induce her to exert high effort. 5. Household chores–I.B Ana and Felix have decided that Felix will wash dishes every time Ana prepares a meal. However, Ana cannot observe how careful Felix is when washing them. She knows that when Felix is very meticulous (careless), the probability of having spotless dishes is 0.95 (0.2) and having them dirty is 0.05 (0.8, respectively). Felix’s utility is measured by the time he spends √ watching his favorite TV show each week (i.e., u(t) = t), which depends on his success at washing dishes. If he is very careful when washing dishes, his effort level is represented as eH = 1.8, and if he is careless, it is eL = 0. Ana’s expected utility when Felix achieves spotless dishes is 20. Identify the contract that Ana needs to offer Felix in terms of time spent watching his show. (Assume that letting Felix watch his favorite TV show is costly for Ana because there is only one TV in the house, and she does not like this particular TV show.) 6. Household chores–II.B Consider the situation in exercise 16.5. Suppose that Felix has purchased a new dishwasher, at a utility cost of K = 1, that increased the probability of having spotless dishes under careless effort to 0.4. (a) Identify the contract that Ana needs to offer Felix in terms of TV time. (b) Is Felix better off after purchasing this dishwasher? Why or why not? (c) For what values of K, the cost of the dishwasher, would Felix be better off after purchasing the dishwasher? 7. Basketball–I.A Consider a situation where a star basketball player is in negotiation for a new contract. The player knows that if he exerts high (low) effort in a game, the probability that his team wins the championship is 0.7 (0.3), which is worth $1 million to the ownership of the team. The basketball player’s utility function can be expressed as u(w) = w0.4 , and he incurs a utility cost of 150 when exerting high effort, and 50 when exerting low effort. Assume that his reservation utility is zero. (a) Identify the type of contract that the ownership should offer the star basketball player to induce him to exert a high effort level. (b) Find the team’s expected profits. 8. Basketball–II.B Consider the situation in exercise 16.7, but suppose that as the star player gets older, the utility cost he incurs to exert high effort increases to 250. 442 Chapter 16 (a) Identify the type of contract that the ownership should offer the star basketball player to induce him to exert a high effort level. (b) Should the ownership even offer this contract to the star player? Why? 9. Insurance market–I.B Consider a situation where an insurance firm wants to incentivize its policyholder to exert effort in prevention of risky behaviors. Suppose that when a policyholder exerts high effort (eH = 10), she has a probability of 0.1 of experiencing an adverse event (e.g., an accident), which costs the insurance firm $10,000. A policyholder that exerts low effort (eL = 0) has a probability of 0.2 of the same event happening. The utility function for a policyholder is d − e2 , where d represents the size of any policy discounts the insurance company offers her (assume that her reservation utility is equal to ū = 0). If the effort level of policyholders is observable by the insurance firm, identify the type of contract that the insurance firm should offer to induce the policyholder to exert high effort level. 10. Insurance market–II.B Consider the situation in exercise 16.9, but suppose that the effort level of the policyholder is no longer observable to the insurance firm. Identify the type of contract that the insurance firm should offer the policyholder to induce her to exert a high effort level. 11. Risk-averse lemons–I.B Consider the market for lemons presented in subsection 16.3.2. Suppose that while both the buyer and seller can observe the car’s quality, the buyer is now risk averse, so √ her utility from purchasing the car is now q − p. (a) What price does the seller charge to the buyer in this situation? (b) Is this sale profitable for the seller? What is the range of qualities that he would be willing to sell to a risk-averse buyer? (Assume that the seller’s valuation does not change.) 12. Risky lemons.C Consider the market for lemons presented in subsection 16.3.2. Suppose that while both the buyer and seller can observe the car’s quality, the buyer’s utility from purchasing the car is now qα − p, where α ∈ [0, 1] measures risk aversion (e.g., when α = 1, the buyer is risk neutral, whereas when α = 12 , the buyer is risk averse). (a) As a function of α, what price does the seller charge to the buyer in this situation? (b) Is this sale profitable for the seller? What is the range of qualities that he would be willing to sell to a risk-averse buyer? (Assume that the seller’s valuation does not change.) (c) What happens to the results of parts (a) and (b) as α increases? 13. Risk-averse lemons–II.B Consider the market for lemons presented in subsection 16.3.3, where the buyer cannot observe car quality. Suppose that the buyer is now risk averse, so her utility from √ purchasing the car is now q − p. (a) What price does the seller charge to the buyer in this situation? (b) Is this sale profitable for the seller? What are the range of qualities that he would be willing to sell to a risk-averse buyer? (Assume that the seller’s valuation does not change.) 14. Used car market.B The used car market in Lemonville has two types of cars: high-quality cars (H) and low-quality cars (L). A proportion r of the total used cars are H type. Denote the type of car by K. So, for a typical car, K = H with probability r, and K = L with probability (1 − r). Contract Theory 443 The valuation of a K-type car to the seller and buyer are Ks and Kb , respectively. That is, the seller is willing to sell a K-type car at a price greater than or equal to Ks , and the buyer is willing to buy it at a price less than or equal to Kb . We assume that Hb > Hs > Lb > Ls > 0. For simplicity, assume that Hb = 4, Hs = 3, Lb = 2, and Ls = 1. The number of used cars for sale is limited, but the demand of the used cars is competitive (i.e., there is an infinite number of potential buyers). (a) Symmetric information. Suppose that both sellers and buyers observe the type of each individual car in the market. Explain what will happen in the market. (b) Asymmetric information. Now we assume asymmetric information: the seller observes the type of the cars he sells, but the buyers do not. We assume risk-neutral buyers, so they will buy if the price is less than the expected valuation of the car. Suppose that there is no effective way of transmitting the information on the type of the cars to the buyers, so there will be only one price in the used car market. Explain what will happen. In particular, find an expression of the particular value r, such that only L cars will be traded if r < r. What happens if r > r? 15. Risk neutrality–II.B Repeat exercise 16.14, but assume a risk-averse buyer, with utility function √ u(q) = q, where q > 0 denotes his income. (a) Symmetric information. Suppose that both the sellers and the buyers know the type of each individual car in the market. Explain what will happen in the market. (b) Asymmetric information. Now we assume asymmetric information: the seller observes the type of the cars he sells, but the buyers do not. We assume risk-neutral buyers, so they will buy if the price is less than the expected valuation of the car. Suppose that there is no effective way of transmitting the information on the type of the cars to the buyers, so there will be only one price in the used car market. Explain what will happen. In particular, find an expression of the particular value r, such that only L cars will be traded if r < r. What happens if r > r? 16. Risk neutrality–I.A Repeat the analysis in example 16.3, but assume that the worker is risk neutral (i.e., their payoff from the contract is w − θe). How are equilibrium results affected? [Hint: Repeat all the steps in subsection 16.3.5 to find the equilibrium effort levels and salaries, and then evaluate your findings at the parameter values considered in example 16.3.] 17. Risk neutrality–II.B Repeat the analysis in example 16.4, but assume that the worker is risk neutral (i.e., their payoff from the contract is w − θe). How are equilibrium results affected? [Hint: Repeat all the steps in subsection 16.3.6 to find the equilibrium effort levels and salaries, and then evaluate your findings at the parameter values considered in example 16.4.] 18. Optimal contracts.B Repeat the analysis in example 16.4, but assume that the worker’s reservation utility is 2, rather than zero. How are the equilibrium results affected? [Hint: Repeat all the steps in subsection 16.3.6 to find the equilibrium effort levels and salaries, and then evaluate your findings at the parameter values considered in example 16.4.] 19. Training.A Consider the results of example 16.4. Suppose that the high-cost worker could pay for some training to lower her cost of effort to θL . How much would she be willing to pay to achieve this? 444 Chapter 16 20. Information premiums.B Consider the results of subsection 16.3.6. (a) What happens to the equilibrium effort and wage levels as θH increases? (b) What happens to the equilibrium effort and wage levels as θL increases? (c) Intuitively, why does the price of one type of worker’s effort (θH or θL ) affect the wages or effort of the other type of worker? 21. Adverse selection.A Give two real-world examples where adverse selection problems exist. In all examples, identify the individuals/firms involved, their order of play, and the available actions. 17 Externalities and Public Goods 17.1 Introduction This chapter examines other market imperfections: externalities and public goods. Externalities arise when the actions of one agent (an individual, a firm, or a country) affect the welfare of another agent. If the agent creating the externality ignores the effects that her actions impose on other individuals, market mechanisms will not allocate resources efficiently. A common example of a negative externality is pollution; a polluting firm may ignore the effect that its emissions have on other firms or society, producing a large amount. Specifically, the firm generates more pollution than would be socially optimal, as identified in this chapter. In such contexts, an explicit coordination among the affected parties may be recommended, and if this does not work, market intervention may be justified. A similar argument applies to the case of public goods, which are goods and services from which all individuals can benefit, and for which excluding noncontributors is either unfeasible or extremely costly. A typical example of this type of good is national defense. Once defense is provided, all of us can enjoy it, whether or not we paid our taxes. Therefore, providing it to one more individual does not alter its cost, and excluding noncontributors is unfeasible. Individuals understanding these features of public goods may have incentives to free-ride because, at the end of the day, they cannot be easily excluded from enjoying the good. We discuss these incentives, as well as the potential of public policy to correct them, next. 17.2 Externalities Externalities The effect that the action of an agent has on the welfare of another agent, beyond the effects transmitted by changes in prices. Examples of negative externalities abound: a firm’s pollution of a river that is being used by fishing farms downstream (production externality); a driver entering a highway at peak hours, which increases the driving time for all drivers (consumption externality); or a 446 Chapter 17 roommate streaming online, which slows down the internet speed of other roommates in her network (consumption externality). However, the decrease in market prices that occurs after one firm brings more units for sale cannot be interpreted as an externality. To understand this point, note that if firm 1 produces more units market price will decrease, hurting the profits of other companies selling the same product. This effect is, however, transmitted via prices since a market for the good exists. In the example about pollution affecting downstream fishing farms, however, markets for pollution do not exist, and a similar argument applies to other examples on negative externalities. We can similarly find examples of positive externalities, such as an individual choosing to vaccinate, thus helping other individuals around him be better protected against that illness (consumption externality), or unpatented research and development (R&D) completed by a university or a firm, which can be used for free by other firms or research centers to rapidly improve their own production processes and inventions (production externality).1 We next study the amount of externality generated under no regulation, and then compare it against the optimal amount of externality (e.g., pollutant emissions) for society. 17.2.1 Unregulated Equilibrium In the case of a negative externality, like a factory polluting a river, the polluter ignores the effect that its actions have on other individuals, such as poor air quality for citizens in the nearby area and larger costs for filtering water by a fishing farm downstream. If left unregulated, this polluting firm would produce a large amount of pollution, which is not necessarily optimal. In particular, the firm maximizes profits as example 17.1 illustrates. Example 17.1: Unregulated equilibrium Consider a monopolist facing inverse demand function p(q) = 10 − q, and total cost TC(q) = 2q. The firm maximizes its profits as follows: max (10 − q)q − 2q. q Differentiating with respect to q yields 10 − 2q − 2 = 0, which simplifies to 8 = 2q. Solving for q, we obtain an output of qU = 4 units; where superscript U denotes “unregulated equilibrium.”2 1. This was the case for laser technology which, within a few years of its invention, found initialy unsuspected applications, such as barcode reading in checkout counters at supermarkets, stores, and warehouses. 2. Because we assume an industry with a single firm, this approach coincides, of course, with our approach to find profit-maximizing output under monopoly discussed in chapter 10. A similar approach would apply if the industry Externalities and Public Goods 447 $ $8 Marginal profit, 8 – 2q Unregulated equilibrium, qU = 4 units 4 q Figure 17.1 Pollution in the unregulated equilibrium. Assuming that each unit of output generates α 0 units of pollution, the total amount of pollution that this firm generates when left unregulated is 4α. Figure 17.1 graphically represents the firm’s problem, which increases output q until marginal profits are zero; that is, ∂π = 10 − 2q − 2 = 8 − 2q = 0. ∂q The curve representing marginal profits, 8 − 2q, originates at a height of 8 and decreases in q, crossing the horizontal axis at exactly qU = 4 units.3 Intuitively, the firm has no more profit opportunities beyond that level of output: producing more than 4 units yields negative marginal profits (i.e., profits decrease), whereas producing fewer than 4 units means that the firm could still increase output and further increase its profits. Self-assessment 17.1 Repeat the analysis in example 17.1, but assume now the inverse demand function changes to p(q) = 14 − q. Compare your results with those in example 17.1. was, instead, oligopolistic, where we would follow the tools learned in chapter 14. We consider this scenario in one of the end-of-chapter exercises. 3. Recall that to find the crossing point with the horizontal axis, we only need to set this equation equal to zero, 8 − 2q = 0, and then solve for q to obtain 8 = 2q, or q = 82 = 4 units. 448 Chapter 17 17.2.2 Social Optimum How can we evaluate whether the unregulated amount of pollution is socially excessive or not? We first examine how much pollution would be generated by a social planner who considers both the firm’s profits and the externality that pollution imposes on other individuals and firms. Example 17.2 describes this calculation. Example 17.2: Finding the social optimum Continuing the scenario in example 17.1, assume that every unit of emissions e 0 generates an external cost of EC = 3 (e)2 , which is increasing and convex in emissions. This implies that emissions are damaging for individuals in the vicinity of the polluting factory, and at an increasing rate; that is, the first ton of carbon dioxide (CO2 ) might just create fog in the area, while the 10,000th ton creates serious health problems. Because emissions are defined as e = αq, the external cost can be rewritten as EC = 3 (αq)2 . For example, if every unit of output generates α = 14 units of emissions, total emissions are e = 14 q, implying 2 3 2 q . that external costs become EC = 3 14 q = 16 The social planner cares about society as a whole, thus considering the sum of firm profits and external costs by solving the following problem: max [(10 − q)q − 2q] − 3 (αq)2 . q Profits External cost This, essentially, adds the external cost of pollution, EC = 3 (αq)2 , to the firm’s profit-maximization problem discussed in example 17.1. Differentiating with respect to q yields (10 − 2q − 2) − 6αq = 0, which simplifies to 8 = q(2 + 6α). Solving for output q, we obtain that the social optimum is qSO = 8 , 2 + 6α which is decreasing in the rate of emissions per unit of output, α. If every unit of out8 put generates 1 unit of emissions, α = 1, the social optimum is only qSO = 2+6 =1 unit. In contrast, when output does not generate any unit of emissions, α = 0, the 8 = 4 units, thus coinciding with the unregusocial optimum increases to qSO = 2+0 lated equilibrium. Intuitively, external cost EC = 3 (αq)2 is zero when α = 0, and as a consequence, the social planner’s maximization problem coincides with that of the unregulated firm in example 17.1. Using a similar approach as in figure 17.1, figure 17.2 depicts the social planner’s problem. For comparison purposes, it splits this problem into two parts: marginal ∂EC ∂π profit ∂π ∂q and marginal damage ∂q , where ∂q = 8 − 2q as shown in figure 17.1, Externalities and Public Goods 449 $ $8 Marginal profit, 8 – 2q Marginal damage, 6αq 4 q SO q 8 = 2 + 6α Figure 17.2 Socially optimal pollution. whereas ∂EC ∂q = 6αq is a straight line starting from the origin and growing at a rate of 6α. Leaving the firm unregulated yields the output level where the marginal profit U curve ∂π ∂q = 8 − 2q crosses the horizontal axis at q = 4. The social optimum, in con- 8 . Increasing trast, lies where marginal profit crosses marginal damage, at qSO = 2+6α SO output beyond q would generate more external costs than profits and would thus be inefficient, whereas decreasing output from qSO would not be efficient either, because a larger output would increase profits more significantly than external costs. Finally, note that the regulator does not necessarily recommend the prohibition of 8 the externality-generating activity. Indeed, the socially optimal output qSO = 2+6α decreases in α, but it does not become zero for any value of α. Even in the case in which α = 100 (i.e., every ton of output generates 100 tons of emissions, a rather 8 = 0.013 units. unlikely scenario), the socially optimal output becomes 2+(6×100) Self-assessment 17.2 Repeat the analysis in example 17.2, but assume the inverse demand function changes to p(q) = 14 − q. Show that the socially optimal output qSO is larger than in example 17.2. Interpret. While example 17.2 presents a scenario in which pollution is never banned, other industries might recommend such a prohibition, as example 17.3 illustrates. 450 Chapter 17 Example 17.3: Prohibiting pollution Consider example 17.2, but assume an external cost of EC = 3 (e)2 + 7e, which is also increasing and convex in emissions e, but yields a higher marginal damage than the external cost in example 17.2.4 The social planner’s problem is analogous to that in the previous example: max [(10 − q)q − 2q] − 3 (αq)2 + 7αq . q Profits External cost Differentiating with respect to q yields (10 − 2q − 2) − (6αq + 7α) = 0, which simplifies to 8 − 7α = q(2 + 6α). Solving for output q, we obtain that the social optimum is qSO = 8 − 7α . 2 + 6α It is straightforward to check that, like the socially optimal output of example 17.2, this output is also decreasing in the rate at which output transforms into emissions, α, because its derivative is 62 ∂qSO −7(2 + 6α) − 6(8 − 7α) = =− 2 ∂α (2 + 6α) 4(1 + 3α)2 31 =− , 2(1 + 3α)2 which is negative for all values of α. In contrast to qSO in example 17.2, the output found here can become negative if α is large enough. In particular, 8−7α 2+6α 0, so long as 8 − 7α 0, or α 87 . Intuitively, if every unit of output generates slightly more than 1 unit of emissions, the socially optimal output should be reduced to zero, thus banning the pollution-generating activity. Self-assessment 17.3 Repeat the analysis in example 17.3, but assume that the external cost decreases to EC = 3 (e)2 + 5e. Find under which values of parameter α the pollution-generating activity should be banned. Interpret. 4. To see this point, find the marginal damage in example 17.3, 6αq + 7α, and compare it with that in example 17.2, 6αq. The marginal damage in example 17.3 originates at 7α, while that in Example 17.2 originated at zero. In addition, the marginal damages are parallel to each other because differentiating with respect to q again yields 6α for both functions. Therefore, 6αq + 7α is parallel to 6αq but lies above it for all values of q. Externalities and Public Goods 451 17.3 Restoring the Social Optimum A natural question is: how do you induce agents to internalize the externalities that their actions impose on other individuals, rather than ignoring them completely in the unregulated equilibrium? Two approaches are often suggested: letting the parties bargain, and market intervention through government policy. 17.3.1 Bargaining between the Affected Parties The following theorem, based on Coase (1960), identifies scenarios where bargaining can be an effective tool to address externality problems. Coase theorem The agents producing the externality and those affected by the externality can negotiate, generating a socially optimal amount of externality, if the following conditions hold: (1) all parties are perfectly informed about each other’s benefits and costs; (2) the negotiation and transaction costs are zero; (3) the amount of the externality is observable by a third party; and (4) their agreement is enforceable. This result holds both when the property rights of the resource are assigned to the agent generating the externality (the polluter) and when they are assigned to the agent affected by the externality (the victim). To understand this bargaining possibility (known as the “Coase theorem”), let us examine it in the context of an example. An upstream firm pollutes a river, affecting a fishing farm that is located downstream. As water becomes more polluted, the fishing farm needs to spend more resources in filtering water for its operations, thus giving rise to a production externality because pollution affects the costs of the fishing farm. For presentation purposes, we first analyze the case in which the property rights over the river are assigned to the fishing farm. Fishing farm. If the fishing farm owns the river, it would initially be completely clean; that is, the externality-generating activity would be q = 0. Is this outcome efficient? No, because the polluting firm could pay the fishing farm for an increase in the externality-generating activity, from q = 0 to qSO . As depicted in figure 17.2, output levels between q = 0 and qSO generate more profits for the polluting firm than the external costs that it imposes on the fishing farm. Formally, the marginal profit curve ∂π ∂q lies above the marginal damage curve ∂EC ∂q , thus indicating that, to increase pollution, the polluting firm would be willing to pay more than what the fishing farm needs as compensation. Beyond qSO , the polluting firm would still obtain additional profits from further increases in pollution, but they are now 452 Chapter 17 smaller than the additional compensation that the fishing farm needs in order to accept such an increase in pollution. As a result, negotiating parties would reach an agreement at exactly qSO units.5 Polluting firm. If, instead, the polluting firm owns the river, it would initially be completely dirty, as this firm would choose output level qU . We again ask ourselves whether this outcome is efficient. And, again, our answer is no. In this case, the fishing farm could pay the polluting firm for a decrease in the externality-generating activity, from qU to qSO . As depicted in figure 17.2, output levels between qSO and qU = 4 generate a larger external cost for the fishing farm than additional profits for the polluting firm. Formally, the marginal ∂π damage curve ∂EC ∂q lies above the marginal profit curve ∂q , thus indicating that, to decrease pollution, the fishing farm would be willing to compensate more than what the polluting firm needs as compensation. Reducing pollution below qSO , the fishing farm would obtain additional reductions in external costs, but they are now smaller than the additional compensation that the polluting firm needs to further decrease pollution. As a result, negotiating parties would reach an agreement at exactly qSO units. Hence, socially optimal output qSO emerges as the outcome of the bargaining process between the agents, both when the polluting firm and when the fishing farm owns the river. This is great news—agents alone, without government intervention, could reach the socially optimal outcome, independent of how property rights are assigned. Under which cases can we expect the Coase theorem to hold, then? Essentially, this theorem holds when its main assumptions are satisfied: • Zero negotiation costs. Negotiation costs tend to increase as more agents generate the externality and more agents are affected by it. We can then expect negotiation costs to be low when only a few agents are involved (e.g., a single polluting firm and a single firm being affected by the externality, as in the fishing farm example), but otherwise, these costs can be large. • Well-defined property rights. In addition, we need property rights to be well defined, thus allowing both parties to know who should be compensated for an increase or decrease of the externality. • Perfect information. Agents must be well informed about the benefits and costs that the other party experiences from the externality. This is a rather restrictive assumption. In the fishing farm example, for instance, it requires this farm observing how beneficial each ton of pollution is for the polluting firm, so the fishing farm can assess how much to offer to reduce pollution. 5. In addition, the polluting company will not have incentives to break the agreement (by polluting more than qSO ) because pollution is perfectly observable by a court of law, and the contract with the fishing farm is, by assumption, enforcable. Externalities and Public Goods • 453 Observable pollution and enforceable contracts. In addition, the theorem requires that the amount of pollution must be observable by a third party, such as a court of law, and the contract must be enforceable in case one of the parties breaks it. When any of these conditions does not hold, we can generally expect that the negotiation does not generate an efficient amount of the externality; in these cases, government intervention might be required. We examine these cases next. 17.3.2 Government Intervention Public policy seeking to correct externalities often takes two forms: a quota, which sets an upper limit on the amount of the externality that agents can generate (e.g., maximum tons of CO2 that firms can emit per year, or maximum amount of fish that a fishing company can appropriate); or emission fees, which increases the cost that the firm faces per unit of the output generating externalities (e.g., the firm pays $7 per ton of cement being produced, as this production generates emissions).6 We analyze the design of each policy tool next. Emission quotas If the regulator seeks to induce a socially optimal output qSO from the polluting firm, she can simply set an emission quota of exactly qSO . When the firm emits less than qSO , no fines are imposed, whereas when the firm emits more, a hefty fine is levied. In the case of example 17.2, for instance, the regulator only needs to set the emission 8 , where α denotes the rate at which every unit of output transforms quota at qSO = 2+6α into emissions. For example, if α = 1/3, the emission quota would be qSO = 8 1 = 2 tons of CO2 . 2+6 3 Emission fees In this scenario, if the regulator seeks to induce a socially optimal output qSO , she only needs to set an emission fee t that induces the firm to produce exactly qSO . How can she calculate the exact amount of fee t that achieves this goal? By anticipating the firm’s production behavior, the regulator knows how the firm reacts to the emission fee (which increases its unit cost by t). Example 17.4 illustrates the design of emission fees when the regulator faces the polluting monopolist from example 17.2. Example 17.4: Finding optimal emission fees From examples 17.1 and 17.2, a polluting monopolist faces a linear demand p(q) = 10 − q and marginal costs of 2. Consider that the regulator seeks to induce this socially optimal output of qSO = 2 tons 6. Examples include emission fees imposed by the Environmental Protection Agency (EPA) on coal-fired power plants in 2008, estimated to increase the cost of every megawatt-hour generated in plants using pulverized coal by around $33 because of its CO2 emissions and another $0.67 because of its sulphur dioxide (SO2 ) and nitrogen oxide (NOx ) emissions. 454 Chapter 17 of CO2 . She then faces a two-period game: in the first stage, the regulator sets emissions fee t; and, in the second stage, observing this fee, the polluting firm responds choosing its output q. We next solve this sequential-move game by applying backward induction, so we start analyzing the second stage. Second stage. If the regulator sets a fee t on every unit of output, the monopolist’s profit-maximization problem becomes max (10 − q)q − (2 + t)q, q where the firm’s unit cost increases from 2q under no regulation to (2 + t)q under regulation. Differentiating with respect to q, we obtain 10 − 2q − (2 + t) = 0. Solving for q, we find that the monopolist’s output is q(t) = 8−t . 2 When fees are absent (t = 0), output reduces to q(0) = 4 units, as in the scenario analyzed in example 17.2. However, when the firm is subject to a positive fee t > 0, its output decreases in the severity of the fee. First stage. The regulator sets the emission fee in the first period, while the firm responds to that fee in the second period. The regulator can then put herself in the shoes of the monopolist, anticipating the output that maximizes the firm’s profits, SO = 2 tons of CO that she q(t) = 8−t 2 2 , and set it equal to the socially optimal output q seeks to induce. That is, 8−t = 2. 2 Rearranging this equation, we find 8 − t = 4 and, solving for emission fee t, yields t = $4. To confirm that this fee induces the firm to produce the socially optimal output (2 tons), we can insert the fee t = $4 into the firm’s output function, q(t) = 8−t 2 , to = 2 units. Therefore, by setting a fee of $4 per unit of output, obtain q($4) = 8−4 2 the regulator increases the monopolist’s costs, which ultimately induces the firm to voluntarily produce the socially optimal output.7 7. In the case of a positive externality, such as vaccinations, clean air, or education, the socially optimal output qSO is actually larger than the output that firms choose in the unregulated equilibrium, qU > qSO . In such a context, the optimal emission fee that the regulator finds becomes negative, thus indicating that she needs to provide a negative tax (i.e., a subsidy) per unit of output to induce firms to increase their production toward the optimal output level qSO . In the scenario of example 17.4, consider, for instance, that qSO = 20, and solve for t to find the optimal subsidy per unit of output. Externalities and Public Goods 455 Self-assessment 17.4 Consider your results in self-assessment question 17.2. Following the steps in example 17.4, find the emission fee t that induces firms to produce the socially optimal output qSO . For a more detailed analysis of externality problems, and how to correct them using various instruments, see Kolstad (2010). For a more technical presentation, see Phaneuf and Requate (2016). 17.4 Public Goods The term “public goods” refers to goods and services which are nonrival (its consumption by one individual does not reduce the amount of the good available to other individuals) and nonexcludable (preventing an individual from enjoying the good is extremely expensive or impossible). A common example is national defense, because my consumption does not reduce your consumption, and if you were to not pay your taxes tomorrow, it would be essentially impossible for the government to prevent you from enjoying national defense, even if you didn’t help in its funding.8 Another common example is clean air, because that also satisfies the two features of nonrivalry (your consumption of clean air does not reduce my own) and nonexcludability (how can you be prevented from enjoying clean air?).9 In contrast, goods that do not satisfy either property are “private goods,” such as an apple, because its consumption is rival (if you eat it, I cannot enjoy the same apple) and excludable (if you don’t pay for an apple, you cannot eat it). You might be wondering: “What if only one of the two features holds?” Table 17.1 illustrates the taxonomy of cases that emerge when combining these two features, with rivalry in rows and excludability in columns. Four cases arise: • Public goods (nonrival and nonexcludable). • Private goods (rival and excludable). • Club goods (nonrival and excludable). • Common-pool resources (rival but nonexcludable). A “club good,” such as a gym, is nonrival because the good can be enjoyed by several members without affecting each other’s utility, unless the gym becomes too crowded. 8. Well, the government could deport you so you don’t get to enjoy national defense, but this is not a penalty for tax evasion. At least yet! 9. Many other examples abound, such as public fireworks, official statistics, and publicly available inventions through unpatented R&D. 456 Chapter 17 Table 17.1 Taxonomy of public goods. Rival Nonrival Excludable Nonexcludable Private goods (Example: apples) Club goods (Example: gyms) Common-Pool Resources (Example: fishing grounds) Public goods (Example: national defense) In addition, it is excludable since the gym owners can easily prevent nonmembers from entering the center by requiring users to show a membership card.10 In contrast, commonpool resources (such as forests, aquifers, hunting grounds, and fishing grounds) are rival because the exploitation of the resource by one agent reduces the stock available for other agents (e.g., if a fisherman catches 1 more ton of fish, other fishermen in the area may need to incur higher costs to catch the same amount of fish). Non-excludable goods, such as public goods and common-pool resources, result in agents exhibiting free-riding behavior, in which consumers do not pay for the goods because they expect that others will pay.11 Example 17.5 illustrates free-riding in a familiar situation. Example 17.5: Free-riding of public goods Consider two roommates cleaning their apartment on a Saturday. Every roommate i simultaneously and independently chooses the number of hours she spends cleaning, hi , where hi ∈ [0, 24] because she cannot spend more than 24 hours a day, and her utility from cleaning is given by ui (hi , hj ) = (24 − hi ) + βhi (hi + hj ) . Leisure Cleaner apartment The first term here indicates the number of hours she enjoys in leisure, 24 − hi (i.e., the hours she does not spend cleaning). For instance, if she spends hi = 2 hours cleaning the apartment, she has 24 − 2 = 22 hours left in the day for leisure (or at least activities other than cleaning). The second term, instead, reflects the benefit that 10. A more recent example of club goods is satellite TV, or pay-TV channels, because their consumption is nonrival (if you watch my favorite TV series, my consumption is not reduced), but it is excludable because you cannot watch a specific TV channel if you did not pay for it. Generally, most types of copyrighted works, such as books, movies, and software, are club goods because they all satisfy nonrivalry and excludability. 11. Common examples of free-riding are Public Broadcasting Service (PBS), with 100 million viewers and only 4 million contributors, and National Public Radio (NPR), with 22 million listeners and only 3 million contributors. Another example is individual effort in a team project for which all the participants receive the same grade. Remember the last course you took that included a team project—did you free-ride off your teammates’ effort? Or were your teammates free-riding off of you? Externalities and Public Goods 457 hi 1 2β 1 β hj Figure 17.3 Individual i’s best response function. she obtains from living in a cleaner apartment, which increases in both the hours she dedicates to cleaning, hi , and the hours her roommate spends cleaning, hj . This benefit is increasing in parameter β > 0. In particular, roommate i chooses the hours she spends cleaning, hi , to maximize her utility ui (hi , hj ). Differentiating with respect to hi , we obtain −1 + 2βhi + βhj = 0. Rearranging this expression yields 2βhi = 1 − βhj , and solving for hi , we find hi = 1−βhj 2β , which we can express as hi = 1 1 − hj . 2β 2 Because this expression determines the optimal cleaning time for individual i, hi , as a function of individual j’s cleaning time, hj , it can be understood as a “best response function,” similar to those we found in the oligopoly markets of chapter 14. Specifically, when individual j spends no time cleaning, hj = 0, individual i responds with 1 , as depicted in the vertical intercept of figure 17.3; and when individual j hi = 2β increases her cleaning time, individual i responds by decreasing her own because i can free-ride off j’s cleaning time (which explains the negative slope of the best response function in figure 17.3). When individual j increases her cleaning time until hj = β1 (or beyond that), individual i responds by spending no time cleaning, as illustrated in the horizontal intercept of figure 17.3.12 1 − 1 h = 0, and solve 12. To illustrate this point, we only need to set the best response function equal to zero, 2β 2 j for hj , obtaining hj = β1 . 458 Chapter 17 A symmetric best response function applies to individual j, hj (hi ). Invoking symmetry, we can obtain the symmetric equilibrium cleaning time, where h∗i = h∗j = ∗ h∗ . Inserting h∗ into hi (hj ) yields h∗ = 1−βh or 2βh∗ = 1 − βh∗ . Solving for h∗ 2β entails h∗ = 1 . 3β 1 Therefore, every roommate spends 3β hours cleaning. For instance, if β = 1/10, every individual would spend 1 3(1/10) 3.3 hours cleaning on Saturday. Self-assessment 17.5 Repeat the analysis in example 17.5, but assume that every roommate has only 12 hours (rather than 24), so the benefit of leisure in her utility function decreases to 12 − hi . How are the results in example 17.5 affected? How can free-riding be prevented? A common policy tool is to require all individuals to pay for the provision of the good via taxes rather than voluntary contributions. This tool is often criticized, as it requires both users and nonusers of a public good (e.g., a highway) to pay for it. A less extreme tool is to require users to pay a certain amount every time they use the good. In the case of highways, for instance, drivers must pay tolls every time they access a road, which essentially transforms the nature of the good from nonexcludable (freeway) to excludable (controlled-access highway). Besides helping fund the highway, tolls are also used to alleviate traffic congestion, as they vary significantly according to the time of day the driver enters the highway, reaching its highest (lowest) dollar amount during peak (valley) hours, when the most (least) traffic congestion occurs. Examples include highways in California, Chile, Brazil, and Singapore, where each driver installs a transponder on her car’s windshield, adding funds to it online. Drivers do not slow down when passing through the transponder’s reader (usually a large arc at the entry point of the highway), which makes traveling through the toll area more convenient. At several points close to the highway, drivers are informed of the toll price for that time of day so that they can decide whether to access the highway or not.13 13. For a more detailed presentation of public goods and policy, see Hindricks and Myles (2013). Externalities and Public Goods 459 17.4.1 A Look at Behavioral Economics—Public-Good Experiments Several researchers have studied public-good games in controlled experiments in many countries. In a typical experiment, every individual is asked to sit at a computer terminal and is presented with a game in which she can independently choose how many dollars (or tokens) to contribute to a private account (which only she can enjoy) or to a public account (which provides benefits to all individuals in the group, thus capturing the nonrival property). Overall, these experiments found that individuals tend to make relatively high donations to the public good, but these contributions can decrease rapidly as individuals interact during several rounds. However, average contributions increase as the benefit from the public good increases.14 17.5 Common-Pool Resources In this section, we investigate equilibrium and socially optimal appropriation in a commonpool resource like a fishing ground, a forest, or an aquifer. Assume that N individuals have access to the resource. Every unit of appropriation (e.g., 1 ton of fish) is sold in the international market which, for simplicity, is assumed to be perfectly competitive. Intuitively, every fisherman’s appropriation (e.g., 20 tons of cod) represents a small share of industry catches, and this does not affect market prices for this variety of fish. As a result, every firm takes the market price p as given, which we normalize to p = $1 to facilitate this analysis. In addition, every firm faces the following cost function: C(qi , Q−i ) = where Q−i = qi (qi + Q−i ) , S (17.1) qj represents the sum of all appropriations by individuals other than i. For j=i instance, when only two fishermen exploit the resource (fisherman 1 and 2), the cost function in equation (17.1) simplifies to C(q1 , q2 ) = q1 (q1 + q2 ) S for fisherman 1 (so that Q−1 = q2 ), and similarly C(q2 , q1 ) = q2 (q2S+q1 ) for fisherman 2 (so that Q−2 = q1 ).15 In addition, S > 0 denotes the stock of the resource. Intuitively, a more abundant resource (higher S) decreases fisherman i’s cost because fish is easier to catch. Importantly, this cost function is increasing in fisherman i’s own appropriation, qi , and in 14. For a recent survey of these experiments, see Vesterlund (2014). 15. In the case of three fishermen, Q−i becomes Q−1 = q2 + q3 for fisherman 1, Q−2 = q1 + q3 for fisherman 2, and Q−3 = q1 + q2 for fisherman 3. In addition, note that the market can still be perfectly competitive if several other fishermen, located in other fishing grounds but appropriating the same type of fish, sell their catches in the international market. 460 Chapter 17 his rival’s appropriations, Q−i .16 Intuitively, the fishing ground becomes more depleted as other firms appropriate fish, making it more difficult for fisherman i to catch fish. Therefore, every fisherman chooses its appropriation level qi to maximize its profits as follows: qi (qi + Q−i ) max πi = qi − , qi S where the first term represents the fisherman’s revenue from additional units of appropriation (recall that, for simplicity, the price of every unit was normalized to p = $1), and the second term indicates the total cost that the fisherman incurs when appropriating qi units of fish, while his rivals appropriate Q−i units. 17.5.1 Finding Equilibrium Appropriation Differentiating with respect to qi in the above maximization problem for fisherman i, we obtain 2qi + Q−i = 0. 1 − S MR MC Intuitively, the first term captures the marginal revenue from catching additional units of fish, MR, whereas the second term indicates the marginal cost that the firm experiences from these additional catches, MC. That is, the fisherman increases his appropriation until the marginal revenue and cost exactly offset each other. Rearranging this expression, yields S = 2qi + Q−i , and solving for qi , we find S 1 − Q−i . (BRFi ) 2 2 Intuitively, qi (Q−i ) represents fisherman i’s best response function because it describes how many units to appropriate, qi , as a response to how many units his rivals appropriate, Q−i . In particular, he appropriates half the available stock ( S2 ) when his rivals do not appropriate any units (Q−i = 0), but his appropriation decreases as his rivals appropriate positive amounts, Q−i > 0, as depicted in figure 17.4.17 Firms are symmetric in this scenario because they face the same price for each unit of fish ($1) and the same cost function. Therefore, the best response function of any other qi (Q−i ) = ∂C(qi ,Q−i ) 2q +Q = i S −i , ∂qi ∂C(qi ,Q−i ) qi which is positive for all appropriation levels; and = S , which is also positive for all appropriation ∂Q−i 16. To confirm this result mathematically, note that the cost function C(qi , Q−i ) satisfies levels. 17. Recall that, in order to find the horizontal intercept of the best response function shown in figure 17.4, we only need to set it equal to zero, S2 − 12 Q−i = 0, rearrange, S2 = 12 Q−i , and solve for Q−i to obtain Q−i = S. Intuitively, this point represents that if fisherman i’s rivals appropriate all the available stock, Q−i = S, then fisherman i responds by not appropriating anything (qi = 0). Externalities and Public Goods 461 qi S 2 BRFi S Q–i Figure 17.4 Fisherman i’s best response function. firm j (where j = i) is symmetric to the best response function discussed previously (i.e., qj (Q−j ) = S2 − 12 Q−j ), so we only change the subscript. In a symmetric equilibrium, each fisherman appropriates the same amount of fish, implying that q∗1 = q∗2 = … = q∗N = q∗ , which helps us ignore the subscripts because all the firms’ catches coincide. Therefore, Q∗−i becomes Q∗−i = q∗ = (N − 1)q∗ , given that we sum j=i over all N − 1 fishermen other than i. Inserting this result in the best response function yields q∗ = S 1 − (N − 1)q∗ , 2 2 (17.2) which is now a function of q∗ alone (recall that the stock S, and the number of fishermen N are both parameters, as opposed to q, which is the only variable we seek to solve for). ∗ ∗ = S2 , or (N + 1)q∗ = S, which, solving Rearranging equation (17.2) yields 2q +(N−1)q 2 ∗ for q , entails an equilibrium appropriation of q∗ = S . N +1 For instance, if the stock is S = 100 tons of fish and N = 9 fishermen, equilibrium appro100 priation becomes q∗ = 9+1 = 10 tons. Generally, the equilibrium appropriation q∗ increases in the stock of the resource, S, but decreases in the number of firms competing for the resource, N. Self-assessment 17.6 Repeat the analysis in subsection 17.5.1, but assume N = 12 fishermen and a stock of S = 230 tons of fish. What if the number of fishermen increases to N = 14? What if, still with N = 12 fishermen, the stock of fish increases to S = 250 tons? Interpret. 462 Chapter 17 17.5.2 Common-Pool Resources—Joint Profit Maximization A natural question at this point is whether equilibrium appropriation is excessive –or, in other words, can fishermen increase their profits if they coordinate their catches? As we show next, the answer is yes. For simplicity, we focus on the case of two fishermen, N = 2, but a similar argument applies to common-pool resources with more fishermen. When fishermen 1 and 2 coordinate their catches, they maximize their joint profits as follows: max π1 + π2 = q1 − q1 ,q2 q1 (q1 + q2 ) q2 (q2 + q1 ) , + q2 − S S π1 π2 which simplifies to max (q1 + q2 ) − q1 ,q2 (q1 + q2 )2 . S Differentiating with respect to q1 , we find 1− 2(q1 + q2 ) = 0, S (17.3) and the same result occurs after differentiating with respect to q2 . Intuitively, the first term represents the marginal revenue from additional catches, while the second term captures fisherman i’s marginal cost, 2(q1S+q2 ) . Relative to the previous individual decision problem, increasing catches now produces twice as much marginal costs because every fisherman takes into account not only the increase in his own costs, but also the increase in his rival’s costs. In short, every fisherman now internalizes the cost externality that his appropriation generates on other fishermen, as larger qi increases the cost of fisherman j.18 Rearranging equation (17.3), we obtain S = 2(q1 + q2 ), and solving for q1 we find q1 (q2 ) = S − q2 . 2 (17.4) As depicted in figure 17.5, this line originates at the same height as fisherman i’s best response function in figure 17.4, S2 , but decreases in his rival’s appropriation faster than that in figure 17.4, thus lying below it. This indicates that, for a given amount of appropriation from firm 2, q2 , firm 1 chooses to appropriate fewer units when firms coordinate their 18. Unlike in the collusive behavior analyzed in chapter 14, where firms’ decision to reduce their output produces an increase in market prices, fishermen’s coordination does not affect the market prices because we consider that such prices are given on the international market. If, instead, fishermen had market power, they would have stronger incentives to coordinate their output decisions because, besides internalizing the cost externality, they could increase market prices. Externalities and Public Goods 463 qi S 2 qi (Q– i ) = qi (Q– i ) = S 1 – Q– i 2 2 S – Q– i 2 S 2 S Q– i Figure 17.5 Equilibrium versus joint-profit maximization in the commons. exploitation of the resource (jointly maximizing profits) than when every firm independently selects its own appropriation.19 To confirm this finding, let us simultaneously solve for appropriation levels q1 and q2 in equation (17.4), q1 (q2 ) = S2 − q2 for fisherman 1 and q2 (q1 ) = S2 − q1 for fisherman 2. However, these equations perfectly overlap each other, indicating that a continuum of optimal pairs (q1 , q2 ) solves the joint-profit maximization problem, graphically illustrated by all points along the line q1 (q2 ) = S2 − q2 . Because firms are symmetric, the literature often considers that, among all optimal pairs, a natural equilibrium is that in which both firms JP JP appropriate the same amount (qJP 1 = q2 = q , where the superscript JP indicates “joint profit” maximization). S JP JP JP JP JP JP Inserting qJP 1 = q2 = q into equation q = 2 − q , and solving for q , entails q = S 4 . Comparing this result against the equilibrium appropriation when agents independently S = S3 , choose their appropriation levels (evaluated for the case of N = 2 fishermen), q∗ = 2+1 yields q∗ > qJP because S S > . 3 4 This result says that agents exploit the resource less intensively when they coordinate their appropriation decisions (and thus internalize the cost externalities their appropriation generates) than when they do not coordinate their exploitation. 19. To find the horizontal intercept of expression q1 (q2 ) = S2 − q2 , we only need to set it equal to zero, 0 = S2 − q2 , rearrange, S2 = q2 , and solve for q2 to obtain q2 = S2 ; as illustrated in the horizontal intercept of figure 17.5. Intuitively, this point represents that, if fisherman 2 appropriates half of the available stock, q2 = S2 , fisherman 1 responds by not appropriating anything (q1 = 0). 464 Chapter 17 Self-assessment 17.7 Repeat the analysis in subsection 17.5.2, but assume N = 12 fishermen and a stock of S = 230 tons of fish. What if the number of fishermen increases to N = 14? What if, still with N = 12 fishermen, the stock of fish increases to S = 250 tons? Interpret. Exercises 1. Regulated duopoly.B Redo example 17.1 (unregulated equilibrium), but with two firms. Then redo example 17.4 to find the optimal emission fee in a duopoly. Compare this result with that in the monopoly found in example 17.4. 2. Finding the social optimum when considering consumer surplus.A Redo example 17.2 (finding the social optimum), but now consider that social welfare includes consumer surplus as well. Welfare in example 17.2 only includes profits and external cost. 3. Positive externalities and social optimum.A Redo example 17.2, but assume positive externalities, where the external benefit function is EB = 5(αq)2 + 3, where α ∈ [0, 15 ). Find the unregulated equilibrium and social optimum. 4. Setting quotas while considering consumer surplus.B Redo example 17.3 (prohibiting pollution), but now consider that social welfare includes consumer surplus too. Welfare in example 17.3 only includes profits and external cost. Talk about the presence of two market imperfections (monopoly versus externalities). 5. Optimal emission fee.B Redo example 17.4, but with the environmental damage function in example 17.3. Which is the lowest emission fee that achieves this objective? 6. Optimal subsidy with positive externality.A Redo example 17.4, but assuming positive externalities, where the external benefit function is EB = 5(αq)2 + 3 for α ∈ [0, 15 ). Which subsidy per unit output induces the monopolist to choose the socially optimal output? 7. Public goods and free-riding.B Consider two roommates, 1 and 2, who simultaneously choose the number of hours that they spend cleaning their apartment. In particular, assume that roommate i’s utility function when he spends hi hours cleaning and roommate j spends hj hours cleaning is 1/3 ui (hi , hj ) = (24 − hi ) + βhi (hi + hj ) . As in example 17.5, the firm term represents the utility that roommate i enjoys from the hours he spends not cleaning the apartment, because the day has 24 hours; and the second term measures the utility that he enjoys from a cleaner apartment, which depends on both hi and hj , and is increasing in parameter β > 0. (a) Suppose that the two roommates choose their hours of cleaning independently. What are the optimal number of hours of cleaning in this context? (b) Assume now that the two roommates can coordinate their actions, choosing hi and hj to maximize their joint utility ui (hi , hj ) + uj (hj , hi ). What are the optimal number of hours of cleaning in this context? Externalities and Public Goods 465 (c) Compare your results in parts (a) and (b). Interpret your comparison in terms of free-riding incentives. 8. Common-pool pasture.B Consider a small village that grazes sheep on an adjacent plot of land. Sheep produce wool that depends on the number of other sheep grazing in the field, such that the wool per sheep is w = 100 − 2S, where S = s1 + s2 is the number of sheep in the field. Assume the wool can be sold at $1 per unit and sheep can be bought at $4 each. Villagers will buy a sheep so long as it is profitable. (a) How many sheep will the village graze on their field? (b) How many sheep should the village graze on their field? (Hint: what is the maximum profit the village can earn?) 9. Common-pool resource–I.B Consider a common-pool resource (e.g., a lake) operated by a single firm during two periods, appropriating x units in the first period and q units in the second period. In 2 particular, assume that its first-period cost function is x3 , while its second-period cost function is q2 . 3 − (1 − β) x Intuitively, parameter β denotes the regeneration rate of the resource. That is, if regeneration is complete, β = 1, first- and second-period costs coincide; but if regeneration is null, β = 0, second2 q period costs become 3−x , and thus every unit of first-period appropriation x increases the firm’s second-period costs. For simplicity, assume that every unit of output is sold at a price of $1 at the international market. (a) Find the profit-maximizing second-period appropriation. (b) Using your result from part (a), find the profit-maximizing first-period appropriation. (c) How are your results affected by a larger regeneration rate, β? 10. Common-pool resource–II.C Consider the scenario in exercise 9, but assume now that entry occurs in the second and that the second-period cost function for both incumbent and period, q +q q i j i , where i = {inc, ent}. entrant becomes 3−(1−β)x inc (a) Find the profit-maximizing second-period appropriation for each firm, qi and qj . (b) Using your result from part (a), find the profit-maximizing first-period appropriation for the incumbent, xi . (c) Compare the incumbent’s first-period appropriation under entry with that under no entry. Interpret your results. 11. Common-pool resource–III.B Consider the common-pool resource problem in section 17.5, but 2q (q +Q ) with a new cost function C(qi , Q−i ) = i iS −i , where qi denotes firm i’s appropriation and Q−i represents the sum of the appropriation from all firm i’s rivals. Assume there are N individuals with access to the fish which they can sell on the international market at $1 per unit, which every individual takes as given. (a) Find the equilibrium appropriation of fish. 466 Chapter 17 (b) What is the equilibrium appropriation of fish when there are N = 10 fishermen and a stock of S = 100 tons of fish. (c) Find the appropriation of fish if the fishermen were to coordinate their catches? (d) How much would each of the 10 now-coordinating fishermen catch if there were 100 tons of fish? 12. Pollution and optimal policy.B Black Smoke eatery is the only restaurant in a small town. They face inverse demand of p = 25 − 0.05q and have costs TC(q) = 3 + 4q. Unfortunately, the eatery produces a lot of unsightly black smoke at the same rate as output (so pollution is equal to q). (a) Find the unregulated equilibrium. (b) Assume that the external cost of Black Smoke’s pollution is EC = 2q. Find the social optimum. (c) If the regulator is to seek the socially optimal output, what pollution quota would she set? (d) If the regulator is to seek the socially optimal output, what emission fee would she set? 13. When does the Coase theorem apply.A Can the following situations be effectively addressed under the Coase theorem? Discuss why or why not. (a) Air pollution (b) A homeowner playing loud music (negatively affecting his neighbors) within a homeowners’ association (HOA) (c) Light pollution in a town with a powerful telescope (that needs surrounding darkness to be effective) (d) Use of an irrigation ditch between two ranches 14. Dealing with negative externalities.B Two neighbors in a rural community were fed up with the town’s landfill policies and decided to purchase land together to use as their own landfill. However, the two neighbors did not anticipate the consequences of their purchase and quickly found that their new landfill smelled. Each neighbor has 10 bags of trash. Dumping on their own land is cheap, but the two neighbors have to endure an increased bad smell; however, dumping at the town’s landfill incurs a cost of $3 per bag. Neighbor 1 lives downwind of the new landfill and endures the brunt of the smell. Her utility is u1 (b1 , b2 ) = −3(10 − b1 ) − (b1 + b2 )2 − (b1 + b2 ), while the upwind neighbor’s utility is u2 (b1 , b2 ) = −3(10 − b2 ) − (b1 + b2 )2 , where bi is the number of bags each neighbor dumps at her own landfill. Note that the utilities are negative, as both actions are positive numbers (i.e., b1 , b2 > 0). (a) How much will each neighbor dump at her new landfill? (b) If the neighbors were to coordinate, how much would they dump at their new landfill? 15. Coase theorem in action.A Consider Jordan and Hannah, new neighbors in a nice neighborhood. Each have home businesses, but with different needs. Jordan runs a woodworking business that Externalities and Public Goods 467 makes a lot of noise and creates a lot of sawdust, so his garage door has to stay open during the workday. Hannah runs a yoga studio, which needs a quiet environment to be successful. If Jordan runs his shop, he can make $500, while Hannah will have no customers and make $0. If Jordan does not run his shop, he makes $0, while Hannah makes $600. (a) Assuming that Jordan has the right to operate his shop, can Hannah induce him to shut down his shop so that she can make a profit? What is the total profit? (b) Hannah found out that Jordan can install a dust collection system in his shop for X dollars, which would allow Jordan to close his garage door and lower the noise enough for her to run her studio. What is the largest amount X that Hannah would be willing to pay for the collection system? What is the total profit? (c) Assume that there is an HOA contract (for which Hannah is the HOA president) that does not allow Jordan to make noise. How much would Jordan offer Hannah not to enforce the agreement and allow him to operate with the door open? What would be the total profit? (d) How much would Jordan be willing to pay for a dust collection system to allow him to operate under the HOA rules? What would be the total profit if he were to invest in the system? 16. Pollution regulation and perfect competition.B Consider a perfectly competitive industry that faces demand of p = 10 − Q, and each firm faces a constant marginal cost of c = 2. The external cost of the pollution is EC = 3(αq)2 . (a) Find the unregulated equilibrium quantity Q∗ . (b) Find the socially optimal quantity. (c) Find the emission fee that would induce the socially optimal quantity. 17. Multiple polluters.B Two polluting utility companies offer power at a regulated price of $3 per unit but have different cost functions. The first company produces a cheaper but more polluting energy at cost TCd = 2 − qd + 0.5q2d , with emissions ed = 2qd . The second company produces a less polluting energy, but at a higher cost, TCc = 4 − qc + q2c , with emissions ec = qc . (a) Find the amount of energy and emissions that each firm will produce if left unregulated. (b) If the external cost of pollution is EC = 12 (ed + ec )2 (the regulator cannot directly measure each firm’s emissions, but can measure total emissions), find the socially optimal amount of output from each firm. (c) Is it possible to find a single emissions fee t that would induce the market to produce at the social optimum? 18. Reducing emissions.B A coal company produces electricity with total cost TC = 4q and emissions e = 2q. The coal company, being large, is a monopoly in its region and faces demand of p = 20 − q. (a) If the external cost of emissions is EC = 2e, what emission fee would induce the social optimum? (b) Now assume that the company can invest in a new technology that reduces their emissions to e = 2(q − α), at a cost of 2α 2 . Intuitively, if α = 0, the firm does not invest in the technology, and its emissions are the same as in part (a), while a larger investment in the technology 468 Chapter 17 increases α, which decreases emissions and the fee paid by the firm. Assuming the fee you found in part (a), how much will the company invest in α? 19. Common-pool refrigerator.A Each year, college students are finding themselves living with new roommates with different lifestyle habits than their own. A potentially frustrating habit is the use of the common refrigerator. Discuss the use, potential problems, and solution to the use of a common refrigerator among a group of 2 or more roommates. 20. Spillover effects.C Firms within an industry can experience “spillover effects” of investing in new technologies. Many competing industries have firms concentrated in one region or city, where the workers at competing firms may interact with each other, leading to the exchange of ideas and this positive externality. Consider two firms that face inverse demand of p = 10 − q1 − q2 . Each firm faces total costs of TC = (4 − xi − 0.25xj )qi + 0.5x2i , where the original marginal cost of production ($4) can be reduced by investing in xi and also decreases by a portion of its rival’s investment, at a rate of 0.25xj . The cost of investing in the cost-reducing technology is 0.5x2i . Assume that, in the first stage, every firm i invests xi dollars in R&D. In the second stage, every firm observes the R&D investment (xi , xj ), and thus the cost function of each firm, and firms compete in quantities (à la Cournot). (a) If the firms act competitively, how much will each firm produce, and how much of the costreducing technology will they invest in? (Hint: Because this is a sequential-move game where firms are perfectly informed, you need to find the subgame perfect equilibrium (SPE) of the game, operating by backward induction. You should then start analyzing firms’ output choices in the second stage of the game, for any pair of xi and xj ; and then, anticipating the profits that firms make in the second stage, find their equilibrium investment in R&D in the first stage.) (b) Discuss how the knowledge spillover affects investment in the new technology. In other words, how does firm 2’s investment x2 affect firm 1’s decisions on output and investment, compared to if they were to act cooperatively? References Akerlof, George A. (1970) “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism,” Quarterly Journal of Economics 84(3): 488–500. Angner, Erik (2016) A Course in Behavioral Economics, 2nd ed. Red Globe Press. Belleflamme, Paul and Martin Peitz (2015) Industrial Organization: Markets and Strategies, 2nd ed. Cambridge University Press. Besanko, David and Ronald Braeutigam (2013) Microeconomics, 5th ed. Wiley Publishers. Bolton, Gary E. and Axel Ockenfels (2000) “ERC: A Theory of Equity, Reciprocity, and Competition,” American Economic Review 90(1): 166–93. Bolton, Patrick and Mathias Dewatripont (2004) Contract Theory. MIT Press. Cabral, Luis (2017) Introduction to Industrial Organization, 2nd ed. MIT Press. Camerer, Colin F. (2003) Behavioral Game Theory: Experiments in Strategic Interaction (The Roundtable Series in Behavioral Economics). Princeton University Press. Campbell, Donald E. (2018) Incentives: Motivation and the Economics of Information, 2nd ed. Cambridge University Press. Coase, Ronald H. (1960) “The Problem of Social Cost,” Journal of Law and Economics, 3(1): 1–44. Dal Bó, Pedro and Guillaume R. Fréchette (2011) “The Evolution of Cooperation in Infinitely Repeated Games: Experimental Evidence,” American Economic Review 101(1): 411–29. Duffy, John and Jack Ochs (2009) “Cooperative Behavior and the Frequency of Social Interaction,” Games and Economic Behavior 66(2): 785–812. Fehr, Ernst and Klaus M. Schmidt (1999) “A Theory of Fairness, Competition, and Cooperation,” Quarterly Journal of Economics 114(3): 817–68. Goolsbee, Austan, Steven Levitt, and Chad Syverson (2015) Microeconomics, 2nd ed. Worth Publishers. Harrington, Joseph (2006) “How Do Cartels Operate?” Foundations and Trends in Microeconomics 2(1): 1–105. Harrington, Joseph (2014) Games, Strategies, and Decision Making, 2nd ed. Worth Publishers. Hindricks, Jean and Gareth D. Myles (2013) Intermediate Public Economics, 2nd ed. MIT Press. Jensen, Robert T. and Nolan H. Miller (2007) “Giffen Behavior: Theory and Evidence,” NBER Working Paper No. 13243. Just, David R. (2013) Introduction to Behavioral Economics. Wiley Publishers. Kagel, John H. and Dan Levin (2014) “Auctions: A Survey of Experimental Research,” Working paper, The Ohio State University. Kahneman, Daniel (2013) Thinking, Fast and Slow. Farrar, Straus and Giroux. Kahneman, Daniel and Amos Tversky (1979) “Prospect Theory: An Analysis of Decision under Risk,” Econometrica 47(2): 263–92. Kahneman, Daniel and Amos Tversky (2000) Choice, Values and Frames. Cambridge University Press. 470 References Klemperer, Paul (2004) Auctions: Theory and Practice (Toulouse Lectures in Economics). Princeton University Press. Kolstad, Charles D. (2010) Environmental Economics. Oxford University Press. Krishna, Vijay (2002) Auction Theory. Academic Press. Laffont, Jean-Jacques and David Martimort (2002) The Theory of Incentives: The Principal-Agent Model. Princeton University Press. Levenstein, Margaret C. and Valerie Y. Suslow (2006) “What Determines Cartel Success?” Journal of Economic Literature, 44(1): 43–95. Macho-Stadler, Ines and David Perez-Castrillo (2001) An Introduction to the Economics of Information: Incentives and Contracts. Oxford University Press. McKenzie, David (2002) “Are Tortillas a Giffen Good in Mexico?,” Economics Bulletin vol. 15(1): 1–7. Menezes, Flavio M. and Paulo K. Monteiro (2004) An Introduction to Auction Theory. Oxford University Press. Milgrom, Paul (2004) Putting Auction Theory to Work. Cambridge University Press. Muñoz-Garcia, Felix (2017) Practice Exercises for Advanced Microeconomic Theory. MIT Press. Muñoz-Garcia, Felix and Daniel Toro-Gonzalez (2019) Strategy and Game Theory: Practice Exercises with Answers, 2nd ed. Springer. Nash, John F., Jr. (1950) “Equilibrium Points in N-Person Games,” Proceedings of the National Academy of Science 36(1): 48–49. Perloff, Jeffrey M. (2016) Microeconomics: Theory and Applications with Calculus, 4th ed. Pearson Publishers. Phaneuf, Daniel J. and Till Requate (2016) A Course in Environmental Economics: Theory, Policy, and Practice. Cambridge University Press. Smith, Vernon L. (1991) “Rational Choice: The Contrast between Economics and Psychology,” Journal of Political Economy 99(4): 877–97. Thaler, Richard H. (1988) “Anomalies: The Winner’s Curse,” Journal of Economic Perspectives 2(1): 191–202. Tversky, Amos and Daniel Kahneman (1986) “Rational Choice and the Framing of Decisions,” Journal of Business 59(4): 251–78. Tversky, Amos and Daniel Kahneman (1992) “Advances in Prospect Theory: Cumulative Representation of Uncertainty,” Journal of Risk and Uncertainty 5(4): 297–323. Varian, Hal (2014) Intermediate Microeconomics: A Modern Approach, 9th ed. W. W. Norton & Company. Vesterlund, Lise (2014) “Charitable Giving: A Review of Experiments on Voluntary Giving to Public Goods,” in Handbook of Experimental Economics, vol. 2, ed. C. R. Plott and V. L. Smith. Princeton University Press. Index Adverse selection problems, 420–421, 428 in market for lemons, 428–431 preventing, 438–439 principal-agent model of, 431–438 profit maximization problem in context of, 439–440 Advertising, by monopolies, 266–267 Airline industry, 249 Akerlof, George A., 428 Allocation rule, 397 All-pay auctions, 397 Amazon (firm), 248, 248n2 Anticoordination game, 315–316, 323 Arbitrage, 279 Arrow-Pratt coefficient of absolute risk aversion, 140–142 Auctions, 396–397 as allocation mechanism, 396–397 common-value auctions, 410–411 double auctions, 240–241 efficiency in, 409–410 experiments with, 411–412 first-price auctions, 400–409, 412–414 second-price auctions, 397–400 Average costs, 198–199 economies of scale in, 201–203 Average product, 157–161 relationship between marginal product and, 161–163 Backward induction, 334–339 Bads, 8 Bang for the buck, 50, 64 Bargaining between parties, 451–453 Battle of the Sexes game, 312–314 Bayesian Nash equilibrium, 393 Behavioral economics, 36–37, 142–143 on auctions, 411–412 on cooperation in games, 346–347 market experiments in, 240–241 prospect theory in, 145–148 public-goods experiments, 459 weighted utility in, 144–145 Bertrand model of imperfect competition, 365–369 Best response functions, 358–361 for games of incomplete information, 392–393 with incomplete information, 393–394 with product differentiation, 377–379 Bid shading, 402–403, 414 Bliss points, 12–14 Block pricing (second-degree price discrimination), 281n3 Bolton, Gary E., 37 Bolton and Ockenfels social preferences utility function, 37 Budget constraints, 45–49 Budget lines, 46 kinked, 60–63 Bundles, 7 budget restraints for, 45–49 consumer choices for, 45 income effects on, 75 preferences for, 8–14 Bundling, 277–278, 286–291 Buyers. See Customers Capital as fixed short-run cost, 196 as input, 155 in isocost lines, 183–185 technological progress in, 174–175 Cardinality, 14 Cartels, 263, 355, 369–373 Certainty effect, 147 prospect theory on, 148 Certainty equivalent, 139–140 Cheating, in games, 343–344 Chinese auctions, 397, 409 Cisco (firm), 268n Club goods, 455–456 Coase, Ronald H., 451 Coase theorem, 451–452 Cobb, Charles, 32n 472 Index Cobb-Douglas production functions, 156–157 in cost-minimization problem, 188 elasticity of substitution in, 178–179 finding input demands with, 190 marginal rate of technical substitution, 165, 166–167 output elasticity in, 200–201 in profit maximization problem, 216–217 Slutsky equation applied to, 100–101 for total costs, 193–194 for utility maximization problems, 53 Cobb-Douglas utility functions, 32–33 for expenditure minimization problems, 66–67 for finding income effect and substitution effect, 91–92 income elasticity in, 78 increasing income in, 77 Stone-Geary utility function and, 35 Collusion, in imperfect competition, 369–373 Common-pool resources, 455, 456, 459–464 Common-value auctions, 410–411 Comparative statics, 2 Compensated demand, 66 Compensating variation, 110–113 alternative representation of, 120–122 measuring, with quasilinear utility function, 117–118 Competition cartels and collusion in, 369–373 imperfect, 355–356 imperfect, models of, 357–369 market power in, 356–357 perfectly competitive markets, 214, 249 Competitive markets, 214 Completeness, 8–9 Concave utility, 134 Constant elasticity of substitution production function, 171 Constrained maximization problems, 64 Consumer choice, 45 budget constraints in, 45–49 kinked budget lines in, 60–63 revealed preferences in, 57–60 utility maximization problem in, 49–56 Consumers, 1. See also Customers Consumer surplus, 107–110 measuring, with quasilinear utility function, 117 Consumer theory, 2–4, 7 marginal rate of substitution in, 25–28 marginal utility in, 18–19 preferences for bundles in, 8–14 utility functions in, 14–17 Consumption in income-consumption curve, 79–80 in price-consumption curves, 85–87 Contract theory, 419–421 adverse selection problems in, 428–439 moral hazards in, 421–428 Convex utility, 136 Cooperation among cartel members, 371–372 in games, 346–347 Coordination game, 314–315 Cost advantages, 248 Cost functions, 193–195 Cost minimization, 183 average and marginal costs in, 198–201 cost functions in, 193–195 diseconomies of scale, 205–206 economies of scale, scope, and experience in, 201–206 input demands in, 189–193 isocost lines, 183–185 Lagrange analysis for, 206–207 problem in, 185–189 types of costs, 195–198 Cost-minimization problem, 185–189 Lagrange analysis for, 206–207 Costs of advertising, 266–267 average and marginal, 198–201 types of, 195–198 Coupons, 62–63 Cournot model of imperfect competition, 358–365 Bertrand model reconciled with, 369 for cartels, 370, 371 with incomplete information, 393–394 with N firms, 380–383 Stackelberg model and, 373, 376 Cross-price elasticity, 84 Customers adverse selection problems faced by, 428–431 first-degree price discrimination on, 279–281 legal rights of, 439 preferences for bundles among, 8–14 screening of, 438 second-degree price discrimination on, 281–284 third-degree price discrimination on, 284–286 of used cars, 420 willingness-to-pay of, 277 Deadweight loss, 263–265 Demand derivative of, 76–77 income effects and, 87–88 price changes and, 82–87 Demand advantages, 248 Demand curves consumer surplus and, 107–110 in monopoly markets, 255–256 Derivative of demand, 76–77, 83 Diodes Julianus, 396n3 Diseconomies of scale, 202–203 Double auctions, 240–241 Douglas, Paul, 32n Index 473 Duopolies, 357 Cournot model of, 362–363 Cournot model with, 382 eBay (firm), 248, 396 Economies adding production to, 239–240 of experience, 205–206 of scale, 201–203, 248 of scope, 203–205 Efficiency in auctions, 409–410 equilibrium versus, 234–239 Efficient allocations, 233–235, 240 marginal rate of substitution and, 241–242 Elasticity constant elasticity of substitution production function, 171 of demand, in monopoly markets, 255–256 of income, 77–79, 97–98 of outputs, 199–201 of price, 83–84 of price, in monopoly markets, 256–257 Slutsky equation to represent, 101–102 of substitution, 176–179 Emissions, government interventions to reduce, 453–455 Employees adverse selection problems involving, 420–421 moral hazard problems involving, 421–428 performance of, 419–421 principal-agent model for contracts with, 431–438 Engel curve, 80–81 Environmental damage, 464 Equilibrium, 213–214 for common-pool resources, 460–461 general equilibrium, 228–240 long-run equilibrium, 225–226 short-run equilibrium, 224–225 Equilibrium allocations, 239–240 Equilibrium price, 230–233 Equivalent variation, 114–116 alternative representation of, 122–124 measuring, with quasilinear utility function, 119 Exchange economies, 239 Expected utility, 131–132 weighted utility and, 144 Expected value, 128–129 variance and, 130 Expenditure function, 120–124 Expenditure minimization problems, 65–68 utility maximization problems and, 68–70 Experience, economies of, 205–206 Experimental tests. See also Behavioral economics of auctions, 411–412 of cooperation in games, 346–347 of economics, 142–143 on public-goods, 459 Explicit costs, 195 Externalities, 445–446 social optimum for, 448–455 with unregulated equilibrium, 446–447 Feasible allocations, 229 Fehr, Ernst, 36, 38n Fehr-Schmidt social preferences utility function, 36–37 Financial aid to students, 281 Finite repetitions of games, 340–341 Firms, 1 cartels, 263 competition among, 355 market power of, 356–357 monopolies, 247–249 moral hazards in contracts with employees, 421–428 principal-agent model for contracts with employees, 431–438 production functions for, 155–157 production theory for, 4 profit maximization problem for, 214–217 supply curves for, 217–220 First-degree price discrimination, 277, 279–281 First-price auctions, 396 equilibrium bidding in, 401–404 in more general settings, 412–414 with N bidders, 405–406 with privately observed valuations, 400–401 with risk-averse bidders, 407–408 First Welfare Theorem, 213, 234–237 Fishing farms, 451–452 Fixed costs, 196–198 Fixed proportions production function, 169–170 elasticity of substitution in, 178 Free Application for Federal Student Aid (FAFSA) forms, 281 Free-riding, 456–458 Frontier Airlines, 249 Game of Chicken, 315–316 Game theory, 5–6 applied to common games, 310–316 on auctions, 396–414 behavioral economics on cooperation in, 346–347 games defined for, 298–300 for games with incomplete information, 391–396 game trees for, 330–331 mixed-strategy Nash equilibrium for, 316–321 Nash equilibrium in, 306–310, 332–333 for repeated games, 340–346 for sequential games, 329–330 simultaneous-move games, 297–298 strategic dominance in, 300–306 subgame perfect equilibrium in, 334–339 Game trees, 330–331 Nash equilibrium for, 332–333 474 Index Geary, Roy C., 35n General equilibrium, 213–214, 228–229 Giffen goods, 83, 89–91 Goods in bundles, 7, 8 derivative of demand for, 76–77 Engel curve for, 80–81 inferior, 82, 97–98 public goods, 455–459 Government interventions, 453–455 Grim-Trigger Strategy, 341–345, 371–372 Health insurance moral hazards in, 421 screening of customers for, 438 Herfindahl-Hirschmann index, 356–357 Hicks, John Richard, 66n11 Hicksian demand (compensated demand), 66n11 Hidden actions. See Moral hazards Hidden information problems, 420 Imperfect competition, 355–356 Bertrand model of, 365–369 cartels and collusion in, 369–373 Cournot model of, 358–365 market power in, 356–357 models of, 357–358 product differentiation in, 377–380 Stackelberg model for, 373–376 Implicit costs, 195 Implied warranties, 439 Incentive constraints, 426, 439–440 Income in budget constraints, 45–49 effects of changes in, 75–82 and substitution effects, 87–88 Income-consumption curve, 79–80 Income effects, 87–88 alternative representation of, 98–101 in labor market, 94–97 substitution effects and, 88–94 welfare changes and, 116–119 Income elasticity, 77–79, 97–98 Slutsky equation to represent, 101–102 Indifference curves, 20–24 isoquants and, 164 marginal rate of substitution and, 25–26 Inferior goods, 76, 82, 89–91, 97–98 Infinite repetitions of games, 341–346 experimental studies of, 346–347 Information rent, 427 Input demands, 183, 189–193 Inputs, 155 in fixed proportions production function, 169–170 in isocost lines, 183–185 isoquants for substitutions of, 163–165 production functions for, 156 returns to scale of, 171–173 Insurance markets moral hazards in, 421 screening of customers in, 438 Inverse demand function, 249, 251 Inverse elasticity pricing rule, 260 Isocost lines, 183–185 in cost-minimization problem, 185–189 Isoquants, 155, 163–165, 183 in cost-minimization problem, 185–189 of fixed proportions production function, 170–171 marginal rate of technical substitution, 165–167 Iterative Deletion of Strictly Dominated Strategies, 303–307 Jensen’s inequality, 134n3 Kahneman, Daniel, 143, 145, 148 Kaiser Aluminum (firm), 195 Kinked budget lines, 60 coupons in, 62–63 for quantity discounts, 60–62 Labor as fixed short-run cost, 196 in isocost lines, 183–185 productivity of, 157 technological progress in, 174 Labor market adverse selection problems in, 420–421 income and substitution effects on, 94–97 lemons in, 430–431 Labor-saving technological progress, 175 Lagrange multipliers for cost-minimization problem, 206–207 for utility maximization problem, 64–65 Leisure, 94–97 Lemons (undesirable used cars), 420 market for, 428–431 Leontieff, Wassily, 30n Leontieff utility function, 30n Lerner index (markup index), 257–260 Linear inverse demand, 249 Linear pricing (uniform pricing), 284 Linear production functions, 168–169 in cost-minimization problem, 189 elasticity of substitution in, 177–178 for finding input demands, 191 marginal rate of technical substitution, 167–168 for total costs, 194–195 Linear utility function, 137–138 Long-run costs, 196–197 Long-run equilibrium, 225–226 Long-run supply curves, 219–220 Loss aversion, 146–147 Index 475 Lotteries, 127, 128 expected utility of, 131–132 expected values in, 128–129 experimental research on, 142–148 inefficiency of, 409–410 prospect theory on, 145–148 risk premiums of, 138–139 variance in, 129–131 Marginal costs, 198–199 Marginal products, 157–161 marginal rate of substitution as ratio of, 175–176 relationship between average product and, 161–163 technological progress increasing, 174 Marginal rate of substitution, 25–28 efficient allocations and, 241–242 in equilibrium prices, 230 finding, 37–38 of fixed proportions production function, 170 as ratio of marginal products, 175–176 of technical substitution, 165–167 Marginal revenues for monopolies, 250–253 in third-degree price discrimination, 284 Marginal utility, 18–19 Market equilibrium long-run equilibrium, 225–226 short-run equilibrium, 224–225 Market power, 356–357 Markets, 4–5 experiments involving, 240–241 failures of, 6 general equilibrium of, 228–240 imperfect competition in, 355–356 for monopolies, 255–257 perfectly competitive, 214, 249 Market supply curves, 220–221 Markup index (Lerner index), 258 Marshall, Alfred, 66n12 Marshallian demand (uncompensated demand), 66n12 Microeconomics, 1–2 Mixed-strategy Nash equilibrium, 298 Monopolies, 4–6, 247–249 advertising by, 266–267 bundling by, 286–291 Cournot model with, 382 first-degree price discrimination by, 278–281 Herfindahl-Hirschmann index for, 357 Lerner index and inverse elasticity pricing rule for, 257–260 markets for, 255–257 monopsonies and, 268–271 multiplant, 260–263 profit maximization problem for, 249–255 second-degree price discrimination by, 281–284 third-degree price discrimination by, 284–286 welfare analysis under, 263–265 Monopsony, 268–271 Monotonicity, 10–12, 16 Moral hazards, 419, 421–422 preventing, 428 when effort is observable, 422–424 when effort is unobservable, 424–427 Nash, John F., Jr., 307 Nash equilibrium, 297, 298, 306–310 for games of incomplete information, 392–396 for game trees, 332–333 for mixed strategies, 316–321 for sequential games, 329 National defense, 455 Natural monopolies, 248 Negative taxes, 238 Nonexpected utility, 142 Nonlinear pricing, 284 Nonsatiation, 12, 16 Normal goods, 76 Ockenfels, Axel, 37 Oil leases, auctions for, 410 Oligopolies, 355 Herfindahl-Hirschmann index for, 357 One-shot games (unrepeated games), 340 Opportunity costs, 195 Oracle (firm), 268n Ordinality, 14 Organization of the Petroleum-Exporting Countries (OPEC), 263, 340, 369 Outputs competition among, with product differentiation, 379–380 economies of scale in, 201–203 elasticity of, 199–201 in long-run equilibrium, 225–226 of monopolies, 249, 253–255 in profit maximization problem, 215 Partial equilibrium, 213–214 Participation constraints, 426, 429, 439–440 Patents, 248, 248n3 Payment rule, 397 Payoffs, 299 Peaches (desirable used cars), 420, 430, 431 Perfect complements utility function, 30–31 Perfectly competitive markets, 214, 249 Cournot model with, 382–383 Perfect substitutes utility function, 29–30 Performance, of employees, 419–421 Pharmaceutical industry, 248 Players (in games), 298 Pollution, 445–447 bargaining to reduce, 451–453 government interventions to reduce, 453–455 social optimum for, 448–450 476 Index Positive externalities, 446, 454n Postcontractual problems, 421 Precontractual problems, 421 Price-consumption curves, 85–87 Price discrimination, 277–279 first-degree, 279–281 second-degree, 281–284 third-degree, 284–286 Prices Bertrand model of simultaneous price competition, 365–369 in budget constraints, 48–49 changes in, 82 compensating variation changes in, 110–113 consumer surplus and changes in, 107–110 derivative of demand and, 83 equilibrium price, 230–233 equivalent variation and changes in, 114–116 inverse elasticity pricing rule, 260 in long-run equilibrium, 225–226 in monopolies’ profit maximization problem, 249–250 in monopoly markets, 255–257 in perfectly competitive markets, 214 price-consumption curves, 85–87 price-elasticity of demand and, 83–84 responses to changes in, 192–193 Price wars, 248–249 Principal-agent model, 431 with asymmetric information, 433–436 comparing information settings for, 436–438 with symmetric information, 431–433 Prisoner’s Dilemma game, 310–312, 329–330 experimental study of, 346–347 infinite repetitions of, 341–346 mixed strategy for, 320–321 repetitions of, 340 Private goods, 455 Probabilities, 128. See also Uncertainty Probability weights, 146 Producer surpluses, 226–228 Product differentiation, 377–380 Production, adding to economy, 239–240 Production functions, 155–157 Cobb-Douglas, 170–171 constant elasticity of substitution, 171 fixed proportions, 169–170 isoquants, 163–165 linear, 168–169 marginal and average product, 157–161 marginal rate of technical substitution, 165–167 relationship between average product and marginal product, 161–163 returns to scale, 171–173 for technological progress, 173–175 Production theory, 4 Productivity average product and marginal product measurements of, 157–161 of employees, 420 Profit maximization problem, 214–217 in adverse selection context, 439–440 Cournot model of, 358 for monopolies, 249–255 Prospect theory, 145–148 Public goods, 445, 455–459 common-pool resources, 459–464 Quantity competition collusion in, 369–370 with N firms, 380–383 sequential, Stackelberg model of, 373–376 simultaneous, Cournot model of, 358–365 Quantity discounts, 60–62 in second-degree price discrimination, 281–284 Quasilinear utility function, 34–35 for expenditure minimization problems, 67–68 for finding income effect and substitution effect, 93–94 measuring welfare changes with, 117–119 QuiBids.com (firm), 397n Quota, 371, 453 Rationality, 2, 299 Reference points, 146 Regulators, 1 Repeated games, 340 with finite repetitions, 340–341 with infinite repetitions, 341–346 Research and development (R&D), 446 Returns to scale production functions, 171–173 Risk aversion, 127, 132–134 Arrow-Pratt coefficient of, 140–142 in auctions, 407–409 Risk loving, 134–136 Risk neutrality, 136–138 Risk premium, 138–139 measurement of, 140 Risks attitudes toward, 132–138 behavioral economics of, 142–148 measurement of, 127, 138–142 screening of customers to avoid, 438 variance in measurement of, 129–131 Rollback equilibrium (subgame perfect equilibrium), 329, 334–339 Satiation, 12–14 Scale, economies of, 201–203 Schmidt, Klaus M., 36, 38n Scope, economies of, 203–205 Screening of customers, 285–286, 286n of insurance risks, 438 Index 477 Second-degree price discrimination, 277, 281–284 Second-order conditions, 215 Second-price auctions, 397–400 Second Welfare Theorem, 213, 237–239 Sequential games, 329–330 Stackelberg model for, 373–376 Short-run costs, 196–198 Short-run equilibrium, 224–225 Short-run supply curves, 221–223 Shutdown price, 219 Signaling, 438–439 Simultaneous-move games, 297–298 Slutsky equation, 100–101 elasticities to represent, 101–102 Smith, Vernon L., 241 Soccer, 317–320 mixed strategy for, 321–323 Social optimum, 448–450 for common-pool resources, 459–464 restoring, 451–455 Social preferences, 36–37 Social welfare, 263 Stackelberg model of sequential quantity competition, 373–376 Standard deviation, 131 Stone, Richard, 35n Stone-Geary utility function, 35–36 Strategic dominance, 297, 300–306 Strategies, 5 for common games, 310–316 defined, 298 mixed-strategy Nash equilibrium, 316–321 Nash equilibrium to find, 306–310 Strict dominance, 300–306 Strict monotonicity, 10–12, 16 Subgame perfect equilibrium (rollback equilibrium), 329, 334–339 for sequential quantity competition, 375 Subgames, 335 Subsidy, 237–239 Substitution effects, 75, 87–88 alternative representation of, 98–101 constant elasticity of substitution production function, 171 elasticity of, 176–179 income effects and, 88–94 in labor market, 94–97 marginal rate of technical substitution, 165–167 Sunk costs, 195–196 Supply curves, 217–220 market supply curves, 220–221 in monopoly markets, 255 short-run supply curves, 221–223 Surpluses, 226–228 Taxes, 107, 124 negative taxes, 238 to prevent free-riding, 458 Technological progress, production functions for, 173–175 Third-degree price discrimination, 277, 284–286 Third-price auctions, 397 Tolls, 458 Transitivity, 9 Tversky, Amos, 143, 145, 148 Uncertainty, 127 behavioral economics of, 142–148 expected utility in, 131–132 expected values in, 128–129 in lotteries, 128 prospect theory on, 145–148 variance in, 129–131 Uncompensated demand, 66 Uniform pricing (linear pricing), 284 United Airlines, 249 Unregulated equilibrium, 446–447 Unrepeated games (one-shot games), 340 Unsunk costs, 195–196 Used-cars market, 420 market for lemons in, 428–431 Utility elasticity, 33 Utility functions, 14–17 Cobb-Douglas, 32–33 diminishing marginal rate of substitution in, 26 Fehr-Schmidt social preferences function, 36–37 indifference curves, 20–24 for marginal utility, 18–19 for perfect complements, 30–31 for perfect substitutes, 29–30 quasilinear, 34–35 Stone-Geary, 35–36 Utility maximization problems, 49–55 expenditure minimization problems and, 68–70 in extreme scenarios, 55–56 income effects in, 75 Lagrange multiplier to solve, 64–65 Variable costs, 196–198 Variance, 129–131 Von Neumann-Morgenstern EU function, 132n WalMart (firm), 248, 248n2 Walras, Leon, 66n12 Walrasian demand, 66n12 Warranties, 431 implied warranties, 439 Weak Axiom of Revealed Preference (WARP), 57–60, 263 Weak dominance, 302 Weather forecasting, 127, 128 Weighted utility, 144–145 478 Index Welfare analysis externalities effecting, 445–446 under monopolies, 263–265 Welfare changes, 107 compensating variation measurement of, 110–113, 120–122 consumer surplus measurement of, 107–110 equivalent variation measurement of, 114–116, 122–124 with no income change, measurement of, 116–119 Willingness-to-pay (WTP), 277, 279 Winner’s curse, 411 Workers. See Employees