CHAPTER 9 UNCERTAINTY

advertisement
F Petri Microeconomics for the critical mind
CHAPTER 9 UNCERTAINTY
chapter9uncertainty 17/02/2016
p. 1
highly provisional and incomplete
This chapter, like ch. 4 on consumer theory and ch. 5 on production, has a Part I which
presents microeconomic notions that do not necessarily need the neoclassical theory of value and
distribution for their validity (and are used in applications essentially in partial-equilibrium
analyses), and a Part II that illustrates how uncertainty is taken account of in general equilibrium
theory, and sketches the possibility of a different approach.
In Part I the central notion is risk aversion. People try to protect themselves against the
possibility of unfavourable events. Giving precision to this intuitive idea has absorbed considerable
energies of economists, mathematicians and philosophers. It has required formalizing the idea of
uncertainty, and the idea of preferences among choices whose outcomes are uncertain to some
degree. I will initially follow the prevalent approach, that assumes that the possible uncertain
outcomes of each choice can be listed exhaustively and that the likelihood of each of the outcomes
is taken as given by the decision maker and measured by a probability of occurrence that satisfies
the usual axioms of probability theory. I will illustrate the most commonly assumed type of utility
function for uncertain outcomes: Von Neumann-Morgenstern expected utility, with decreasing
marginal utility of income. I will use risk aversion, defined on the basis of that utility function, to
explain insurance, risk sharing, diversification, portfolio selection. I will end Part I with some
observations on subjective expected utility, and on current research directions that question the use
of VNM utility functions.
Part II illustrates how uncertainty is dealt with in modern general equilibrium theory. It
covers the notion of Arrow-Debreu general equilibrium with contingent commodities, its
equivalence with a Radner equilibrium, and the issue of incomplete markets. These are briefly
contrasted with the treatment of uncertainty in long-period analyses, both classical and neoclassical.
EXPECTED UTILITY
A prospect is a list of outcomes, one for each possible state of the world. An individual may
have to choose among different actions which entail different prospects. For example, to insure or
not to insure against a house fire; to bet 100 dollars at roulette on black, or on a single number; to
go on a dangerous adventure trip or not; to choose among possible moves at poker.
It is possible to discuss consumer choice among prospects without using probability,
postulating simply a preference ordering among them; I say something on this later. Now I consider
choice among different prospects that consist of different bundles of goods, associated with
different known probabilities of occurrence. Risk, and not uncertainty, in the terminology of Frank
Knight.
In order to analyze risk taking the predominant assumption in the last decades has been that
preferences over risky events can be described through utility functions with a very specific
property, the expected utility form. This is as follows:
Suppose that one must choose among lotteries or gambles that give different known
probabilities of obtaining different outcomes. Outcomes can be vectors of consumption goods, or
sums of money, or happenings (e.g. that someone gets married; or an accident), anything really. The
outcomes can always be reinterpreted as events (sets of states of the world, having some relevant
characteristic in common and differing only in elements irrelevant for the preferences of the
chooser). Example of an event: all possible states of the world in which when to-night at 10:00 pm I
will gamble at roulette the ball will stop on number 22; the relevant common element is the roulette
number; many authors would describe the outcome simply as number 22, but it is always possible
to reinterpret it as an event. But the word outcome clarifies that we are considering events that come
out of a choice or decide whether a gamble was successful.
1
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 2
Let u(x) be the utility of outcome x for certain;
let px be the probability of event x;
let there be n possible events (a finite number);
a lottery or gamble L over n events (or outcomes or prizes) assigns a probability of
occurrence to each of them, the sum of their probabilities being 1. It can be represented in different
ways: one is as a double vector, that lists first all the prizes x1,...,xn and then, in the same order, their
probabilities of occurrence p1, p2, ... , pn, that sum to 1. If the possible events are known and ordered
in an unambiguous way, the vector of probabilities alone is sufficient to represent a lottery. Another
representation, adopted e.g. by Varian, and often useful to avoid ambiguities, is
p1◦x1p2◦x2...pn◦xn .
Among the outcomes of a lottery there can be other lotteries, e.g. one can have, with A, B, C
three outcomes or events:
L = p1◦Ap2(p3◦Bp4◦C) where p1+p2=1, p3+p4=1.
A lottery with lotteries among the prizes is said compound, otherwise simple. The reduced
lottery corresponding to compound lottery L is defined as the simple lottery L’ stating the final
probabilities of occurrence of each event:
L’=p1◦Ap2p3◦Bp2p4◦C.
If it had been A=C, the reduced lottery would have been L”=(p1+p2p4)◦Ap2p3◦B. Adopting
the vector representation: if you have n possible ordered events and k simple lotteries over them
L1=(p11, ..., pn1), L2=(p12, ..., pn2), ... , Lk=(p1k,...,pnk), and if a compound lottery CL has these simple
lotteries as outcomes with probabilities (q1,...,qk), the reduced lottery LR(CL,L1,...,Lk) with the n
events as prizes is the vector of probabilities (s1,...,sn) where si=pi1q1+pi2q2+...+pikqk.
In order to minimize ambiguities I use the Varian representation of lotteries. Other treatises
prefer to indicate the lottery pA◦ApB◦B ... pN◦N as pAA+pBB+...+pNN. The justification for
this representation is the following. Remember that, once the outcomes are ordered in an
unambiguous way from first to last, a simple lottery can be specified as simply the vector of
probabilities of the outcomes; now, an outcome can be considered (and we will consider it) identical
to the certainty of that outcome, which in turn can be considered equivalent to a ‘degenerate’ lottery
that assigns probability 1 to that outcome and probability zero to all other possible outcomes[1]. All
outcomes can be considered degenerate lotteries, and simple lotteries can then be seen as in fact
compound lotteries. Consider a simple lottery pA◦ApB◦BpC◦C, where pA+pB+pC=1, which
implies that A, B and C are the only possible events considered; let the outcomes be listed in the
order A,B,C; the representation of this lottery as the vector of probabilities of the outcomes is (pA,
pB, pC). Now redefine the symbols A, B, C to stand for the vectors that represent the three outcomes
as degenerate lotteries: A≡(1,0,0), B≡(0,1,0), C≡(0,0,1); then (pA, pB, pC) = pAA+pBB+pCC.
Assume that the chooser assigns a definite utility level to each event and to each lottery; if
the xi’s represent consumption vectors or incomes, assume that the utility level of each xi would be
the same if it had occurred in a different state of the world.
Definition: the chooser has utility (over lotteries) of the expected utility form or of Von
Neumann-Morgenstern (VNM) form if the utility of a lottery that assigns probability p1 to event x1,
probability p2 to event x2, ..., probability pn to event xn, can be represented as
[9.1] u(x1,...,xn; p1,p2,...,pn) = p1u(x1)+p2u(x2)+...+pnu(xn).
A utility function (over lotteries) of expected utility form is called expected utility, or Von
Neumann-Morgenstern utility from the name of the originators of the notion, or also Bernoulli
utility function. The expected utility of a lottery is the sum of the expected values (in the
probabilistic/statistical sense) of the utilities of the different events, and a lottery is preferred to
1
The equivalence of an outcome, of the certainty of that outcome, and of its degenerate lottery is
sometimes listed among the axioms of the theory (cf. axiom L1 in Varian 1993); other authors, e.g. Owen,
apper to consider the equivalence a definition, and therefore not needing an explicit axiom.
2
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 3
another one if and only if its expected utility is greater. Sometimes it is useful to distinguish the
function u(x1,...,xn; p1,p2,...,pn) from the functions u(xi) by indicating the first one with a different
symbol, e.g. U(·); but in the present case, where all the u(xi)’s are the same function, the use of a
different symbol can be misleading in that, if event xi is a lottery, u(xi) has exactly the same
functional form as U(·). Still, sometimes it is useful to have a name for u(xi) different from the
name of U(·): common terminologies are to call u(xi) the felicity function or the basic utility
function, and to call U(·) the VNM utility or overall utility function.
Of course if the Von Neumann-Morgenstern (VNM) utility function correctly represents
preferences, any increasing monotonic transformation of it correctly represents the same
preferences, but it is convenient to accept only transformations that maintain the expected utility
form (thus maintaining its convenient additive, i.e. strongly separable, form), and therefore only
affine transformations from u(ℒ) to v(ℒ)=au(ℒ)+b with a, b scalars and a>0.
Preferences admit representation via a VNM utility function when they satisfy certain
axioms. The axioms one starts from can be different, and pedagogical proofs can be considerably
simplified by assuming as axioms some propositions that in fact might be derived from the other
axioms. The minimal number of important axioms (that is, besides those establishing that
preferences over lotteries are complete[2], reflexive, and transitive; that a consumer considers the
degenerate lottery that assigns probability 1 to one outcome as the same as getting that outcome for
certain; that the consumer doesn’t care about the order in which a lottery is described), is three.
They are:
1. Equivalence axiom
2. Continuity axiom
3. Independence axiom.
Equivalence (of reduced lotteries) axiom: the chooser considers a compound lottery and its
corresponding reduced lottery as the same lottery (or at least she is indifferent among them)[3].
It must be noted that empirical evidence is not always in accord with this axiom; sometimes
people choose differently depending on how a choice among lotteries is presented. We will neglect
this fact, that seems of very limited relevance for normal economic choices.
An implication of this axiom is that one can always replace a simple lottery over more than
two outcomes with a compound lottery over two outcomes, some of them being in turn lotteries.
Thus the lottery pA◦ApB◦B(1–pA–pB)◦C is equivalent to the lottery
pA
pB
 A
 B ]  (1–pA–pB)◦C .
(pA+pB)◦[
p A  pB
p A  pB
Continuity axiom: The preference relation ≿ on the space of simple lotteries ℒ is
continuous, that is, for any three lotteries A, B, C  ℒ the sets {p[0,1]: p◦A(1–p)◦B ≿ C} and
{p[0,1]: C ≿ p◦A(1–p)◦B} are closed[4].
We assume complete comparability i.e. we exclude noncomparability – a rather strong assumption,
stronger than for choices under certainty, since one may be facing rather complex compound lotteries.
3
This appears as assumption L3, p. 173, in Varian 1993, and as 7.2.2 in Owen 1995. It may be
interesting for the student to compare different proofs, and to understand whether the method of proof relies
on a very different approach, or is the same behind apparent differences: the latter is the case for the
differences between the proof here and the one in Varian 1993. This axiom implies that lotteries by
themselves are not objects of preferences, only the events that are their (final) prizes are.
4
This appears as axiom U1, p. 174, in Varian 1993.
2
3
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 4
This axiom states that if a lottery p◦A(1–p)◦B is strictly preferred to an outcome (which
can be a lottery), a sufficiently small change in p will not invert the preference order. An example of
practical relevance that illustrates its concrete meaning is the following: if a trip with zero
probability of a serious accident (e.g. death) is strictly preferred to no trip, the same trip with a
positive but sufficiently small probability of a serious accident is still preferred to no trip.
Essentially, the axiom rules out lexicographic preferences. It appears acceptable in the generality of
cases.
The third axiom is more contentious; it is justified as follows: consider a lottery p◦A(1–
p)◦B; whether this lottery is preferred to another lottery should not be affected by replacing A with a
C such that the consumer is indifferent between A and C, because the consumer should find that
p◦A(1–p)◦B ~ p◦C(1–p)◦B: this is because, differently from the case in usual consumer theory,
relationships of substitutability or complementarity between A and B, and between C and B, are
irrelevant here since the consumer is not going to consume A and B or C and B: A and B are
alternative possibilities, and the same for C and B[5].
Independence axiom: The preference relation ≿ on the space of simple lotteries ℒ satisfies
independence, that is, for any lottery A, B, C  ℒ and for p(0,1) it is
A≿B if and only if p◦A(1–p)◦C≿ p◦B(1–p)◦C.
This is sometimes called the substitution axiom. It can be shown (cf. Appendix) that from
this axiom one can derive the strong-preference and the indifference versions of the same axiom;
these are sometimes for simplicity directly assumed as axioms in place of the original axiom:
[9.2] A≻B if and only if p◦A(1–p)◦C≻ p◦B(1–p)◦C.
[9.3] A ~ B if and only if p◦A(1–p)◦C ~ p◦B(1–p)◦C[6].
The Independence axiom and its implications [9.2] and [9.3] assume that if we have two
lotteries L and M with L  M then for any positive probability p and any third lottery C it is
p◦L(1–p)◦C  p◦M(1–p)◦C, and the same preservation of ranking holds if L ~ M. Replacing the
third lottery with any other one, even with L or M, does not alter this preservation of ranking.
From these axioms one derives:
Monotonicity-in-Probabilities (M-in-P)Lemma: if AB and p’> p with p, p’ (0,1), then
(i) A  p◦A(1–p)◦B,
(ii) p◦A(1–p)◦B  B
(iii) p’◦A(1–p’)◦B  p◦A(1–p)◦B, [7]
and conversely if p’>p with p, p’ (0,1) and A  p◦A(1–p)◦B or p’◦A(1–p’)◦B  p◦A(1–p)◦B,
then A  B.
5
However, preferences (being ex ante relative to the resolution of uncertainty) might be influenced
by some psychological connection between outcomes; so it cannot be excluded, it seems, that one may prefer
to be given the possibility of getting A or B, rather than C or B, in spite of the fact that, if one had to choose
between A for certain and C for certain, one would be indifferent.
6
[9.3] appears in a slightly different form (because referred to a best lotter b and a worst lottery w,
whose existence is assumed by Varian but not here) as U2, p. 174, in Varian 1993.
7
This result (referred to lotteries with the best and the worst outcome as prizes) is assumed for
simplicity by Varian 1993 as an axiom, U4 p. 174; Varian says that it can be derived from the earlier axioms
he has listed, but does not give the proof, that would not have been easy because his axioms include neither
the Independence axiom as stated here (i.e. with the weak inequality sign) nor our [9.2], Varian only lists our
[9.3].
4
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 5
Proof. The trick is to use B or A in place of C in [9.2]. Inequality (i) derives from the fact that AB
and 9.2 imply A = (1–p)◦Ap◦A  (1–p)◦Bp◦A for all p(0,1). Inequality (ii) analogously derives from
p◦A(1–p)◦Bp◦B(1–p)B=B. Now define π = [(p’–p)/(1–p)] (0,1), admissible since p<p’<1; then
p’◦A(1–p’)B=π◦A(1–π) ◦(p◦A(1–p)◦B); now let L stand for the compound lottery p◦A(1–p)◦B which
by (i) we know to be worse than A; then in (ii) we can replace B with L, and p with π, obtaining π◦A(1–π)
◦L  L, that is, p’◦A(1–p’)◦B  p◦A(1–p)◦B, that proves (iii). Conversely, A=p◦A(1–p)◦Ap◦A(1–
p)◦B implies AB by 9.2, and p’◦A(1–p’)◦B  p◦A(1–p)◦B can be re-written π◦A(1–π)◦(p◦A(1–p)◦B)
 p◦A(1–p)◦B with π defined as above, or π◦A(1–π)◦LL that implies AL and hence AB. ■
It is furthermore possible to derive from the above axioms the following result, sometimes
directly assumed as an axiom, usually with the name Archimedean axiom:
Archimedean ‘axiom’: Let A, B, C be outcomes such that A≻ C≻ B. Then there exists some
p* (0,1) such that p*◦A(1–p*)◦B ~ C[8].
(I write ‘axiom’ in inverted commas because here it is not an axiom, it is derived from other
axioms). It is in the proof of this result that the Continuity axiom is relevant.
Proof. The two sets {p[0,1]: p◦A(1–p)◦B ≿ C} and {p[0,1]: C ≿ p◦A(1–p)◦B} are closed and
nonempty (each one contains at least 0 or 1), and every point in [0,1] belongs to at least one of the two sets
because of completeness of the preference order. Since the unit interval is connected[9], there must be some p
belonging to both sets, and at that p the lottery p◦A(1–p)◦B must be equipreferred[10] to C since the weak
inequality holds both ways. This result only needs weak inequalities, but if the inequalities are strict, A  C 
B, then we can add that the common p cannot be 0 or 1 because then the lottery p◦A(1–p)◦B would be
equivalent to A for certain or to B for certain, which – given the assumption that A C B – would render the
equipreference between the lottery and C impossible. ■
We prove now:
Theorem. Uniqueness of the equipreference probability. The p* in the Archimedean
‘axiom’ is unique[11].
Proof. For any number s in [0,1] different from p*, if p*>s then, by result (iii) in the M-in-P Lemma,
C ~ p*◦A(1–p*)◦B  s◦A(1–s)◦B ; and if s>p* then s◦A(1–s)◦B  C. ■
I also present the proof of this result by Owen (Theory of Games, 1995, p. 153) that does not assume
result (iii) of the M-in-P Lemma and actually proves the latter lemma in a different way. Take any number s
in [0,1] different from p*. We want to show that it cannot be s◦A(1–s)◦B ~ C. We already know that it
cannot be s=0 or s=1 (see the end of the proof of the Archimedean 'axiom') so take s(0,1). Assume s<p*.
Then 0 < p*–s < 1–s and therefore, since (by the Equivalence axiom) we can write B =
8
This corresponds to result (1) p. 175 in Varian 1993.
A set is connected if it is possible to connect any point of the set to any other point of the set with a
continuous curve consisting entirely of points of the set. Each point of a continuous curve is a point of
accumulation along the curve from either direction; hence if a continuous curve in a connected set S goes
from a point of a subset F of S to a point not in F but in another subset H of S , with F and H both closed and
connected and such that FH=S, then any point of the curve not in F is in H, and there must be a point of the
curve which is a frontier point of F and also an accumulation point for H, so by the definition of closed set it
belongs to H and therefore to both subsets.
10
Some economists (e.g. Varian) would write ‘indifferent to C’, using the adjective ‘indifferent’ in
sentences like “x is indifferent to y” to mean “the consumer is indifferent between x and y”. This ugly
deformation of English is avoided by other authors who write in the same sense “x is equivalent to y”; I
prefer to use in the same sense ‘equipreferred’, that clarifies that one is talking about preferences.
11
This corresponds to result (2) p. 175 in Varian 1993.
9
5
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 6
1 p * 
 p * s
B
 B  and by assumption AB, it follows (by [9.2] with C=B) that

1 s
 1 s

1 p * 
 p * s
 A
 B  B. (This proves (ii) in the M-in-P Lemma.) Then by [9.2]

1 s
 1 s

1 p * 
 p * s
 A
 B   s◦A(1–s)◦B. But the reduced lottery corresponding to the
s◦A(1–s)◦ 
1 s
 1 s

1 p * 
 p * s
 A
 B  is p*◦A(1–p*)◦B, hence:
compound lottery s◦A(1–s)◦ 
1 s
 1 s

C ~ p*◦A(1–p*)◦B  s◦A(1–s)◦B. (This proves (iii) in the M-in-P Lemma.)
The case s>p* works in the same way with the inequalities reversed (and proves (i) in the M-in-P Lemma) ■
Now I build a utility function for preferences satisfying these axioms and show that it has
the expected utility form, following Owen (1995).
Theorem. Existence of expected utility. Under the axioms listed so far there exists a
function u that maps the set of all outcomes into the real numbers, such that for any two outcomes A
and B and any p[0,1]:
[9.4] u(A) > u(B) if and only if AB
[9.5] u(p◦A(1–p)◦B) = pu(A) + (1–p)u(B).
This function is unique up to an affine transformation, i.e. if there exists a second function v that
satisfies 9.4 and 9.5 for the same preferences, then there exist real numbers α>0 and β such that for
all outcomes A
[9.6] v(A) = αu(A) + β .
Proof (partial). The complete proof is long and I shall only give parts of it, sufficient to point out its
basic principle, which is most simply explained by assuming that there exist two outcomes E1 and E0 such[12]
that E1  E0 and initially restricting ourselves to outcomes A (which can be lotteries) such that E1  A  E0.
Then by the Archimedean axiom there exists a probability s(0,1) such that s◦E1(1–s)◦E0 ~ A. This
probability s is chosen as the numerical value of u(A) relative to the reference outcomes E1 and E0. If there
exists a best outcome and a worst outcome, it is possible to choose them as reference outcomes, and then all
outcomes will either satisfy E1AE0, or E1~A in which case we put u(A)=u(E1)=1, or E0~A in which case
we put u(A)=u(E0)=0. One can then prove that u thus defined satisfies the conditions of the theorem (see
below). But one need not choose the best and worst outcomes (even when they exist) as reference outcomes,
it is possible to choose any couple of outcomes such that E1E0, then an outcome A can be preferred to E1 or
can be worse than E0, but one can still assign values to u(A) connected with probabilities determined by the
Archimedean axiom, and such that 9.4, 9.5 and 9.6 are satisfied, in the way shown below. This makes it
possible to treat cases in which there is no best or no worst outcome. (The case in which there are no two
outcomes E1 and E0 such that E1E0 is uninteresting, it is the case when for all outcomes A and B it is A~B,
then we can simply assign u(A)=0 for all events, the conditions of the theorem are satisfied.)
The rules to assign u(A) for all the five possible cases are as follows:
(a) AE1. Then there exists q(0,1) such that q◦A(1–q)◦E0 ~ E1. We define u(A)=1/q, which is >1.
(b) A~E1. We define u(A)=u(E1)=1.
(c) E1AE0. Then there exists s(0,1) such that s◦E1(1–s)◦E0 ~ A. We define u(A)=s.
(d) A~E0. We define u(A)=0.
(e) E0A. Then there exists t(0,1) such that t◦A(1–t)◦E1 ~ E0. We define u(A)=
t 1
, which is
t
negative.
12
These symbols E1, E0 should not be confused with the expectation operator.
6
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 7
Note that if we are given u(A) we know the case to which A belongs. The proof that u as defined in
cases (a) to (e) for two outcomes A and B satisfies conditions 9.4 and 9.5 is quite lengthy, depending on the
case to which each of the two outcomes belongs: given the irrelevance of the order in which outcomes are
listed, there are 15 possible combinations. Only one combination will be examined here, (c,c). The other
combinations except (e,e) are examined in the Appendix. Case (e,e) is left for the reader to examine as an
Exercise.
Assume then the case (c,c) and that u(A)=sA, u(B)=sB. If sA=sB then A~sA◦E1(1-sA)◦E0~B so A~B.
If sA>sB then sA◦E1(1-sA)◦E0  sB◦E1(1-sB)◦E0 and therefore AB. Analogously if AB it must be sA>sB
where sA◦E1(1-sA)◦E0 ~A and sB◦E1(1-sB)◦E0 ~B, otherwise it could not be sA◦E1(1-sA)◦E0  sB◦E1(1sB)◦E0. Therefore u satisfies 9.4.
To prove 9.5, consider any p(0,1). By the Equivalence Axiom and the definitions of sA, sB:
p◦A(1–p)◦B ~
~ p◦[ sA◦E1(1-sA)◦E0]  (1–p)◦[ sB◦E1(1-sB)◦E0] ~
~ (psA+(1–p)sB)◦E1  [p(1–sA)+(1–p)(1–sB)]◦E0 .
Hence the utility u of the lottery p◦A(1–p)◦B is psA+(1–p)sB; since sA=u(A) and sB=u(B):
u(p◦A(1–p)◦B) = pu(A) + (1–p)u(B)
and 9.5 is satisfied.
There remains to prove that u is unique up to an affine transformation with a>0. Let v be any other
function satisfying 9.4 and 9.5. Since E1E0 it must be v(E1)>v(E0). Define β=v(E0), and α=v(E1)–v(E0) (that
satisfies α>0). Consider an outcome A of case (c), that is, such that E1AE0. Let u(A)=s where A ~
s◦E1(1–s)◦E0. Therefore v(·) must assign the same number to A and to s◦E1(1–s)◦E0; which by the
expected utility form implies:
v(A) = v(s◦E1(1–s)◦E0) =
= sv(E1) + (1–s)v(E0) =
= s(α+v(E0)) + (1-s)v(E0) =
= s(α+β) + (1-s)β = sα + β = αu(A) + β.
Hence v(·) satisfies 9.6.
With similar reasonings it can be shown that v(·) satisfies 9.6 also for A falling in the other cases (a),
(b), (d), (e). For example in case (a) we define α and β in the same way, and from E1~ q◦A(1–q)◦E0,
u(A)=1/q, v(E1) = v(q◦A(1–q)◦E0) = qv(A)+(1–q)v(E0) we deduce
α = v(E1)–v(E0) = qv(A)–qv(E0) = qv(A)–qβ,
v(A) = (α+qβ)/q = α
1
+ β = αu(A) + β. ■
q
The above considerations extend straightforwardly to cases in which the outcomes are more
than two but finite in number. If the probability of outcome n is pn, with n=1,..., N, then the
expected utility of this lottery is
N
 p u ( n) .
n 1
n
We will not discuss the technical complications connected with continuous probability
distributions. It can be shown that under essentially the same axioms, if a lottery consists of a
continuous probability distribution defined on a continuum of outcomes x, then the expected utility
of this lottery is  u(x)p(x)dx . We will not have occasion to deal with continuous probability
distributions in this text.
It is often said that the expected utility form implies that the utility function is cardinal,
rather than simply ordinal; in other words, that it makes sense to say that the amount by which the
utilities of two lotteries differ is greater, or smaller, than the amount by which the utilities of other
two lotteries differ (just like one can say that the difference in temperature between two days is
greater, or smaller, than the difference in temperature between two other days), a statement
generally considered to make no sense for the usual ordinal utilities of consumer choice. However,
7
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 8
it should not be forgotten that the Cobb-Douglas utility function, if written in logarithmic form,
u(x1,x2)=α ln x1 + (1–α) ln x2, is as ‘cardinal’ as an expected utility function in terms of a single
scalar, say income; indeed it is often assumed, when considering expected utility of lotteries whose
outcomes are incomes, that the basic utility of income is logarithmic, in which case if one faces a
lottery that, depending on the result (say) of a random extraction, assigns a quantity of income x 1
with probability α and a quantity of income x2 with probability 1– α, then the expected utility
function perfectly coincides with the Cobb-Douglas just written. Just like for the Cobb-Douglas, for
expected utility too some monotonic transformations (e.g. raising it to the power of 3) would cause
one to lose the additive separable form, and one does not adopt them because less convenient. What
the axioms behind the expected utility form imply is only a strongly separable utility function,
linear in the probabilities, but the form of the basic utility function u(·) remains generally nonlinear. An analogous additive separability holds also for the quasilinear utility function used to
formalize the assumption of constant marginal utility of money.
Why this utility function? Its existence relies on axioms which are not, at least at first sight,
wildly implausible. For example, the fundamental Independence Axiom essentially says this:
suppose you face a lottery p◦A(1-p)◦B; since you are going to get A or B, relationships of
complementarity or ‘anticomplementarity’ (having A makes B less agreeable) between the two
prizes, which have an important role in usual consumer theory, are not going to be relevant, so if
there is another outcome C that you find perfectly equipreferable to A, it stands to reason that it
makes no difference to you if C replaces A in the lottery – indeed it might be argued that if it does
make a difference to you then you are not truly indifferent between A and C. And for the same
reason if C is preferable to A, then replacing A with C should make the new lottery preferable to the
old one. However, there is experimental evidence and introspective evidence suggesting that often
people do not respect the axioms behind the expected utility function. Read the excellent chapter on
cognitive limitations and consumer behaviour in Robert Frank's intermediate textbook
Microeconomics and Behaviour (it is ch. 8 of the 6th edition).
So let us briefly point out that one can go some way without assuming such a specific form
of the utility function.
Preferences over goods and lotteries of goods can be represented through a utility function
as long as one makes the same assumptions as for sure goods, i.e. completeness, reflexivity,
transitivity, and continuity.
I show now that one does not need expected utility to define risk aversion.
If offered (for free) a bet that gives a 50% chance of winning 1000 euros and a 50% chance
of losing 1000 euros most people will refuse the bet. It must mean that the utility of the lottery
1/2◦(initial wealth + 1000)1/2◦(initial wealth – 1000) is less than the utility of initial wealth, i.e. is
less than the utility of obtaining the expected value of the lottery for certain.
This empirical fact is not always verified, for example people buy lottery tickets or play
roulette, in spite of knowing that the gambles associated with them are not fair, that is, that the
expected value of the variations in wealth associated with the gambles is negative[13]; but this
aversion to accept fair bets is verified in many cases of economic interest, and is the basis for the
explanation of phenomena like insurance or portfolio diversification. Therefore we proceed to
define it.
RISK AVERSION
Let us formalize uncertainty as ignorance as to which one, of an exhaustive and perfectly
known list of possible future states of the world, will come about. Uncertainty can be about
13
In many state lotteries the amount of money given back to lottery ticket buyers as prizes is a third
or less of the money paid by the buyers.
8
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 9
contemporaneous facts, e.g. about whether there is oil under a certain portion of earth surface, but
its resolution will be always associated with some future time and state of the world (e.g. states
distinguished by which information will come out on whether there is oil or not in that place).
States of the world, or simply states for brevity, are distinguished according to the variables
of interest. If several variables are needed to distinguish states, the total number of states increases
rapidly. For example if, in order to forecast the total output of tomatoes in a nation in a certain year,
states of the world are distinguished according to 10 possible climatic states (that describe rainfall,
temperature etc.) in each of 3 regions, one distinguishes 103=1000 states.
Decisions on how to act, e.g. how to spend one’s money, may depend on the likelihood one
assigns to future states. The decision by a small town mayor to buy a snowplough may depend on
how likely and frequent she esteems heavy snowfalls will be.
One can distinguish commodities according to the state with which they are associated. A
standard example is an umbrella when it rains and when it doesn’t; another one is an ice cream
when it is hot and when it is cold; or, a vacation when one is healthy, and when one is ill.
Sometimes it is possible to stipulate contracts for delivery of a good if and only if a certain
state occurs. E.g. if one pays in advance for tow-away assistance insurance, the service will be
provided if and only if the car breaks down. Insurance contracts are all basically of this type: one
pays for delivery of a service or of an amount of money conditional on a certain event having
occurred. A good associated with a specified state of the world (or a specified event, by which we
mean a set of states of the world) that may or may not occur is called a state-indexed commodity or
also a contingent commodity.
Sometimes it is possible to buy a contingent commodity, that is, to buy a promise of
delivery of a commodity conditional on the realization of the state to which the commodity is
associated. But whether this is possible or not, state-indexed commodities are a useful way to
formalize decisions under uncertainty.
Thus assume that one is concerned about the amount of a certain good (e.g. income) that
will be available depending on the state of the world. Assume there is a finite number S of possible
states of the world, in each one of which the amount xs of income can take many values[14]. The
basic assumption is that the chooser has a preference ordering over the vectors x =(x1,...,xs,...,xS) of
amounts of the good available in the several alternative states. If e.g. there are only two states, if the
good is divisible, and if its amounts are nonnegative, the set of vectors on which preferences are
defined is the set of points in R2 . This simple case will often allow us to reach a sufficient grasp of
many issues.
These vectors x =(x1,...,xs,...,xS) are formally totally analogous to usual consumption
vectors[15]. So we can suppose that preferences over these vectors are complete, reflexive,
transitive, continuous and monotonic, in the sense of ch. 4. Then there will be downward-sloping
indifference curves; these may indicate for example that, if the outcome is an amount of a single
good (e.g. income) the chooser is indifferent between the vector (10 units of the good if state 1
occurs, 10 units if state 2 occurs) and the vector (12 units if state 1 occurs, 9 units if state 2 occurs).
Note that to define these preferences and indifference curves we do not need probabilities.
However, if probabilities of occurrence of the states are known, and if preferences can be
represented via a utility function which is state-independent (that is, assigns the same utility to a
certain level of income whatever the state in which that income is obtained) and has the expectedutility form, then we can say something interesting about the slope of the indifference curves when
– restricting ourselves now to the one-good (wealth), two-states case – they cross the 45° line, cf.
Fig. 9.1. The 45° line is called the certainty line because its points represent amounts of wealth that
14
This means that income is not one of the elements that distinguish states. Alternatively, one can
distinguish states according also to income, and then group them into events that collect states sharing
aspects that do not include income: with this convention, the number S will refer to events and not to states.
15
It should be obvious that this remains true if the single good xs is replaced by a vector of goods.
9
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 10
coincide across states, will be available independently of which state occurs, and therefore are
certain. The utility of a point w=(w1,w2) on the 45° line (w1=w2=w) is therefore U(w)= pu(w1)+(1–
p)u(w2)=u(w), independent of the probabilities p, 1–p of occurrence of the two states. Now consider
displacements along the indifference curve that passes through w, that take one to (w+x1, w+x2).
Let us determine the slope of the indifference curve i.e. the derivative x2'=dx2/dx1 at the certainty
line.
x2 (income in
state 2)
y▪
■
E(x)
C(x)▪
slope –p/(1–p)
■
x
45°
O
x1 (income in state 1)
Fig. 9.1
There are two possible states of the world in each one of which the amount of income can
take any of a continuum of values; the Figure shows how, given the probability of the two
states and a vector (a lottery) x, one can derive the lottery’s expected value E(x) and,
from the individual’s indifference curve through x, its certainty equivalent C(x). It also
shows that if the individual starts on the certainty line then with strictly convex
indifference curves he will not accept fair bets (e.g. if he starts at C(x) he will not accept a
bet corresponding to point y).
Differentiate with respect to x1 in x1=x2=0 both sides of
pu(w+x1)+(1-p)u(w+x2(x1)) = u(w).
u ( w  x1 )
u ( w  x2 ( x1 )) dx2
 (1  p)
 pu '(1  p)u ' x2 '  0
We obtain p
x1
x2
dx1
(The left-hand side can be re-written as shown, with the two u' equal, because they are
determined for the same initial value of u, and expected utility implicitly assumes that the consumer
does not care about the state by itself, she cares only about the income she gets.) This implies
x2'(0) = –p/(1–p).
This is true at all points on the certainty line.
Thus in this case the absolute slope of the indifference curve at the certainty line measures
the ratio between the probabilities of the two states. Tracing a line with that slope, all points (w+x1,
w+x2) along the line have the same expected value because x2/x1= –p/(1–p) implies (1–p)x2= –px1
i.e. E(x)=0. If now we intend by x the vector indicating a point in the state plane, and not the vector
of displacements from a point on the certainty line, we can find the expected value E(x) of each
vector x by tracing a line through x with slope –p/(1–p) and finding its intersection with the
certainty line. On the other hand we can find the certainty equivalent C(x) of any vector x as the
certain amount of the good that yields the same utility as x, that is, as the point where the
indifference curve through x crosses the certainty line.
We can then state: if indifference curves are strictly convex, then the expected value of any
vector not on the certainty line is greater than the certainty equivalent of that vector[16]. This form
16
Note that the analysis is not restricted to a single good (e.g. income). Interpret y as a vector of
vectors yω of consumption goods in different states of the world ω, where yjω is the amount of consumption
good j in state of the world ω. Suppose there is a given probability vector p that assigns probability pω to
10
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 11
of indifference curves seems to be a good way to explain the situations in which choosers reject an
actuarially fair bet, which is the name given to the payment of an amount of money M to buy the
right to the result of a lottery of prizes whose expected value is M. For example you may bet $10
dollars that the toss of a fair die will come out with a 2, in which case (and only in that case) you
win $60; the probability of a 2 being 1/6, the expected value of $60 is $10; the bet is fair. Suppose
there are two states of the world distinguished by the outcome of a roulette spin (e.g. number 10
comes out, or it doesn’t); the initial situation of the consumer is a quantity x of money for certain,
i.e. the consumer is on the certainty line at some point x=(x1=x, x2=x); the consumer can bet a
certain amount of money on number 10 coming out (state 1) and, if it does, she gets an amount of
money whose expected value equals the bet. A fair bet displaces the consumer to a vector y with the
same expected value as x, and therefore below the indifference curve through x if indifference
curves are strictly convex[17]. This remains true with more than two states and with the vector x
containing more than two commodities. We then define:
Risk aversion. An individual is risk averse (i.e her preference relation is said to exhibit risk
aversion) if indifference curves (or, in the case of many states and goods, indifference
hypersurfaces) relative to state-indexed goods are strictly convex. The definition is local or global
depending on whether the convexity of indifference curves holds locally or globally.
This definition of risk aversion does not need the existence of well-defined probabilities for
the states, nor of state-independent utility, only of indifference curves (or surfaces). If probabilities
exist and utility has the expected utility form then the definition implies E(x)>C(x) for x not on the
certainty line. The converse is not true, one might have indifference curves shaped as in Fig. 9.1bis,
these imply E(x)>C(x) in spite of being convex only in a neighbourhood of the certainty line.
If one is indifferent between a lottery and obtaining its expected value for sure, one is said to
be risk neutral. If one prefers a lottery to its expected value, one is said to be risk loving. If
probabilities are not defined and one only has the map of indifference curves, risk neutrality means
straight indifference curves, and risk love means concave indifference curves (again, locally or
globally).
x2 (income in
state 2)
C(x)▪
■
x
45°
O
x1 (income in state 1)
each state of the world ω, ∑ωpω=1; so the couple (y,p) is a lottery over vectors of consumption goods. The
sure prospect y* corresponding to (y,p) is a vector of consumption goods, the same in all states of the world,
with components yj*=∑ωpωyjω i.e. the amount of each consumption good is the expected value of that
consumption good. We can define the risk premium rate as the number ρ such that the sure prospect (1-ρ)y*
has the same utility as the lottery, u((1-ρ)y*)=u(y,p). Note that this definition of the risk premium rate does
not require that the utility function over lotteries has the VNM form.
17
Cf. y in fig. 9.1. Then does risk aversion not apply to roulette gamblers? It may well apply, and yet
be compensated by the pleasure itself of gambling (the adrenaline, the decor of the casino, etc.). We are
implicitly assuming that it is only the results of the lotteries that affect preferences, but this is often untrue.
Another possibility is that the gambler believes she knows a way of beating the house.
11
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 12
Fig. 9.1bis
Now let us discuss risk aversion relative to income levels, assuming utility has the expected
utility form.
Assume a single good, income, perfectly divisible. Assume a utility of income u(x); the
utility of any income lottery ℒ=p◦x1(1–p)◦x2 is u(ℒ)=pu(x1)+(1–p)u(x2). Risk aversion is defined
as: obtaining for sure the expected value of the lottery is preferred to the lottery:
u(E(ℒ))>u(ℒ) for x1≠x2.
This must mean that u(x) is (strictly) concave, that is, the marginal utility of income is
decreasing[18]; thus when income is the payoff, risk aversion and decreasing marginal utility of
income are equivalent ways to characterize the situation.
graph
u(x*)=u(E(ℒ))
u(ℒ)
A
u(x)
u^(x)
x*=p(x*+a)+(1-p)(x*-b)=x*+pa-(1-p)b,
pa-(1-p)b=0 , b/a=p/(1-p)
B
x*
x2 C(ℒ)
x1
Fig. 9.2
Given a lottery with prizes corresponding to points A and B, its mean x* depends on (and therefore
reveals graphically) the probability p, and once determined it allows describing the prizes as x*+a and x*-b.
The certainty equivalent C(ℒ) is the quantity of wealth yielding the same utility as ℒ .
This has the following interesting interpretation. Consider a consumer who faces a lottery ℒ
yielding wealth x1 with probability p and wealth x2 with probability 1-p. Represent utility as a
concave function of wealth x, and indicate as A and B the points on this function corresponding to
x1 and x2. The expected value E(ℒ)= x* = px1+(1-p)x2 is a point on the abscissa between x1 and x2,
whose position depends on p, moving toward x1 as p increases. (If we indicate x1=x*+a, x2=x*-b,
The meaning of ‘decreasing marginal utility of income’ may appear obscure to people accustomed
to the usual utility function of consumer theory only defined up to an increasing monotonic transformation,
so that marginal utilities can be decreasing or increasing depending on the transformation one applies. But
we are dealing here with preferences representable through an expected utility function, which was shown
earlier to be unique up to an affine transformation. Now if v(x)=αu(x)+β, the sign of the second derivative is
not affected by the affine transformation; thus, to say that u’(x) is decreasing in x means that all
transformations of u(x) that maintain the expected utility form will have the same property. However, since
other transformations, that still describe the same preferences but without maintaining the expected utility
form, may not have this property, one can still ask for the concrete meaning of the restriction. This is that if
18
one faces a lottery L with different levels of income as prizes, then u(E(ℒ))>u(C(ℒ)): it is only relative to the
straight line connecting two different income-utility points (and representing therefore different expectedincome–expected-utility points depending on probabilities) in Fig. 9.2 that the concavity or not of u(x) makes
a difference. If utility depends on a vector of consumption goods, and their prices are given, the utility of
income can be taken to be the indirect utility function (uncertainty about prices can then be treated through
different prices distinguishing different states of the world), and a different meaning of decreasing marginal
utility of income can be stipulated as a decreasing (absolute) rate of substitution between income and some
assigned bundle of goods as income increases; but this has no necessary connection with a decreasing
marginal utility of income in the expected utility function.
12
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 13
we can also interpret the lottery as resulting from initial wealth w=x* and acceptance of a fair
gamble offering a win a with probability p, or a loss b with probability 1-p.) The expected utility of
the lottery is u(ℒ)= pu(x1)+(1-p)u(x2) and is determined graphically by the point corresponding to
x* on the segment joining points A and B. On the contrary the utility of wealth x* for sure is
determined, graphically, by the point on the curve u(x) corresponding to x*; u(x*) = u(E(ℒ)) is the
utility of the expected value of the lottery, and it is greater than u(ℒ) since u(x) is strictly concave,
cf. Fig. 9.2 where p=1/2. The certainty equivalent of ℒ, C(ℒ), is the amount of sure wealth yielding
the same utility as ℒ. The difference x*–C(ℒ) is called the risk avoidance price, or risk premium, or
willingness to pay to avoid risk.
We can reinterpret the situation to explain the existence of insurance, applying it to a case of
insurance against, say, house fire. Let x1 be income if no fire happens (i.e. initial income), and x2 be
income if a fire happens (causing a loss of income x1-x2); the probability of the fire is known to be
1–p. Then
x*=px1+(1-p)x2
is the expected value of the gamble the consumer faces without insurance[19], while C(ℒ) is the
certainty equivalent of the gamble, i.e. the sure amount of income that the consumer finds
equipreferred to the gamble, u(C(ℒ))=u(ℒ). As Figure 9.2 shows, owing to risk aversion it is
C(ℒ)<x*, so if someone offers the consumer an insurance contract that assures the consumer a
certain income x^ greater than C(ℒ) in exchange for the payment of (x1 – x^) as ‘premium’ or
insurance price, the consumer accepts. The maximum difference between x* and x^ the consumer is
ready to accept is x*–C(ℒ), which is accordingly called the risk premium[20] or risk avoidance price.
The risk premium is the maximum amount a risk-averse individual would be ready to pay to avoid a
lottery whose expected value were her initial sure wealth, or to avoid the additional riskiness
relative to obtaining for sure the expected value of a lottery.
Suppose now for simplicity that all consumers and all fire insurance contracts are identical.
The insurance company, by insuring a high number of consumers with independent risks, owing to
the law of large numbers can be certain to pay damages to a percentage 1–p of insured customers on
average, so it has little reason to be risk averse; if it had no expenses it would be capable of
guaranteeing its customers a certain income x* by asking for x1–x* as premium; by offering only
x^<x* (and asking for x1–x^ as premium) the insurance company obtains a difference between
earnings and payments with which it can cover administrative expenses and earn a rate of profit on
its capital. Thus one can say, as a first approximation, that it is risk aversion that makes the
existence of insurance companies possible. If the company were a monopolist, it would offer an x^
for sure only infinitesimally greater than C(ℒ); with free entry and competition, x^ will tend to be
the highest one allowing insurance companies to cover administrative expenses and to earn the
normal rate of return on their capital.
Figure 9.2 also illustrates the effect of a mean-preserving spread of the alternative
outcomes: assume that, with p constant, the distance of the possible outcomes of the lottery from its
unchanged mean value increases, that is, a and b increase while x* remains unchanged. This means
that points A and B move respectively to the right and to the left on the u(x) curve, hence the line
connecting them shifts downwards (cf. the dotted straight line in Fig. 9.2), so u(ℒ) and C(ℒ)
decrease: the risk premium increases. It is not difficult to prove that the same behaviour of u(ℒ) and
C(ℒ) is caused by a mean-preserving spread that consists of replacing an outcome, say x-b, with two
19
One can also see x* as initial wealth x1 minus the expected value of the damage x1-x2, i.e. x*=x1(1-p)(x1-x2). In an example below the expected value of the damage is indicated as d.
20
Unfortunately the term ‘premium’ is used in commercial language both to indicate what the
customer pays for the insurance coverage, and in the locution ‘risk premium’ with the meaning explained in
the text (which is why other terms are sometimes preferred for this second usage, such as risk avoidance
price). The term ‘risk premium’ is also used with a still different meaning in other contexts.
13
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 14
outcomes, one greater and one smaller, such that their joint probability is equal to the probability of
the replaced outcome and their total expected value is equal to the replaced outcome[21]: because of
risk aversion the sub-lottery with the two new outcomes as prizes has lower utility than the replaced
outcome, so the new overall lottery has lower expected utility because it has, in place of x-b, the
certainty equivalent of the sub-lottery that replaces it (Exercise: show it graphically).
In Fig. 9.2 an alternative u^(x) curve (the dotted curve) has also been drawn, more concave
than the first one, but such that u^(x*)=u(x*) so as to favour comparison. The greater concavity
causes the A-B line to be lower than with u(x), so the risk premium is greater. One then says that
u^(x) shows greater risk aversion than u(x); a more precise measure will be indicated shortly.
In Figure 9.2 the assumption that utility has the expected utility form is what allows one to
find u(ℒ) on the point of the AB segment vertically above x*. The intuition for the effect of a meanpreserving spread can also be obtained from Fig. 9.1, where a movement away from the certainty
line from any vector x to another vector x’ with the same expected value means a passage to a lower
indifference curve. The risk premium too can be derived from this graphical representation. Under
our assumption of state-independent utility, it can be measured indifferently on the state-1 axis or
on the state-2 axis, and it is indicated by the difference between the amount of, say, x1
corresponding to E(x) and the amount corresponding to C(x).
Is risk aversion a universal phenomenon? Clearly not, because it does not explain the
acceptance of gambles known to be unfair, like buying tickets of lotteries known to give back in
prizes much less than – even less than a third of – the total revenue from the sale of tickets[22], nor
does it explain the fact that people play gambles known to be unfair, like casino roulette (one of the
less unfair gambles, but still unfair, paying only 36/37 in Europe, and 36/38 in USA, on average, of
what is gambled). Decisions suggesting risk preference can be explained as due to the excitement of
gambling (i.e. one derives utility not only from the prizes but also from the activity itself of
gambling); as due to the attraction of the possibility of becoming very rich, for which one is ready
to give up small sums without worrying much about the fairness of the bet[23]; as due to
desperation, if the prize is the only way to get out of a situation which is already a disaster so that
losing the bet is not going to make things much worse (this last case explains gambles with a
negative expected value by heavily indebted individuals or by stock market traders who try to
recuperate the loss of a speculation turned sour). The same individual can be risk averse or not
depending on the gamble, e.g. most purchasers of lottery tickets also buy insurance policies.
The St. Petersburg Paradox
Still, in most cases risk aversion seems to hold. For example, decreasing marginal utility of
income appears to be the explanation of the famous St. Petersburg Paradox.
The following gamble is proposed: a fair coin is flipped until heads appears; if heads
appears at the n-th flip, a prize of 2n dollars is paid. How much would you be ready to pay for the
right to this uncertain prize? Consider the following table which illustrates the probability of the
prizes and their expected values:
21
For example in the case of Fig. 9.2 outcome x-b might be replaced by the lottery with prizes x-b+ε
and x-b-ε, both with probability 1/2. Exercise: show that the mean of the overall lottery is not affected by
such a replacement.
22
Exercise: prove formally that if a lottery with a single prize pays back to the winner the total
amount earned by selling tickets then the lottery is actuarially fair, and therefore a risk averse individual
should not buy a ticket.
23
There is reason to suspect that people tend not to calculate expected values when the gamble
involves a very small loss versus a very large (although highly improbable) gain; most people have only the
vaguest idea of the probability of winning the big prizes of important lotteries. The anticipation of a regret if
one completely gives up the possibility, however improbable, of becoming very rich is probably an important
reason why one buys lottery tickets.
14
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
flip
prize if heads first comes out at this flip
probability of heads first coming out then
expected value of prize
1
2
1/2
1
2
4
1/4
1
3
8
1/8
1
4
5 ...
16
32
1/16 1/32
1
1
p. 15
n
2n
1/2n
1
The expected value of the gamble is the sum of the expected values of the possible prizes,
hence it is 1 + 1 + 1 + ... = +∞. If one were risk neutral one should be ready to pay any sum, even an
enormous one, for the right to a single play of this gamble! Experimental evidence shows on the
contrary that on average people are ready to pay between 3 and 4 dollars. Bernoulli, one of the
founders of probability theory, proposed to explain this paradox by assuming expected utility with
the basic utility of money m given by ln m. Let mn be the prize if heads comes out at the n-th flip,
and pn the probability of this prize. Then the expected utility of the gamble is




1
1
n
u(ℒ) =  pnu (mn )   n ln 2 n   n n ln 2  ln 2 n = 2 ln 2 = ln 22 = ln 4 . [24]
n 1 2
n 1
n 1 2
n 1 2
This implies C(ℒ)=4, that may explain the empirical evidence as due to risk aversion being a
bit greater than if the utility of money or wealth were the natural logarithm of wealth. Also, there is
a constraint to what one can pay for such a gamble, deriving from the maximum wealth one can
dispose of; and also probably there is a minimum subsistence wealth one is not ready to risk. (This
minimum subsistence may be the level that explains why, when one’s initial expected wealth level
is below it, one becomes risk loving.)
Actuarially fair insurance
Imagine an individual facing the possibility of a loss of money owing to some event, and
considering whether to insure against this loss. Let us indicate (with the same symbols as Varian):
W the given initial wealth,
L the loss of wealth, or damage, (a single value, for simplicity) caused by the event
occurring, measured in money,
p the probability of L occurring, independent of the individual’s actions[25],
q the ‘insurance coverage’ i.e. the amount of the conditional commodity ‘one dollar the
insurance company will pay in case L occurs’,
π the given premium to be paid per dollar of coverage.
If the individual insures for an insurance coverage q, he pays πq for sure and obtains wealth
W–πq if L does not occur, W–L–πq+q if L occurs. The individual chooses q so as to maximize
utility (1–p)u(W–πq)+pu(W–L–πq+q). The first-order condition is
–(1–p)πu’(W–πq)+p(1–π)u’(W–L–πq+q) = 0 ,
that is,
u' (W - D - q  q)

(1  p)


.
u' (W - q)
(1   )
p
The result requires finding the value to which the series ∑nxn converges, and then setting x=1/2 in
it. Consider the geometric series 1+x+x2+...+xn+... = 1/(1–x) for 0<x<1; differentiate both sides to obtain
0+1+2x+3x2+4x3+...+nxn–1+... = 1/(1–x)2; rewrite the left-hand side as 1+x+x+2x2+x2+3x3+x3+... =
(1+x+x2+...+xn+...)+∑nxn , hence ∑nxn = 1/(1–x)2 – 1/(1–x) = x/(1–x)2; this becomes 2 if x=1/2.
25
In many cases the probability of the damage depends on the individual’s actions: e.g. a house fire,
or falling ill and in need of medical assistance, can depend on how careless one is. This gives rise to the
problem of moral hazard, which for the moment we do not want to introduce.
24
15
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 16
Now suppose the insurance company offers an actuarially fair contract, that is, p=π, for
example because competition with entry reduces expected pure profit to zero, and the insurance
company – we may assume for simplicity – has no costs other than the payment of coverages, in
which case expected profit is, on each insurance contract, p(πq–q)+(1–p)πq = (p–π)q, implying p=π
if expected profit is zero. An actuarially fair gamble or bet is, as already indicated, one that offers a
win whose expected value equals what one pays for the gamble; if the customer pays πq and
receives q with probability p=π the expected value of receiving q is pq=πq. If p=π then the righthand side of the above first-order condition equals 1, implying equality of the marginal utilities on
the left-hand side and therefore, since u is strictly concave, a marginal utility determined in the
same point at the numerator and at the denominator i.e. (W–L–πq+q)=(W–πq) i.e. L=q. The
individual chooses complete insurance: he pays πq and is left with W–πq, and in case L happens he
receives q=L and finds himself again with W–πq. So he obtains W–πq for sure, and since we have
shown that the insurance contract is fair, this means W–πq equals the expected value of the lottery
E(ℒ) = p◦(W–L)(1–p)◦W, corresponding to income x* in Fig. 9.2.
x2 (income in
state 2)
■
E(x)
C(x)
▪
■
x
45°
O
α β
Fig. 9.3
x1 (income in state 1)
The Figure reproduces Fig. 9.1 with the indication of the risk premium, represented by
segment αβ; it also shows that a more convex indifference curve through x (the dotted
one) would cause the risk premium to be greater because it causes the certainty equivalent
to be smaller.
A clear intuition for this result can be obtained from Fig. 9.1 or 9.3. Let x represent the
lottery payoffs without insurance, with state 2 indicating income in case the damage occurs (x1=W,
x2=W–L) so now it is x2 that has probability p and the constant-expected-value lines have slope –
(1–p)/p; the expected value of this lottery is p(W–L)+(1–p)W. If the customer buys q dollars of
coverage, paying πq for it, the point representing the lottery moves to x’=(W–πq, W–πq–L+q), that
is, as q increases x’ moves North-West along a straight line of slope –(1–π)/π. If insurance is
actuarially fair, p=π, then this ‘insurance line’ coincides with the constant-expected-value line, so
for any q the expected value of the lottery with insurance is the same as without insurance, π(W–L–
πq+q)+(1–p)(W–πq) = p(W–L)+p(1–π)q+(1–p)W–(1–π)πq = p(W–L)+(1–p)W. Then the customer
finds it optimal to reach the point E(x) on the certainty line, which owing to the convexity of
indifference curves is the point with the highest utility on the constant-expected-value line. In other
words, by insuring completely the individual reaches the utility of the expected value of the lottery
for sure, and therefore avoids the loss in utility associated with risk aversion.
Example. (Here Fig. 9.2 is more appropriate.) A family has wealth m whose utility is given
by u(m)=m1/2 and has VonNeumann-Morgenstern utility relative to choices under uncertainty. The
family's initial wealth is m*=100 and it esteems there is a 50% probability of a damage of 64, which
would reduce wealth to 36.
a) Determine the expected value E(m) of the family's wealth.
16
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 17
b) Determine the expected utility of the 'lottery' describing the two possible wealths of the
family if no insurance is purchased.
c) Determine the maximum price P=d+R (where d is the expected value of the damage
reimbursement) the family is ready to pay for complete insurance, and the associated risk avoidance
price R i.e. benefit from elimination of risk.
Solution. Expected value of wealth E(m)=100/2 + 36/2 = 68 (this is the x* in Fig. 9.2).
Utility of 'lottery' L=[100, 36; p=1/2] is u(ℒ)=10/2 + 6/2 = 8. Certainty equivalent of the 'lottery' is
C(ℒ)=64, the certain wealth with utility equal to 8. Maximum P payable for complete insurance is
the one that reduces initial wealth 100 to the level yielding the same utility as without insurance i.e.
u=8, therefore P=36 that reduces wealth to C(ℒ)=64. R=P–d where d=expected value of the
reimbursement i.e. of the damage, that is, d=64/2=32; hence R=4, also obtainable as R=E(m)–C(ℒ)
because E(m)=m*–d, C(ℒ)=m*–P.
Unfair insurance
In order to cover administrative costs and, possibly, some risk aversion of the owners,
insurance companies do not offer actuarially fair insurance contracts. We can understand the effect
in terms of the two-states diagram (Fig. 9.4). Now π>p, so the ‘insurance line’ is less steep than the
constant-expected-value line. If the consumer can choose the coverage q at the fixed premium π,
and if the absolute slope of the ‘insurance line’ is greater than –dx2/dx1 at x, she maximizes utility at
the point y of tangency between the ‘insurance line’ and indifference curve, so she insures only
partially[26].
x2 (income in
state 2, with loss)
■
C(x)▪
E(x)
■y
slope –(1–p)/p
■
slope –(1–π)/π
x
45°
O
x1 (income in state 1, no loss)
Fig. 9.4
The analysis of the same case through the graphical representation of Fig. 9.2 is as follows.
With fair insurance (π=p) we have seen that the consumer insures completely, that is, buys a
coverage q=L, pays πq, and is left with x*=W–πq=W–πL for sure, which is the expected value of
the original lottery. To understand the effect of π>p let us first re-examine the case π=p, but now
with incomplete insurance. For q<L if p=π the expected value of the lottery ℒ’=p◦(W-L-πq+q)(1p)◦(W-πq) remains equal to x*=W–pL, and u(ℒ’) can be found vertically above x* on the straight
line connecting the points A’ and B’ corresponding to x=W-πq and x=W-L-πq+q on the curve u(x),
cf. Fig. 9.5; u(ℒ) rises with q but remains less than u(x*) as long as q<L, which is why it is best for
the consumer to go all the way to q=L which causes A’ and B’ to coincide.
If on the contrary π>p, the expected value of the same lottery ℒ’ equals W–pL–(π–p)q<x*,
so it decreases as q increases; there are therefore two effects on u(ℒ’) as q increases from zero, a
positive effect deriving from the A’-B’ line moving upwards, and a negative one deriving from the
26
If the insurance company is risk neutral and can decide both q and P, what is the choice that
maximizes its profit?
17
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 18
leftward movement of the point E(ℒ’) on the abscissa, which (going up from it vertically) gives us
the point on the A’-B’ line that indicates u(ℒ’). The first-order condition tells us that now it must be
u’(B’)/u’(A’)>1 for an optimum, which means that, at the optimum, B' is to the left of A' on the
utility curve: evidently as q increases the negative effect becomes stronger than the positive one
before A’ and B’ come to coincide, that is, before q=L. Therefore the optimum q for the consumer
is less than L, which means the consumer does not insure completely.
A’ A
graph
u(W)
u(ℒ’)
u(ℒ)
B’
B
W-L
x*
W
Fig. 9.5
u(ℒ’) when q<L but p=π and therefore the expected value of ℒ’ remains equal to x*.
Measuring risk aversion
Fig. 9.3 shows that the risk premium is the greater, the greater the convexity of indifference
curves; Fig. 9.2 shows that the risk premium is the greater, the greater the concavity of the expected
utility u(x). In applications, the approach illustrated in Fig. 9.2 is the more frequent one; then a
measure of the concavity of u(x) can be useful; it must measure how fast the slope of u(x) decreases
if x increases. The most widely used measure is
u" ( x)
,

u ' ( x)
the absolute value of the ratio between second and first derivative of u(x). This is called the ArrowPratt measure of absolute risk aversion. When applied to the expected utility of wealth, it indicates
how fast its slope decreases as wealth increases: the second derivative alone does not suffice,
because it is altered by affine transformations of the expected utility function, which we know do
not alter preferences and maintain the expected utility form; division by u' makes the measure
independent of such transformations. Its appropriateness as a measure of risk aversion can be shown
in two ways. The dotted utility function in Fig. 9.2 has the same slope as the other one at x* and is
more concave, thus its Arrow-Pratt measure is greater, and indeed it causes a greater risk avoidance
price (a smaller certainty equivalent) for the same lottery, indicating a greater loss of utility due to
risk aversion. The Arrow-Pratt measure also indicates how convex the indifference curves of Fig.
9.1 are at the certainty line. Consider the indifference curve through the initial, sure wealth w of a
consumer who is offered gambles consisting of a win x1 with probability p, and a loss x2 (a negative
number) with probability 1-p; if the consumer accepts one such gamble, her wealth becomes a
lottery p◦(w+x1)(1-p)(w+x2). The set of gambles the consumer will accept is the set of points on
or above the indifference curve through w; this is called her acceptance set. The acceptance set will
be the smaller, the more convex the indifference curve through w, indicating that fewer gambles
will be accepted. We can say then that risk aversion is (at least locally) the greater for small
gambles from a certain initial wealth, the more convex the indifference curves at the certaintly line.
We have already seen that at the certainty line all indifference curves have slope x2'(x1)= -p/(1-p) at
x1=0. Let us differentiate agan with respect to x1 in x1=x2=0 both sides of the equality from which
this result was derived:
18
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 19
u ( w  x1 )
u ( w  x2 ( x1 )) dx2
 (1  p)
 pu '(1  p )u ' x2 ' (0)  0 .
x1
x2
dx1
We obtain again ∂2u/∂x12=∂2u/∂x22=u"(w) because both calculated in the same point, hence:
pu"(w) + (1-p)[u"(w)x2'(0)x2'(0) + u'(w)x2"(0)]=0
from which, using the fact that x2'(x1)= -p/(1-p) at x1=0, one obtains
p  u" ( w) 

.
x2"(0) =
(1  p) 2  u ' ( w) 
Thus the convexity of the indifference curve is the greater, the greater is the Arrow-Pratt
measure of risk aversion; a consumer with a smaller Arrow-Pratt measure will accept all the
gambles accepted by a consumer with a greater Arrow-Pratt measure, plus some more, indicating
less risk aversion.
A property of this measure is that for small bets it is approximately proportional to the risk
premium or risk avoidance price. Let us prove it.
Let the payoffs of a fair bet on wealth be a random variable h, with E(h)=0. Indicate the risk
premium as R and the initial (sure) wealth as W, with the possible values of h small relative to W.
By the definitions of certainty equivalent and of risk premium it is
E[u(W+h)] = u(W–R).
On the left-hand side we have the expected utility of the random income W+h; on the right-hand
side we have the utility of the certainty equivalent W–R. Let us expand both sides in Taylor’s
series. For the right-hand side one may stop at the second term since R is a fixed amount:
u(W–R) = u(W) – Ru’(W) + higher-order terms.
For the left-hand side it is better to include the third term to allow for the variability of h:
h2
E[u(W+h)] = E[u(W)+hu’(W)+ u”(W)+ higher-order terms] =
2
2
= u(W) + E(h)u’(W) + ½E(h )u”(W) + higher-order terms.
Remembering E(h)=0, noting that E(h2) is the variance of h, and neglecting higher-order terms, we
obtain ½E(h2)u”(W)  – Ru’(W), i.e.
u" (W ) Var (h)
(*) R  –
.
u ' (W ) 2
The risk premium is approximately equal to the Arrow-Pratt measure of absolute risk
aversion multiplied by half the variance of h.
The above was calculated assuming that the initial wealth was precisely the expected value
of the lottery (W+h). But the reasoning still holds when that is not the case, by replacing W with the
expected value x* of the random variable ‘wealth’. The Arrow-Pratt measure must then be
calculated at x*.
We can illustrate with a numerical example that will give us some feeling for the magnitude
of the risk premium, at the same time checking the goodness of the approximation. Take the kind of
situation considered earlier that may induce one to buy insurance against a possible house fire
causing a damage L. Assume initial wealth is W*=625; u(W)=W1/2; L=225 with a probability 10%
of happening. The lottery the uninsured individual faces is ℒ=.1◦400.9◦625; the expected value of
wealth it implies is x*=E(W)=602.5. The expected utility of the lottery is u(ℒ) = .1 · 4001/2 + .9 ·
6251/2 = 24.5. The certainty equivalent is C(ℒ) = [u(ℒ)]2 = 600.25. Therefore the risk premium is
R=E(W)–C(ℒ)=2.25 (did you expect it to be so small?). Now let us see what measure of the risk
premium is obtained from the (*) approximation. With this utility function, u”(x)/u’(x)= –1/(2x);
the Arrow-Pratt measure must be calculated at x* and therefore is 1/(2x*) = 1/1205. The variance is
E((x–x*)2) = .9·(625–602.5)2 + .1·(602.5–400)2 = 4556.25.
Half this variance times the Arrow-Pratt measure yields 1.89. The true risk premium, 2.25, is 19%
greater than the (*) approximation 1.89. The order of magnitude is not widely off the mark but the
p
19
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 20
percentage of error is rather big, which shows that the (*) approximation is good only for random
variables with very small deviations from the mean.
Absolute risk aversion. A question of practical interest is whether increases in wealth induce
or not a greater readiness to accept bets. This is connected with the behaviour of the risk premium,
because if one starts with a sure wealth W and is offered a bet h that turns W into the random
variable W+h, the individual will accept the bet if
E[u(W+h)]>u(W),
which is equivalent to (indicating the lottery with the random variable):
C(W+h)>W
i.e. to
E(W+h)–R(W,h)>W.
If the bet remains unchanged in payoffs and in probabilities as W changes, it is
E(W+h)=W+E(h)
so the bet is accepted if
E(h)>R(W,h).
One speaks of decreasing absolute risk aversion if as wealth increases one comes to accept
bets that at a lower wealth level one would not accept, that is, if for any given h the risk premium R
is a decreasing function of W. It seems intuitive that especially when comparing lower with higher
levels of income and considering a bet that, even though with a positive expected value, at the lower
income level would risk reducing the individual’s income below subsistence, the individual will be
more disposed to accept the bet if rich. Indeed there is ample evidence supporting decreasing
absolute risk aversion (except, of course, in the cases of desperation mentioned earlier). (Buying
lottery tickets with improbable big prizes may appear less attractive to very rich people than to poor
people because of the smaller impact of a win on their life condition; but in this case there isn’t risk
aversion to start with, at least for the poor people.)
In some analyses the assumption is made of constant absolute risk aversion, abbreviated as
CARA, that is, that for a given bet represented by the random variable h with expected value E(h)
the risk avoidance price R that renders the consumer indifferent between the random variable
(W+h) and the sure sum W+E(h)–R does not change when W changes. One can also express this
assumption as an assumption of absence of wealth effects on R (this is further explained below). It
may be useful to know that the utility functions exhibiting CARA must be of the form
u(x) = – ce–αx
where c, α are positive constants, and x is the random wealth or income. This function is negative,
increasing, concave and tending asymptotically to zero. With it one obtains –u”(x)/u’(x) = α, so α is
the Arrow-Pratt measure of absolute risk aversion. If one sets c=1/α one obtains u’(x)=e–αx which
may make analytical derivations easier.
Relative risk aversion
Suppose you are interested in knowing whether an increase in wealth makes an individual
more or less averse to accepting a bet whose payoffs are specified as proportional to wealth. For
example, suppose at wealth W the individual rejects a bet that yields .1W with probability 55%, and
–.1W with probability 45%: will he still reject this bet if W increases tenfold?
If at wealth W the bet is a random variable h, we have seen that the bet is accepted if
C(W+h)>W. What is of interest now is whether C(W+h) increases in the same proportion as W if h
increases in that same proportion; describing the increase of W and of h by multiplying them by an
increasing scalar b, the question is how C(bW+bh) behaves relative to bW. Since C(bW+bh) =
E(bW+bh)–R(bW,bh) = bW+bE(h)–R(bW,bh), the question turns on whether R changes in the
same proportion as W and h, or not. So what is relevant is whether the relative risk premium R/W
changes or not as W and h increase in the same proportion.
Let us then note that the approximation (*) can be re-written
20
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 21
u" (W ) Var (h)
.
u ' (W ) 2W
If X and Y are two random variables with Y=bX, then Var(Y)=b2Var(X). So if the sure initial
wealth W increases and becomes bW while the random variable h (the bet) becomes bh, the
variance 2 of wealth if the bet is accepted becomes b22. Therefore if we rewrite R/W as
u" (W ) Var (W )
(**) R/W  – W
,
u ' (W ) 2W 2
the second fraction on the right-hand side is invariant to proportional changes of sure wealth and of
the bet payoffs (to see it, replace W with bW). So the behaviour of R/W as W and h increase is, at
least locally, indicated by the behaviour of the Arrow-Pratt measure multiplied by W. For this
u" (W )
reason, –W·
is called the Arrow-Pratt measure of relative risk aversion, and its increase or
u ' (W )
decrease as W increases indicates whether the relative risk premium is an increasing or decreasing
function of W. Note that the often plausible assumption of decreasing absolute risk aversion is
compatible with any behaviour of relative risk aversion, and it is much less clear what kind of
behaviour one should expect for the latter; some people argue that a constant relative risk aversion
is not too implausible an assumption.
Constant relative risk aversion (CRRA) obtains if
x1
u(x) =
with 0<, u(x) = ln x if =1.
1 
The constant  is the measure of relative risk aversion. The CRRA function implies, as you
can easily check, decreasing absolute risk aversion because the latter is /x.
Note that if you have a lottery L=p◦x(1-p)◦y and the utility function is CRRA then
x1
y1
1
u(ℒ)= p
 (1  p)

px1  (1  p) y1
1 
1  1 
whose expression in square brackets is a CES utility function raised to the power (1–ρ), whose
constant elasticity of substitution is 1/ρ. A similar expression is reached, with rates of discount in
place of probabilities, if in an intertemporal neoclassical growth model one assumes that the total
utility of infinitely lived households, or of generations, is a sum of per-period ‘felicity functions’ of
CRRA form; then one obtains that the elasticity of substitution between consumption in two periods
is constant, which simplifies things. But analytical simplicity is not sufficient to make an
assumption economically reasonable; however, I must leave to the macroeconomics lecturer the
task of giving a more convincing motivation for this choice, if possible.
R/W  –


Exercise: prove that in the graph of Fig. 9.1 or 9.3 with constant absolute risk aversion indifference
curves have the same slope along straight lines parallel to the certainty line, while with constant relative risk
aversion indifference curves have the same slope along rays from the origin.
??Fig. 9.6: indifference curves with CARA and with CRRA (from Cowell)
AN APPLICATION OF THE RISK PREMIUM APPROXIMATION: EFFICIENT RISK POOLING.
I take from Milgrom and Roberts 1992 an example of application of approximation (*).
This example is based on the assumption of absence of wealth effects, the assumption that in
choices without uncertainty is expressed by quasi-linear utility.
Assume the following three conditions hold:
21
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 22
(i) for each couple of options α and β that the individual can have access to, with α strictly
preferred to β, there exists an amount of money m sufficient to compensate the individual for
getting β instead of α (that is, getting β and m is equipreferred to getting α);
(ii) given α, β, m as in (i), the preference order between α and β and the compensating
amount of money m are independent of the wealth of the individual; thus the individual is also
indifferent between getting β, or getting α and paying m; and she is also indifferent between getting
α plus an amount of money M, or getting β plus m plus M;
(iii) the individual’s wealth is greater than the reduction m in it that compensates for the
passage from β to α.
These conditions are obviously restrictive, and therefore not legitimate in many cases[27].
But when they can be assumed as sufficiently close to reality, then the utility function of the
individual who, relative to the choice set of interest, has no wealth effect takes a quasilinear form,
linear in wealth. We assume monotonicity. Let x be the money wealth of the individual, and let y be
a vector of variables relevantly influencing her decisions relative to that choice set; then u(x,y) can
be represented as x+v(y).
To prove it, note that the summation indicates that v(y) too is a quantity of money,
interpretable as the m such that the individual is indifferent between obtaining a total wealth x+m,
and obtaining wealth x and the vector y. Accordingly v(y) can be called the money equivalent of y.
(If y is a ‘bad’, which reduces utility relative to having only x, then v(y) is negative.) Because of (i)
the money equivalent exists, therefore u(x,y) = u(x+v(y)), which by monotonicity is an increasing
function of wealth and therefore represents the same preference order as x+v(y).
Because of monotonicity u(x,y)≻u(x,y’) if and only if u(x+v(y))≻u(x+v(y’)) or equivalently
if and only if x+v(y)>x+v(y’). Because of (ii) v(y) is independent of x; therefore v(y)–v(y’) is the m,
independent of x, that compensates for obtaining y’ instead of y. Absence of wealth effects requires
that the initial wealth of the individual be greater than the greatest of the possible difference v(y)–
v(y’) for y, y’ in the relevant choice set.
Call x+v(y) the equivalent wealth. When an economic problem involves the welfare of
several individuals, if wealth effects are absent the equivalent wealths of different individuals can
be added to obtain the total equivalent wealth, an amount of money.
Pareto efficiency requires that total equivalent wealth be maximized: if it is not maximized
in an initial situation, then it can be increased and then each individual can be made better off by
dividing equally the increase of total equivalent wealth among all individuals, so the initial situation
is not Pareto efficient; while when total equivalent wealth is maximized, no further increase of the
welfare of some individual is achievable without reducing the welfare of some other individual,
because the equivalent wealth of some individual would have to be reduced, so the situation is
Pareto efficient. Milgrom and Roberts call this result the principle of value maximization. It can also
be expressed as follows: In an uncertain context, an allocation is efficient if and only if it maximizes
the sum total of the certainty equivalents of the wealths of the individuals involved. Application of
this principle does not determine how much equivalent wealth will go to each individual, because
redistributions of wealth that leave total equivalent wealth at its maximum level do not disturb
Pareto efficiency. There is room, therefore, for side payments among the involved individuals as
incentives to behaviours conducive to value maximization.
For our example we must consider uncertainty. One of the possible interpretations of y is as
the vector of possible values assumed in different states of the world by a random variable y that
measures deviations of wealth from an expected value x, causing x+y to be itself a random variable.
Then u(x,y) is the utility of the random variable x+y, that is, the utility of the certainty equivalent
C(ℒ) of the lottery defined by this random variable; in the absence of wealth effects the
27
One implicit assumption is that the capacity of money to give utility is given, which implies that
money prices are given, at least the money prices in markets other than the one under study (cf. in ch. 4 the
analysis of quasilinear utility and of consumer surplus). The analysis is necessarily a partial-equilibrium one.
22
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 23
representability of utility through equivalent wealth means that utility can be measured as equal to
the certainty equivalent; we have seen that this certainty equivalent is x–R, where x is the expected
value of wealth and R is the risk premium; it can also be expressed as x+v(y) where v(y) is the
negative of the risk premium. Under absence of wealth effects the risk premium associated with the
random variable y is independent of x; in other words, there is constant absolute risk aversion.
Indeed if W is sure wealth and h a random variable, a risk-averse consumer prefers W+E(h) for sure
to the random variable (W+h) so we can treat the first as α and the second as β in the definition of
absence of wealth effects; the risk avoidance price R is the compensating amount of money m that
makes the consumer indifferent between getting α minus m, or β; absence of wealth effects means
that R does not change with W, which is the definition of CARA.
In an uncertainty context, the principle of value maximization says that an allocation is
efficient if and only if it maximizes the sum of the certainty equivalents of the random wealths of
the individuals involved.
I come now to the example. Risk pooling occurs when individuals who face different risky
income prospects agree to share the total realized income from all their prospects. One way is as
follows. Suppose the incomes of two individuals A and B are two independent random variables YA
and YB, with means yA and yB and variances Var(YA), Var (YB). Indicate with ρA, ρB the respective
coefficients of (constant) absolute risk aversion. The principle of value maximization implies that
we must allocate risk so as to maximize the certainty equivalent of total (random) wealth, that is,
expected value of total wealth minus total risk premium. Without risk reallocation the sum of the
risk premiums, using approximation (*), is
½ ρAVar(YA) + ½ ρBVar(YB).
Suppose the two individuals agree to share their incomes, so that individual A will receive
αYA+βYB+γ , with 0≤α≤1, 0≤β≤1, and γ a side payment (positive, negative or zero) whose role and
possible size will become clear below; while B will receive (1–α)YA+(1–β)YB–γ. This is a feasible
contract, in that what is distributed in total is YA+YB.
Now the total risk premium is
½ ρAVar(αYA+βYB+γ) + ½ ρBVar((1–α)YA+(1–β)YB–γ).
The expected value of total wealth is yA+yB, and this minus the total risk premium is the
total certainty equivalent that must be maximized; therefore we must minimize the total risk
premium. Minimization must be with respect to α, β and γ; but γ does not influence the variances
nor, as a consequence, the minimum, it is only relevant as possibly necessary in order to satisfy the
participation constraints that require, in order for the contract to be acceptable to both individuals,
that the expected utility to each individual of what the contract gives them be not inferior to what
they can get without risk pooling; we will understand its role later.
Neglecting γ, the total risk premium can be rewritten as[28]:
½ ρAVar(αYA+βYB) + ½ ρBVar((1–α)YA+(1–β)YB) =
= ½ ρA[α2Var(YA)+β2Var(YB)+2αβCov(YA,YB)] +
+ ½ ρB[(1–α)2Var(YA)+(1–β)2Var(YB)+2(1-α)(1–β)Cov(YA,YB)] =
= ½ [Var(YA)·(ρAα2+ρB(1–α)2)+Var(YB)·( ρAβ2+ρB(1–β)2)].
The first-order minimization conditions are that the partial derivatives of this expression
relative to α and to β be set equal to zero, which produces:



 B 
.
1 A 1 
If for example ρA<ρB indicating that A is less risk averse than B, then α>1/2 and the same
for β, so A must bear a greater portion of the risk. The closer A is to risk neutral, the closer α and β
are to 1, which means that A should bear nearly all the variability of income, while B should get an
28
Remember that if Y=a+bX then Var(Y)=b2Var(X), that Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y),
and that YA and YB are independent so Cov(YA,YB)=0.
23
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 24
income consisting nearly entirely of the sure side payment –γ (in this case γ would be negative),
thus bearing nearly no risk.
We see here one reason for the necessity of the side payment γ: without it an optimal α (and
β) close to 1 or to zero would cause one of the two individuals to get almost the entire joint income.
Another reason is to compensate for differences between yA and yB. However, satisfaction of the
principle of value maximization only determines the total certainty equivalent, not how it gets
distributed between A and B. As normal in an interaction between only two individuals, there will
be some indeterminacy here. The participation constraints impose that each individual must obtain
from the contract not less than without the contract, that is,
uA(αYA+βYB+γ) ≥ uA(YA)  yA – ½ ρAVar(YA)
uB((1–α)YA+(1–β)YB–γ) ≥ uB(YB)  yB – ½ ρBVar(YB)
but these constraints will generally determine an interval of feasible values that γ can take, inside
which the value of γ is left indeterminate.
One way to determine γ is to assume that it is determined so as to leave the expected value
of the two incomes unchanged: if αyA+αyB+γ=yA, that is, γ=(1–α)yA–αyB, then (1–α)yA+(1–α)yB–
γ=yB. Now each individual has the same mean income as before the contract, but less risk.
The reduction of risk can be seen clearly with the help of a new notion. Define risk
tolerance as the reciprocal of the Arrow-Pratt measure of risk aversion, 1/ρ if ρ is the Arrow-Pratt
coefficient. Then the share of risk assigned to A, α, is the ratio of A’s risk tolerance to total risk
tolerance (the sum of the individual risk tolerances). For example if ρA=2 and ρB=4, and hence
1
1
Var (YA  YB )
2
2
α=2/3, it is 2/3 =
. And the total risk premium is
, as if all risk were borne by
1
1
1 1


2 4
 A B
an individual with risk tolerance equal to the sum of the risk tolerances of A and B. Thus if n
individuals with similar independent random incomes and similar risk aversion sign an incomepooling contract like the one illustrated above (the side payments γ can now be assumed absent,
since equal sharing in total income is the natural contract), they come to bear less risk. Assume n
individuals having the same independent random income Y and the same risk aversion coefficient
ρ; the total risk premium expressed through risk tolerance is
1
1 1
1 1
Var (nY ) 
nVar (Y ) (because the covariances are zero) =  Var (Y ) .
2
2n1
2n1


The total risk premium is the same as the risk premium R of a single individual on her own; this
means that the total certainty equivalent is the sum of the certainty equivalents of the isolated
individuals, plus (n–1)R; an equal sharing of this total certainty equivalent yields therefore for each
individual a certainty equivalent increase, relative to being on their own, equal to R(n–1)/n.
Diversification
It is common to hear that risk can be reduced by diversification. Let us give a very simple
and limited explanation of why it is so.
If xi, xj are two random variables, their covariance is defined as
 ij  E ( xi  i )( x j   j ) 
 
  (x
i
 i )( x j   j ) f ( xi , x j )dxi dx j
 
where μi, μj are the means of xi and xj, and f(xi,xj) is the joint density function that yields the
b
probability P(a≤xi≤b and c≤xj≤d) as
d
 
b
f ( xi , x j )dxi dx j . Two properties of covariance are:
c
Cov (X,Y) = E(X·Y) – E(X)·E(Y)
Var(aX+bY) = a2Var(X)+b2Var(Y)+2abCov(X,Y).
24
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 25
Now consider a random variable obtained as a mixture of xi and xj, z=αxi+(1–α)xj, with
0≤α≤1; z could represent the returns from investing a sum partly in asset xi and partly in asset xj,
that is, from diversification. We obtain
μz=αμi+(1–α)μj
Var(z) = α2Var(xi)+(1–α)2Var(xj)+2α(1–α)Cov(xi,xj)
We will study later the optimal choice when the expected value of the returns from xi and xj
is not the same. For now let us consider the case μi=μj and Var(xi)=Var(xj), then the random
variable z has mean μz=μi, and variance Var(z) = [α2+(1–α)2]Var(xi)+2α(1–α)Cov(xi,xj).
For 0<α<1 it is α2+(1–α)2<1, hence Var(z) is smaller than Var(xi), and is minimized for
α=1/2 where it becomes half the variance of xi or xj; so if Cov(xi,xj) is zero, or positive but small,
the variance and hence the risk premium decrease with diversification; if the covariance is negative,
the reduction of riskiness is even greater.
PORTFOLIO SELECTION
Now let us study the optimal portfolio choice of a risk-averse individual faced with the
possibility of investing income into assets with different expected returns and different riskiness.
This should allow some understanding of the behaviour of investors in stock markets.
Assume a two-periods time horizon, and suppose there are two assets: a safe one, which
next period will return one euro per euro invested now (hence a zero rate of return: e.g. money),
and a risky one, with a gross return of k euros per euro invested, which is a random variable with an
expected value greater than 1, i.e. a positive expected rate of return. A given wealth W must be
allocated between the two assets. Let a and b be the amounts of wealth invested respectively in the
risky and in the safe asset, a+b=W. Assume no short sales are possible, so 0≤a≤W (short sales will
be explained in a little while). Assume also strict risk aversion. The individual’s wealth next period
is a random variable w=ak+b=W+a(k–1)=W+aR where R is the random rate of return on the risky
asset, with distribution function F(R) and mean E(R)>0. If the individual has a vNM expected
utility function with basic utility function u(·), her utility maximization problem is
max  u (W  aR)dF ( R) s.t. 0≤a≤W.
a
Let a* be the solution; we prove that it must be a*>0. The first-order Kuhn-Tucker
condition suffices for an optimum because the objective function is concave. The derivative of the
Lagrangian function must equal zero, where λ1 is the multiplier of constraint a≥0 and –λ2 of
constraint a≤W:
 u' (W  aR) RdF ( R)  1  2 = 0.
The complementary slackness conditions are λ1a=0, λ2(W–a)=0. So if 0<a<W then λ2=λ1=0;
if a=0 then λ2=0 and λ1>0 and therefore  u ' (W ) RdF ( R ) < 0, but this cannot be optimal because
when a=0 it is  u ' (W ) RdF ( R ) = u ' (W )  RdF ( R)  0 because
 RdF (R) = E(R) > 0 ; this proves
that the solution requires a*>0. This shows that as long as a risky asset has actuarially favourable
returns (and no asset has even more favourable returns), the optimal portfolio composition will
include at least a small amount of it.
The earlier analysis of actuarially unfair insurance conforms to this result, because one can
treat the situation corresponding to complete insurance as the safe asset, relative to which the
‘lottery’ without insurance (point x in fig. 9.4) is a risky asset with a greater expected value; and we
saw that the optimal decision is not to insure fully, i.e. to accept some risk.
We can now ask about the effect on a* of changes in W and of changes in E(R).
It can be shown that da*/dW is positive if the absolute risk aversion coefficient is decreasing
in wealth i.e. if at a higher W one accepts bets that at a lower level of W one rejects. This is rather
obvious and I skip the proof. More interesting is what happens to a* as E(R) rises. Let us follow
25
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 26
Varian 1993 in maintaining things simple by representing the increase of the random rates of
returns through the assumption that the random variable R becomes R(1+h), with h≥0 a scalar: an
increase of h from zero represents a proportional rise of all possible returns. The first-order
condition if the optimum is internal can be written
(*) E(u’[W+a*(h)(1+h)R]·(1+h)R) = 0.
Apart from the use of the expectation operator, if h=0 equation (*) coincides with the
already found first-order condition. If a*=W, that is if no amount is held of the safe asset, the rise of
h from zero changes nothing: under our assumption of no short sales it remains optimal for the
individual to have a*=W. If a<W, since for each level of h we can treat h as a constant the second
term (1+h) in (*) can be taken out of E(·) and neglected because of the equality to zero, so (*)
becomes
(**)
E(u’[W+a*(h)(1+h)R]] = 0.
This equality will be satisfied if and only if a*(h)=a*(0)/(1+h). This means that at higher
levels of h the individual decreases the amount held of the risky asset in exactly the same proportion
as the rise of gross returns, so the individual restores exactly the same pattern of returns as before
the increase of h.
But this conclusion is strictly dependent on the assumption that the safe asset has zero
expected net return (like money). To see why, let us use a graphical representation like in Fig. 9.1,
i.e. let us assume only two possible rates of net return of the risky asset per euro invested in it, R1
and R2. Let the origin O=(0,0) represent initial wealth, and let us measure on the axes the variation
in wealth (the total net return) if one invests W in an asset or in a portfolio; the safe asset is on the
certainty line and has rate of return R0=0, therefore the return from the investment of W entirely in
the safe asset (a=0) is indicated by the origin, because it causes no variation of wealth. A risky asset
demanded up to a*<W requires then that if R1>0, it is R2<0, otherwise a=W would be the best
choice for the individual because spending the whole of W on the risky asset would get the
individual a lottery like y in Fig. 9.7, preferred to no variation as well as to any mixture of the safe
and the risky asset. And, if short sales were allowed, the individual would like to sell short an
indefinite amount of the risky asset. This is the moment to explain what short sales are, and how
they can be represented graphically.
To sell an asset short means to sign a contract at time t where one promises to deliver the
asset at time t+1 in spite of the fact that at time t one does not own the asset: one will have to buy it
at time t+1 from somebody else just before delivering it to the person to whom it was promised. If
the asset yields its return at time t+1, the short-seller will have to buy it at a value inclusive of its
return because the short purchaser has a right to the return. An asset can be sold short against
immediate payment, or against payment at the time of delivery: the latter is the form generally used
for speculation, but there isn’t a great difference between the two forms since an advance payment
can be transformed into an income at time t+1 through purchase of a safe asset, and a promise of
payment at t+1 can be used as guarantee for a loan at time t. Speculation through short sales is
based on believing that the value of the asset at t+1 will be less than the value at which the short
purchaser is ready to pay it on delivery; since financial transactions can be completed in the space
of seconds, the individual who short sells the asset for payment of a price P at t+1 counts on buying
the asset just a few seconds before at a lesser price P’, pocketing the difference P–P’. Of course the
short purchaser accepts because she entertains the opposite expectation that at time t+1 the market
price P’ of the asset will be greater than the short-sale price P, so she will be able, by re-selling the
asset immediately, to pocket the difference P’-P.
Let us consider Fig. 9.7. It shows on the axes the net returns from the portfolio in the two
states; the downward-sloping broken lines are equal-expected-value lines whose slope indicates the
probabilities assigned to states 1 and 2. Point x indicates the net returns if all W is spent on a risky
asset and the latter is such that 0<a*<W; the net returns are x1=W·R1, x2=W·R2, with R1>0, R2<0;
the points on the O-x segment are the portfolios corresponding to different values of a; the optimal
portfolio is the one that generates tangency with an indifference curve, it is represented by net
26
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 27
returns z* in the Figure. It is conceivable that the tangency be on the prolongation of the O-x
segment to the right of x, in which case if short sales are not allowed the optimal portfolio consists
of only the risky asset, while with the possibility of short sales the individual will sell the safe asset
short, and will use the proceeds to spend more than W on the risky asset so as to reach the point of
tangency; reaching a point like x’ will mean a negative holding of the safe asset[29]. Point y
represents the total net returns (with a=W) of a risky asset delivering a positive net return in both
states: if it were available the individual would spend the whole of W on it, and with a possibility of
unlimited short sales the individual would demand an infinite amount of it. But in the latter case the
given price of the risky asset becomes implausible: the excess demand for the asset would cause its
price to rise, decreasing the expected rates of return. And anyway, even if not owing to the actions
of a single individual, the price of financial assets must be considered determined by the market; it
can only be taken as given as a first step, useful to determine individual demands, toward
determining the prices of assets.
x2 (income
variation in
state 2)
■
y
45°
O
x1 (income variation in state 1)
■ z*
■
x
■
x’
Fig. 9.7
The same Figure allows us to understand why, in this case of the safe asset having zero net
return, a rise of the return on the risky asset does not induce the individual to alter the pattern of
returns, i.e. to remain at z*: the rise of h moves x to some x’ along the same O-x line but further
away from the origin: the optimal choice, the point of tangency with the indifference curve, remains
z*. This is obtained by reducing a*: if a* had not decreased, the portfolio returns would have been
represented by a point to the right of z*.
If the safe asset yields a positive rate of return R0, the optimal portfolio will no longer
coincide with the original portfolio z* when h increases, nor will it be the case that
a*(h)=a*(0)/(1+h). A graphical analysis is again sufficient. Assume the net returns from investing
the whole of W in the safe asset are indicated by point s on the certainty line in Fig. 9.8, with
s1=s2=W·R0. Now the condition 0<a*<W, which requires that the net returns from the risky asset x
be to the South-East of s, does not prevent both rates of return R1 and R2 from being positive, and
this is the case shown in Fig. 9.8. Let z* indicate again the initial optimal portfolio; now let h rise,
causing x to move outwards to x’; this causes the line representing the achievable portfolios (i.e. the
If we indicate with α the share of W going to buy the risky asset, and with 1–α the share going to
the safe asset, reaching x’ requires α>1, so 1–α is negative. Exercise: show graphically how to build a
riskless portfolio from two risky assets, using short sales if necessary.
29
27
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 28
line through s and x) to rotate counterclockwise around s. If it were a*(h)=a*(0)/(1+h) the returns
from holding the risky asset, a*(0)R1 and a*(0)R2, would not change, therefore the new portfolio
would be found (by the parallelogram rule) on the s-x’ line by going up from z* with a 45° slope.
This is point c in Fig. 9.8, and it will generally not be the point of tangency between the s-x’ line
and an indifference curve, the tangency will generally be to the right of this point; e.g. with CARA
– the case shown – the slope of the indifference curve at c is the same as at z* and the tangency
must therefore be to the right of c, with CRRA the tangency will be even farther to the right. Thus
apart from highly unlikely cases a*(h)R1(1+h) rises with h; but whether a* itself rises or decreases
cannot be ascertained without further information on the utility function; all we can say is that the
lower the net return on the safe asset, the more likely it is that a* decreases, because the closer we
are to the case with zero net return on the safe asset.
x2
certainty line
s■
■
c
■ z*
■
▪
a*(0)R1
O
▀
x’
x
WR1
Fig. 9.8
WR1(1+h)
x1
Many assets
In order to extend the analysis of portfolio choice to many assets 0, 1,...,i,...,n it is
convenient to focus on the gross returns ki, random variables except for the safe asset 0, which we
assume to have a certain gross return k0. (Again I essentially follow Varian 1992.) We assume again
a two-periods framework, the assets are purchased in the first period and deliver their returns in the
second period. We assume a given wealth W to be spent on the portfolio. The prices of assets are
given; it is convenient now to indicate with αi the share of W spent on purchasing asset i. Wealth in
the second period is then a random variable w:
n
w = W   i ki .
i 0
n
The shares αi must sum to 1; let us write this budget constraint as  0   1  1 . Replacing
i 1
in the expression for w we obtain:
n
n
n
i 1
i 1
i 1
w = α0k0W + W   i ki = Wk 0 (1    i )  W   i ki  Wk 0  W   i ( ki  k 0 ).
n
i 1
In this way we have embodied the budget constraint into the objective function, so we obtain
an unconstrained maximization problem:
28
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 29
n


max E u (Wk 0  W   i (ki  k0 ))  .
1 ,...,  n
i 1


The objective function can be indicated for brevity as E(u(w)). The first-order conditions are
(†)
∂E(·)/∂αi = E(u’(w)·(ki–k0)) = 0, i = 1,..., n.
The equality may not be obtainable without short sales, but these can be admitted, at least
within limits (obviously enormous short sales will not be accepted by the buyers because the risk of
default of the seller would become excessive).
These rather opaque first-order conditions become more informative if we re-write them as
E(u’(w)ki) = k0E(u’(w)) and we apply the covariance identity Cov(X,Y)=E(XY)–E(X)E(Y),
rewritten as E(XY) = Cov (X,Y) + E(X)E(Y), to the left-hand side of this equation to obtain
Cov(u’(w),ki) + E(ki)E(u’(w)) = k0E(u’(w)), which can be re-arranged to yield:
k E (u ' ( w))  Cov(u ' ( w), ki )
 Cov(u ' ( w), ki )
(***) E(ki) = 0
.
 k0 
E (u ' ( w))
E (u ' ( w))
Expression (***) tells us that if a risky asset is held in a portfolio, it must yield the safe
return plus an addition, called again ‘risk premium’ (not to be confused with the ‘risk premium’ of
the insurance problem and of the Arrow-Pratt approximation, which will be called here risk
avoidance price), which depends on the covariance between the marginal utility of wealth and the
gross return of the asset. This risk premium is positive if the covariance is negative, i.e. if the asset
return is positively correlated with wealth (and hence negatively correlated with the decreasing
marginal utility of wealth) and therefore to insert it into the portfolio increases riskiness relative to
investing only in the safe asset; it is negative if the asset return is negatively correlated with wealth,
and to insert it into the portfolio decreases the variability of wealth and therefore decreases the
riskiness of the portfolio, making the investor willing to accept a lower rate of return than on the
safe asset in exchange for the risk reduction.
What if the risk premiums do not satisfy this condition? The basic idea is that in such a case
there will be excess demand or excess supply of the asset, and the price of the asset (which we are
taking as given) will change, until its supply becomes allocated to portfolios. The ultimate purpose
of the analysis, in other words, is to determine the prices of assets. Each ki is the random gross
return per unit of money spent on asset i, so it changes if the price of the asset changes. For example
consider a share with random future dividends: the dividends are per share, but in order to obtain
them one pays a price that depends on the price of the share. So one of the things that equation
(***) tells us is that in order for an asset to be accepted in a portfolio the price of the asset must
adjust until it falls in a range that makes it possible to satisfy that condition.
However, the condition itself may appear rather surprising, because no explicit mention of
the riskiness of the non-safe assets appears in equation (***). One might have expected some
measure of the riskiness of each asset to appear in the condition, e.g. the risk avoidance price
associated with the lottery represented by the asset’s returns. Intuition suggests that risk aversion
should imply that one will invest in a riskier asset only if the expected value of the asset’s returns is
sufficiently higher than for the less risky investments as to compensate for the greater risk. Is this
intuition proved mistaken by equation (***) ?
No, and we can understand what is going on by considering the case with only one risky
asset (asset 1) besides the safe one (asset zero). Then we are back to the problem solved by the firstorder conditions (**) for the case of a safe asset with zero net return; however, now we assume
R0>0 and we express the choice in terms of the share α of wealth going to the risky asset, so the
first-order condition is (†) applied to asset 1, which, remembering that k1-1=R1, can be written
E(u’(w)(R1-R0)) = 0.
Rewriting it as E(u’(w)·R1) = R0E(u’(w)) and applying the covariance identity in the same way as
before, we obtain
29
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 30
R0 E (u ' ( w))  Cov(u ' ( w), R1 )
 Cov(u ' ( w), R1 )
= R0 +‘percentage risk premium’.
 R0 
E (u ' ( w))
E (u ' ( w))
The ‘percentage risk premium’ is positive, because R1 is necessarily correlated positively
with w, it is what makes w a random variable! The negative Cov(u’(w),R1) is the way the
dependence of the randomness (and hence riskiness) of w on the randomness of R1 reveals itself. So
the ‘percentage risk premium’ must be positive to compensate for the randomness of the returns of a
mixed portfolio that includes the risky asset besides the safe one. Note also that the greater the share
of W allocated to the risky asset, the greater the variability of the random variable u’(w), and
therefore the greater the absolute value of the covariance: if E(R1) is given, expression (‡) also
determines the share of the risky asset in the portfolio.
Now we can understand better what is going on also with many assets. A risky asset which,
when added to a portfolio, increases the variability of total returns and hence of w will be accepted
in the portfolio only if it has a higher expected rate of return than the portfolio without it. The
‘variability’ of w is a vague expression, and in more advanced texts the reader can find measures of
riskiness such as several types of ‘stochastic dominance’, but here we limit ourselves to using
variance as the measure of ‘variability’ and hence of riskiness[30]. Every time an asset’s returns are
correlated with the general movement of the ensemble of returns to the point that its inclusion in the
portfolio raises the variance of the total return, this increases the riskiness of the portfolio, and this
makes the inclusion of the asset in the portfolio convenient only if it raises the average portfolio
expected return.
Thus assume a safe asset (with net returns s if the whole of W were allocated to it) and two
risky assets A and B, and suppose initially that only asset A (with net returns xA if the whole of W
were allocated to it) is included in the portfolio together with the safe asset. There results a certain
riskiness of the portfolio, that implies a proportional risk premium on the risky asset which we
indicate as PA=E(RA)–R0. Now the investor considers the possibility of including the second risky
asset in the portfolio. Assume only two states of the world, hence two possible returns for each
asset, and consider Fig. 9.9, where as usual the broken lines are constant-expected-value lines
whose slope indicates the probabilities of the two states. Suppose the second risky asset has net
returns xB that are on the line s-xA but further away from the certainty line. The optimal portfolio
remains the one yielding average returns z*, which means that A and B can co-exist in the portfolio.
Since B has a higher expected return than A, this must mean that the proportional risk premium PB
is greater than PA. The reason is that B’s returns have greater variance than A’s, so its inclusion in
the portfolio raises its riskiness; this shows up in the greater covariance of B’s returns with w. Now
suppose there is a third risky asset C with returns indicated by y, again on the s-xA line but on the
opposite side of the certainty line and therefore with an expected value of returns lower than for the
safe asset. Again its inclusion in the portfolio is possible and does not alter the optimal average
returns that remain at z*, which must mean that the proportional risk premium on C is negative; the
reason is that C is negatively correlated with A or B and therefore its inclusion in the portfolio
decreases the portfolio’s variance and hence riskiness.
Fig. 9.9 makes it evident that, in the two-states case, more than two assets can co-exist in a
portfolio only if their rates of return are aligned, like s, xA, xB and y. But with three states, a
minimum of three assets will be generally necessary in order to reach an optimal portfolio, and with
very numerous or infinite states it is possible that all available assets be indispensable in order to
reach maximum utility.
(‡) E(R1) =
30
There is no universally accepted definition of riskiness; the definition is implicitly supplied by
how one measures it. One approach is to define the riskiness of a lottery by the risk avoidance price, or better
(in order to be able to compare lotteries with different sizes of prizes) by the ratio of the risk avoidance price
to the expected value of the lottery. Then which one, of two lotteries, is riskier depends on the utility
function one uses.
30
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 31
x2
certainty line
y■
s■
■z*
xA■
■
xB
O
Fig. 9.9
More specific assumptions about the utility function of the financial investor, for example
that the investor only cares about the mean and the variance of returns (as assumed in the Capital
Asset Pricing Model, CAPM), make it possible to reach more specific results; the covariance
between the return of an asset and the portfolio return remains an important influence on the asset’s
risk premium because it reveals the extent to which inclusion of the asset in the portfolio affects the
mean and the variability of its average return[31].
CONSUMPTION AND SAVING WITH UNCERTAINTY.
No formalization is required for the little to be said here on these topics. On aggregate
consumption, the main effect of the presence of uncertainty as to future income is the following.
Risk aversion implies that if future income becomes more uncertain without a change in its mean,
that is, if future income is uncertain and its distribution undergoes a mean-preserving spread, then
the utility of future income decreases, and therefore in the choice between consumption to-day and
consumption in the future the consumer will behave as if future income had decreased. According
to a majority of plausible formalizations of the intertemporal utility function the effect will be a
reduction of current consumption, that is, an increase of savings in the current period (in accordance
with intuition but, precisely because of this, without a great increase of our knowledge).
If it is the return on saving that is uncertain, then, again because of risk aversion, an
increased uncertainty about this rate of return, i.e. a mean preserving spread of its distribution, acts
like a decrease of its mean; the effect on saving depends on whether intertemporal preferences are
such that, in the absence of uncertainty, a lower rate of return on savings would induce a rise or a
decrease of savings: as we know from ch. 4, both cases are perfectly possible (neoclassical
economists generally assume the second case, but only because it makes the stability of equilibrium
on the market for loanable funds more likely, not because of any convincing empirical evidence).
31
I leave to texts on finance the task of a more detailed introduction to portfolio choice. I warn that
the capacity of some of the models presented in such texts to describe how stock markets really function can
be doubted. For example, the CAPM model implies that all investors should choose the same composition of
the portfolio of risky assets, in blatant contradiction with reality. I suggest reading inside stories written by
financial market operators to get a realistic picture of how those markets operate.
31
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
FIRM BEHAVIOUR UNDER UNCERTAINTY
Malinvaud: profits do not disappear, they are needed to compensate for risk
p. 32
tbc
Firms face two types of uncertainty: technological uncertainty about how much output will
be produced from the utilization (or from the intended utilization, if supply of some resources is
uncertain) of given amounts of resources; and market uncertainty, about prices (of outputs, but
sometimes also of inputs if the production process cannot be stopped once started, and the prices of
the raw materials or labour to be utilized half-way are uncertain when the production process starts),
and/or about demand if the firm is not a perfectly competitive one that can sell as much as it wants
at the independently determined market price.
Let us first consider the influence of price uncertainty upon a price-taking firm. Here I
follow Gravelle-Rees closely. Assume a firm that produces a quantity q of output which costs a
certain total cost c(q) and will be sold at the uncertain price pr (where r stands for ‘random
variable’). Let us push aside the problem discussed in Ch. 8 of who decides the objective of the firm
if there are several owners with different expectations and different preferences (and risk aversion)
by assuming a single owner with expected utility function u(yr), where yr is the owner’s income
which is the sum of the profit of the firm prq–c(q) and of other certain sources of income M. Income
yr is a random variable because profit is a random variable. The owner chooses q to maximize
U = Eu(yr) = Eu(prq–c(q)+M).
The first-order condition, assuming q>0, is
Eu’(yr)dyr/dq = Eu’(prq–c(q)+M)[pr–c’(q)] = 0.
The second-order condition is
Eu”(prq–c(q)+M)[pr–c’(q)]2 – Eu’(prq–c(q)+M)c”(q) < 0.
If there were no uncertainty the maximand would be simply profit, pq–c(q), the first-order
condition would be p–c’(q)=0, and, assuming this to be satisfied, the second-order condition would
be –c”(q)<0. Maximization of u(pq–c(q)+M) would yield the same conditions, as the reader can
check. Let us assume U-shaped average cost curves so these conditions can be satisfied. The
presence of uncertainty changes things, first, because the decision about how much to produce must
be taken before the price is known and therefore the firm may find that it did not produce the expost profit-maximizing quantity; second and more importantly, because of the presence of risk
aversion. If the firm owner is risk neutral, maximizing Eu(yr) is equivalent to maximizing the
expected value of yr, that is, with p* the mean price, to maximizing
E(prq–c(q)+M) = p*q–c(q)+M.
The first-order condition is p*q–c’(q)=0: the risk-neutral firm behaves as if it faced a certain price
equal to the expected value of the price. The similarity extends to the second-order condition: since
with risk neutrality u(yr) is linear in yr, u”=0 because u’ is a constant, so in the second-order
condition the first addendum disappears and the second addendum implies –c”<0, marginal cost
must be increasing at the profit-maximizing output q*(p*).
Risk aversion introduces some difference. Let us re-write the first-order condition utilizing
the covariance rule:
Eu’(yr)[pr–c’(q)] = Eu’(yr)E[pr–c’]+Cov(u’(yr),pr–c’) = Eu’(yr)E[p*–c’]+Cov(u’(yr),pr) = 0.
The covariance is negative, because a higher pr raises yr and under risk aversion y”<0.
Therefore Eu’(yr)E[p*–c’] must be positive; since u’>0 it is Eu’(yr)>0 so it must be E[p*–c’]>0 i.e.
c’(q) < p*.
This means that marginal cost must be less than the marginal cost, equal to p*, that would be
optimal for the risk-neutral firm. As marginal cost is increasing when output is q*(p*), the one that
would be chosen by the risk-neutral firm, for the risk-averse firm output must be less than that.
Thus we have shown that the risk-averse firm chooses a smaller output than the risk-neutral firm.
32
F Petri Microeconomics for the critical mind
chapter9uncertainty 17/02/2016
p. 33
This is one instance of the general rule that a concave utility of income induces the choice of
a smaller expected income than with risk neutrality: the decreasing marginal utility of income
causes less weight to be given to the possibility of incomes above the mean than to the possibility of
incomes below the mean; hence a precautionary attitude.
The same general rule will appear in investment decisions: the more uncertain the future
output price or, in imperfect-competition market forms, the more uncertain the future demand, the
more cautious generally will be the investment decision of a risk-averse firm, which will prefer to
make less profit than a risk-neutral firm by building a smaller productive capacity that under risk
neutrality, in order to reduce the risk of incurring losses if not bankruptcy. Thus ceteris paribus
increases of uncertainty must be expected to reduce aggregate investment. A precise formalization,
that will depend on the specific assumptions, appears unable to add much to this general intuition,
so I skip it.
33
Download