A Risk Measure For Income Streams Georg Ch. Pflug1 and Andrzej Ruszczyński2 Abstract A new measure of risk is introduced for a sequence of random incomes adapted to some filtration. This measure is formulated as the optimal net present value of a stream of adaptively planned commitments for consumption. The new measure is calculated by solving a stochastic dynamic linear optimization problem which, for finite filtrations, reduces to a deterministic linear program. We analyze properties of the new measure by exploiting the convexity and duality structure of the stochastic dynamic linear problem. The measure depends on the full distribution of the income process (not only on its marginal distributions) as well as on the filtration, which is interpreted as the available information about the future. The features of the new approach are illustrated by numerical examples. 1 Department of Statistics and Decision Support Systems, Universitaetsstrasse 5, University of Vienna, 1090 Wien-Vienna, Austria; e-mail: georg.pflug@univie.ac.at 2 Department of Management Science and Information Systems and RUTCOR, Rutgers University, 94 Rockefeller Rd, Piscataway, NJ 08854, U.S.A.; e-mail rusz@rutcor.rutgers.edu 1 1 Motivation Since the seminal work of Markowitz it is well understood that consequences of economic activity with uncertain success must be judged in two different and well distinguished dimensions. The mean refers to the average result among a set of possible scenarios, while the risk dimension describes the possible variation of the results under varying scenarios. In the Markowitz model the risk is measured by the variance of the outcome (cf. [8, 9]). In the mean–risk setting the decision maker is faced with a two-objective situation: he/she wants to maximize the mean return and to minimize the risk at the same time. As for all multi-objective situations, there is in general no uniquely defined best decision, which is optimal in both dimensions and one has to seek for compromises. The set of solutions which are Pareto-efficient in the sense of these two objectives is called the mean–risk efficient frontier. In some models for optimal decision making the two dimensions are often mixed by introducing a nondecreasing concave utility function. Risk aversion, i.e. the degree of taking the risk dimension into account, can be modeled by the negative curvature of the utility function. However, it is highly desirable to clearly separate the two dimensions and to make the compromising strategy as transparent as possible, and the efficient frontier approach provides such a transparency. In the first step, the efficient frontier is calculated for a given decision problem and the non-dominated decisions are identified. In the second step, the compromise decision may be chosen among the efficient candidates. There is a vast literature on one-period decision models using several notions of measuring risk (see, e.g., [1, 7, 13, 14, 19]). In the multiperiod situation, however, most proposals focus on the risk contained in the terminal wealth (see [3, 11, 12] and the references therein). The purpose of this paper is to propose a risk measure for multiperiod models which incorporates the risk contained in intermediate incomes. Suppose that I1 , . . . , IT is a stream of random incomes. A simple but inappropriate way of defining the multiperiod risk would be to look at the marginal variables separately and fabricate a combined risk measure as a combination of the univariate risk measures. The distinction can be made clear by advocating an example which goes back to Philippe Artzner. Suppose that a coin is thrown three times. In situation 1, a reward of 1 is paid if the coin shows more heads than tails. In situation 2, the same reward is paid if the last throw shows head. Do the two situations reflect the same risk for the decision maker? If the whole experiment is done in a few seconds, one is inclined to say ‘yes’. But suppose that the throw of the coin happens just once a year. Then in situation 1 the decision maker knows her income one year ahead, which is a clear advantage over situation 2. Thus situation 1 should turn out to be less risky than 2, although their income variables have identical marginals. We shall return to this example in section 7. To valuate the entire income stream and not just the terminal wealth appears to be appropriate in many models. For instance, pension funds promise a income streams to their clients. Since the rights emerging from a pension fund membership are not bequeathable, clients are not interested in the terminal wealth at some future moments of time, at which they may not be able to consume it. At least in Europe, pension funds are 2 only administrators and not owners of the funds. They are primarily interested in high management fees, which come from a large number of customers. Customers can only be attracted if a good income stream can be guaranteed. Thus it is in the own interest of a pension fund to keep an eye on the customer’s income stream process (see, e.g., [10, 17]). Besides that, the income stream risk must also be considered in other cases, where the primal investment is just made for the purpose of getting the income at later periods. Real options, or loans to companies are good examples here. The paper is organized as follows. After an introductory section about one-period measures, we introduce our concept of a multiperiod measure in section 3. Its properties are analyzed in section 4, and section 5 contains explicit linear programming models for the case of finitely many scenarios. In section 6 we consider mean–risk models for our measures and compare them to models based on the terminal wealth distribution. Illustrative examples are contained in section 7. All calculations can be done by standard linear programming packages. 2 The one-period case Let I be a random income variable defined on some probability space (Ω, F̃, P ). The risk contained in I is caused by the lack of information about its exact value. A variable, but predictable value of I is riskless. If a natural catastrophe, e.g. a flood, were completely predictable, there would be no risk and no company would insure against it. If a decision maker were clairvoyant, he/she would face no risk since he/she would see the future in a deterministic way and would be able to adapt to it. For us, normal humans, some but not all information about the future may be available. The amount of information available may be expressed in terms of some σ-algebra F ⊆ F̃. The extreme cases are the clairvoyant (F = F̃) and the totally uninformed (F = F0 = {Ω, ∅}). The ultimate goal of engaging in risky enterprises with uncertain income opportunities is consumption. Consumption, however, can only be realized after deciding about the amount one wants to commit for this purpose (to buy a house, a car etc.). Suppose that the decision maker decides to commit an amount a. In this case, he/she risks not achieving this decided target, since I may be less than a. However, he/she may insure against the shortfall event, i.e. the event that I < a. Insurance comes at the price of E(q[I − a]− ), for q > 1.3 The costs for insurance decrease the possible consumption. If, on the other hand, some surplus is left after consumption, this surplus is discounted by a factor d < 1, since saving does not provide the same satisfaction as the consumption committed for. The Expected Net Present Value (ENPV) of the consumption and savings is therefore E(a + d[I − a]+ − q[I − a]− ). A rational decision maker maximizes the ENPV with respect to the available information 3 We use the notation [x]+ = max(x, 0) and [x]− = max(−x, 0). 3 F; i.e., his/her utility functional is UF (I) = max{E(a + d[I − a]+ − q[I − a]− ) : a is F measurable}. (1) It is evident that F1 ⊆ F2 implies that UF1 (I) ≤ UF2 (I), i.e., more information provides more utility. Since I is F̃-measurable and a + d[v − a]+ − q[v − a]− = v − (1 − d)[v − a]+ − (q − 1)[v − a]− is concave in a, one sees that UF̃ (I) = E(I) and UF (I) ≤ E(I) for any other σ-subalgebra F of F̃. The risk R contained in the random variable I and the information F is defined as the difference between the maximal utility (the utility of the clairvoyant) and the actual utility. RF (I) = E(I) − UF (I). (2) The just defined utility and risk measures are closely related to the notion of the conditional-value-at risk (CV@R). Recall that the CV@R is defined as CV@Rβ (I) = max{a − 1 E([I − a]− ) : a ∈ R} β (see Rockefellar and Uryasev [18]). We have the following representation (see [15]): Z 1 β −1 G (p) dp CV@Rβ (I) = β 0 µ ¶ G(G−1 (β)) − β −1 = E(I|I ≤ G (β)) − G−1 (β), β where G(u) = P{I ≤ u} and G−1 (p) = inf{u : G(u) ≥ p}. Clearly, CV@Rβ (I) ≤ CV@R1 (I) = E(I). If I is a constant, then CV@Rβ (I) = E(I) = I. Lemma 1 Let I|F be the conditional distribution of I given the σ-algebra F. Then UF (I) = dE(I) + (1 − d)E[CV@Rβ (I|F)] and RF (I) = (1 − d)[E(I) − E[CV@Rβ (I|F)]], where β = (1 − d)/(q − d). 4 (3) Proof. Suppose first that the σ-algebra is trivial, i.e. F = F0 = {Ω, ∅}. Since the function appearing under the expectation in (1)satisfies Ua (v) = a + d[v − a]+ − q[v − a]− = a + d(v − a) + d[v − a]− − q[v − a]− = dv + (1 − d)a − (q − d)[v − a]− . (4) we can transform (1) as follows: n ³ ´ o q−d UF0 (I) = dE(I) + (1 − d) max E a + [I − a]− : a ∈ R 1−d = dE(I) + (1 − d)CV@Rβ (I)], with β = (1−d)/(q−d). Repeating now the same argument for the conditional distribution of I given F and taking the expectation afterwards, one gets the general formulas UF (I) = E[dE(I|F) + (1 − d)CV@Rβ (I|F)] = dE(I) + (1 − d)E[CV@Rβ (I|F)] and RF (I) = (1 − d)[E(I) − E[CV@Rβ (I|F)]]. 2 If I is F measurable, then conditionally on F, I is a constant and therefore by (3), UF (I) = E(I) and RF (I) = 0. Example. Suppose that I follows a lognormal distribution and that the available information is contained in a random variable J, which is correlated with I. To be more precise, assume that I = exp(X1 ), J = exp(X2 ), where µ ¶ µµ ¶ µ ¶¶ X1 µ1 σ12 σ1 σ2 ρ ∼N , X2 µ2 σ1 σ2 ρ σ22 The CV@Rβ (Y ) of a Lognormal(µ, σ 2 ) distributed Y is CV@Rβ (Y ) = 1 µ+σ2 /2 e Φ(Φ−1 (β) − σ), β as can be seen by calculation. Let FJ be the σ-algebra generated by J. Using the 2 conditional distribution of I given J and the formula E(I) = eµ1 +σ1 /2 one gets that p 2 2 UFJ (I) = deµ1 +σ1 /2 + (q − d)eµ1 +σ1 /2 Φ(Φ−1 (β) − σ1 1 − ρ2 ) and 2 2 RFJ (I) = (1 − d)eµ1 +σ1 /2 − (q − d)eµ1 +σ1 /2 Φ(Φ−1 (β) − σ1 p 1 − ρ2 ). If the correlation ρ = 1, then RFJ = 0 and if ρ = 0, then RFJ = RF0 . 2 Notice that v 7→ Ua (v) is a concave, nondecreasing utility function for every fixed a. Recall the following ordering relations for random variables. 5 Definition 1 Let I (1) and I (2) two random income variables. • We say that first order stochastic dominance (I (1) ≺F SD I (2) ) holds, if E[U (I (1) ] ≤ E[U (I (2) )] for all nondecreasing functions U for which these expected values are finite. • We say that second order stochastic dominance (I (1) ≺SSD I (2) ) holds, if E[U (I (1) ] ≤ E[U (I (2) )] for all nondecreasing and concave, integrable functions U for which these expectations are finite. • We say that concave dominance (I (1) ≺CC I (2) ) holds, if E[U (I (1) ] ≤ E[U (I (2) )] for all concave functions U for which these expectations are finite. Since all Ua are nondecreasing and concave by (4), it follows that UF0 (I) = max{E(Ua (I) : a ∈ R} is monotonic w.r.t. second order stochastic dominance ≺SSD and a fortiori with first order stochastic dominance ≺F SD and concave dominance ≺CC . By a similar argument, RF0 is antitonic w.r.t. ≺CC . More generally, if I (1) and I (2) are defined on the same probability space, and all the conditional distributions satisfy (I (1) |F) ≺SSD (I (2) |F), then UF (I (1) ) ≤ UF (I (2) ). Similarly, if (I (1) |F) ≺CC (I (2) |F), then RF (I (1) ) ≥ RF (I (2) ). It is necessary to require the ordering of all conditional distributions. Example 1 Let the probability space have three points, ω1 , ω2 , ω3 , each having probability 1/3. Let I (1) (ω1 ) = 1.01, I (1) (ω2 ) = 1.015, I (1) (ω3 ) = 1.03; I (2) (ω1 ) = 1.01501, I (2) (ω2 ) = 1.0301, I (2) (ω3 ) = 1.0101. Choose q = 1.2, d = 0.93 and F = {{ω1 , ω2 }, ω3 }. Then UF (I (1) ) = 1.0175 > UF (I (2) ) = 1.0174, but I (1) ≺SSD I (2) . Notice that UF is translation-equivariant, i.e. for all constant b UF (I + b) = UF (I) + b. (5) This follows directly from the definition. In contrast, RF (I) is translation-invariant, i.e. for all constant b RF (I + b) = RF (I). Since Uλa (λv) = λUa (v), UF and RF are (positively) homogeneous, i.e. UF (λI) = λUF (I) RF (λI) = λRF (I). 6 (6) U is concave and R is convex in the following sense: If I1 and I2 are two income variables (possibly dependent), then UF (pI1 + (1 − p)I2 ) ≥ pUF (I1 ) + (1 − p)UF (I2 ) (7) RF (pI1 + (1 − p)I2 ) ≤ pRF c(I1 ) + (1 − p)RF (I2 ). (8) and To prove (7) suppose that UF (I1 ) = E(Ua1 (I1 )) and UF (I2 ) = E(Ua2 (I2 )). Then, using (4), we obtain UF (pI1 + (1 − p)I2 ) = max E(Ua (pI1 + (1 − p)I2 ) a ≥ E(Upa1 +(1−p)a2 (pI1 + (1 − p)I2 ) ≥ EpUa1 (I1 ) + (1 − p)Ua2 (I2 )) = pUF (I1 ) + (1 − p)UF (I2 ). Inequality (8) is easily deduced from that. If we compound I1 and I2 with probability p, i.e., ( I1 with probability p, I= I2 with probability 1 − p, then E(Ua (I)) = pE(Ua (I1 )) + (1 − p)E(Ua (I2 )) and thus UF (I) ≤ pUF (I1 ) + (1 − p)UF (I2 ). Artzner, Delbaen, Eber and Heath [2] have introduced the notion of a coherent risk measure as a measure being translation-equivariant (they call it translation-invariant), positive homogeneos, convex in the sense of (8) and monotonic w.r.t. pointwise ordering (see also [5]). Thus −UF is a coherent risk measure in the sense of [2], but RF is not since it is translation invariant in the sense of (6). 3 Risk of multiperiod income streams Suppose now that I1 , I2 , . . . , IT is a stream of random incomes which arrive at times 1, 2, . . . , T . We denote by (Ω, F, P) the probability space on which these random variables are defined. Together with that, a filtration {Ft }, t = 1, . . . , T , is defined, so that It is Ft measurable for each t = 1, . . . , T . The σ-subfield Ft represents the information available at time t. We take the convention that F0 = {∅, Ω}. Analogously to the static case, let at be the amount to be consumed at time t. The decision about at must be made at time t − 1, so at must be Ft−1 -measurable. The consumption of one unit at time t gives a Net Present Value (NPV) of ct ≥ 0. The shortfall costs are qt ≥ 0. The expected shortfall costs are immediately subtracted from the consumption before period t (this can be interpreted as an insurance cost). Any surplus occuring in period t 7 increases the income of the next period. The final surplus is discounted by a factor d ≥ 0. We make the following assumptions about the sequences {ct }, {qt } and the constant d: ct ≤ qt , t = 1, . . . , T, ct+1 ≤ ct , t = 1, . . . , T − 1, d ≤ cT . (9) Let Kt be the (random) surplus carried from period t to period t + 1. We have K0 = 0 and Kt = [Kt−1 + It − at ]+ , t = 1, . . . , T. (10) The shortfall Mt at period t is given by Mt = [Kt−1 + It − at ]− . (11) Our objective is to maximize the expected consumption minus the expected shortfall costs. This can be written as the following optimization problem: U(I1 , I2 , . . . , IT ) = max E T hX i (ct at − qt Mt ) + dKT (12) t=1 s.t. at is Ft−1 -measurable for t = 1, . . . , T . (13) Similarly to the static case, we introduce the dynamic risk measure of the sequence {It } as R(I1 , . . . , IT ) = U(EI1 , . . . , EIT ) − U (I1 , . . . , IT ). (14) We shall prove in the next section that it is always non-negative, and that it possesses most of the properties of the risk measure in the static case. In order to analyze problem (12)–(13) we shall formalize it as a stochastic control problem. We denote by Xt the space of Ft -measurable random variables having a finite expected value: Xt = L1 (Ω, Ft , P). We also use the notation Et (·) for E(·|Ft ). Problem (12)–(13) can be now stated as follows. Find random variables at ∈ Xt−1 , Mt ∈ Xt , and Kt ∈ Xt , t = 1, . . . , T , that constitute the solution of the problem T hX i max E (ct at − qt Mt ) + dKT (15) t=1 s.t. Kt = Kt−1 + It − at + Mt , t = 1, . . . , T, Kt ≥ 0, Mt ≥ 0, t = 1, . . . , T, (16) (17) where K0 = 0 and the constraints (16)–(17) are understood in the ‘almost sure’ sense. We can view (15)–(17) as a linear programming problem in abstract spaces. Let us introduce Lagrange multipliers λt ∈ L∞ (Ω, Ft , P) associated with the constraints (16), 8 t = 1, . . . , T . The lagrangian takes on the form T X L(a, M, K, λ) = E (ct at − qt Mt ) + dEKT −E t=1 T X λt (Kt − Kt−1 − It + at − Mt ). (18) (19) t=1 The dual functional is defined as D(λ) = sup L(K, a, M, λ), (a,M,K)∈X0 where X0 = {(a, M, K) : at ∈ Xt−1 , Mt ∈ Xt , Mt ≥ 0, Kt ∈ Xt , Kt ≥ 0, t = 1, . . . , T }. (20) We have L(a, M, K, λ) = E +E =E +E T T X X (ct − λt )at + E (λt − qt )Mt + E (d − λT )KT t=1 T −1 X t=1 (λt+1 − λt )Kt + E t=1 T X T X λt It t=1 T X (ct − Et−1 λt )at + E (λt − qt )Mt + E (d − λT )KT t=1 T −1 X t=1 T X t=1 t=1 (Et λt+1 − λt )Kt + E λt It , where we have manipulated (by conditioning) the coefficients in front of at , Mt and Kt to obtain elements of the corresponding dual spaces L∞ (Ω, Ft−1 , P), L∞ (Ω, Ft , P), and L∞ (Ω, Ft , P). It follows that D(λ) < +∞ if and only if the following conditions are satisfied: Et−1 λt λt λT λt = ct , t = 1, . . . , T, ≤ qt , t = 1, . . . , T, ≥ d, ≥ Et λt+1 , t = 1, . . . , T − 1, (21) (22) (23) (24) and the dual problem is to find min E T X t=1 9 λt I t (25) subject to (21)–(24). It is worth noting that the multiplier process, {λt }, is a supermartingale. Kuhn–Tucker optimality conditions and duality relations hold for our model (15)–(17), similarly to the finite-dimensional case. Theorem 1 The processes ât , M̂t , and K̂t , t = 1, . . . , T , constitute an optimal solution of (15)–(17) if and only if there exists multipliers λ̂t ∈ L∞ (Ω, Ft , P), t = 1, . . . , T , such that conditions (21)–(24) are satisfied together with the complementary slackness conditions (understood in the ‘almost sure’ sense): M̂t (qt − λ̂t ) = 0, t = 1, . . . , T, K̂T (λ̂T − d) = 0, K̂t (λ̂t − Et λ̂t+1 ) = 0, (26) (27) t = 1, . . . , T − 1. (28) Proof. Consider the affine operator G = (G1 , . . . , GT ) involved in (16): Gt (a, M, K) = Kt − Kt−1 − It + at − Mt . t = 1, . . . , T. We treat it as an operator from the space on which (a, M, K) are defined (the product of the corresponding L1 spaces) to X = X1 × · · · × XT . Since the image of the set (20) under G contains a neighborhood of 0 in X , our result follows from [4, Thm. 4, §1.1]. 2 Theorem 2 Suppose that conditions (9) hold. Then for every sequence I1 , . . . , IT such that E |It | < +∞, t = 1, . . . , T , the optimal values of problems (15)–(17) and (21)–(25) are finite and equal. Proof. A feasible solution to the primal problem (15)–(17) is given by at = E(It ), with the other variables determined by (10)–(11). The objective value at this point provides a lower bound for the optimal value of the dual problem. The feasible set of the dual problem, given by (21)–(24), is convex, closed and bounded in L∞ (Ω, F1 , P) × · · · × L∞ (Ω, FT , P). Hence, it is weakly∗ compact (Alaoglu theorem, see [6, Thm. 6, p. 179]). Therefore the dual problem has an optimal solution, λ̂. Then every solution (a, M, K) of the conditions (26)–(28) which satisfies equation (16) is, by Theorem 1, an optimal solution of the primal problem. Such a solution exists, because we can determine K and M from (26)–(28), and then choose a (which is not constrained) to ensure (16). 2 It is clear that the optimal Lagrange multipliers λ̂t (ω) can be interpreted as the (random) costs of a unit of a credit at time t and scenario ω. With such costs neither is it profitable to borrow nor to lend at each time t. 4 Properties of the dynamic risk measure Directly from the definition, the functionals U and R are homogeneous. U is monotonic (2) (1) in the following sense: If two income processes {It } and {It } are defined on the same 10 (1) (2) probability space (Ω, (Ft ), P) with the same filtration Ft and if It ≤ It a.s. for all t, (1) (1) (2) (2) then U(I1 , . . . , IT ) ≤ U(I1 , . . . , IT ). More generally, if all conditional distributions (1) (2) (1) (1) (2) (2) satisfy It |Ft−1 ≺SSD It |Ft−1 for t = 1, . . . , T , then U(I1 , . . . , IT ) ≤ U (I1 , . . . , IT ). Finally, U is translation equivariant in the following sense: U(I1 + b1 , . . . , IT + bT ) = U(I1 , . . . , IT ) + c1 b1 + c2 b2 + . . . cT bT , where b1 , . . . , bT are constants. We shall also show in this section that U is concave, so it makes sense to call −U coherent in the sense of [2]. Let us start from the following observation. Lemma 2 Suppose that conditions (9) hold and that each It it Ft−1 -measurable and integrable, t = 1, . . . , T . Then U(I1 , . . . , It ) = T X ct E(It ). t=1 Proof. The solution at = It , Mt = 0, Kt = 0, t = 1, . . . , T, is feasible for the primal problem (15)–(17), while the solution λt = ct , t = 1, . . . , T, (29) is PTfeasible for the dual problem (21)–(25). Since both have the same objective values, t=1 ct E(It ), by virtue of Theorem 2 they are optimal for their problems. 2 As a conclusion from this result we obtain a basic property of our risk measure. Theorem 3 Suppose that conditions (9) hold. Then for every sequence I1 , . . . , IT such that E |It | < +∞, t = 1, . . . , T , the risk measure (14) is finite and non-negative. Proof. Under conditions (9) the deterministic solution (29) is feasible for (21)–(25). Since a feasible solution for a dual problem always provides an upper bound for the primal problem, for every sequence I1 , . . . , IT such that E |It | < +∞, t = 1, . . . , T , we have T X U(I) ≤ D(c) = ct E(It ) = U(E(I)), t=1 where the last equality follows from Lemma 2. 2 Theorem 4 Let Bt , t = 1, . . . , T , be σ-subalgebras such that Bt−1 ⊆ Bt ⊆ Ft for t = 1, . . . , T . Then for every sequence I, . . . , IT with E|It | < ∞ we have R(E(I1 |B1 ), . . . , E(IT |BT )) ≤ R(I1 , . . . , It ). 11 (30) Proof. By Theorem 2 both U(I1 , . . . , It ) and U(E(I1 |B1 ), . . . , E(IT |BT )) are finite. Let µ̂t , t = 1, . . . , T , be the optimal solution of the dual problem (21)–(25) with the income stream E(It |Bt ), t = 1, . . . , T . Then the multipliers µ̄t = E(µt |Bt ), t = 1, . . . , T , are also optimal solutions of this problem. Indeed, the feasibility follows from Et µ̄t = Et E(µt |Bt ) = Et µt = ct , t = 1, . . . , T, and the optimality is guaranteed by E T X µt E{It |Bt } = E t=1 T X µ̄t E{It |Bt }. t=1 The multipliers µ̄t are also feasible for (21)–(25) with the income stream It , t = 1, . . . , T . Therefore, T T X X U(I1 , . . . , It ) ≤ E µ̄t It = E µ̄t E(It |Bt ). t=1 t=1 Combining the last two relations and using (14) we obtain the required result. 2 A simple interpretation of Theorem 4 is that the additional information, represented by Bt , reduces risk. In particular, if each It becomes known at the preceding period, there is no risk at all, as we have shown it in Lemma 2. Also, combining two income streams cannot increase risk. Theorem 5 Let I = (I1 , . . . , IT ) and J = (J1 , . . . , JT ) be two streams of integrable incomes. Then for every p ∈ (0, 1) R(pI + (1 − p)J) ≤ pR(I) + (1 − p)R(J), that is, the functional R(·) is convex. Proof. The result follows from Theorem 2. Let us denote by Λ the set of multipliers defined by (21)–(24). We have h U(pI + (1 − p)J) = min pE λ∈Λ ≥ p min E λ∈Λ T X t=1 T X λt It + (1 − p)E T X i λ t Jt t=1 λt It + (1 − p) min E t=1 λ∈Λ T X λ t Jt t=1 = pU(I) + (1 − p)U(J). Since Lemma 2 implies that U(pEI + (1 − p)EJ) = pU(EI) + (1 − p)U(EJ), our result follows. 2 12 5 Finite filtrations Let us consider in more detail the case when the filtration F = (F1 , . . . , FT ) is finite. This filtration generates partitions of the probability space Ω, which may be represented by a rooted tree of height T . Each node of the tree at layer t represents an atom of the σ-algebra Ft . Subtrees represent subpartitions. Suppose that the nodes of this tree are numbered {0, 1, 2, . . . N }, with 0 being the root. Let N = {1, 2, . . . , N } be the node set (not including the root). We assume that there are N0 − 1 nonterminal nodes in N and that T = {N0 , . . . , N } is the set of terminal nodes (leaves). If n is a node in N , then n− denotes its predecessor and t(n) denotes its time stage (its distance from the root). The nodes of the tree are marked by the probabilities of the corresponding elements of the partitions. Evidently, such a tree represents the filtered probability space (Ω, (Ft )t=1,...,T , P ). An income stream I = {It }, which is adapted to the filtration Ft , assigns values In to each node n ∈ N . We call such a valuated tree an income stream tree. The commitment decisions are made at the nonterminal nodes (including the root), i.e. a is a vector of length N0 with components a0 , . . . , aN0 −1 . The calculation of the dynamic utility functional U turns out to be a linear program defined on income stream trees. It reads X X X max pn ct(n) an− − pn qt(n) Mn + pn dKn a,M,K n∈N n∈N n∈T s.t. Kn + an− − Mn = In , t(n) = 1, n ∈ N , Kn − Kn− + an− − Mn = In , t(n) > 1, n ∈ N , Mn ≥ 0, Kn ≥ 0 n ∈ N . (31) This linear program has N0 + 2N variables and N equality constraints. Its optimal value is U(I). The risk is defined as X R(I) = ct(n) pn In − U(I). n∈N Let (zn ) be the vector of dual variables of (31). We introduce the notation n+ for the set of all successors of the node n ∈ N \T . Setting zn = pn λn , we obtain the following 13 form of the dual problem: min λ X pn λn In n∈N 1 X pm λm , n ∈ N \T , pn m∈n+ 1 X cn = pm λm , n ∈ N \T , pn m∈n+ s.t. λn ≥ (32) λ n ≤ qn , n ∈ N , λn ≥ d, n ∈ T . It is the discrete counterpart of the general dual problem (21)–(25). As before, the dual process {λn } is a submartingale. 6 Mean–risk models To illustrate the way in which our dynamic risk measure can be employed, let us now consider the following situation. We have n random income streams Itj , t = 1, . . . , T , j = 1, . . . , n. Our objective is to create a portfolio, I(x) = n X xj I j j=1 P with xj ≥ 0, nj=1 xj ≤ C. With no loss of generality we may assume that C = 1. The portfolio income stream can be characterized by two measures: the mean represented by the expected net present value, µ(x) = T X ct E(It (x)) = t=1 n X xj j=1 T X ct E(Itj ), t=1 and our risk measure rd (x) = R(I1 (x), . . . , It (x)). The mean–risk optimization model can be formulated as follows min rd (x) s.t µ(x) = m, n X xj = 1, j=1 x ≥ 0. 14 (33) In the above problem the parameter m is varied over all values for which a solution exists. Using Lemma 2 we can write this problem in a more explicit form: T ³ ³X n hX ´ ´ i min E ct xj Itj − at + qt Mt − dKT t=1 j=1 n X xj Itj − at + Mt , s.t. Kt = Kt−1 + t = 1, . . . , T, j=1 n X j=1 n X xj T X (34) ct E(Itj ) = m, t=1 xj ≤ 1, j=1 xj ≥ 0, j = 1, . . . , n, Kt ≥ 0, Mt ≥ 0, t = 1, . . . , T. By varying m over possible values of the mean we can reconstruct the mean–risk efficient frontier of our model. Another possibility is to base the mean–risk model on the distribution of the net present value of I(x), given by N (x) = n X xj T X ct Itj . t=1 j=1 The mean is the same as before, while the risk can be measured by the static measure based on (1): rs (x) = E(N (x)) − max{E(a + d[N (x) − a]+ − q[N (x) − a]− )}. a The explicit linear programming formulation of model (33) (with the rik measure rd replaced by rs ) reads: min E n hX j=1 s.t. n X xj xj T X T X t=1 n X T X j=1 n X i − a + qM − dK t=1 j=1 xj ct Itj ct Itj − a + M − K = 0, (35) ct E(Itj ) = m, t=1 xj ≤ 1, j=1 xj ≥ 0, j = 1, . . . , n, K ≥ 0, M ≥ 0. The disadvantage of the latter approach is that it concentrates exclusively on the net present value, ignoring the temporal variations. 15 : »» 0» X» © * X X X z ©© 1 : »» HH j 0» X» XXz X 1 : »» » X» X © * X © z X @ @ R 0 ©© HH : »» HH j 0» X» XXX z 1 ¡ µ ¡ 0 © HH ¡ ¡ @ @ : »» 0» X» © * X X X z ©© 1 : »» HH j 0» X» XXX z 1 : »» » X» X © * X © X z @ @ 0 ©© R HH : »» HH j 0» X» XXX z 1 1 µ ¡ ¡ ¡ ¡ @ @ 0 0 0 0 0 0 © HH 0 0 0 0 1 0 Figure 1: Income stream trees for Example 2. Left: Tree 1; Right: Tree 2. 7 Examples Example 2 This example has already been mentioned in the Introduction. Suppose a fair coin is thrown three times. Consider two situations: Situation 1: The income is 1 at the final stage, if more heads than tails were counted. Situation 2: The income is 1 at the final stage, if the last throw shows heads. The corresponding income stream trees are shown in Figure 1, where an upmove means heads and a downmove represents tails. Evidently, the two cases leads to exactly the same marginal income distributions at each stage. On the other hand, Situation 1 is more predictable and should lead to a smaller risk. We have solved the linear program (31) with the specification c = [1, 0.95, (0.95)2 ], q = [1.2, 1.2 · 0.95, 1.2 · (0.95)2 ], d = (0.93)2 , and we have obtained the following results: U(I (1) ) = 0.4419, U(I (2) ) = 0.4325, U(E(I (1) )) = U(E(I (2) )) = 0.5 · (0.95)2 = 0.4512 and therefore R(I (1) ) = 0.0093 < R(I (2) ) = 0.0188. This analysis shows that Situation 2 is riskier than Situation 1, indeed. It is also interesting to look at the dual variables λ(1) and λ(2) given by (32). They generate dual submartingale processes which live on the same tree as the income processes. This is illustrated in Figure 2. 16 ¡ µ ¡ .950 : .8649 »» .9025 » X» © * X X z .9401 X ©© © HH ¡ ¡ @ @ @ @ R 1.05 ©© HH : .8649 »» HH j .9975 » X» XXX z .9401 * ©© .9025 : .8649 »» » X» XXX z .9401 ¡ µ ¡ ¡ ¡ @ @ @ @ R 1.05 : .8649 »» »» .9739 X © * X X X z 1.083 ©© © HH : .8649 »» HH j .9261 » X» XXX z .9401 * ©© .950 : .9120 »» HH j .9975 » X» XXX z 1.083 .9975 ©© HH : .8649 »» » X» XXX z 1.083 : .8649 »» HH j .9025 » X» XXX z .9401 Figure 2: The dual submartingale process for Example 2. Left: Tree 1; Right: Tree 2. 1 : »» » X» X © * X z X @ ©© @ R 0© HH : HH »»» j 0» XXX z X 1 ¡ @ : »»» 1» XXX © * X z ©© 1© HH ¡ µ : »» HH ¡ j 1» X» X X ¡ z X @ 0 © H H ¡ µ : »» HH ¡ j 0» X» X X ¡ X z 1 0 0 : »»» »X 1X © * X X z ©© ¡ @ : »» » X» X © * X z X @ ©© @ R 0© HH : HH »»» j 0» XXX z X @ 0 1 1 0 1 1 0 0 1 1 1 0 0 Figure 3: Income stream trees in Example 3. Left: Tree 3; Right: Tree 4. Example 3 We modify Example 2 in such a way that a positive income may also occur at stages 1 and 2, but the marginals remain identical. Consider the income trees shown in Figure 3. Assuming that all arc probabilities are 0.5 one gets the result U(I (3) ) = 1.3919, R(I (3) )= 0.0344 U(I (4) ) = 1.3775, R(I (4) )= 0.0487 Since the predictability occurs earlier in Tree 3, its risk is smaller, as expected. We may illustrate here that hiding some information leads to larger risk. Suppose that the outcome of throw 2 is not revealed. In this case Tree 3 changes to Tree 3a shown in Figure 4. The utility and risk for Tree 3a are U(I (3a) ) = 1.3825, R(I (3a) ) = 0.0438. 17 ¡ µ ¡ 1 1/2 ¡ ¡ @ @ @ R @ 1/2 0 1/4 * ©© © - 1© HH Hj 3/4 H 1 3/4 * ©© © - 0© HH Hj 1/4 H 1 0 0 Figure 4: Tree 3a. ¡ µ ¡ 1 © H ¡ ¡ @ @ @ @ R 0© H : »» 1» X» © * X XX © z © HH : »» H j 2» X» XXz X : »» 0» X» X © * X © X z © HH : »» H j 1» X» XXz X : 2 »» »» -1 X © * X XX © z -1 2 © © H HH ¡ µ : -3 »» ¡ H j 1» X» X X ¡ X z 0 ¡ @ : 2 »» »» @ 1X X © * X © X z 1 @ © R 1© @ H HH : 1 »» H j 0» X» XXX z 0 1 3 0 1 0 1 0 1 Figure 5: Income stream trees for Example 4. Left: Stream 1; Right: Stream 2. As expected, the risk of Tree 3a is larger than the risk of Tree 3. Example 4 Our next example concerns portfolio models for income streams. Let us consider two streams as shown on Figure 5. The third income stream is a shifted random walk: each upmove corresponds to the income of 3 and each downmove to the income of 0. The mean and the dynamic risk associated with these three income streams are as follows µ(I 1 ) = 2.13, R(I 1 ) = 0.034 µ(I 2 ) = 2.46, R(I 2 ) = 0.065 µ(I 3 ) = 4.27, R(I 3 ) = 0.101. For these three assets we have formulated and solved the two mean–risk portfolio problems: the model with the dynamic risk measure (34) and the model based on the net present value (35). 18 6 Discounted Cumulative Flow 5 4 3 2 1 0 0 1 2 3 Period Figure 6: Performance of the portfolio generated by the dynamic mean–risk model. We assumed the mean value m = 3. In the dynamic model we have obtained the following portfolio xd = (0.6, 0, 0.4), while the static model (based on the net present value) yields xs = (0, 0.69, 0.23). As expected, the static model uses inflow 2, which has a better mean and a lower risk in terms of the total cumulative flow, while the dynamic model uses inflow 1, which is more predictable. The corresponding cumulative flow trajectories for all eight scenarios are illustrated in Figures 7 and 7. The cumulative flow for income {It } is defined recursively by W (0) = 0, Wt = Wt−1 + ct It , t = 1, 2, 3. Our results confirm our expectations: the dynamic risk measure, when used in a mean–risk optimization model, prefers solutions with more predictable income streams. The images of the dynamic mean–risk frontier and the static frontier in the space correponding to ur dynamic measure is presented in Figure 8. We see that for low values of risk, there is a significant difference between the two approaches. 19 6 Discounted Cumulative Flow 5 4 3 2 1 0 0 1 2 3 Period Figure 7: Performance of the portfolio generated by the static mean–risk model. 4.5 4 3.5 3 2.5 2 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 Figure 8: The dynamic mean–risk frontier and the image of the static mean–risk frontier. 20 References [1] F. Andersson, H. Mausser, D. Rosen and S. Uryasev, Credit risk optimization with Conditional Value-at-Risk criterion, Mathematical Programming 89 (2001), pp. 273– 291. [2] P. Artzner, F. Delbaen, J.-M. Eber and D. Heath, Coherent measures of risk, Mathematical Finance 9 (1999), pp. 203–228. [3] D. Carino and W. T. Ziemba, Formulation of the Russell–Yasuda Kasai financial planning model, Operations Research, 46 (1998), pp. 433–449. [4] Ioffe A.D., Tihomirov V.D., Theory of Extremal Problems, North–Holland, Amsterdam, 1979. [5] H. Föllmer and A. Schied, Convex measures of risk and trading constraints, Finance and Stochastics (to appear). [6] Kantorovich L.V., Akilov G.P., Functional Analysis, Pergamon Press, Oxford, 1982. [7] H. Konno and H. Yamazaki, Mean–absolute deviation portfolio optimization model and its application to Tokyo stock market, Management Science 37 (1991), pp. 519– 531. [8] H. M. Markowitz, Portfolio selection, Journal of Finance 7 (1952), pp. 77–91. [9] H. M. Markowitz, Mean–Variance Analysis in Portfolio Choice and Capital Markets, Blackwell, Oxford, 1987. [10] J. M. Mulvey and A. E. Thorlacius, The Towers Perrin global capital market scenario generation system, in: Worldwide Asset and Liability Modeling, J.M. Mulvey and W.T. Ziemba (Eds.), Cambridge University Press 1998, pp. 286–314. [11] J.M. Mulvey, Introduction to financial optimization, Mathematical Programming 89 (2001), pp. 205–216. [12] J. M. Mulvey and W. T. Ziemba, Asset and liability allocation in a global environment, In: Finance, R.A. Jarrow, V. Maksimović, W.T. Ziemba (Eds.), North–Holland 1995, pp. 435–463. [13] W. Ogryczak and A. Ruszczyński, From stochastic dominance to mean–risk models: semideviations as risk measures, European Journal of Operational Research, 116 (1999), pp. 33–50. [14] W. Ogryczak and A. Ruszczyński, On consistency of stochastic dominance and mean– semideviation models, Mathematical Programming 89 (2001), pp. 217–232. 21 [15] W. Ogryczak and A. Ruszczyński, Dual stochastic dominance and related mean–risk models, Rutcor Research Report 10-2001, Rutgers University, Piscataway 2001 (to appear in SIAM Journal on Optimization). [16] G. Ch. Pflug, Some remarks on the Value-at-Risk and the Conditional Value-atRisk. In: Probabilistic Constrained Optimization - Methodology and Applications, S. Uryasev (Ed.), Kluwer Academic Publishers, ISBN 0-7923-6644-1 pp. 272–281, 2000. [17] G. Ch. Pflug and A. Swietanowski, Dynamic asset allocation under uncertainty for pension fund management, Control and Cybernetics, 28 (1999), pp. 755–777. [18] R. T. Rockafellar and S. Uryasev, Optimization of Conditional Value-at-Risk, Journal of Risk 2 (2000), pp. 21–41. [19] W. F. Sharpe, A linear programming approximation for the general portfolio analysis problem, Journal of Financial and Quantitative Analysis, 6 (1971), pp. 1263–1275. 22