Stochastic Programming For Business Applications Alan Brown abrown@labyrinth.net.au Profit Objective Business managers seek to maximize profit. Past profit is deterministic. Future profit is based on stochastic values. Examples Linear programming – Resource Allocation, (Brown, 1969) – Tennis/Warfare, (Barnett, 2004) Quadratic programming – Po rtfolio Selection, (Markowitz, 1952) Integer programming – Open Pit Mining, ( Froyland, 2004 ) Linear Programming The LP formulation of resource allocation problem: maximise P = bi xi – cj yj subject to y = A.x (resource requirements) cj yj ≤ m (budget constraint) 0 ≤ xi ≤ 1 yj ≥ 0 where xi is a possible activity, yj is required units of resource, bi is the present value of benefit from the ith activity, cj is the present value of cost per unit for the jth resource, A is a matrix for the bill of materials, and P is the planned profit. Planning In the LP problem, the revenue items are expected values. These values are deterministic, which corresponds to the past. Our aim is to plan for the future. The problem must be reformulated for this purpose. Stochastic Programming The SP formulation of resource allocation problem using risk adjusted values; maximise a(P) = a( bi xi – cj yj) subject to y = A.x (resource requirements) - a(- cj yj) ≤ m (budget constraint) 0 ≤ xi ≤ 1 yj ≥ 0 where xi is a possible activity, yj is required units of resource, bi is the present value of benefit from the ith activity, cj is the present value of cost per unit for the jth resource, A is a matrix for the bill of materials, and a(.) is the risk adjusted value of the item. Stochastic Programming In this example, the stochastic values are mainly confined to coefficients of the objective function. However a risk adjusted value of the total expenses - a(- cj yj) is used in the budget constraint. The problem is no longer an LP if risk adjustment function is non linear. The risk adjusted values of the separate components of revenue are additive if further conditions are imposed. Exponential Utility Assume the decision makers of the firm are risk averse, with utility function 1.5 1 0.5 u(x) = R ( 1 – exp(- x/R) ) 0 -2 -1.5 -1 -0.5 0 -0.5 0.5 1 1.5 -1 where R = risk capital -1.5 -2 -2.5 -3 -3.5 -4 The maximum value of the exponential utility is R. “A bird in the hand is worth two in the bush”. 2 2.5 3 Exponential Utility If Y is a random variable then its utility is u(Y) = R ( 1 – E[exp(- Y/R)] ) where R is the risk capital of the firm, and E[.] is the expectation, or mean. Additivity theorem Theorem. When exponential utility is used, risk adjusted values are additive provided the variables are independent. Moment Generating Function The moment generating function of the random variable Y, with auxiliary parameter t, is MY(t) = E[exp(Yt)] = 1+ m1 t + m2 t2 / 2! + m3 t3 / 3! + m4 t4 / 4! + ……. where m1, m2, m3, m4, …. are moments of Y about the origin. The exponential utility of the random variable Y is u(Y) = R ( 1 – MY(-1/R) ) where the m.g.f. has an auxiliary parameter –1/R. Exponential utilities are not additive. Risk adjusted value If x is a deterministic then its utility is u(x) = R (1 – exp(- x/R) ) This is a monotonic increasing function of x, so it has a unique inverse. u-1( u(x) ) = - R log(1- u(x)/R ) =x The risk adjusted value of a random variable Y is given by a(Y) = u-1(u(Y)) = - R log(1- u(Y)/R ) = - R log( E[exp(-Y/R)] ) Risk adjusted value inequality If a(Y) = - R log( E[exp(-Y/R)] ) then a(Y) ≤ E[Y] Proof: Use Jensen’s inequality for convex functions. (refer Feller, Vol 2, p.151) Cumulant Generating Function The cumulant generating function of the random variable Y with auxiliary parameter t is KY(t) = log(MY(t)) = log( E[exp(Yt)] ) = k1 t + k2 t2 / 2! + k3 t3 / 3! + k4 t4 / 4!+ ……. where k1, k2, k3, k4, …. are cumulants of Y. The risk adjusted value of a random variable Y given by a(Y) = - R log( E[exp(- Y/R)] ) = - R KY(-1/R) Independence Definition. Random variables are independent if, and only if, their joint distribution function factorises into separate components. f(x, y) = g(x) h(y) In practice, independence means that statistical calculations involving multiple integrals (summations for discrete variables) can be calculated as repeated single integrals (summations). Lemma If X and Y are independent random variables, then the moment generating function of their sum is the product of their moment generating functions. Proof: If Z = X + Y then MZ(t) = E[ exp(Zt) ] = E[ exp(Xt + Yt) ] = E[ exp(Xt) . exp(Yt) ] = E[ exp(Xt) ] . E[ exp(Yt) ] using independence of X and Y = MX(t) . MY(t) Corollary to lemma If X and Y are independent random variables, then the cumulant generating function of their sum is the sum of their cumulant generating functions. Proof: Take logarithms in the statement of the Lemma to get KZ(t) = KX(t) + KY(t) Proof of the theorem Proof of the Theorem: If Z = X + Y where X and Y are independent random variables then a(Z) = - R KZ(-1/R) using exponential utility = - R KX(-1/R) - R KY(-1/R) using corollary = a(X) + a(Y) using exponential utility Risk adjusted profit Process: 1. 2. 3. 4. Determine the risk capital of the firm. Determine the cumulants of the individual revenue items. Check the risk adjusted value of each item. Add the cumulants of the individual items to obtain the cumulants of the profit. 5. Adjust the second cumulant of the profit to allow for correlations between items. 6. Evaluate the risk adjusted profit by finding the inverse of its cumulant generating function. resall69.xls Risk capital The balance sheet of the firm shows the shareholders capital. This may be used as a default value of the risk capital when no further information is provided. Sometimes a suitable fraction of this amount is specified. Entering statistical data Future income and expenditure can only be estimated. Nothing is certain, even for revenue items covered by contracts. The magnitude of the uncertainty may vary from item to item. We require a way to capture the shape of the statistical distribution. A common starting point is to estimate the mean and standard deviation of each item. Entering statistical data Let m = mean of the random variable, and = standard deviation. If m ≠ 0, put c = / m = coefficient of variation. Then k1 = m k2 = 2 = c2 m2 k3 = c3 m3 k4 = c4 m4 (mean) (variance) where = skewness where = kurtosis. Enter for each item: m, c, , as required. [ Perhaps and can be set to 0 if / R < 0.1 ] Standard distributions coefficient of variation deterministic 0 distribution skewness kurtosis 0 0 normal Poisson c 1/√n 0 1/√n 0 1/n gamma 1/√ 2/√ 6/ log-normal c c(3+c^2) c^2(16+15c^2+ 6c^4+c^6) Special cases distribution coefficient of variation skewness kurtosis deterministic 0 0 0 normal Poisson exponential (gamma) log-normal 1 1/5 1 0 1/5 2 0 1/25 6 1 4 36 resall69.xls Entering statistical data The coefficient of variation, skewness, and kurtosis are non-dimensional. They do not depend on the scale. They make is easier to communicate ideas about shapes of distributions. The use of measures which do not depend on scale is well known in the art of modelling, e.g. flow tank and wind tunnel experiments. Weaknesses The cumulant generating function of the revenue items are not fully represented by their first four cumulants. Neither is the cumulant generating function of the planned profit, and we are inverting a finite rather than an infinite sum. This truncation error can be avoided when all revenue items are statistically independent. Weaknesses The non-linear nature of the risk adjusted adjusted profit may lead to fractional activities occurring in the solution of the mathematical program unless special precautions are taken. There is an implicit assumption that the risk adjusted value of any fractional activity is meaningful. The implicit assumption is not required for an IP. Weaknesses If the random variables are dependent, the covariance cannot be ignored. To introduce covariance requires an additional matrix of correlation data. Truncation errors occur when the cumulant generating function for the profit is inverted. Catastrophe risks Sometimes cumulants of the distributions of a random variable may be infinite. Such distributions are associated with extreme events. If one cumulant is infinite for any individual revenue item, then so is the corresponding cumulant in the total, and the risk adjusted profit will not exist. Thus catastrophe risks are not covered by the risk capital. Central limit theorem All cumulants of the risk adjusted profit are scaled by powers of the risk capital R. In general, R is larger than any individual revenue item. In this case the cumulants of the risk adjusted values decrease rapidly as their order increases, and the distribution of the total risk adjusted profit may be close to Normal. Central limit theorem If the risk capital is extremely large, the coefficient of variation of the risk adjusted profit is close to zero, and the distribution of the risk adjusted profit is close to deterministic. In these circumstances the process of risk adjustment adds very little to our knowledge. Central limit theorem When the risk capital decreases to the same order of magnitude as an individual revenue item, the risk adjusted value of this item may become negative. This indicates that there is insufficient capital to cover this risk if adverse conditions occur in the future. Such an indication is very useful to the business planner. Central limit theorem The most interesting cases occur when the risk capital is greater than all the individual revenue items, but not by too much. Then the process of risk adjustment adds takes into account the variety of statistical distributions of the various revenue items that may occur in practice, and allows in a sensible way for their interaction. In these cases, attention should be given to co-variances that might exist, especially between the larger revenue items. Order of preference The order of preference obtained from the SP can be different to the order obtained from the LP. resall69.xls Activity LP SP x1 1.000 0.756 x2 0.000 0.000 x3 0.000 0.000 x4 0.138 1.000 x5 0.000 0.000 An order of preference for the individual activities can be obtained by a gradual relaxation of the binding constraints. Post processing an LP (or IP) Given planned profit, P*, as the solution of an LP, its risk adjusted value is a(P*) = -R log(E[exp(-P*/R)] This can be calculated using the cumulants of the individual revenue items appearing in the solution. This process preserves the properties of the solution to the LP. It may not give the maximum risk adjusted value, but it will provide a lower bound. It may turn out that this lower bound is negative! Post processing Example from tennis/warfare: T. Barnett (2004) If resources available, plan to apply extra effort when E[cost] < g . I(c, d) . E[reward] where g = gain in probability of winning point with effort I(c, d) = importance of point at score (c, d) A conservative player uses risk adjusted values. His criterion changes to -a(-cost) < g . I(c, d) . a(reward) This new criterion is satisfied less often. Portfolio Selection Problem A Find the portfolio mix that maximises the return for a given risk. Problem B Find the portfolio mix that minimises the risk for a given return. Quadratic Programming The QP formulation of the investment problem is: maximise subject to where m = ∑pi i v = ∑ ∑ pi pj ij i j ∑pi = 1 pi ≥ 0 (mean return of mix) (variance of mix as measure of risk) (constraint on proportions) (no short selling) pi is proportion of the ith component in the mix, i is mean return for the ith component, i is standard deviation of return for the ith component, ij is the correlation of the returns for the ith and jth components. H. M. Markowitz, Journal of Finance, March 1952. Historical data Annual forces of return 1983-2003 S/E I PT P F/G C mean coeff of var skew kurt 13.3% 1.348 0.212 -0.491 10.5% 1.898 0.423 1.412 12.8% 0.848 0.693 1.337 8.1% 1.370 -1.206 1.989 11.4% 0.635 -0.272 0.692 9.2% 0.455 0.546 -1.259 correl S/E I PT P F/G C S/E 1.000 0.741 0.592 -0.120 0.359 0.334 0.741 1.000 0.215 -0.010 0.218 0.345 0.592 0.215 1.000 -0.135 0.585 0.159 -0.120 -0.010 -0.135 1.000 -0.123 0.486 0.359 0.218 0.585 -0.123 1.000 0.555 0.334 0.345 0.159 0.486 0.555 1.000 I PT P F/G C markowitz.xls Risk frontier efficient frontier 0.1400 0.1200 0.1000 M = f(V) M 0.0800 0.0600 0.0400 0.0200 0.0000 0.0000 0.0100 0.0200 0.0300 0.0400 V The curve of the risk frontier is a piecewise parabola. Where does the investor sit? Planning In this QP problem, the returns are deterministic, which corresponds to the past. Our aim is to plan for the future, when the returns are stochastic. The QP problem must be reformulated for this purpose. Quadratic Programming revised The revised QP formulation of the investment problem is: maximise subject to where a(m) = a( ∑pi i ) (mean return, adjusted for risk) v = ∑ ∑ pi pj ij i j (variance of mix as measure of risk) ∑pi = 1 (constraint on proportions) pi ≥ 0 (no short selling) pi is the proportion of the ith component in the mix, i is mean return for the ith component, i is standard deviation of return for the ith component, ij is the correlation of the returns for the ith and jth components, a(.) is the risk adjusted value of a random variable. Risk adjusted frontier risk adjusted frontier a(m) = f(v) - g(v) 0.1400 0.1200 0.1000 0.0800 a(M) 0.0600 0.0400 0.0200 0.0000 0.0000 0.0100 0.0200 0.0300 0.0400 V The curve of the risk adjusted frontier has a unique maximum. The investor sits at this maximum. CGF and risk adjusted value The risk adjusted value of a random variable Y given by a(Y) = -R log( E[exp(- Y/R)] ) = -R KY(-1/R) = k1 - k2 /(2 R) + k3 /(6 R2) - k4 /(24 R3) + ……. A useful approximation is a(Y) k1 - k2 /(2 R) Risk capital The constraint ∑pi = 1 can be re-scaled as ∑pi R = R Thus, for this problem, the risk capital, R, of the investor is R=1 CGF and risk adjusted value When R =1, the risk adjusted value of the mean return is given by a(m) = k1 - k2/2 + k3/6 - k4/24 + … A useful approximation is a(m) m - v/2 CGF of return on the mix The CGF for the return on the portfolio mix can be calculated by adding covariance terms when calculating the second cumulant. This variance/covariance matrix is already specified in the problem. The higher cumulants vanish when the joint distribution of the individual returns is multivariate-Normal. However assuming this joint distribution occurs can lead to poor decision making. Central limit theorem When the adjustment for co-variance is made we are forced to use a truncated representation of the cumulant distribution. However we have not made any adjustments to the third or fourth cumulants. The errors that arise are usually small in practice. Solution to Investor’s Problem S/E I PT P F/G 27% 0% 73% 0% 0% Is this answer reasonable? C m v a(m) 0% 12.90% 0.0131 12.26% Blast from the past Blast from the past Blast from the past In 1987, investors ignored the risks associated with higher returns in the Share sector. Some investors sought refuge in the Property sector, which showed low historical variance over the previous 15 years. The Property sector crashed in 1990. The historical data for this sector now shows a high kurtosis. Do not rely only on past data. Forecast the future. markowitz2.xls lnspiration lnspiration lnspiration WITH BHP BILLITON: Together with AMSI, Centre researchers plan to unde rtake a study of various ma thematical aspects of Project Evaluation. Project Evaluation is in it s infancy, but attempts to quantify risk and unc ertainty at various stages in the lifet ime of a m ajor project, from inception to full commi tme nt. from www.comple x.org.au Changing reporting levels Board <- Strategy CEO manager manager <- Tactics The Boodarie challenge