Ito’s Lemma Ito's Lemma, page 1 Ito’s Lemma X(t) is a stochastic process if its value evolves stochastically over time. Start with discrete– time processes whose values can change only at discrete points in time. The change in X has some probability distribution. X(t+1) – X(t) ~ f [, , … , (other parameters)] e. g. if X(t) = ln(stock price at t) and f (, ) = normal density (, 2) then future stock price is log-normally distributed. Markov Process: stochastic process for which probability density depends only on current value X(t) and not on any earlier value X(t – s). Given current price, no additional information in examination of past prices. Start with this process (random walk with drift): X(t+1) = X(t) + + t where (2) ~ (0, 1) [notation means (, ) is = 0, = 1, i.e., a standardized random variable, but not necessarily normal t is independent of s for t s. This specification implies that X(t+2) – X(t) = 2 + t + t+1 and that: X(t+2) – X(t) ~ (2, 2 2), and in general: X(t+n) – X(t) ~ (n, n 2) Implications: “drift” is proportional to n variance is proportional to n; so standard deviation is proportional to n Ito's Lemma, page 2 Digression: Interesting implication: what is the probability that the return on the stock index (e.g., the S&P 500) will be positive in a given period? Using historical (post-war) data, annual 12% annual 16% Convert to daily: n = 1/250. Therefore, daily = 12/250 = 0.048% daily = 16/ 250 = 1.01% Annual basis, odds about 23% market will fall if returns are normal distributed. For the market to fall, return must be below the mean by 12%, which is ¾ of a standard –12% deviation: N( 16 ) = 0.23. Actual ratio, 1946 – 2001, is 13/56 = 0.23!! –.048% However, on a daily basis, mean return is insignificant compared to . N( 1.01 ) = .48. So the odds are essentially 50/50 market will rise or fall. Essentially a random walk. A related paradox: Suppose your portfolio is $1 million. Put it in a safe (real) annuity, and you can earn about $35,000 (real) annually, clearly not enough to retire on (at least in Boston). Put it in the market index, and the daily standard deviation of your portfolio will be about .01 $1,000,000 = $10,000, which dominates your salary. [E.g., even with a salary of $250,000, you earn only $1,000 a day, relatively insignificant compared to your daily volatility.] So why bother working? How do we resolve these two conclusions? Ito's Lemma, page 3 Given this analysis, we can write X(t+n) – X(t) = n + n [where ~ (0,1)] Notice that variance = n2 Now let’s move in “opposite” direction. Hold the period fixed and divide it into n subperiods, each of length 1/n = t = h, Subperiod: t t+h t+2h t+3h • • • t+nh |––––––––|–––––––|–––––––|––––––––––––––––––|–– t • • • t+1 Innovation: h h h • • • Within any period, the random portion of the increment to X (the “innovation”) is h i , with variance 2h. Therefore, the change in X across any subperiod is : X(t + ih) – X[t + (i – 1)h] = h + h i Notice that the mean and variance of X(t+1) – X(t) are unaffected by the number of subperiods: n X(t+1) – X(t) = (h + h i) = + h i i 1 Mean = Variance = 2h var (i) = 2h n = 2 Notice also that the innovation to X(t+1) – X(t) is the sum of many random variables, each scaled by h = 1/n . This suggests that even if we do not assume i are each normally distributed, it still may be the case that the scaled sum h i = i / n is normally distributed, N(0,2). This would follow from a central limit theorem (CLT). Ito's Lemma, page 4 The CLT requires some “regularity” conditions on i, however. These in effect require that the sum of the i is not dominated by one (or a small number) of the individual i as n gets large. Rules out: (a) (b) fat–tailed distributions “jump” in stock prices – this means stock prices are continuous. Economic content of these assumptions: over small t, you are “very” sure that possible innovation to X also is small. Conclusion: if stock prices are continuous, the CLT implies that we can act as though the i are normally distributed, even without assuming normality from the outset. Therefore, assuming continuity is equivalent to assuming that ~ N(0, 1). Now take limit as h dt: X(t + ih) – X [t + (i – 1)h] = h + h i dX = dt + dz; dz dt dz is called a Weiner process or pure Brownian motion, with properties: • • independent increments normally distributed, dz ~ N(0, dt) While I have written mean and std. deviation as constants, and , we can be more general. An Ito process allows and to depend on X and t: dX = (X, t) dt + (X, t) dz This process is still Markov: transition probabilities depend only on current values of X and t. Ito's Lemma, page 5 Special cases: (1) Arithmetic Brownian Motion and constant. Then increments to X are independent and identically distributed random variables, each normally distributed. Total increment is the sum of these iid normal increments. Hence, Xt – X0 ~ N(t, 2t). (2) Geometric Brownian Motion (X, t) = X and (X, t) = X dX = Xdt + Xdz dX X = dt + dz Percentage increments to X have constant mean and variance. Implies Xt is log-normal [i.e., ln(Xt/X0) is normal]. (3) Ornstein – Uhlenbeck process (X, t) = k (X* – X) and (X, t) = This specification implies mean reversion to a “long–run” value of X*. (4) Sometimes interest rates are modeled as dX = k (X* –X) + Xdz ; 0 < < 1 which rules out negative values if X* > 0. Ito's Lemma, page 6 Notice that while X is continuous, it is nowhere differentiable: Continuity: lim E {[X(t+h) – X(t)]2 } h0 = lim E[(h + h )2] = lim E[2h2 + 2h2 + 2h3/2] = lim (2h2 + 2h) = 0 MSE convergence Differentiability: If X(t+h) – X(t) = X/t h converges in MSE to some value X'(t), this would be its time derivative. X(t+h) – X(t) 2 We require E[ ( ) ] to converge for the derivative to exist. h X(t+h) – X(t) 2 But lim E[ ( ) ] = lim (2 + 2/h) = h Intuition: think about the stock market example: S h + h = h t Ito's Lemma, page 7 Example: The typical daily fluctuation is a very large rate of fluctuation: for example, a one standard deviation swing in return (e.g., 1% per day) is equivalent to a swing in the annualized % rate of change of 1% 1/250 = 250%. Per hour: one standard deviation swing is 1%/ 7 . [7 trading hours per day] Annualize: 1%/ 7 1/(250/7) = 250% 7 As t 0, rate of change If we were to graph the instant-by-instant stock price, it would be continuous, but infinitely vibrating with slope = . Jagged (nowhere smooth or differentiable) but continuous. St ______________________________ time Notice: E(dS/dt) does not exist, but E(dS)/dt does exist! Ito's Lemma, page 8 What about functions of X? For example, if X = ln S or S = eX and dX = dt + dz, then what can we say about dS? [or for a more interesting example, if we know process for dS, what about rules for the evolution of the call option value, c(S, t)? ] ** Usual calculus will lead you astray!!** To see why, let’s consider an example. Note that usual rule says that S = eX dS = eX dX To make this example easy, suppose = 0. Therefore, dX = dz dX ~ N(0, 2 dt), symmetric around 0. S = exp(X) exp(X) exp(X0 + X) exp(X0) exp(X0 X) _________________________________________________ X X0 – X X0 X0 + X Notice that although disturbance around X0 is symmetric (± X), disturbance around eX0 is not. Convexity of function implies gain > loss, so there is upward drift in stock price even though rate of return is symmetric around zero. E(S) = E(ex) > e E(x) = eX0 = S0 In general, E[f(X)] f[E(X)] This is “Jensen’s Inequality” Ito's Lemma, page 9 Therefore, while usual calculus states that dS = eXdX we’ve just shown that E(dS) eX E(dX) = 0 In this case E(dS) > 0 So usual calculus must be wrong for functions of stochastic processes. Must consider effect of the curvature of the function S error from curvature Slope = f’(X0) ____________________________________________ X X0 Obvious tool is a Taylor series expansion, which tells us how to correct for curvature: f(X0+ X) = f(X0) + f'(X0) X + ½ f’’(X0) (X)2 + 1/6 f’’’(X0) (X)3 + . . . f’ = initial slope f’’ adjusts for change in slope (convex f’’ > 0 ; concave f’’< 0) f’’’adjusts for change in rate of change in slope, etc. How far do we need to go in Taylor expansion? Answer is given by Ito’s lemma. Ito's Lemma, page 10 In a loose sense, Ito’s lemma tells us that when we take a Taylor series, we can use the following “multiplication rules:” (i) (dt)2 = 0 [ In general, (dt)n = 0 if n > 1] (ii) (dt)n dz = 0 for any n > 0. [Any (zero-mean) stochastic term of higher order than dt can be ignored.] 2 (iii) (dz) = dt The first two rules are familiar arguments from usual calculus – these terms are of a smaller order of magnitude than other terms. The last rule is surprising: the square of a normally distributed random variable is nonstochastic! Intuition: dz = dt (dz)2 = 2dt E(dz2) = 1 dt = dt Var(dz2) = E [{dz2 – E(dz2)}2] = E [(2dt – dt)2] = E (4dt2 – 2 dt2 2 + dt2) = dt2 E(4 – 2 2 + 1) So var (dz2) is proportional to dt2 , which is negligible. So (dz)2 is not literally non-stochastic, but as h dt, the uncertainty approaches zero “very” rapidly. [Notice importance of normal distribution: need very thin tails for E(4) to be finite.] Notice that Ito’s lemma implies that if dX = (X,t) dt + (X,t) dz then (dX)2 = 2(X,t) dt2 + 2 (X,t) dt (X,t) dz + 2(X,t) dz2 = 0 + 0 + 2(X,t) dt which is nonstochastic. When you square an Ito process, the only term that survives is variance. Ito's Lemma, page 11 Now consider a function of X, y = f(X,t ). e.g., y may be the value of a call option and X the stock price on which the call is written. Taylor series: f(X + dX, t + dt) = f(X, t) + fXdX + ftdt + ½ fXXdX2 + ½ fttdt2 + fXt dXdt df + ft(X, t) dt + ½ fXX(X, t) dX2 = fX(X, t) dX + = fX[(X, t) dt + (X, t)dz] + ft dt + ½ fXX 2(X, t) dt = [fX (X, t) + ft + ½ fXX 2(X, t)] dt + fX (X, t)dz 0 + + 0 0 + 0 This is called Ito’s lemma. ** Notice that dy and dX are (locally) perfectly correlated. Their stochastic terms are both known numbers times dz, the same random variable. y = f(X) Slope = f'(X0) ____________________________________________ X X0 Since X is small for small t, may treat relationship as locally linear and correlation 1 What does this analysis imply about “portfolios” comprised of y and X? That they can be made riskless. More on this next week. This graph also explains why the “extra” or second derivative term enters the drift for y. A Jensen’s inequality term that depends on both convexity and volatility. Ito's Lemma, page 12 Example 1: Log-normality and geometric brownian motion X = ln(S) or S = eX = f(X) If dX = dt + dz with , constant [i.e., ln(S) has arithmetic brownian motion, equivalent to assuming ln(ST) is normally dist.] then dS = fXdt + ft dt + ½ fXX2 dt + fXdz = (eX + 0 + ½ eX2) dt + eXdz = eX( + ½ 2) dt + eXdz correction for Jensen's inequality But S = eX. Therefore, dS/S = ( + ½ 2) dt + dz Define = + ½ 2 We will commonly write: dS/S = dt + dz S has G. B.M. which we now know is equivalent to: ST /S0 is log–normally distributed. Ito's Lemma, page 13 Example 2: Futures price dynamics Suppose expected rate of return on stock = , but stock pays continuous proportional dividend at rate . Then dS = ( – ) S dt + S dz [Notice that 2(S, t) = 2S2] Spot-futures parity for futures maturing at time T implies that F(St, t) = St e (r – )(T–t) Ito's lemma dF = Fs dS + Ft dt + ½ Fss(dS)2 = Fs( – )S dt + Ft dt + ½ Fss2S2 dt + FsS dz But FS S = F and Ft = – (r – ) F and Fss = 0 dF dF/F = ( – )F dt – (r – )Fdt + 0 + F dz = ( – r)dt + dz Issues: 1) why is E(dF/F) less than expected rate of return on stock by r dt? 2) when will futures price be unbiased predictor of E(St)? Ito's Lemma, page 14 Example 3: Valuation Consider a security paying a continuous dividend stream, i.e., a dividend in period dt of Xdt (equivalently, a rate of dividends X), forever. Suppose dX/X = dt + dz What is value of security? Assume dz is “non–systematic” risk. Call f(X) the value of security. [Notice that value function does not depend on t. Why not?] df = fX X dt + ft dt + ½ fXXX22 dt + fXX dz expected income = [(fX X + ½ fXXX22) + X] dt = rf dt capital gain in paren div = fair return [equilibrium condition] Can we solve this differential equation for f( )? Boundary condition is f(0) = 0 Hypothesize a solution of form f(X) = kX Then: fXX = f fXX = 0 Now plug into pde: f + 0 + X = rf k = 1/(r–) f(X) = kX + X = rkX X = growing perpetuity value (Gordon growth model) (r – ) Ito's Lemma, page 15 Multivariate Ito’s lemma Need one more multiplication rule. If dz1 and dz2 are two Weiner processes, with correlation of , then dz1 dz2 = dt Notice that dz1 dz2 = 1 dt 2 dt = 12dt . E (1 2) = cov (1, 2) = 12 = Now let y = f (X1, X2, t) dX1 = 1 (X1, X2, t) dt + 1(X1, X2, t) dz1 where dX2 = 2 (X1, X2, t) dt + 2(X1, X2, t) dz2 Then dy = f1 dX1 + f2 dX2 + f3 dt + ½ f11 (dX1)2 + ½ f22 (dX2)2 + f12 dX1 dX2 = [f1 1 + f2 2 + f3 + ½ f1112(X1, X2, t) + ½ f22 22(X1, X2, t) + f12 12] dt + f11 dz1 + f22 dz2 The generalization to n sources of risk is obvious. This has interesting relation to the APT. In continuous time, if X1 and X2 are continuous, then (local) uncertainty in y is linear and additive in sources of uncertainty, even if original functional form is not additive. Ito’s lemma provides a “factor” loading equation. Example: suppose production function is Q = LaKb and L and K both follow GBM: dL/L = L dt + L dzL dL = LL dt + LL dzL dK/K = K dt + K dzK dK = KK dt + KK dzK Notice that QL = Q/L = aLa–1Kb and QLL = aLaKb = aQ Similarly: QKK = bQ 2 QKKK = b(b–1)Q QLLL2 = a(a–1)Q QLKLK = abQ Ito's Lemma, page 16 dQ = QLdL + QK dK + ½ QLL (dL)2 + ½ QKK (dK)2 + QKL KL dK dL 2 = QL (L L dt + L L dzL) + QK (KK dt + KK dzK) + ½ QKK(K K2 dt) + 2 ½ QLL(L L2 dt) + QKL (KL K L dt) 2 2 = dt {aQL + bQK + ½ Q a(a–1)L + ½ Q b (b–1) K + Q a b L K }+ a Q L dzL + b Q K dzK 2 2 dQ/Q = [aL + bK + ½ a (a–1) L + ½ b (b–1)K + a b L K]dt + a L dzL + b K dzK So the elasticities a, b are also the “factor loadings.” Notice that the constant elasticity specification, f(L,K) = LaKb preserves Geometric Brownian Motion. In a single-variable example, e.g., Q = La we would get 2 dQ/Q = (aL + ½ a (a–1) L) dt + a L dzL and the second derivative term in the drift is positive if a > 1, i.e., if function is convex, and negative if a < 1, i.e., if function is concave. Ito's Lemma, page 17 Extreme market timing would shift fully between the market index and Treasury bills. Performance of a perfectly successful timer as a function of market performance Portfolio return rf Market index return rf Notice: There is no meaning to "the beta" of the fund. Beta = 0 or 1, depending on the market forecast. Can you guess the results of this strategy, if executed with perfect success? Compound growth (annual) 1926 1999 Bills only: $1 $12.6 3.5% Market index only: $1 $973 9.9% Perfect timer (annual) $1 $41,022 15.7% Perfect timer (monthly) $1 $5.74B 36.0%