Ito's Lemma - Sorin Solomon

advertisement
Ito’s Lemma
Ito's Lemma, page 1
Ito’s Lemma
X(t) is a stochastic process if its value evolves stochastically over time. Start with discrete–
time processes whose values can change only at discrete points in time. The change in X has
some probability distribution.
X(t+1) – X(t) ~ f [, , … , (other parameters)]
e. g. if X(t) = ln(stock price at t) and f (, ) = normal density (, 2)
then future stock price is log-normally distributed.
Markov Process: stochastic process for which probability density depends only on current
value X(t) and not on any earlier value X(t – s). Given current price, no additional information
in examination of past prices.
Start with this process (random walk with drift):
X(t+1) = X(t) +  + t
where



(2)
 ~ (0, 1) [notation means (, ) is  = 0,  = 1, i.e., a standardized random
variable, but not necessarily normal
t is independent of s for t s.
This specification implies that
X(t+2) – X(t) = 2 + t + t+1
and that:
X(t+2) – X(t) ~ (2, 2 2),
and in general:
X(t+n) – X(t) ~ (n, n 2)
Implications:
“drift” is proportional to n
variance is proportional to n; so standard deviation is proportional to n
Ito's Lemma, page 2
Digression:
Interesting implication: what is the probability that the return on the stock index (e.g., the
S&P 500) will be positive in a given period?
Using historical (post-war) data,


annual 12%
annual  16%
Convert to daily: n = 1/250. Therefore,


daily = 12/250 = 0.048%
daily = 16/ 250 = 1.01%
Annual basis, odds about 23% market will fall if returns are normal distributed. For the
market to fall, return must be below the mean by 12%, which is ¾ of a standard
–12%
deviation: N( 16 ) = 0.23.
Actual ratio, 1946 – 2001, is 13/56 = 0.23!!
–.048%
However, on a daily basis, mean return is insignificant compared to . N( 1.01 ) = .48.
So the odds are essentially 50/50 market will rise or fall. Essentially a random walk.
A related paradox:
Suppose your portfolio is $1 million. Put it in a safe (real) annuity, and you can earn about
$35,000 (real) annually, clearly not enough to retire on (at least in Boston).
Put it in the market index, and the daily standard deviation of your portfolio will be about .01 
$1,000,000 = $10,000, which dominates your salary. [E.g., even with a salary of $250,000, you
earn only $1,000 a day, relatively insignificant compared to your daily volatility.] So why
bother working?
How do we resolve these two conclusions?
Ito's Lemma, page 3
Given this analysis, we can write
X(t+n) – X(t) = n +  n   [where  ~ (0,1)]
Notice that variance = n2
Now let’s move in “opposite” direction. Hold the period fixed and divide it into n subperiods,
each of length 1/n = t = h,
Subperiod:
t
t+h
t+2h
t+3h
• • •
t+nh
|––––––––|–––––––|–––––––|––––––––––––––––––|––
t
• • •
t+1
Innovation:
 h   h   h 
• • •
Within any period, the random portion of the increment to X (the “innovation”) is  h i ,
with variance 2h. Therefore, the change in X across any subperiod is :
X(t + ih) – X[t + (i – 1)h] = h +  h i
Notice that the mean and variance of X(t+1) – X(t) are unaffected by the number of subperiods:
n
X(t+1) – X(t) =

(h +  h i) =  +  h i
i 1
Mean = 
Variance = 2h  var (i) = 2h  n = 2
Notice also that the innovation to X(t+1) – X(t) is the sum of many random variables, each
scaled by h = 1/n .
This suggests that even if we do not assume i are each normally distributed, it still may be the
case that the scaled sum  h i =  i / n is normally distributed, N(0,2). This would
follow from a central limit theorem (CLT).
Ito's Lemma, page 4
The CLT requires some “regularity” conditions on i, however. These in effect require that the
sum of the i is not dominated by one (or a small number) of the individual i as n gets large.
Rules out:
(a)
(b)
fat–tailed distributions
“jump” in stock prices – this means stock prices are continuous.
Economic content of these assumptions: over small t, you are “very” sure that possible
innovation to X also is small. Conclusion: if stock prices are continuous, the CLT implies that
we can act as though the i are normally distributed, even without assuming normality from the
outset.
Therefore, assuming continuity is equivalent to assuming that  ~ N(0, 1).
Now take limit as h  dt:
X(t + ih) – X [t + (i – 1)h] = h +  h i
dX = dt + dz;
dz   dt
dz is called a Weiner process or pure Brownian motion, with properties:
•
•
independent increments
normally distributed, dz ~ N(0, dt)
While I have written mean and std. deviation as constants,  and , we can be more general.
An Ito process allows  and  to depend on X and t:
dX = (X, t) dt + (X, t) dz
This process is still Markov: transition probabilities depend only on current values of X and t.
Ito's Lemma, page 5
Special cases:
(1)
Arithmetic Brownian Motion
 and  constant. Then increments to X are independent and identically distributed
random variables, each normally distributed. Total increment is the sum of these iid
normal increments. Hence, Xt – X0 ~ N(t, 2t).
(2)
Geometric Brownian Motion
(X, t) = X and (X, t) = X 
dX = Xdt + Xdz
dX
X = dt + dz
Percentage increments to X have constant mean and variance. Implies Xt is log-normal
[i.e., ln(Xt/X0) is normal].
(3)
Ornstein – Uhlenbeck process
(X, t) = k (X* – X) and  (X, t) = 
This specification implies mean reversion to a “long–run” value of X*.
(4)
Sometimes interest rates are modeled as
dX = k (X* –X) + Xdz ; 0 < < 1
which rules out negative values if X* > 0.
Ito's Lemma, page 6
Notice that while X is continuous, it is nowhere differentiable:
Continuity:
lim
E {[X(t+h) – X(t)]2 }
h0
= lim E[(h +  h )2]
= lim E[2h2 + 2h2 + 2h3/2] = lim (2h2 + 2h) = 0  MSE convergence
Differentiability:
If
X(t+h) – X(t)
= X/t
h
converges in MSE to some value X'(t), this would be its time derivative.
X(t+h) – X(t) 2
We require E[ (
) ] to converge for the derivative to exist.
h
X(t+h) – X(t) 2
But lim E[ (
) ] = lim (2 + 2/h) = 
h
Intuition: think about the stock market example:

S h +  h
=

h
t
Ito's Lemma, page 7
Example: The typical daily fluctuation is a very large rate of fluctuation: for example, a one
standard deviation swing in return (e.g., 1% per day) is equivalent to a swing in the annualized
% rate of change of
1%
1/250 = 250%.
Per hour: one standard deviation swing is 1%/ 7 . [7 trading hours per day]
Annualize:
1%/ 7
1/(250/7) = 250%  7
As t  0, rate of change  
If we were to graph the instant-by-instant stock price, it would be continuous, but infinitely
vibrating with slope = . Jagged (nowhere smooth or differentiable) but continuous.
St
______________________________ time
Notice: E(dS/dt) does not exist, but E(dS)/dt does exist!
Ito's Lemma, page 8
What about functions of X?
For example, if
X = ln S or S = eX and
dX = dt + dz,
then what can we say about dS? [or for a more interesting example, if we know process for dS,
what about rules for the evolution of the call option value, c(S, t)? ]
** Usual calculus will lead you astray!!**
To see why, let’s consider an example. Note that usual rule says that S = eX dS = eX dX
To make this example easy, suppose = 0.
Therefore, dX = dz  dX ~ N(0, 2 dt), symmetric around 0.
S = exp(X)
 exp(X)
exp(X0 + X)
exp(X0)
exp(X0  X)
_________________________________________________ X
X0 – X
X0
X0 + X
Notice that although disturbance around X0 is symmetric (± X), disturbance around eX0 is not.
Convexity of function implies gain > loss, so there is upward drift in stock price even though
rate of return is symmetric around zero.
E(S) = E(ex) > e E(x) = eX0 = S0
In general, E[f(X)]  f[E(X)]
This is “Jensen’s Inequality”
Ito's Lemma, page 9
Therefore, while usual calculus states that
dS = eXdX
we’ve just shown that
E(dS)  eX E(dX) = 0
In this case E(dS) > 0
So usual calculus must be wrong for functions of stochastic processes.
Must consider effect of the curvature of the function
S
 error from curvature
 Slope = f’(X0)
____________________________________________ X
X0
Obvious tool is a Taylor series expansion, which tells us how to correct for curvature:
f(X0+ X) = f(X0) + f'(X0) X + ½ f’’(X0) (X)2 + 1/6 f’’’(X0) (X)3 + . . .
f’ = initial slope
f’’ adjusts for change in slope (convex f’’ > 0 ; concave f’’< 0)
f’’’adjusts for change in rate of change in slope, etc.
How far do we need to go in Taylor expansion? Answer is given by Ito’s lemma.
Ito's Lemma, page 10
In a loose sense, Ito’s lemma tells us that when we take a Taylor series, we can use the
following “multiplication rules:”
(i) (dt)2 = 0 [ In general, (dt)n = 0 if n > 1]
(ii) (dt)n  dz = 0 for any n > 0. [Any (zero-mean) stochastic term of higher order than
dt can be ignored.]
2
(iii) (dz) = dt
The first two rules are familiar arguments from usual calculus – these terms are of a smaller
order of magnitude than other terms.
The last rule is surprising: the square of a normally distributed random variable is nonstochastic!
Intuition:
dz =  dt
(dz)2 = 2dt
E(dz2) = 1  dt = dt
Var(dz2) = E [{dz2 – E(dz2)}2]
= E [(2dt – dt)2]
= E (4dt2 – 2 dt2 2 + dt2)
= dt2  E(4 – 2 2 + 1)
So var (dz2) is proportional to dt2 , which is negligible. So (dz)2 is not literally non-stochastic,
but as h dt, the uncertainty approaches zero “very” rapidly. [Notice importance of normal
distribution: need very thin tails for E(4) to be finite.]
Notice that Ito’s lemma implies that if
dX = (X,t) dt + (X,t) dz
then
(dX)2 = 2(X,t) dt2 + 2 (X,t) dt (X,t) dz + 2(X,t) dz2 = 0 + 0 + 2(X,t) dt
which is nonstochastic. When you square an Ito process, the only term that survives is
variance.
Ito's Lemma, page 11
Now consider a function of X, y = f(X,t ).
e.g., y may be the value of a call option and X the stock price on which the call is written.
Taylor series:
f(X + dX, t + dt) = f(X, t) + fXdX + ftdt + ½ fXXdX2 + ½ fttdt2 + fXt dXdt

df
+ ft(X, t) dt + ½ fXX(X, t) dX2
=
fX(X, t) dX
+
=
fX[(X, t) dt + (X, t)dz] + ft dt + ½ fXX 2(X, t) dt
=
[fX (X, t) + ft + ½ fXX 2(X, t)] dt + fX (X, t)dz
0
+
+
0
0
+
0
This is called Ito’s lemma.
** Notice that dy and dX are (locally) perfectly correlated. Their stochastic terms are both
known numbers times dz, the same random variable.
y = f(X)
 Slope = f'(X0)
____________________________________________ X
X0
Since X is small for small t, may treat relationship as locally linear and correlation  1
What does this analysis imply about “portfolios” comprised of y and X?
That they can be
made riskless. More on this next week.
This graph also explains why the “extra” or second derivative term enters the drift for y. A
Jensen’s inequality term that depends on both convexity and volatility.
Ito's Lemma, page 12
Example 1: Log-normality and geometric brownian motion
X = ln(S) or S = eX = f(X)
If
dX = dt + dz
with ,  constant
[i.e., ln(S) has arithmetic brownian motion, equivalent to assuming ln(ST) is normally dist.]
then
dS = fXdt + ft dt + ½ fXX2 dt + fXdz
= (eX + 0 + ½ eX2) dt + eXdz
= eX( + ½ 2) dt + eXdz

correction for Jensen's inequality
But S = eX. Therefore, dS/S = ( + ½ 2) dt + dz
Define  =  + ½ 2
We will commonly write: dS/S =  dt + dz
S has G. B.M. which we now know is equivalent to: ST /S0 is log–normally distributed.
Ito's Lemma, page 13
Example 2: Futures price dynamics
Suppose expected rate of return on stock = , but stock pays continuous proportional dividend
at rate . Then
dS = ( – ) S dt + S dz
[Notice that 2(S, t) = 2S2]
Spot-futures parity for futures maturing at time T implies that
F(St, t) = St e (r – )(T–t)
Ito's lemma 
dF = Fs dS + Ft dt + ½ Fss(dS)2
= Fs( – )S dt + Ft dt + ½ Fss2S2 dt + FsS dz
But FS  S = F and Ft = – (r – ) F and Fss = 0 
dF

dF/F
= ( – )F dt – (r – )Fdt + 0 + F dz
= ( – r)dt + dz
Issues: 1) why is E(dF/F) less than expected rate of return on stock by r dt?
2) when will futures price be unbiased predictor of E(St)?
Ito's Lemma, page 14
Example 3: Valuation
Consider a security paying a continuous dividend stream, i.e., a dividend in period dt of Xdt
(equivalently, a rate of dividends X), forever.
Suppose dX/X = dt + dz
What is value of security? Assume dz is “non–systematic” risk.
Call f(X) the value of security. [Notice that value function does not depend on t. Why not?]
df = fX X dt + ft dt + ½ fXXX22 dt + fXX dz
expected income = [(fX X + ½ fXXX22) + X] dt = rf dt



capital gain in paren
div
= fair return [equilibrium condition]
Can we solve this differential equation for f( )? Boundary condition is f(0) = 0
Hypothesize a solution of form f(X) = kX
Then: fXX = f
fXX = 0
Now plug into pde:


f + 0 + X = rf
k = 1/(r–)


 f(X) =
kX + X = rkX

X
= growing perpetuity value (Gordon growth model)
(r – )
Ito's Lemma, page 15
Multivariate Ito’s lemma
Need one more multiplication rule. If dz1 and dz2 are two Weiner processes, with correlation
of , then
dz1  dz2 = dt
Notice that dz1  dz2 = 1 dt  2 dt = 12dt .
E (1 2) = cov (1, 2) = 12 = 
Now let y = f (X1, X2, t)
dX1 = 1 (X1, X2, t) dt + 1(X1, X2, t) dz1
where
dX2 = 2 (X1, X2, t) dt + 2(X1, X2, t) dz2
Then dy = f1 dX1 + f2 dX2 + f3 dt + ½ f11 (dX1)2 + ½ f22 (dX2)2 + f12 dX1 dX2
= [f1 1 + f2 2 + f3 + ½ f1112(X1, X2, t) + ½ f22 22(X1, X2, t) + f12 12] dt
+ f11 dz1 + f22 dz2
The generalization to n sources of risk is obvious.
This has interesting relation to the APT. In continuous time, if X1 and X2 are continuous, then
(local) uncertainty in y is linear and additive in sources of uncertainty, even if original
functional form is not additive. Ito’s lemma provides a “factor” loading equation.
Example: suppose production function is Q = LaKb and L and K both follow GBM:
dL/L = L dt + L dzL 

dL = LL dt + LL dzL
dK/K = K dt + K dzK 

dK = KK dt + KK dzK
Notice that
QL = Q/L = aLa–1Kb and
QLL = aLaKb = aQ
Similarly:
QKK = bQ
2
QKKK = b(b–1)Q
QLLL2 = a(a–1)Q
QLKLK = abQ
Ito's Lemma, page 16
dQ = QLdL + QK dK + ½ QLL (dL)2 + ½ QKK (dK)2 + QKL KL dK dL
2
= QL (L L dt + L L dzL) + QK (KK dt + KK dzK) + ½ QKK(K K2 dt) +
2
½ QLL(L L2 dt) + QKL (KL K L dt)
2
2
= dt {aQL + bQK + ½ Q a(a–1)L + ½ Q b (b–1) K + Q a b  L K }+
a Q L dzL + b Q K dzK
2
2
dQ/Q = [aL + bK + ½ a (a–1) L + ½ b (b–1)K + a b L K]dt
+ a L dzL + b K dzK
So the elasticities a, b are also the “factor loadings.”
Notice that the constant elasticity specification, f(L,K) = LaKb preserves Geometric Brownian
Motion.
In a single-variable example, e.g., Q = La we would get
2
dQ/Q = (aL + ½ a (a–1) L) dt + a L dzL
and the second derivative term in the drift is positive if a > 1, i.e., if function is convex, and
negative if a < 1, i.e., if function is concave.
Ito's Lemma, page 17
Extreme market timing would shift fully between the market index and
Treasury bills.
Performance of a perfectly successful timer as a function of market performance
Portfolio
return
rf
Market index return
rf
Notice: There is no meaning to "the beta" of the fund. Beta = 0 or 1, depending on the
market forecast.
Can you guess the results of this strategy, if executed with perfect success?
Compound
growth (annual)
1926
1999
Bills only:
$1
$12.6
3.5%
Market index only:
$1
$973
9.9%
Perfect timer (annual)
$1
$41,022
15.7%
Perfect timer (monthly)
$1
$5.74B
36.0%
Download