Applications of bootstrap method on finance

advertisement
Applications of bootstrap
method to finance
Chin-Ping King
Population distribution function F
empirical distribution function(EDF) Fn
F
(x1, x2,…, xn)
where x= (x1, x2,…, xn)
Fn
(x1*, x2*,…, xn*)
where x*= (x1*, x2*,…, xn*)
Probability of elements of population which occur in x : P
Probability of elements of EDF which occur in x*: Pn
Pn ~ Bi(P, (p*(1-p)/n))
By Weak Law of Large Number and Central Limit Theorem
(n)1/2(Fn - F)
d
N(0, p*(1-p))
Estimation of standard deviation and bias
Estimator of θ : θ’
θ’ =s(x)
Standard deviation:
se={∑nj=1 [θ’j- θ’(.)]2/(n-1)}
se(.)= ∑nj=1 θ’ j/n
Bias:
bias=E[θ’]- θ
Root mean square error of an estimator θ’ for θ:
E[(θ’- θ)2]= se2*{1+(1/2)*(bias/se)} 2
Nonparametric bootstrap
The bootstrap algorithm for estimating standard errors(or bias)
1. Select B independent bootstrap samples x*1, x*2 , …, x*B, each consisting of n data drawn
with replacement from x . Total possible number of distinct bootstrap samples is
C(2n-1,n) .
2. Evaluate the bootstrap replication corresponding to each bootstrap sample
θ’*(b)=s(x*b )
b=1,2,…,B
3. Estimate the standard error (or bias) by the sample standard deviation (or bias) of the B
replications:
se’B = se={∑Bb=1 [θ ’*(b)- θ’*(.)]2/(B-1)}
θ’* (.)= ∑Bb=1 θ ’*(b) /B
Bias’B=θ’*(.)- θ’
A Schematic diagram of the nonparametric
bootstrap
Unknown
Population
Distribution
F
Observed Random
Sample
x= (x1, x2,…, xn)
θ’ =s(x)
Statistic of interest
Empirical
Distribution
Fn
Bootstrap
Sample
x*= (x1*, x2*,…, xn*)
θ’*(b)=s(x*b )
Bootstrap Replication
Parametric bootstrap
Function form of population probability distribution F has been known, but
parameters in population probability distribution F are not known
Parametric estimate of population probability distribution : Fpar
We draw B samples of size n from the parametric estimate of estimate of the
population probability distribution Fpar:
Fpar
x*= (x1*, x2*,…, xn*)
Error in bootstrap estimates
mi= the ith moment of the bootstrap distribution of θ’
Var(se’B) = Var(m21/2 ) + E[(m2(△+2))/4B]
△= m4/m22-3, the kurtosis of the bootstrap distribution of θ’
Var(m21/2 ):sample variation , it approaches zero as the sample size n
approaches infinity
E[(m2(△+2))/4B]:resampling variation, it approaches zero as B approaches
infinity
Confidence intervals based on bootstrap percentiles
(1-α) Percentile interval:
[θ’%low , θ’%up ]= [θ’*(α/2)B , θ’*(1-(α/2))B ]
θ’*(α/2)B : 100*(α/2)th empirical percentile, or B *(α/2)th value in the ordered
list of the B replications of θ’*
θ’*(1-(α/2))B : 100*(1-(α/2))th empirical percentile, or B *(1-(α/2))th value in the
ordered list of the B replications of θ’*
Percentile interval lemma
Suppose the transformation ψ’=t(θ’) perfectly normalize the distribution of θ’:
ψ’ ~ N(ψ, c2)
For some standard deviation c. Then the percentile interval based on θ’
equals [t-1(ψ’-z(1-(α/2))*c), t-1(ψ’-z(α/2)*c)]
Example:
θ’ =exp(x)
ψ’=t(θ’)=logθ’
x ~ N(0,1)
Coverage performance
Results of 300 confidence interval realizations for θ’ =exp(x)
Method
Standard normal
Interval
Bootstrap percentile
Interval
miss left: left endpoint >1
Miss right: right endpoint <1
% miss left
% miss right
1.2
8.8
4.8
5.2
Transformation-respecting property
The percentile interval for any (monotone) parameter transformation ψ’=t(θ’)
is simply the percentile interval for θ’ mapped by t(θ’) :
[ψ’%low , ψ’%up ]= [t(θ ’%low) , t(θ ’%up) ]
Better bootstrap confidence intervals
(1-α) BCa interval:
[θ ’low , θ ’up ]= [θ’*(α1) , θ’*(α2) ]
α1 and α2 are obtained by standard normal cumulative distribution function
of some correction formulas for bootstrap replications.
BCa interval is transformation respecting.
Accuracy of bootstrap confidence interval
For (1- α )coverage, approximate confidence interval points θ ’low and θ ’up are
called first order accurate if:
Pr(θ ≦ θ ’low )= (α/2 )+ O(n-1/2)
Pr(θ ≧ θ ’up)= (α /2)+ O(n-1/2 )
And second order accurate if
Pr(θ ≦ θ ’low )= (α/2 )+ O(n-1)
Pr(θ ≧ θ ’up)= (α/2 )+ O(n-1)
Percentile interval : first order accurate.
BCa interval : second order accurate.
Calibration of confidence interval points
1.
Generate B bootstrap samples x*1, x*2 , …, x*B. For each sample b=1,2,…,B:
1a) Compute a λ-level confidence interval point θ’*λ (b) for a grid of
values of λ. Where θ’*λ (b) can be θ’*(b)-z1-λ *se ’*(b) .
2. For each λ compute p’ (λ)=#{θ’ ≦ θ’*λ (b) }/B.
3. Find the value of λ satisfying p’ (λ)= α/2
Calibration of percentile interval and BCa interval
Once calibration of percentile interval: second order accurate
Pr(θ ≦ θ ’low )= (α/2 )+ O(n-1)
Pr(θ ≧ θ ’up)= (α/2 )+ O(n-1)
Once calibration of BCa interval: third order accurate
Pr(θ ≦ θ ’low )= (α/2 )+ O(n-3/2)
Pr(θ ≧ θ ’up)= (α /2)+ O(n-3/2 )
Computation of the bootstrap test statistics
1.
Draw B samples of size n with replacement from x.
2.
Evaluate ϕ(.) on each sample,
ϕ(x*b) where ϕ(.) is test statistics
b=1,2,…,B
3. Approximate P-value by
P-value=#{ϕ(x*b) ≧ ϕobs}/B or
P-value=#{ϕ(x*b) ≦ ϕobs}/B
Where ϕobs= ϕ(x) the observed value of test statistics
Asymptotic refinement
Asymptotically normal test statistics ϕ
ϕ
d
N(0,σ2)
ϕ ~ Gn(u,F)
Gn(u,F): exact cumulative distribution
Gn(u,F)=Pr(|ϕ| ≦u|F)
Gn(u,F)
φ(u) as n approaches infinity (assume σ=1)
φ(u): standard normal cumulative distribution
An asymptotic test is based on φ(u)
φ(u)- Gn(u,F)=O(n-1)
G*n(u): bootstrap cumulative distribution
A bootstrap test is based on G*n(u)
G*n(u)-Gn(u,F)= O(n-3/2)
Reality test for data snooping
Forecasting model: lk
Benchmark model: l0
dk= lk- l0
H0 :maxk=1,2,…,nE(dk) ≦0
Data: 1000 daily closing stock prices of UMC
Benchmark model: random walk with drift
lnPt = a + lnPt-1 + εt
Forecasting model :
lnPt = a + ΔlnPt-1 + εt
where ΔlnPt = lnPt – lnPt-1
V=(1/B) ∑Bb=1 d1(b)
Critical value
Quantile of bootstrap distribution
for V
0.000808
The difference is significant, so reject H0
Forecasting model beat random walk model
Statistics V
-1.9874504*
Inference when a nuisance parameter is not identified
the null hypothesis
Threshold Autoregressive (TAR)model:
α10 + α11 yt -1+ ε1t
yt -1 ≦η
α20 + α21 yt -1+ ε2t
yt -1 > η
yt =
η : threshold value
H0: time series is linear
H1: time series is TAR process
Data: monthly data of U.S. dollar/Sweden krona exchange rate from January
1974 to December 1998
U.S. dollar/Sweden krona
Bootstrap P-value
0.0200
Reject H0
U.S. dollar/Sweden krona exchange rates follow TAR process
Bootstrap percentile confidence interval
1.0
0.8
0.6
0.4
0.2
0.0
91:01 91:04 91:07 91:10 92:01 92:04 92:07 92:10
SDN
L05
L10
U90
U95
Thanks for listening
Download