Applications of bootstrap method to finance Chin-Ping King Population distribution function F empirical distribution function(EDF) Fn F (x1, x2,…, xn) where x= (x1, x2,…, xn) Fn (x1*, x2*,…, xn*) where x*= (x1*, x2*,…, xn*) Probability of elements of population which occur in x : P Probability of elements of EDF which occur in x*: Pn Pn ~ Bi(P, (p*(1-p)/n)) By Weak Law of Large Number and Central Limit Theorem (n)1/2(Fn - F) d N(0, p*(1-p)) Estimation of standard deviation and bias Estimator of θ : θ’ θ’ =s(x) Standard deviation: se={∑nj=1 [θ’j- θ’(.)]2/(n-1)} se(.)= ∑nj=1 θ’ j/n Bias: bias=E[θ’]- θ Root mean square error of an estimator θ’ for θ: E[(θ’- θ)2]= se2*{1+(1/2)*(bias/se)} 2 Nonparametric bootstrap The bootstrap algorithm for estimating standard errors(or bias) 1. Select B independent bootstrap samples x*1, x*2 , …, x*B, each consisting of n data drawn with replacement from x . Total possible number of distinct bootstrap samples is C(2n-1,n) . 2. Evaluate the bootstrap replication corresponding to each bootstrap sample θ’*(b)=s(x*b ) b=1,2,…,B 3. Estimate the standard error (or bias) by the sample standard deviation (or bias) of the B replications: se’B = se={∑Bb=1 [θ ’*(b)- θ’*(.)]2/(B-1)} θ’* (.)= ∑Bb=1 θ ’*(b) /B Bias’B=θ’*(.)- θ’ A Schematic diagram of the nonparametric bootstrap Unknown Population Distribution F Observed Random Sample x= (x1, x2,…, xn) θ’ =s(x) Statistic of interest Empirical Distribution Fn Bootstrap Sample x*= (x1*, x2*,…, xn*) θ’*(b)=s(x*b ) Bootstrap Replication Parametric bootstrap Function form of population probability distribution F has been known, but parameters in population probability distribution F are not known Parametric estimate of population probability distribution : Fpar We draw B samples of size n from the parametric estimate of estimate of the population probability distribution Fpar: Fpar x*= (x1*, x2*,…, xn*) Error in bootstrap estimates mi= the ith moment of the bootstrap distribution of θ’ Var(se’B) = Var(m21/2 ) + E[(m2(△+2))/4B] △= m4/m22-3, the kurtosis of the bootstrap distribution of θ’ Var(m21/2 ):sample variation , it approaches zero as the sample size n approaches infinity E[(m2(△+2))/4B]:resampling variation, it approaches zero as B approaches infinity Confidence intervals based on bootstrap percentiles (1-α) Percentile interval: [θ’%low , θ’%up ]= [θ’*(α/2)B , θ’*(1-(α/2))B ] θ’*(α/2)B : 100*(α/2)th empirical percentile, or B *(α/2)th value in the ordered list of the B replications of θ’* θ’*(1-(α/2))B : 100*(1-(α/2))th empirical percentile, or B *(1-(α/2))th value in the ordered list of the B replications of θ’* Percentile interval lemma Suppose the transformation ψ’=t(θ’) perfectly normalize the distribution of θ’: ψ’ ~ N(ψ, c2) For some standard deviation c. Then the percentile interval based on θ’ equals [t-1(ψ’-z(1-(α/2))*c), t-1(ψ’-z(α/2)*c)] Example: θ’ =exp(x) ψ’=t(θ’)=logθ’ x ~ N(0,1) Coverage performance Results of 300 confidence interval realizations for θ’ =exp(x) Method Standard normal Interval Bootstrap percentile Interval miss left: left endpoint >1 Miss right: right endpoint <1 % miss left % miss right 1.2 8.8 4.8 5.2 Transformation-respecting property The percentile interval for any (monotone) parameter transformation ψ’=t(θ’) is simply the percentile interval for θ’ mapped by t(θ’) : [ψ’%low , ψ’%up ]= [t(θ ’%low) , t(θ ’%up) ] Better bootstrap confidence intervals (1-α) BCa interval: [θ ’low , θ ’up ]= [θ’*(α1) , θ’*(α2) ] α1 and α2 are obtained by standard normal cumulative distribution function of some correction formulas for bootstrap replications. BCa interval is transformation respecting. Accuracy of bootstrap confidence interval For (1- α )coverage, approximate confidence interval points θ ’low and θ ’up are called first order accurate if: Pr(θ ≦ θ ’low )= (α/2 )+ O(n-1/2) Pr(θ ≧ θ ’up)= (α /2)+ O(n-1/2 ) And second order accurate if Pr(θ ≦ θ ’low )= (α/2 )+ O(n-1) Pr(θ ≧ θ ’up)= (α/2 )+ O(n-1) Percentile interval : first order accurate. BCa interval : second order accurate. Calibration of confidence interval points 1. Generate B bootstrap samples x*1, x*2 , …, x*B. For each sample b=1,2,…,B: 1a) Compute a λ-level confidence interval point θ’*λ (b) for a grid of values of λ. Where θ’*λ (b) can be θ’*(b)-z1-λ *se ’*(b) . 2. For each λ compute p’ (λ)=#{θ’ ≦ θ’*λ (b) }/B. 3. Find the value of λ satisfying p’ (λ)= α/2 Calibration of percentile interval and BCa interval Once calibration of percentile interval: second order accurate Pr(θ ≦ θ ’low )= (α/2 )+ O(n-1) Pr(θ ≧ θ ’up)= (α/2 )+ O(n-1) Once calibration of BCa interval: third order accurate Pr(θ ≦ θ ’low )= (α/2 )+ O(n-3/2) Pr(θ ≧ θ ’up)= (α /2)+ O(n-3/2 ) Computation of the bootstrap test statistics 1. Draw B samples of size n with replacement from x. 2. Evaluate ϕ(.) on each sample, ϕ(x*b) where ϕ(.) is test statistics b=1,2,…,B 3. Approximate P-value by P-value=#{ϕ(x*b) ≧ ϕobs}/B or P-value=#{ϕ(x*b) ≦ ϕobs}/B Where ϕobs= ϕ(x) the observed value of test statistics Asymptotic refinement Asymptotically normal test statistics ϕ ϕ d N(0,σ2) ϕ ~ Gn(u,F) Gn(u,F): exact cumulative distribution Gn(u,F)=Pr(|ϕ| ≦u|F) Gn(u,F) φ(u) as n approaches infinity (assume σ=1) φ(u): standard normal cumulative distribution An asymptotic test is based on φ(u) φ(u)- Gn(u,F)=O(n-1) G*n(u): bootstrap cumulative distribution A bootstrap test is based on G*n(u) G*n(u)-Gn(u,F)= O(n-3/2) Reality test for data snooping Forecasting model: lk Benchmark model: l0 dk= lk- l0 H0 :maxk=1,2,…,nE(dk) ≦0 Data: 1000 daily closing stock prices of UMC Benchmark model: random walk with drift lnPt = a + lnPt-1 + εt Forecasting model : lnPt = a + ΔlnPt-1 + εt where ΔlnPt = lnPt – lnPt-1 V=(1/B) ∑Bb=1 d1(b) Critical value Quantile of bootstrap distribution for V 0.000808 The difference is significant, so reject H0 Forecasting model beat random walk model Statistics V -1.9874504* Inference when a nuisance parameter is not identified the null hypothesis Threshold Autoregressive (TAR)model: α10 + α11 yt -1+ ε1t yt -1 ≦η α20 + α21 yt -1+ ε2t yt -1 > η yt = η : threshold value H0: time series is linear H1: time series is TAR process Data: monthly data of U.S. dollar/Sweden krona exchange rate from January 1974 to December 1998 U.S. dollar/Sweden krona Bootstrap P-value 0.0200 Reject H0 U.S. dollar/Sweden krona exchange rates follow TAR process Bootstrap percentile confidence interval 1.0 0.8 0.6 0.4 0.2 0.0 91:01 91:04 91:07 91:10 92:01 92:04 92:07 92:10 SDN L05 L10 U90 U95 Thanks for listening