Week 10: VaR and GARCH model Estimation of VaR with Pareto tail • Disadvantages (parametric and nonparametric method): a. For parametric method: the assumption of normal distribution is always not true. b. For nonparametric estimation, it is usually possible only for large α, but not for small α. So, one may expect to resort to use nonparametric regression for large αto estimate small one. Extreme Value Theory (EVT) • Assume {rt} i.i.d with distribution F(x), then the CDF of r(1) , denoted by Fn,1(x) is given by Fn,1(x)=1-[1-F(x)]n. In practice F(x) is unknown and then Fn,1(x) is unknown. The EVT is concerned with finding two sequence {an) and {bn} such that (r(1) – an)/ bn converges to a non-degenerated distribution as n goes to infinity. Under the independent assumption, the limit distribution is given by F*(x)=1-exp[-(1+kx)1/k], when k≠0 and 1-exp[-exp(x)], when k=0. Three types for the above EVT distribution a. Type I: k=0, the Gumbel family with CDF: F*(x)= 1-exp[-exp(x)], x∈R. b. Type II: k<0, the Frechet family with CDF F*(x)= 1-exp[-(1+kx)1/k], if x<-1/k and =1, otherwise. c. Type III: k>0, the Weibull family with CDF F*(x)= 1-exp[-(1+kx)1/k], if x>-1/k and =1, otherwise. Pareto tails • In risk management, we are mainly interested in the Frechet family, which includes stable and student-t distribution. • We know that when {rt} i.i.d with tail distribution P(|r1|>x)=x-βL(x), i.e., rt has Pareto tails, then {Xt} converges to a stable distribution with tail index β. Here the tail index β is always unknown. To evaluate the VaR of small α, we estimate the tail index β first and apply the nonparametric method to draw the value of VaR for largeα0, and then use the VaR(α0) to estimate VaR(α) . (how?) This is the so-called semi-parametric method. How to estimate the tail index β? • MLE: when the whole distribution of rt is known. • Linear regression: suppose {rt} have a Pareto left tail, for x>0, P(r1 <-x)= x-βL(x), then log(k/n)=log P(r1 <r(k) )= -βlog (-r(k)) + log L(-r(k)). • Hill estimator: based on MLE, one can get that n/β= log(r1/c)+ log(r2/c)+….+ log(rn/c), when P(r1 <-x)= 1-(x/c)-β, x>c. When one only uses the data in the tail to compute the tail index, then β^ =n(c)/∑ri>c log(ri/c) and let H=1/β, then The properties of Hill estimator VaR for a derivative • Suppose that instead of a stock, one owns a derivative whose value depends on the stock. One can estimate a VaR for this derivative by VaR for derivative =LX VaR for asset, where L=(Delta pt-1asset/ pt-1option), and Delta=d C(s, T,t, K, σ,r)/dS. Volatility modeling • Volatility is important in options trading: for example the price of a European call option, the well known Black-Scholes option pricing formula states that the price is C(S0) = S0(d1) – Ke –rT (d2), where d1= (T) –1[log (S0/K) + (r+2/2)T], d2= (T) –1[log (S0/K) + (r–2/2)T] = d1 – T. • In VaR, let Rt be the daily asset log-return and St be the daily closing price, then Rt+1=log(St+1/ St) . Suppose the return is normal distributed with mean zero, then one can write it as • The variance as measure by square return, exhibit strong autocorrelation, so that if the recent period was one of high variance, then tomorrow is tend to have high variance. To capture this phenomenon, the easiest way is to use The advantages of Riskmetrics • It is reasonable from the observed return that recent returns matter tomorrow’s variance than distance returns. • It is simple: only one parameter is contained in the model. • Relative little data need to be stored to calculate tomorrow’s variance. Shortcoming of Riskmetrics • It ignores the fact that the long-run average variance tends to relative stable over time. GARCH model • ARCH(1) model: The unconditional kurtosis of ARCH(1) • Suppose the innovations are normal, then E(at4| Ft-1)=3[ E(at2 | Ft-1) ]2 =3(α0+α1at-12 )2, it follows that Eat4 = 3α0 2 ( 1 +α1 ) / [ ( 1 -α1 ) ( 1 -3α1 2 )] and Eat4 /(Eat2 )2 =3 ( 1 -α1 2 )/ ( 1 -3α1 2 )>3. This shows that the tail distribution of at is heavier than that of a normal distribution. How to build an ARCH model • Build an econometric model (e.g. an ARMA model) for the return series to remove any linear dependence in data abd use the residual series of the model to test for ARCH effects • Specify the ARCH order and perform estimation: AIC(p)=log[(σ^)p2)]+2p/n • Model checking: Ljung-Box statistics. Q=n(n+2) ∑k=1 h (ρ^) k2/(n-k). GARCH(1, 1) model Note that: σ2=ω/(1-α-β) The forecast of variance of kday cumulative return If the returns have zero autocorrelation, then the variance of the K-day returns is For RiskMetrics model, it is just Kσt+12. But for a GARCH model, we have If the returns have zero autocorrelation and σt+1<σ, then Var forecast of GARCH> RM GARCH(p, q) • A process {Rt} is called a GARCH(p, q) model if Rt=σtεt, where Between GARCH and ARMA model • Let et=Rt2-σt2, then Rt2 follows an ARMA models. This can be seen by Rt2=ω+ ∑i=1 max{p, q} ( αi+βi) Rt-i2+et∑j=1 qβj et-j. This also explains why simple GARCH models, such as GARCH(1, 1) may provide a parsimonious representation for some complex autodependence structure of Rt2. Some properties • Theorem A: The necessary and sufficient condition for a GARCH(p, q) being a unique strictly stationary process with finite variance is ∑j=1 pαj +∑j=1 qβj <1. Further, ERt=0, Cov(Rt, Rt-k)=0 for k>0 and Var(Rt)=ω/(1- ∑j=1 pαj +∑j=1 qβj ). In addition, if E(εt4)1/2 ∑j=1 pαj /(1- ∑j=1 qβj )<1, then Rt has fourth moment. Theorem B: Under ∑j=1 pαj +∑j=1 qβj <1, {Rt2} is a causal and invertible ARMA(max{p, q}, q) process and exhibits heavier tails than those of εt in the sense of kurtosis. Some related model • GARCH-M model: when the conditional standard deviation is a regressive variable, we called this model as GARCH-in-mean (GARCH-M) model, i.e., Yt =aXt +b σt+ at, Where at is a GARCH model. For example, when Y is a return, it may depends on the variability, higher variability will lead to higher returns. The leverage effect model: negative return increases variance by more than a positive return of the same magnitude. Model A: let It=1, if day t’s return is negative and zero, otherwise and define Model B: E-GARCH model Weekend effect It is always known that days that followed a weekend or a holiday have higher variance than average day. We can try the following model: σt+12=ω+βσt2+ασt2 Zt2+γITt+1, where ITt+1 takes value 1 if day t+1 is a Monday, for example. More general EGARCH IGARCH model • A GARCH(p, q) process is called an IGARCH process if How to estimate the parameters in a GARCH model • MLE: • Quasi-Maximum Likelihood Estimation (QMLE): Whittle’s estimator By Theorem B, we see that Rt2 can be written as Rt2 =c0+∑j cj Rt-j 2 +et, where cj>0. If Var(et) is finite, then the spectral density of the process {Rt2} is g(ω)= Var(et) |1- ∑j cj exp(ij ω)|-2 /2π. And the Whittle’s estimator is given by minimizing ∑j=1T-1 IT(ωj)/ g(ωj), where IT(.) is the periodogram of {Rt2} , ωj=2jπ/ T.