arXiv:1602.07599v2 [q-fin.RM] 12 Jun 2016 Backtesting

advertisement
Backtesting Lambda Value at Risk
Jacopo Corbetta
arXiv:1602.07599v2 [q-fin.RM] 12 Jun 2016
CERMICS, École des Ponts ,UPE, Champs sur Marne, France.
and
Ilaria Peri
Department of Finance, Business School, University of Greenwich, London, England
June 14, 2016
Abstract
A new risk measure, the Lambda value at risk (ΛV aR), has been recently proposed as a generalization of the Value at risk (V aR). The ΛV aR appears attractive
for its potential ability to solve several problems of V aR. The aim of this paper is
to provide the first study on the backtesting of ΛV aR. We propose three nonparametric tests which exploit different features. Two of these tests directly assess the
correctness of the level of coverage predicted by the model. One is bilateral and
provides an asymptotic result. A third test assesses the accuracy of ΛV aR that
depends on the choice of the P&L distribution. Finally, we perform a backtesting exercise that confirms the highest performance of ΛV aR especially when the
distribution tail behaviour is considered.
Keywords: hypothesis test, estimation risk, risk management
1.
Introduction
Risk measurement and its backtesting are matter of primary concern to the financial
industry. The value at risk (V aR) measure has become the best practice. Despite its
popularity, after the recent financial crisis, V aR has been extensively criticized by academics and risk managers. Among these critics, we recall the inability to capture the tail
risk and the lack of reactivity to the market fluctuations. Thus, the suggestion of the
Basel Committee, in the consultative document Fundamental review of the trading book
(2013), is to consider alternative risk measures that can overcome the V aR’s weaknesses.
A new risk measure, the Lambda Value at Risk (ΛV aR), has been introduced by
a theoretical point of view by Frittelli et al. (2014). The ΛV aR is a generalization of
the V aR at confidence level λ. Specifically, the ΛV aR considers a function Λ instead
of a constant confidence level λ, where Λ is a function of the losses. Formally, given a
monotone and right continuous function Λ : R → (0, 1), the ΛV aR of the asset return X
is a map that associates to its cumulative distribution function F (x) = P (X ≤ x) the
number:
ΛV aR = − inf {x ∈ R | F (x) > Λ(x)} .
(1)
This new risk measure appears to be attractive for its potential ability to solve several
problems of V aR. First of all, it seems to be flexible enough to discriminate the risk
among return distributions with different tail behaviors, by assigning more risk to heavytailed return distributions and less in the opposite case. In addition, ΛV aR may allow a
rapid changing of the interval of confidence when the market conditions change.
Recently, Hitaj et al. (2015) proposed a methodology for computing ΛV aR. In this
2
study, a first attempt of backtesting has also been performed and compared with V aR.
Their proposal is based on the hypothesis testing framework by Kupiec (1995). Here, the
accuracy of the ΛV aR model is evaluated by considering the following null hypothesis:
the relative frequency of exceptions over the backtesting time window does not surpass
the maximum of the Λ function. However, the actual level of coverage provided by the
ΛV aR model is not constant at any time and, thus, this method misses to evaluate
properly the ΛV aR performance.
The objective of this paper is to propose the first theoretical framework for the backtesting of the ΛV aR. We propose three backtesting methodologies which exploit different
features. The first two tests have the objective to evaluate if the ΛV aR provides an accurate level of coverage. Here, we check if the probability that a violation occurs ex-post
actually coincides with the one predicted by the model. Both these two tests are based on
test statistics where the distribution is obtained by applying results of probability theory.
The first test is unilateral and provides more precise results for usual backtesting time
window (i.e 250 observations). The second test is bilateral and provides an asymptotic
result. Thus, the second test is more suitable for larger sample of observations. In respect to the hypothesis test proposed in Hitaj et al. (2015), we consider a null hypothesis
which better evaluates the ΛV aR performance and, thus, the advantages introduced by
its flexibility.
We propose a third test that is inspired to the approach used by Acerbi and Szekely
(2014) for the Expected Shortfall backtesting. This test is focused on another aspect: it
evaluates if the correct coverage of risk derives from the fact that the model has been
estimated with the correct distribution of the return. Here, the alternative and null
3
hypothesis change. We propose a test statistic for which the distribution is obtained by
simulations.
Hence, the first kind of tests do not directly question if the model has been estimated
by using the correct distribution function of the asset returns, but verify if the Λ function
has been correctly computed and allows for an actual coverage of the risk. On the other
hand, the third test considers Λ as correct and question the impact of the estimation of
the P&L distribution on the coverage capacity of ΛV aR.
Finally, we conduct an empirical analysis based on the backtesting of the ΛV aR,
calibrated using the same dynamic benchmark approach proposed by Hitaj et al. (2015).
The backtesting exercise has been performed along six different time windows throughout
all the global financial crisis (2006-2011).
The paper is structured as follows: Section 2 introduces the backtesting models;
Section 3 describes and shows the results of the empirical analysis.
2.
Model
2.1. Notations and definitions
Let us consider a probability space (Ω, (Ft )T , Pt ), where the sigma algebra Ft represents
the information at time t. We assume that X is the random variable of the returns of an
asset distributed along a real (unknowable) distribution Ft , i.e. Ft (x) := Pt (Xt < x), and
it is forecasted by a model predictive distribution Pt conditional to previous information,
i.e. Pt (x) = Pt (Xt ≤ x|Ft−1 ).
We can measure the risk of the asset return X using the classical V aR, by attributing
4
to X at time t the following value:
V aRt = − inf {x ∈ R | Pt (x) > λ} .
(2)
The alternative risk measure proposed by Frittelli et al. (2014), the ΛV aR, attributes
to X at time t the following value:
ΛV aRt = − inf {x ∈ R | Pt (x) > Λt (x)} .
(3)
where Λt is a monotone function that maps the x ∈ R in (λm , λM ) with λm > 0 and
λM < 1.
Hitaj et al. (2015) proposed a method to estimate the Λ function that is called dynamic benchmark approach. The Λ function represents the proxy of the tails of the
market’s P&L distributions. This approach is called dynamic since the Λ is re-estimated
at each time t according the information in t − 1. This feature allows the ΛV aR to be: 1.
sensitive to the tail risk, in fact, ΛV aR can discriminate the risk of assets with different
tail behavior; 2. reactive to the market fluctuations: the probability level given by the Λ
function changes according the different asset reaction to the market fluctuations.
The authors proposed six different models to estimate Λ, but we focus on the linear
ΛV aR versions. These models are obtained by linear interpolation of n points (πi , λi ) for
any π1 ≤ x < πn , with i = 1, 2, ..n, and fixing a lower (upper) bound at Λ(x) = λ1 for any
x ≤ π1 and upper (lower) bound at Λ(x) = λn for x ≥ πn in the increasing (decreasing)
case.
Hitaj et al. (2015) chose 4 points (n = 4). On the probability axis, they fixed
5
the Λ minimum λm = 0.001, the maximum λM = 0.01 and the others λi values, with
i = 2, .., 3, by an equipartition of the interval (0, λM ]. On the losses axis, they fixed the
4 points πi on the basis of n order statistics of the P&L distribution of some selected
market benchmark. Specifically, π1 is equal to the minimum of all the benchmark returns:
π1 = min xt,j , where xt,j is the realized return of the j-th benchmark, with j = 1, . . . , B
and B is the number of benchmarks and t = 1, .., T and T is time horizon (i.e. number
of days in the rolling window); π2 , π3 , and π4 are equal to the maximum, mean, and
minimum of the benchmark’s λ%-V aR, respectively.
2.2. Backtesting models
Let us denote with xt the realization of the asset return X at time t. In order to perform
the backtesting of a risk measure, we need to construct the sequence of the random
variable representing the violations, {It }Tt=1 , across T days, as follows:
It =




1 if xt < yt



0 otherwise
where yt is the return forecasted by the risk measure. The hit sequence is equal to 1
on day t if the realized returns on that day, xt , is smaller than the value yt predicted
by the risk measure at time t − 1 for the day t, i.e. ΛV aRt or V aRt . If the yt is not
exceeded (or violated), then the hit sequence returns a 0. We assume that the violations
It independently occurs. We observe that It is a random variable that follows a Bernoulli
6
distribution, that is:
It ∼ B(λt )
(4)
where λt is the probability to have an exception at time t.
The first test proposed for the backtesting of V aR is given by Kupiec (1995), where
the author consider the following null and alternative hypothesis:
H0K : λ ≤ (=)λ0
(5)
H1K : λ > λ0
Hitaj et al. (2015) proposed a backtesting method by adapting the classical Kupiec
test for V aR to the ΛV aR. They consider the following null and alternative hypothesis:
H0K : λ ≤ max(Λ)
(6)
H1K
: λ > max(Λ)
Substantially, the ΛV aR is accepted if the relative frequency of the n exceptions over
the time horizon T , λ := n/T , is less or equal to the maximum of the Λ function,
max(Λ). This is an unilateral hypothesis test that can be conducted by using the same
log-likelihood ratio and critical value of the V aR test. This approach allows for testing
if the objective of having less than 1% of violations has been reached, however, it does
not allow to test properly the accuracy of the true ΛV aR.
Indeed, if the ΛV aR model is correct, at time t we should be expecting that the hit
7
sequence assumes value 1 with probability
λ0t = Pt (Xt < −ΛV aRt )
(7)
and 0 with probability 1 − λ0t . In particular, if Xt admits a density under Pt and both
the function Λ and Pt are continuous, λ0t = Λt (−ΛV aRt ). In this case, the probability of
violations depends by the function Λt . From this considerations, it follows that the random variable It of the violations are not identically distributed, which implies that usual
likelihood backtesting framework (POF by Kupiec (1995) , TUFF by Christoffersen
(2010) etc.) cannot be directly applied.
Hence, if the ΛV aR is correct, that is, the model probability is correct, under H0 , we
have that:
H0 : λt = λ0t for any t
(8)
In case of bilateral test, the alternative hypothesis should be set as follows:
H1 : λt 6= λ0t for some t
(9)
While an unilateral test, should be conducted by setting the following alternative hypothesis:
H1 : λt > λ0t for some t and equal otherwise
(10)
where H1 is chosen to be only in the direction of risk under-estimation.
In order to test the accuracy of the ΛV aR model, we propose two hypothesis tests of
unconditional coverage of ΛV aR. Using theoretical results of probability theory we can
8
evaluate with sufficient level of precision if the ΛV aR guarantees the level of coverage
predicted by the parameter λ0t . In particular, the second test provides an asymptotic result, hence it provides the best results for large sample (i.e. time horizon larger than 500).
In this way, we are able to better detect the correctness of the ΛV aR than Hitaj et al.
(2015). Notice that a rejection of their null hypothesis implies a rejection of ours.
We propose also a third test that is very useful to check if the ΛV aR allows for
an appropriate coverage of risk by having been estimated with the correct distribution
function, Pt . This test does not argue if the level of coverage predicted by the model has
been reached, thus, it does not question the correctness of the Λ function. Hence, in this
case, the null and alternative hypothesis are:
H0 : ΛV aR(Ft ) = ΛV aR(Pt ) for every t
(11)
H1 : ΛV aR(Ft ) > ΛV aR(Pt ) for some t and equal otherwise
Here, the correctness of the null hypothesis is evaluated by a simulation exercise.
2.2.1
Test 1: Test of coverage
We set the null and the alternative hypothesis as in (8) and (10), respectively. We
construct this first test by defining the test statistic Z1 equal to the number of violations
over the time horizon T , as follows:
Z1 :=
T
X
t=1
9
It
(12)
The distribution of Z1 is obtained by applying classical results of probability theory.
Since the violations It independently occurs and the sum of independent Bernoulli with
different mean follows a Poisson Binomial distribution (λt ), thus we have that under H0 :
Z1 ∼ Poiss.Bin({λ0t }).
This test is in principle a bilateral test, with critical region: C = z1 : z1 < qZ1 ( α2 ) ∪
z1 : z1 ≥ qZ1 (1 − α2 ) . However, when T corresponds to the usual time horizon (i.e. 250
days), the probability that Z1 < qZ1 ( α2 ) is null. In the backtesting practice, we propose
to treat this test as unilateral, where the critical region is given by:
CZ1 = {z1 : z1 ≥ qZ1 (1 − α)} = {z1 : PZ1 (z1 ) > 1 − α}
and α denotes the significance level of the test (i.e. 1 type error) and qZ1 denotes the
quantile of the distribution of Z1 under H0 , i.e. PZ1 .
In our empirical analysis we fix α = 10% and we compare the result with V aR. For
the V aR model, under H0 we have that:
Z1 ∼ B(T, λ0 ).
This corresponds to the traffic light approach by Basel with two bands instead of three.
10
2.2.2
Test 2: Asymptotic test of coverage
We propose a second test that is founded on a result of probability theory known as the
Lyapunov theorem. This theorem, that we recall here after, is based on the application
of the central limit theorem to random variables that are independent but not identically
distributed (see Lyapunov (1954)).
Theorem 1 (Lyapunov) Suppose X1 , X2 , ... is a sequence of independent random variables, each with finite expected value µt and variance σt2 . Define
s2n
=
T
X
σt2
t=1
If for some δ > 0, the “Lyapunov’s condition”
T
1 X E |Xt − µt |2+δ = 0
lim 2+δ
n→∞ s
T
t=1
is satisfied, then the following convergence in distribution holds as T goes to infinity:
T
1 X
d
(Xt − µt ) →
− N (0, 1)
sT t=1
In the following lemma we show that the “Lyapunov’s condition” is satisfied when
s2T =
PT
1
λt (1 − λt ) and µt = λt .
Lemma 2 If {It } is a sequence of independent random variables distributed as a Bernoulli
11
with parameters {λt }t and inf t λt = λm > 0, then
lim
T →∞
with s2T =
PT
1
T
1 X
s2+δ
T
E[|It − λt |2+δ ] = 0
t=1
λt (1 − λt ).
Proof. We observe that:
E[|It − λt |2+δ ] = (1 − λt )λ2+δ
+ λt (1 − λt )2+δ
t
1
= λt (1 − λt ) λ1+δ
+ (1 − λt )1+δ ≤ λt (1 − λt ) ≤ .
t
4
On the other hand we have
s2+δ
=
T
T
X
λt (1 − λt )
1
!1+ δ2
≥
T
X
λm (1 − λm )
1
!1+ 2δ
δ
= (T λm (1 − λm ))1+ 2 .
We can thus conclude that
PT
t=1
E[|It − λt |2+δ ]
T
≤
δ → 0
2+δ
sT
4 (T λm (1 − λm ))1+ 2
as T → ∞.
We set the null and the alternative hypothesis as in (8) and (9), respectively. Thus,
we can build the following test statistic, that under H0 is defined as follows:
PT
(It − λ0t )
Z2 := qPt=1
.
T 0
0
1 λt (1 − λt )
and is asymptotically distributed as a Standard Normal. Formally:
12
d
Z2 −
→ N(0, 1) .
This result follows from the application of Lemma 2 and the Lyapunov’s theorem.
We remark that this is a bilateral test. Thus, we reject the hypothesis H0 if the
realization z2 of the test statistic stays in the following critical region:
α o n
n
α o
∪ z2 : z2 (x) > qZ2 1 −
CZ2 := z2 : z2 (x) < qZ2
2
2
where α is the significance level of the test, and qZ2 is the quantile function of the Standard
Normal distribution PZ2 .
2.2.3
Test 3: Test of P&L correct estimation
The third test is inspired by Acerbi and Szekely (2014) and focused on another aspect.
Here, we do not only test if the probability λ that a violation occurs is the one provided
by the model, λ0 , since we consider the Λ function as correct. The objective of this test
is to verify if the ΛV aR guarantees the correct coverage of the risk since it has been
estimated under the correct assumption on the distribution Pt of the returns.
Hence, the accuracy of the model can be checked by setting the null and alternative
hypothesis as in (11). Anyway, under H0 the distribution of Xt should be equal to Pt ,
hence these hypothesis imply that an unilateral test can also be conducted by testing
the correctness of the assumption on the asset return distribution, with the following
13
hypothesis:
H0′ : Ft = Pt for every t
(13)
H1′ : Ft > Pt for some t and equal otherwise
The model must be rejected if it is computed under a distribution Pt that under-estimates
the correct distribution Ft . In the empirical exercise, we have chosen the hypothesis in
(13) because the weakest hypothesis in (11) would have not been sufficient to simulate
the test statistic and compute the p-value.
We define the Z3 test statistic, that under H0 is given by:
Z3 :=
T
T
T
1X 0
1X 0 1X
(λt − It ) =
λt −
It
T t=1
T t=1
T t=1
(14)
We observe that under H0 , we have E[Z3 (X)] = 0, while under H1 , E[Z3 (X)] < 0 in
the ΛV aR. So, the realized value Z3 (x) is expected to be zero, and it signals that the
model estimation does not allow for covering the risk when it is negative.
Proposition 3 Under the test hypothesis H0′ and H1′ we have:
1. EH0′ [Z3 ] = 0
2. EH1′ [Z3 ] < 0.
Proof. It is enough to notice that under H0′ , It ∼ B(λ0t ) = so that EH0′ [It − λ0t ] = 0,
which implies
EH0′ [Z3 ] =
1X
EH0′ [λ0t − It ] = 0 .
T
14
In a similar way, under H1′ , since It ∼ B(λt ) with λt > λ0t , we obtain that EH1′ [Z3 ] < 0.
Notice that the violations It depend on Xt , then under H0 the distribution of Z3
depends on the assumption for the distribution Pt of the asset returns. Hence, in order
perform the test, it is necessary to simulate M scenario of the distribution Pt of the returns
at each time t, with t = 1, . . . , T . In this way, we obtain at time T the distribution of
the test statistic PZ3 under H0 . In order to construct the critical region we need to study
the behavior of the Z3 distribution when the distribution of the returns changes from P
to F . Let us compute PZ3 :
PZ 3
T
1X 0
= P (Z3 ≤ z) = P
(λ − It ) ≤ z
T t=1 t
!
T
T
X
X
=P
(−It ) ≤ zT −
λ0t
t=1
=P
T
X
t=1
It ≥ −zT +
t=1
where
PT
t=1 It
!
T
X
t=1
λ0t
!
is distributed as a Binomial Poisson of parameter {λt }. We observe that
PZ3 is an increasing function of {λt } (i.e. the CDF of Z3 shifts to the left when λ
increases). As a consequence, given a confidence level α, we reject when the p-value
p = PZ3 (z) is smaller than α.
In the empirical analysis we conduct M = 10000 simulations using the same assumptions on the returns’ distributions as for the risk measures computation. We set the the
significant level of the test α at 10%.
This test allows to verify how the choice of the P&L distribution function influences
the level of risk coverage of the ΛV aR, that, instead, it is not directly assessed by Test
15
1 and Test 2. Hence, the best use of Test 3 is comparing the results between the same
kind of ΛV aR models but estimated with different assumptions on the P&L distribution
(i.e. historical, Montecarlo Normal and GARCH, etc.).
The limit of this test is that requires a massive storage of information, since at time
T we need all the predictive distributions Pt of the returns for t = 1, . . . , T .
3.
Empirical analysis
In this section, we provide an empirical analysis of the backtesting methods of the ΛV aR
that we have defined in Section (2.2). We applied our tests to a slightly different version
of the 1% − ΛV aR models proposed in Hitaj et al. (2015) and to the 1% − V aR model.
We compare our backtesting results with the Kupiec-type test proposed in Hitaj et al.
(2015) for the ΛV aR and with the classical Kupiec’s test for V aR.
We refer to the same dataset as in Hitaj et al. (2015), consisting in daily data of 12
stocks quoted in different countries along different time windows throughout the global
financial crisis (specifically, from January 2005 to December 2011). These comprise the
stocks of Citigroup Inc. (C UN Equity) and Microsoft Corporation (MSFT UW Equity)
for the United States, Royal Bank of Scotland Group PLC (RBS LN Equity) and Unilever
PLC (ULVR LN Equity) for the United Kingdom, Volkswagen AG (VOW3 GY Equity)
and Deutsche Bank AG (DBK GY Equity) for Germany, Total SA (FP FP Equity) and
BNP Paribas SA (BNP FP Equity) for France, Banco Santander SA (SAN SQ Equity)
and Telefonica SA (TEF SQ Equity) for Spain, and Intesa Sanpaolo SPA (ISP IM Equity)
and Enel SPA (ENEL IM Equity) for Italy.
16
The computation of the risk measures is based on the assumption of historical and
Normal distribution of the asset returns. In order to add robustness to the analysis,
we also implement GARCH models with t-student increments. The estimation of the
parameters is based on 250 days of observations for the historical and Normal assumption,
while 500 days are considered for the GARCH model.
The backtesting exercise is conducted comparing the realized ex-post daily P&L with
the daily V aR and ΛV aR estimates of the 12 stocks over the time period of 1 year. In
particular, we split the analysis into six different 2-year rolling windows (250 days for the
risk measure computation and 1 year for the backtesting).
3.1. Results
3.1.1
The violations and Kupiec test
We first report the results of the violations and the Kupiec test for the V aR model and
the Kupiec-type test adapted by Hitaj et al. (2015) for the ΛV aR model. We compute
the average number of violations and acceptance rate over all the assets and different time
horizon T . The results here presented are under the assumption of historical distribution
of the asset returns.
17
Average number of violations
VaR
2006
2007
2008
3.42
5.33
3.42
(VaR 5%)
(VaR 1%)
Kupiec-Test
2009
2010
2011
2006
11.58
0.75
3.08
6.83
100 %
83 %
0%
100 %
92 %
50 %
5.33
11.58
0.75
3.08
6.83
100%
83%
0%
100%
92%
50%
2.25
3.67
7
0.67
2
4.25
100 %
83 %
42 %
100 %
100 %
83 %
2.17
2.33
5.75
0.67
1.58
4
100 %
83 %
67 %
100 %
100 %
83 %
2.21
3.00
6.38
0.67
1.79
4.13
100 %
83 %
54 %
100 %
100 %
83 %
(VaR 5%)
1.17
1
3.92
0.42
0.92
2.75
100 %
100 %
100 %
100 %
100 %
100 %
(VaR 1%)
1.17
1.08
3.92
0.42
1
2.75
100 %
100 %
100 %
100 %
100 %
100 %
1.17
1.04
3.92
0.42
0.96
2.75
100 %
100 %
100 %
100 %
100 %
100 %
1%
2007
2008
2009
2010
2011
ΛV aR 1% (decr)
ΛV aR 1% (incr)
Table 1. Time evolution of the average number of violations and the Kupiec test under the historical
distribution assumption. The table shows the evolution over the global financial crisis of the average
number of violations and the percentage of Kupiec acceptance, aggregated at the level of the 1%V aR,
as well as the increasing and decreasing ΛV aR models.
As expected and already pointed out in Hitaj et al. (2015) the average number of violations of the 1%V aR is bigger then the one of the ΛV aR, in particular if we compare
the increasing models. In fact the 1% V aR model shows a drastic increase in the average
number of violations, moving from 3.42 in 2006 to 11.58 in 2008. On the other hand, the
increasing ΛV aR models register an average number of violations of around 1.17 during
2006 and retain the number at around 3.92 in the 2008 crisis.
This result was expected since the Λ function has been built with maxx Λt (x) = 0.01,
which implies that the ΛV aR is always greater or equal than the 1% V aR, so that a loss
not covered by the first is also not covered by to the latter. This implies that the ΛV aR
performs always better than 1% V aR by using an unilateral Kupiec-type test, since this
kind of test does not capture the variability of the Λ function that is the essential feature
of the ΛV aR.
18
The number of infractions can be seen as an index of how fast the different models
respond to the external events. During 2009, both the V aR and ΛV aR models quickly
incorporate the effects of the crisis, significantly decreasing the number of violations.
The violations trend is the same also under the other two distribution’s assumptions
taken in exam as shown in the following table.
Gaussian
VaR
GARCH
2006
2007
2008
2009
2010
2011
2006
2007
2008
2009
2010
2011
4.58
7.08
14.92
1.75
4.17
9.42
3.17
6.83
8.25
0.33
0.75
4.33
4.58
7.08 14.92
1.75
4.17
9.42
3.17
6.83
8.25
0.33 0.75
4.33
(VaR 5%)
4.42
6.75
14.25
1.58
3.75
9.17
3.08
5.83
7.33
0.33
0.42
4.25
(VaR 1%)
4.25
5.83
13.08
1.42
3.42
8.58
2.75
4.75
6.42
0.25
0.33
3.92
4.33
6.29 13.67
1.50
3.58
8.88
2.92
5.29
6.88
0.29 0.38
4.08
(VaR 5%)
3.33
4.75
10.83
0.92
2.75
6.67
1.25
2.67
3.58
0.00
0.17
1.42
(VaR 1%)
3.33
5.08
11.67
1.17
3.00
7.00
1.25
2.83
3.50
0.00
0.33
1.42
3.33
4.92 11.25
1.04
2.88
6.83
1.25
2.75
3.54
0.00 0.25
1.42
1%
ΛV aR 1% (decr)
ΛV aR 1% (incr)
Table 2. Time evolution of the average number of violations under the Gaussian and GARCH model.
The table shows the evolution over the global financial crisis of the average number of violations aggregated at the level of the 1%V aR, as well as the increasing and decreasing ΛV aR models.
3.1.2
Test 1 and Test 2: comparison of the level of coverage among V aR and
ΛV aR
In Table (3) and (4) we show the results of the tests of coverage that we have proposed in
Section (2.2) for the ΛV aR model. The results here presented are under the assumption
of historical, gaussian or GARCH distribution of the asset returns.
19
Historical
VaR
1%
2006
2007
2008
100%
58%
100%
58%
Gaussian
GARCH
2009
2010
2011
2006
2007
2008
2009
2010
2011
2006
2007
2008
2009
2010
2011
0%
100%
75%
25%
58%
33%
0%
92%
50%
8%
75%
50%
33%
100%
100%
67%
0%
100%
75%
58%
33%
0%
92%
50%
8%
75%
50%
33%
100%
100%
67%
100%
92%
67%
42%
8%
0%
83%
50%
8%
75%
50%
33%
100%
100%
67%
100% 100%
67%
33%
25%
0%
92%
42%
8%
67%
67%
33%
100%
100%
67%
25%
(VaR 5%)
100%
75%
8%
(VaR 1%)
92%
83%
25%
96%
79%
17%
100%
96%
67%
38%
17%
0%
88%
46%
8%
71%
58%
33%
100%
100%
67%
(VaR 5%)
75%
83%
0%
100%
83%
17%
0%
0%
0%
42%
33%
8%
67%
58%
25%
100%
92%
58%
(VaR 1%)
75%
83%
0%
100%
75%
17%
8%
8%
0%
42%
42%
8%
75%
50%
25%
100%
92%
58%
75 %
83%
0%
100%
79%
17%
4%
4%
0%
42%
38%
8%
71%
54%
25%
100%
92%
58%
ΛV aR 1% (decr)
ΛV aR 1% (incr)
Table 3. Time evolutions of the Test 1 for the ΛV aR models under different assumptions of the
P&L distribution. The table shows the evolution over the global financial crisis of the acceptance rates,
aggregated at the level of the ΛV aR models (minx Λ(x) = 0.5%) calculated using the historical, normal
and GARCH assumption of the P&L distribution.
Historical
2006
VaR
1%
2007
2008
2009
Gaussian
2010
2011
2006
2007
2008
42%
58%
42%
58%
GARCH
2009
2010
2011
2006
2007
2008
2009
2010
2011
0%
100%
67%
25%
83%
58%
42%
100%
100%
67%
42%
0%
100%
67%
25%
83%
58%
42%
100%
100%
67%
100%
75%
0%
100%
92%
100%
75%
0%
100%
92%
(VaR 5%)
100%
83%
17%
100%
100%
75%
58%
42%
0%
100%
67%
17%
83%
58%
33%
100%
100%
67%
(VaR 1%)
100%
83%
42%
100%
100%
83%
50%
50%
0%
100%
67%
17%
92%
75%
42%
100%
100%
75%
100%
83%
29%
100%
100%
79%
54%
46%
0%
100%
67%
17%
88%
67%
38%
100%
100%
71%
(VaR 5%)
100%
100%
17%
100%
92%
42%
17%
25%
0%
92%
50%
8%
92%
75%
67%
100%
100%
83%
(VaR 1%)
100%
100%
17%
100%
92%
42%
25%
33%
0%
83%
58%
25%
92%
67%
67%
100%
92%
83%
100%
100%
17%
100%
92%
42%
21%
29%
0%
88%
54%
17%
92%
71%
67%
100%
96%
83%
42%
ΛV aR 1% (decr)
ΛV aR 1% (incr)
Table 4. Time evolutions of the Test 2 for the ΛV aR models under different assumptions of the
P&L distribution. The table shows the evolution over the global financial crisis of the acceptance rates,
aggregated at the level of the ΛV aR models (minx Λ(x) = 0.5%) calculated using the historical, normal
and GARCH assumption of the P&L distribution.
We first notice that the acceptance rate of the tests we propose is lower than the unilateral
Kupiec-POF test in Hitaj et al. (2015). This is due to the particular construction of the
Kupiec-POF test. By imposing a fix parameter λ0 that is equal to max(Λ), this test
is not able to capture the daily variation of the coverage level λ0t = Pt (Xt < ΛV aRt )
20
given by the ΛV aR model. For this reason, this test is useful to assess if the ΛV aR
model guarantees a maximum level of accepted coverage, but cannot be used to check
the accuracy of the real coverage level, λ0t offered by the ΛV aR model.
On the other hand, the coverage tests that we propose are able to better evaluate if
the flexibility introduced by the Λ function helps to detect adverse scenario and put aside
a more adequate amount of capital.
For all the models the asymptotic coverage test (Test 2) provides an higher acceptance
rates in respect to the coverage test (Test 1). This is due to the fact that the coverage
test provides more precise results with a smaller number of observations.
In general, the ΛV aR models result more accurate than 1% V aR, confirming the
outcomes in Hitaj et al. (2015). This means that the highest flexibility of the ΛV aR
contributes to the highest coverage. On the other hand, in our tests, the decreasing
ΛV aR models seem to be more accurate, in contrast with the results of the Kupiec test
in Hitaj et al. (2015). Even if the number of the infractions of the increasing ΛV aR
models is the smallest, these models lose accuracy especially during the crisis periods
(2008, 2011). Our coverage tests point out an issue of estimation in the ΛV aR models
proposed by Hitaj et al. (2015). The choice of the Λ minimum, minx Λ(x), that seemed
to be irrelevant in Hitaj et al. (2015), here it is determinant, as we discuss below.
3.1.3
The choice of the Λ minimum
The results in Table (3) and (4) have been computed by fixing minx Λ(x) = 0.5%. In
fact, we notice that, using our coverage tests, the increasing ΛV aR models computed as
in Hitaj et al. (2015) presented an higher rejection rate (see Table (5)), while presenting
21
the smallest number of infraction.
Test 1: Coverage Test
2006
2007
2008
(VaR 5%)
100 %
75 %
(VaR 1%)
92 %
Test 2: Asymptotic Coverage Test
2009
2010
2011
2006
2007
2008
2009
2010
2011
8%
100 %
92 %
67 %
100 %
83 %
17 %
100 %
100 %
75 %
83 %
25 %
100 %
100 %
67 %
100 %
83 %
42 %
100 %
100 %
83 %
96 %
79 %
17 %
100 %
96 %
67 %
100 %
83 %
29 %
100 %
100 %
79 %
(VaR 5%)
8%
17 %
0%
58 %
42 %
8%
75 %
83 %
0%
100 %
75 %
17 %
(VaR 1%)
8%
17 %
0%
58 %
42 %
8%
75 %
83 %
0%
100 %
83 %
25 %
8%
17 %
0%
58 %
42 %
8%
75 %
83 %
0%
100 %
79 %
21 %
ΛV aR 1% (decr)
ΛV aR 1% (incr)
Table 5. Time evolutions of the Test 1 and Test 2 for the ΛV aR models with minx Λ(x) = 0.1% under
the historical distribution assumption. The table shows the evolution over the global financial crisis of
the acceptance rates, aggregated at the level of the ΛV aR models with minx Λ(x) = 0.1%.
Thus, we have studied how the probability of infraction λt evolves and we have observed
that in most of the cases it obtains the minimal value 0.1%. This happens especially
during crisis periods, when the cumulative distribution function of the assets shifts on
the left and intersect the Λ function at the minimum level. For this reason, we propose to
compute the ΛV aR models by fixing the Λ minimum equal to 0.5%, i.e. minx Λ(x) = 0.005
instead of minx Λ(x) = 0.001 used by Hitaj et al. (2015). Although, the authors did not
specify a criteria for this choice, we consider this as a relevant and critical issue. From our
point of view, the Λ minimum should provides the probability to lose more than the worst
case event (i.e. benchmarks’ minimum, π1 = min xt,j ) over the time window observations
(i.e. 250 in our case). If we consider all the event equally probable, the selection of the Λ
minimum should be greater than 1/T over T observations. Thus, in our models, we set
minx Λ(x) = 0.5% since the probability of an event over 250 past realizations is 0.4%.
22
By using these new estimations, we have observed that the number of infractions
does not change in any of the period under consideration, while the acceptance rate of
the increasing ΛV aR models drastically increases (see tables (3) and (4)), validating our
choice. Clearly, this new setting does not affect the decreasing ΛV aR models. Anyway,
the choice of the Λ minimum can be refined considering more precise evaluation of the
probability of a worst case event, but this is beyond the objective of this paper.
3.1.4
Test 3: comparison of ΛV aRs with different P&L estimations
As anticipated in Section (2.2), the best use of the third test is the comparison of the level
of coverage among different estimations of the ΛV aR. We computed the time evolution of
the acceptance rate aggregated at the level of the increasing and decreasing ΛV aR models.
We repeat the analysis changing the assumption on the P&L distribution: specifically,
historical, Monte Carlo Normal and GARCH simulations. The results are presented in
Table (6)
Historical
VaR
2006
2007
2008
50%
33%
50%
(VaR 5%)
(VaR 1%)
1%
Gaussian
GARCH
2009
2010
2011
2006
2007
2008
2009
2010
2011
2006
2007
2008
2009
2010
2011
0%
100%
58%
25%
58%
33%
0%
92%
50%
8%
75%
58%
33%
100%
100%
67%
33%
0%
100%
58%
58%
33%
0%
92%
50%
8%
75%
58%
33%
100%
100%
67%
50%
33%
0%
100%
67%
17%
58%
42%
0%
92%
58%
17%
75%
58%
33%
100%
100%
67%
58%
50%
8%
100%
67%
8%
50%
33%
0%
92%
58%
25%
92%
67%
33%
100%
100%
75%
54%
42%
4%
100%
67%
13%
54%
38%
0%
92%
58%
21%
83%
63%
33%
100%
100%
71%
8%
17%
0%
58%
42%
0%
17%
17%
0%
92%
50%
8%
83%
67%
67%
100%
100%
83%
25%
ΛV aR 1% (decr)
(VaR 5%)
ΛV aR 1% (incr)
(VaR 1%)
8%
17%
0%
58%
42%
8%
33%
8%
0%
83%
50%
17%
83%
58%
67%
100%
92%
83%
8%
17%
0%
58%
42%
4%
25%
13%
0%
88%
50%
13%
83%
63%
67%
100%
96%
83%
Table 6. Time evolutions of the Test 3 for the ΛV aR models under different assumptions of the
P&L distribution. The table shows the evolution over the global financial crisis of the acceptance rates,
aggregated at the level of the ΛV aR models (minx Λ(x) = 0.5%) calculated using the historical, normal
and GARCH assumption of the P&L distribution.
23
The results show that the GARCH assumption on the return guarantees the highest
coverage.
Moreover, we notice how in this test the Historical estimator frequently underperforms
the Gaussian one, in contrast with the previous tests. This is due to the fact that the
Historical estimator takes values only from a finite sample; in particular, if the realized
return xt at time t is lower than all the ones of the year before, on which the historical
estimator is built, not only the probability of obtain it is 0 but also Pt (xt ) = 0. Obviously,
this problem does not occur if we suppose a normal distributions of the returns.
Test 3 is based on the tail behaviour of the distribution and, in particular, compares
the realized number of violations with the ones provided by the model. For this reason,
a distribution with thin tails (as the historical) will perform poorly on this test. These
observations can explain why this test sometimes results more punitive in the historical
case than in the normal one. Such a preference for the normal distribution is, on the other
hand, completely reversed by the other tests, which privilege the Historical distribution
by relying (almost) only on the number of infractions and not on the full shape of the
distribution.
4.
Conclusions
A new risk measure, the ΛV aR, has been recently introduced. An ad hoc study on its
backtesting has not been done in literature. The issue is that, in the ΛV aR model, the
probability of a violation is not constant, but somehow depends on the function Λ. A
first backtesting proposal is provided by Hitaj et al. (2015). However, this methodology
24
does not keep into account the effective predictive capacity of the ΛV aR as introduced
by the Λ function.
We propose three backtesting methodologies and we asses the accuracy of the new
risk measure from different points of view. Test 1 and Test 2 evaluate if the ΛV aR
provides an accurate level of coverage, which is the one exactly predicted by the model.
These tests are more efficient to compare the goodness of ΛV aR with respect to V aR.
In particular, they assess the additional value introduced by the Λ function and if Λ has
been correctly estimated, thus, allowing a better coverage of the risk. Test 1 is unilateral
and provides more precise results with a small sample of observation (g.e. 250). Test 2
is bilateral and provides an asymptotic result; for this reason, it is preferable for larger
sample of observations.
The Test 3 is focused on another aspect. Here, the correctness of the Λ function is not
argued. This test evaluates if the correct coverage of the risk derives from the fact that
ΛV aR has been estimated with the correct distribution of the returns. Hence, the best
use of Test 3 is comparing the results between the same kind of ΛV aR models but estimated under different assumptions on the P&L distribution (i.e. historical, Montecarlo
Normal and GARCH). This test, being based on simulations, requires a massive storage
of information and may provide less accurate results when the distribution of the returns
has thin tails.
Finally, we conduct an empirical analysis. Both Test 1 and Test 2 show that the ΛV aR
models perform better than 1 % V aR, confirming the results in Hitaj et al. (2015). Test
1 provides more precise results than Test 2, implying an higher rejection rate. Test 3
shows that the ΛV aR computed with the GARCH model of the returns has the highest
25
level of coverage.
Acknowledgements
This research benefited from the support of the “Chaire Risques Financiers”, Fondation
du Risque.
References
Acerbi, C., and Szekely, B. (2014), ”Back-testing expected shortfall,” Risk, 27(11).
Basel Committee on Banking Supervision (1996),”Supervisory Framework for the Use of
Backtesting in Conjunction with the Internal Models Approach to Market Risk Capital
Requirements,” Bank for International Settlements.
Christoffersen, P. (2010), ”Encyclopedia of Quantitative Finance - Backtesting,” John
Wiley and Sons.
Basel Committee on Banking Supervision (2013), ”Fundamental review of the trading
book,” Second consultative document, Bank for International Settlements.
Frittelli, M., Maggis, M., and Peri, I. (2014), ”Risk Measures on and Value at Risk with
Probability/Loss Function,” Mathematical Finance, 24, 442-463.
Hitaj, A., Mateus, C., and Peri, I. (2015), ”Lambda value at risk and regulatory capital:
a dynamic approach to tail risk,” Working paper.
26
Kerkhof, J., and Melenberg, B. (2004), ”Backtesting for Risk-Based Regulatory Capital,”
Journal of Banking & Finance, 28(8), 1845–1865.
Kupiec, P. (1995), ”Techniques for Verifying the Accuracy of Risk Measurement Models,”
Journal of Derivatives, 3, 73-84.
Lyapunov, A. M. (1954), Collected works, Vol 1.
27
Download