Uploaded by Sai Krithik

BTP refrence points

advertisement
Links for refrenced papers
https://dl.acm.org/doi/pdf/10.1145/2661631
https://www.researchgate.net/publication/322488817_Conditi
onal_Monte_Carlo_for_sums_with_applications_to_insurance_
and_finance
https://www.tandfonline.com/doi/pdf/10.1080/10920277.201
1.10597627?needAccess=true
Reinforcement learning is an area of Machine Learning. It is
about taking suitable action to maximize reward in a particular
situation. It is employed by various software and machines to
find the best possible behavior or path it should take in a
specific situation. Reinforcement learning differs from
supervised learning in a way that in supervised learning the
training data has the answer key with it so the model is trained
with the correct answer itself whereas in reinforcement
learning, there is no answer but the reinforcement agent
decides what to do to perform the given task. In the absence of
a training dataset, it is bound to learn from its experience.
Monte Carlo methods, or Monte Carlo experiments, are a
broad class of computational algorithms that rely on repeated
random sampling to obtain numerical results. The underlying
concept is to use randomness to solve problems that might be
deterministic in principle. They are often used in physical and
mathematical problems and are most useful when it is difficult
or impossible to use other approaches. Monte Carlo methods
are mainly used in three problem classes:[1] optimization,
numerical integration, and generating draws from a probability
distribution.
Abstract
Value-at-risk (VaR) and conditional value-at-risk (CVaR) are two
widely used risk measures of large losses and are employed in
the financial industry for risk management purposes. In
practice, loss distributions typically do not have closed-form
expressions, but they can often be simulated (i.e., random
observations of the loss distribution may be obtained by
running a computer program). Therefore, Monte Carlo methods
that design simulation experiments and utilize simulated
observations are often employed in estimation, sensitivity
analysis, and optimization of VaRs and CVaRs. In this article, we
review some of the recent developments in these methods,
provide a unified framework to understand them, and discuss
their applications in financial risk management.
INTRODUCTION
Risk is a fundamental attribute of financial activities. When
investors make financial
decisions, they consider not only potential returns but also
potential risks. There are
various kinds of risks in the financial industry. For instance, an
investment bank
may hold a portfolio of stocks for a period of time and the value
of the portfolio may
evolve at random during the period. Then, the bank faces the
market risk that the
value of the portfolio may fall below the initial value. Similarly,
a commercial bank
may hold a portfolio of loans lent to different obligors. Then,
the bank faces the credit
risk that some of the obligors may default. Because of the
importance and ubiquity of
financial risks, individual financial institutions often want to
identify and understand the risks in their activities, based on
which they can then control or manage the risks.
Furthermore, because of the systematic nature of financial
institutions, risks of one
institution can easily spread to other institutions or even to the
entire financial system,
resulting in the so-called systemic risk. Such systemic risk may
even affect the entire
economic and social system. Therefore, consensus has been
reached that regulations
on financial systems and financial markets are necessary.
There have been numerous risk measures introduced and
employed in the financial industry. Value-at-risk (VaR) and conditional value-at-risk
(CVaR, also known as
expected shortfall or tail conditional expectation), which we
review in this article, are
among the most well-known and widely used ones and play
dominating roles in practice.
For any α ∈ (0, 1), the α-VaR of a random loss L is the α quantile
of L, while the α-CVaR
is the average of all β-VaR for β ∈ (α, 1). As we are typically
interested in the risk of
large losses in practice, α is typically quite close to 1, for
example, α = 0.9, 0.95, 0.99.
As pointed out by Hong and Liu [2009], if we define the large
losses to be the losses in
the upper (1 − α)-tail of the loss distribution, then the α-VaR is
the lower bound of the
large losses and the α-CVaR is the mean of the large losses.
They provide information
on potential large losses that an investor may suffer
Even though VaR was widely adopted in financial practice,
there is also criticism on
its use as a risk measure. Artzner et al. [1999] defined four
axioms and called a risk
measure that satisfies these axioms a coherent risk measure.
One of these axioms is
the subadditivity axiom, which basically means that “a merger
does not create extra
risk.” They further showed that VaR does not always satisfy the
subadditivity axiom
and is therefore not a coherent risk measure. Rockafellar and
Uryasev [2002], on the
other hand, showed that CVaR satisfies all four axioms and is
therefore a coherent risk
measure (see also the study of Acerbi and Tasche [2002]). Kou
et al. [2013], however,
argued that the subadditivity axiom is not necessary and
suggested replacing it with
the comonotonic subadditivity axiom, which only requires
subadditivity to hold for
random variables moving in the same direction. They showed
that both VaR and CVaR
satisfy the comonotonic subadditivity axiom. However, they
argued that, compared to
CVaR, VaR is often more robust to the tail behavior of the loss
distribution, which
is in general difficult to characterize in practice, and is therefore
more suitable for
regulatory purposes.
2. ESTIMATIONS OF VAR AND CVAR
As a starting point, we define VaR and CVaR and explore their
inherent connections. Let
L be the random loss of interest and F(y) = Pr{L ≤ y} be the
cumulative distribution
function (CDF) of L. Then, the inverse CDF of L can be defined as
F−1 (γ ) = inf {y :
F(y) ≥ γ }. Following the definitions of Trindade et al. [2007], for
any α ∈ (0, 1), we
define the α-VaR of L as
vα = F−1 (α),
and define the α-CVaR of L as
cα = 1
1−α
∫1
α
vβ dβ. (1)Monte Carlo Methods for Value-at-Risk and
Conditional Value-at-Risk: A Review 22:5
Pflug [2000] showed that cα is also the optimal value of the
stochastic program:
cα = inf
t∈
{
t+1
1 − α E[L − t]+
}
, (2)
where [a]+ = max{0, a}. Let T be the set of optimal solutions to
the stochastic program
defined in Equation (2). Then it can be shown that T = [vα , uα ],
where uα = sup{t :
F(t) ≤ α} (see, e.g., Rockafellar and Uryasev [2002] and Trindade
et al. [2007]). In
particular, note that vα ∈ T . Therefore,
cα = vα + 1
1 − α E[L − vα ]+. (3)
When L has a positive density in the neighborhood of vα , then
vα = uα . Therefore, the
stochastic program defined in (2) has a unique solution, and
cα = E[L|L ≥ vα ], (4)
while the right-hand side of Equation (4) is also known as
expected shortfall or tail
conditional expectation. To be meaningful, we assume that cα
is finite for all discussions
related to CVaR in this article.
2.1. Crude Monte Carlo Estimation
Suppose that L1, L2, . . . , Ln are n independent and identically
distributed (i.i.d.) observations from the loss L. Then, the α-VaR of L can be estimated
by
ˆvn
α = Lnα:n,
where a denotes the smallest integer larger than or equal to a,
and Li:n is the ith
order statistic from the n observations.
Trindade et al. [2007] suggested to use the estimator
ˆc n
α = inf
t∈
{
t+1
n(1 − α)
n∑
i=1
[Li − t]+
}
(5)
to estimate the α-CVaR of L. Let
Fn(y) = 1
n
n∑
i=1
1{Li ≤y}
be the empirical CDF constructed from L1, L2, . . . , Ln, where
1{·} is the indicator function. Then
ˆc n
α = inf
t∈
{
t+1
1 − α E[ ̃L − t]+
}
,
where the CDF of ̃L is Fn. Since ˆvn
α = F−1
n (α), then by Equation (3), we have
ˆc n
α = ˆvn
α+1
n(1 − α)
n∑
i=1
[Li − ˆvn
α
]+ . (6)
Therefore, we can apply Equation (6) to directly estimate cα
instead of solving the
stochastic program in Equation (5).
Consistency and asymptotic normality of the estimators ˆvn
α and ˆc n
α have been studied
extensively in the literature (see, e.g., Serfling [1980] and
Trindade et al. [2007]).
Regarding the asymptotic properties, a result that is even
sharper is the Bahadur
representation [Bahadur 1966].
ACM Transactions on Modeling and Computer Simulation, Vol.
24, No. 4, Article 22, Publication date: November 2014.
22:6 L. J. Hong et al.
As a unified view, we present the asymptotic properties of ˆvn
α and ˆc n
α using the Bahadur
representations. To this end, we first make the following
assumption.
ASSUMPTION 1. There exists an > 0 such that L has a positive
and continuously
differentiable density f (x) for any x ∈ (vα − , vα + ).
Assumption 1 requires that L has a positive and differentiable
density in a neighborhood of vα . It implies that F(vα ) = α and cα = E[L|L ≥ vα ].
Bahadur representations of ˆvn
α and ˆc n
α are summarized in the following theorem,
whose proof can be found in Sun and Hong [2010].
THEOREM 2.1. For a fixed α ∈ (0, 1), suppose that Assumption 1
is satisfied. Then
ˆvn
α = vα + 1
f (vα )
(
α−1
n
n∑
i=1
1{Li ≤vα }
)
+ An, and
ˆc n
α = cα +
(
1
n
n∑
i=1
[
vα + 1
1 − α (Li − vα )+
]
− cα
)
+ Bn,
where An = Oa.s.(n−3/4 (log n) 3/4 ), Bn = Oa.s.(n−1 log n), and
the statement Y n = Oa.s.(g(n))
means that Yn/g(n) is bounded by a constant almost surely.
Consistency and asymptotic normality of ˆvn
α and ˆc n
α follow straightforwardly from
Theorem 2.1. Specifically, if Assumption 1 is satisfied, then ˆvn
α → vα and ˆc n
α → cα with
probability 1 (w.p.1) as n → ∞, and
√n ( ˆvn
α − vα
)⇒
√α(1 − α)
f (vα ) N(0, 1), as n → ∞, (7)
where “⇒” denotes “converge in distribution,” and N(0, 1)
represents the standard
normal random variable. If, in addition, E[(L − vα ) 2 1{L≥vα }] <
∞, then
√n ( ˆc n
α − cα
) ⇒ σ∞ · N(0, 1), as n → ∞, (8)
where
σ2
∞ = lim
n→∞ nVar ( ˆc n
α
)=1
(1 − α) 2 · Var([L − vα ]+).
2.2. Variance Reduction
In the simulation literature, there has been a significant amount
of work on the topic
of variance reduction for VaR estimation. For instance, Hsu and
Nelson [1990] and
Hesterberg and Nelson [1998] studied the use of control
variates. Avramidis and Wilson
[1998] employed correlation-induction techniques for variance
reduction in quantile
estimation. Glynn [1996] considered the use of importance
sampling (IS) and discussed
its asymptotic properties. The problem of estimating portfolio
VaR has been studied
in Glasserman et al. [2000] and Glasserman et al. [2002], where
IS and stratified
sampling are employed.
Among various variance reduction methods proposed in the
literature, IS is particularly attractive, given the rare-event features of many practical
problems. It has proven
to be a very effective variance reduction technique in this
context, and much work has
been done regarding this issue.
In what follows, we discuss a general IS method for estimating
VaR and CVaR, with
a focus on the asymptotic properties of the IS estimators.
Specifically, suppose that L is
simulated under another CDF G(·), where F is absolutely
continuous with respect to G
in [vα − , ∞), with > 0 being a fixed constant, that is, F(dx) = 0 if
G(dx) = 0 for any
ACM Transactions on Modeling and Computer Simulation, Vol.
24, No. 4, Article 22, Publication date: November 2014.
Monte Carlo Methods for Value-at-Risk and Conditional Valueat-Risk: A Review 22:7
x ∈ [vα − , ∞). We refer to G as the IS distribution and let l(x) =
F(dx)/G(dx) denote
the likelihood ratio function (also called score function)
associated with the change of
measure. Note that for x ∈ [vα − , ∞),
F(x) = EF
[1{L≤x}
] = EG
[1{L≤x}l(L)] ,
where EF and EG denote taking expectations with respect to F
and G, respectively.
Then we may estimate F(x) by
Fn,IS (x) = 1
n
n∑
i=1
1{Li ≤x}l(Li ).
Then the IS estimators of vα and cα , denoted by ˆvn,IS
α and ˆc n,IS
α , can be defined as follows:
ˆvn,IS
α = F−1
n,IS (α) = inf {x : Fn,IS (x) ≥ α}, and
ˆc n,IS
α = ˆvn,IS
α+1
n(1 − α)
n∑
i=1
(Li − ˆvn,IS
α
)+l(Li ).
Recently, Sun and Hong [2010] and Chu and Nakayama [2012]
independently studied
the Bahadur representations of the IS estimators. To present
this result, we follow the
framework of Sun and Hong [2010] and make a further
assumption.
ASSUMPTION 2. There exist > 0 and C > 0 such that l(x) ≤ C for
any x ∈ (vα −
, vα + ), and there exists p > 2 such that EG
[l p(L)] < ∞.
Assumption 2 requires that the likelihood ratio is bounded from
above in a neighborhood of vα and has a finite p > 2 moment on the right tail of
the loss.
The Bahadur representations of the IS estimators of vα and cα
are summarized in the
following theorem. Interested readers may refer to Sun and
Hong [2010] for its proof.
THEOREM 2.2. For a fixed α ∈ (0, 1), suppose that Assumptions
1 and 2 are satisfied.
Then,
ˆvn,IS
α = vα + 1
f (vα )
(
α−1
n
n∑
i=1
1{Li ≤vα }l(Li )
)
+ C n, and
ˆc n,IS
α = cα +
(
1
n
n∑
i=1
[
vα + 1
1 − α (Li − vα )+l(Li )
]
− cα
)
+ Dn,
where C n = Oa.s.(max{n−1+2/ p+δ , n−3/4+1/(2 p)+δ }) and Dn
= Oa.s.(n−1+2/ p+δ ) for any δ > 0.
Asymptotic normality of the estimators follows immediately
from Theorem 2.2. In
particular, under Assumptions 1 and 2,
√n( ˆvn,IS
α − vα
)⇒
√
VarG
[1{L≥vα }l(L)]
f (vα ) N(0, 1), as n → ∞.
If, in addition, EG[(L − vα ) 2l2 (L)1{L≥vα }] < ∞, then
√n( ˆc n,IS
α − cα
)⇒
√
VarG
[(L − vα )+l(L)]
1 − α N(0, 1), as n → ∞.
If l(x) ≤ 1 for all x ≥ vα , then it can be easily verified that
VarG[1{L≥vα }l(L)] ≤ α(1 − α)
and VarG[(L − vα )+l(L)] ≤ Var[(L − vα )+]. Then, compared to
Equations (7) and (8),
ACM Transactions on Modeling and Computer Simulation, Vol.
24, No. 4, Article 22, Publication date: November 2014.
22:8 L. J. Hong et al.
it can be seen that the asymptotic variances of the IS
estimators are smaller than
those of the estimators without IS, given that l(x) ≤ 1 for all x ≥
vα . In practice, an
effective IS distribution (with a density function g) often
satisfies g(x) ≥ f (x) for x ≥ vα .
This provides a guideline for selecting an appropriate IS
distribution during practical
implementation.
Large Sample
The -level value at risk (VaR) and the -level conditional tail expectation (CTE) of a
continuous
random variable X are defined as its -level quantile (denoted by q ) and its
conditional expectation given the event {X q}, respectively. VaR is a popular risk measure in the
banking sector,
for both external and internal reporting purposes, while the CTE has recently
become the risk
measure of choice for insurance regulation in North America. Estimation of the CTE
for company
assets and liabilities is becoming an important actuarial exercise, and the size and
complexity of
these liabilities make inference procedures with good small sample performance
very desirable. A
common situation is one in which the CTE of the portfolio loss is estimated using
simulated values,
and in such situations use of variance reduction techniques such as importance
sampling have
proved to be fruitful. Construction of confidence intervals for the CTE relies on the
availability of
the asymptotic distribution of the normalized CTE estimator, and although such a
result has been
available to actuaries, it has so far been supported only by heuristics. The main goal
of this paper
is to provide an honest theorem establishing the convergence of the normalized
CTE estimator
under importance sampling to a normal distribution. In the process, we also provide
a similar
result for the VaR estimator under importance sampling, which improves upon an
earlier result.
Also, through examples we motivate the practical need for such theoretical results
and include
simulation studies to lend insight into the sample sizes at which these asymptotic
results become
meaningful.
1. INTRODUCTION (similar to the structure of Review of literature from the
example BTP
The -level value at risk (VaR) and the -level conditional tail expectation
(CTE) of a continuous
random variable X are defined as its -level quantile (denoted by q ) and
its conditional expectation
given the event {X q}, respectively. Although both of these risk
measures are popular, it is noteworthy that requirements on a risk measure for it to be coherent, as laid
out in Artzner et al. (1999),
are satisfied by the CTE but are not by the VaR. VaR is popular in the
banking sector and is used for
both external and internal reporting purposes. On the insurance side, for
variable annuities the adoption of the C-3 Phase II revision to the regulatory risk-based capital
model in 2005, and the implementation of the analogous principles-based reserving methodology
(AG VACARVM) in 2009 by the
National Association of Insurance Commissioners (NAIC), have
together made the CTE the key risk
measure. Now with the Life Reserves Work Group and the Life Capital
Work Group (C3WG) of the
American Academy of Actuaries working on an analogous reserve and
capital methodology for life
insurance products, and the possibility of principles-based reserves
(PBRs) being made effective in
2014, CTE is well poised to become the risk measure of choice for the
whole of the life industry in the
United States.
* Jae Youn Ahn, Doctoral Student, Dept. of Statistics and Actuarial Science, The University of
Iowa, 241 Schaeffer Hall, Iowa City, IA 52242,
USA. jaeyoun-ahn@uiowa.edu.
† Nariankadu D. Shyamalkumar, ASA, PhD, Assistant Professor, Dept. of Statistics and Actuarial
Science, The University of Iowa, 241 Schaeffer
Hall, Iowa City, IA 52242, USA. shyamal-kumar@uiowa.edu.
394 NORTH AMERICAN A CTUARIAL J OURNAL, V OLUME 15, NUMBER 3
The above described changes in insurance regulation require estimation of
the CTE for company
assets and liabilities, and the size and complexity of these liabilities make
inference procedures with
good small sample performance very desirable. There are two common
situations requiring inference
procedures for the CTE. In the first, the distribution of the loss random
variable is unknown, and the
actuary has only a random sample from this unknown distribution at his or her
disposal. In the second,
the loss random variable is a known function of some economic variable(s)
with a known distribution
for the latter. The complexity in this situation arises from the huge
computational cost involved in
calculating the loss random variable as a function of the economic variable(s).
So although the distribution in theory can be ascertained with certainty, the computational
complexity of the task renders
it practically unknown, and once again the actuary has to make do with a
sample from the portfolio
loss distribution. Although both of these situations have much in common, it is
the availability of
variance reduction techniques such as importance sampling in the second
situation that makes them
different. We refer to Glasserman (2004) for a self-contained treatment of the
use of Monte Carlo
methods in finance, in particular the use of variance reduction techniques
such as importance sampling.
In response to this need for inference procedures for the CTE, and for better
understanding of their
performance, there has been a surge in the actuarial literature of papers
dealing with statistical inference of the CTE and related risk measures; see, for example, Jones and
Zitikis (2003), Manistre and
Hancock (2005), Kaiser and Brazauskas (2006), Kim and Hardy (2007),
Brazauskas et al. (2008), Ko
et al. (2009), Russo and Shyamalkumar (2010), Necir et al. (2010), and Ahn
and Shyamalkumar (2010).
Nevertheless, only Manistre and Hancock (2005) discuss the use of variance
reduction techniques for
estimation of the CTE, and this is the area of focus for our paper.
Our interest in establishing asymptotic convergence results for the empirical
CTE and quantile under
importance sampling arose mainly because we see importance sampling as
one of the potent practical
strategies to get not only better point estimators but also confidence intervals;
see, for example, Manistre and Hancock (2005) and Glasserman et al. (2000). The main contribution
of the paper is that
we establish asymptotic normality of the CTE and VaR estimators under
importance sampling. In the
case of VaR, as discussed later, our result improves upon an earlier result of
Glynn (1996), whereas
there is no published result for the case of the CTE. However, we note that
our results have been
suggested and supported by heuristics derived from the use of influence
functions in Manistre and
Hancock (2005), one of the earlier articles on the estimation of CTE in the
actuarial literature. Al-
though a theoretical result justifying the use of a methodology is undoubtedly
of interest, its practical
value is amplified if it prevents the use of the methodology in cases where
against expectations the
methodology fails. Through the first two examples, for the expository ease
dealing with the case of
ordinary sampling, we motivate the practical need for theoretical results
establishing asymptotic normality of the CTE and VaR estimators under importance sampling.
The following nonpathological ordinary sampling example shows that the
existence of influence function in the case of the CTE falls short of establishing convergence of the
empirical CTE to normality,
and also that the formula derived for the asymptotic variance through the use
of influence function
could be misleading. The use of influence function for VaR is not similarly
prone to misuse because
the asymptotic variance formula for the VaR is proportional to the reciprocal of
the density evaluated
at the quantile, and known results for weak convergence of empirical quantiles
under ordinary sampling
(for example, see Reiss 1989) require only that the density evaluated at the
quantile be positive. We
refer the reader to Manistre and Hancock (2005), especially to sections 2 and
5 therein, for an introduction to influence functions.
3. Variance Reduction for the c.d.f.
CdMC always gives variance reduction. But as argued, it needs to be substantial for the procedure to
be worthwhile. Further in many applications the right and/or left tail is of particular interest, so one
may pay particular attention to the behaviour there.
Remark 3.1 That CdMC gives variance reduction in the tails can be seen intuitively by the following
direct argument without reference to Rao–Blackwellization. The CrMC, respectively, the CdMC,
estimators of F n xð Þ are I Sn > xð Þ and F xSn1ð Þ, with second moments
EI Sn > xð Þ2 = EI S n > xð Þ =
ð1
1
f n1 yð ÞF xyð Þ dy (3.1)
=
X ≥ 0 P Sn1 > xð Þ +
ðx
0
f n1 yð ÞF xyð Þ dy (3.2)
EF xSn1ð Þ2 =
ð1
1
f n1 yð ÞF xyð Þ2 dy (3.3)
=
X ≥ 0 P Sn1 > xð Þ +
ðx
0
f n1 yð ÞF xyð Þ2 dy (3.4)
In the right tail (say), these second moments can be interpreted as the tails of the r.v.’s Sn − 1 + X,
Sn − 1 + X* where X, X* are independent of Sn − 1 and have tails F and F2 . Since F2 xð Þ is of smaller
order than F xð Þ in the right tail, the tail of Sn − 1 + X* should be of smaller order than that of Sn − 1 + X,
implying the same ordering of the second moments. However, as n becomes large one also expects
the tail of Sn − 1 to more and more dominate the tails of X, X* so that the difference should be less and
less marked. The analysis to follow will confirm these guesses.
A measure of performance which we consider is the ratio r n(x) of the CdMC variance to the CrMC
variance:
r n xð Þ = Var F xSn1ð Þ
Fn xð ÞF n xð Þ = Var F xSn1ð Þ½ Š
Fn xð ÞF n xð Þ (3.5)
(note that the two alternative expressions reflect that the variance reduction, is the same whether
CdMC is performed for F itself or the tail F).
To provide some initial insight, we examine in Figure 3, r n(xn, z) as function of z where xn, z is
the z-quantile of Sn. In Figure 3(a), the underlying F is Pareto with tail F xð Þ = 1 = 1 + xð Þ3 = 2 and in
Conditional Monte Carlo for sums
459
at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S1748499517000252
Downloaded from https://www.cambridge.org/core. IP address: 178.171.127.215, on 26 Nov 2019 at 09:35:55, subject to the Cambridge
Core terms of use, available
Figure 3(b), it is standard normal. Both figures consider the cases of a sum of n = 2, 5 or 10 terms and
use R = 250,000 replications of the vector Y1 , ... , Yn − 1 (variances are more difficult to estimate than
means, therefore the high value of R). The dotted line for AK (the Asmussen-Kroese estimator, see
section 4) may be ignored for the moment. The argument z on the horizontal axis is in log10-scale, and
xn, z was taken as the exact value for the normal case and the CdMC estimate for the Pareto case.
For the Pareto case in Figure 3(a), it seems that the variance reduction is decreasing in both x and n,
yet in fact it is only substantial in the left tail. For the normal case, note that there should be
symmetry around x = 0, corresponding to z(x) = 1/2 with base-10 logarithm −0.30. This is confirmed
by the figure (though the feature is of course somewhat disguised by the logarithmic scale). In
contrast to the Pareto case, it seems that the variance reduction is very big in the right (and therefore
also left) tail but also that it decreases as n increases.
We proceed to a number of theoretical results supporting these empirical findings. They all use
formulas (3.3) and (3.4) for the second moments of the CdMC estimators. For the exponential
distribution, the calculations are particularly simple:
Example 3.2 Assume F xð Þ = ex, n = 2. Then P X1 + X2 > xð Þ = xex + ex and (5) takes the form
F xð Þ +
ðx
0
eye2ðxyÞ dy = ex + e2x ex1ð Þ = 2exe2x
and so for the right tail:
r2 xð Þ = 2exe2x xex + ex
ð Þ2
xex + exð Þ 1xexexð Þ
For x → ∞, this gives
r2 xð Þ = 2ex + o ex
ðÞ
xex + o xexð Þ = 2
x 1 + oð1Þð Þ ! 0
In the left tail x → 0, Taylor expansion give that up to the third-order term
2exe2x 1x2 + x3; xex + ex = 1x2 = 2 + x3 = 3
10 -1 10 -2 10 -3 10 -4
0
0.5
1
Pareto
n = 2, CdMC
n = 2, AK
n = 5, CdMC
n = 5, AK
n = 10, CdMC
n = 10, AK
10 -1 10 -2 10 -3 10 -4
0
0.5
1
Normal(0,1)
n = 2, CdMC
n = 2, AK
n = 5, CdMC
n = 5, AK
n = 10, CdMC
n = 10, AK
(a) (b)
Figure 3. The ratio r n(z) in (3.5), with F Pareto in (a) and normal in (b).
Søren Asmussen
460
at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S1748499517000252
Downloaded from https://www.cambridge.org/core. IP address: 178.171.127.215, on 26 Nov 2019 at 09:35:55, subject to the Cambridge
Core terms of use, available
and so
r2 xð Þ 1x2 + x3 1x2 = 2 + x3 = 3
2
1x2 + x3ð Þ x2 = 2x3 = 6ð Þ
1x2 + x3 1x2 + 2x3 = 3
x2 = 2 = 2x
3!0◊
The relation r n(x) → 0 in the left tail (i.e. as x → 0) in the exponential example is in fact essentially a
consequence of the support being bounded to the left:
Proposition 3.3 Assume X > 0 and that the density f(x) satisfies f xð Þ cxp as x → 0 for some c
>0
and some p > −1. Then r n xð Þ dx p + 1 as x → 0 for some 0 < d = d(n) < ∞.
The following result explains the right tail behaviour in the Pareto example and shows that this
extends to other standard heavy-tailed distributions like the lognormal or Weibull with decreasing
failure rate (for subexponential distributions, see, e.g. Embrechts et al., 1997):
Proposition 3.4 Assume X > 0 is subexponential. Then r n(x) → 1 − 1/n as x → ∞.
For light tails, Example 3.2 features a different behaviour in the right tail, namely rn(x) → 0. Here is
one more such light-tailed example:
Proposition 3.5 If X is standard normal, then r n(x) → 0 as x → ∞. More precisely,
r n xð Þ 1
x
ffiffiffiffiffiffiffiffiffiffiffiffi
2n1
nπ
r
ex2 = ½2nð2n1ފ
The proofs of Propositions 3.3–3.5 are in the Appendix.
To formulate a result of type r n(x) → 0 as x → ∞ in a sufficiently broad class of light-tailed F
encounters the difficulty that the general results giving the asymptotics of P Sn > xð Þ as x → ∞ with
n
fixed are somewhat involved (the standard light-tailed asymptotics is for P S n > bnð Þ as n → ∞ with
b
fixed, cf. e.g. Jensen, 1995). It is possible to obtain more general versions of Example 3.2 for close-toexponential tails by using results of Cline (1986) and of Proposition 3.5 for thinner tails by involving
Balkema et al. (1993). However, the adaptation of Balkema et al. (1993) is rather technical and can
be found in Asmussen et al. (2017).
One may note that the variance reduction is so moderate in the range of z considered in Figure 3(b)
that CdMC may hardly be worthwhile for light tails except for possibly very small n. If variance
reduction is a major concern, the obvious alternative is to use the standard IS algorithm which uses
exponential change of measure (ECM). The r.v.’s X1 , ... , X n are here generated from the exponentially twisted distribution with density fθ xð Þ = eθx f ðxÞ = EeθX, where θ should be chosen such that
EθSn = x. The estimator of P Sn > xð Þ is
eθS n EeθX
nI Sn > xð Þ (3.6)
see Asmussen & Glynn (2007: 167–169) for more detail. Further variance reduction would be
obtained by applying CdMC to (3.6) as implemented in the following example.
Example 3.6 To illustrate the potential of the IS-ECM algorithm, we consider the sum of n = 10
r.v.’s which are γ(3,1) at the z = 0.95, 0.99 quantiles xz. The exponentially twisted distribution is
γ(3, 1 − θ) and EθSn = x means 3/(1 − θ) = x, i.e. θ = 1 − 3/(x/n). With R = 100,000 replications, we
obtained the values of r n(x) at the z quantiles for z = 0.95, 0.99 given in Table 1. It is seen that
IS-ECM indeed performs much better that CdMC, but that CdMC is also moderately useful for
providing some further variance reduction. ◊
A further financially relevant implementation of the IS-ECM algorithm is in Asmussen et al. (2016)
for lognormal sums. It is unconventional because it deals with the left tail (which is light) rather than
the right tail (which is heavy) and because the ECM is not explicit but done in an approximately
efficient way. Another IS algorithm for the left lognormal sum tail is in Gulisashvili & Tankov
(2016), but the numerical evidence of Asmussen et al. (2016) makes its efficiency somewhat doubtful.
CONCLUSIONS AND FURTHER DISCUSSIONS
This article provides a unified view of the simulation of VaR, CVaR, and their sensitivities. It also gives a brief review on VaR and CVaR optimization. These topics are
inherently related and are important content of financial risk management. We believe the methodologies and techniques covered in this article are very important for
financial risk management practice.
However, the context of this article is far from sufficient for the practice of risk management. In this article, we have mainly focused on research for dealing with VaR and
CVaR. We did not study in depth the properties of VaR and CVaR risk measures. Every
risk measure has its properties, advantages, and disadvantages. Understanding these
properties is important and could be beneficial from a risk management perspective.
For instance, one important feature of using VaR optimization is that the model may
result in very skewed loss distribution, and consequently, the risk may hide in the
tail of the distribution (see, e.g., Natarajan et al. [2008]). This issue is very important
for risk management practice. Similarly, we think using the CVaR optimization model
may also bring in important issues. For instance, Lim et al. [2011] showed that CVaR
is fragile in portfolio optimization; that is, estimation errors in CVaR may affect optimization results and thus decisions significantly. Also, we did not include any empirical
study on VaR and CVaR, which is very important. It is of great meaning to analyze
VaR/CVaR-based models and to study the pros and cons of these models in practice
using data and information available.
Another important theoretical question is the specification of distributions of random
variables in risk management models. In the context of this article, we have assumed
that an input distribution is predetermined and is given to modelers. However, in
practice, it is often difficult to specify the input distribution precisely. A considerable
amount of research has been devoted to the issue of uncertainty in models of VaR/CVaR
(see, e.g., El Ghaoui et al. [2003], Zymler et al. [2013], Hu and Hong [2012], Hu et al.
[2013a], and many others). However, it is far from sufficient and more study on input
uncertainty is necessary in the context of financial risk management. Modeling input
uncertainty should incorporate information available and should reflect the practice.
Download