Uploaded by Michael Anderson

10.1.1.353.1954

advertisement
Better approximations
to cumulative normal
functions
Graeme West
Financial Modelling Agency and
Programme in Advanced Mathematics of Finance,
School of Computational & Applied Mathematics,
University of the Witwatersrand, Private Bag 3, Wits 2050, South Africa
graeme@finmod.co.za
1. The need for High Precision
Cumulative Normal Functions
Espen Haug relates a story to me of how his book (Haug 1998) has
received a rather scathing review at the Amazon website by one reader;
and the underlying reason for the problem is in actual fact the inaccuracy of the cumulative normal approximation in his book, this inaccuracy
is in turn inherited by the bivariate cumulative approximation. As a consequence, option prices where the bivariate cumulative is used can be
negative, under not absurd inputs!
It is important to remember that in most if not all approximations,
the n-variate cumulative function will use the n − 1 -variate. It makes
sense a priori to have a high precision univariate cumulative normal, but
it makes even more sense if we are going to use the bivariate cumulative
normal, as—besides needing to be satisfactory in its own right—this will
rely on the univariate that we have chosen. And, if we use a trivariate
cumulative then one will require a high precision bivariate cumulative
function.
2. Univariate Cumulative Normal
The Cumulative Standard Normal Integral is the function:
x
1
X2
N(x) = √
e− 2 dX
(1)
2π −∞
As is well known, a closed form solution does not exist for this integral,
so a numerical approximation needs to be implemented. Most common
is an approximation which involves an exponential and a fifth degree
polynomial, given in (Abramowitz & Stegun 1974), and repeated in (Hull
2002, §12.9) and (Haug 1998, Appendix A), for example. (Some other
shorter and less accurate approximations are available, but the above
Thanks to Axel Vogt for some testing of the accuracy of some function implementations, and Espen Haug for suggesting this topic.
70
Wilmott magazine
TECHNICAL ARTICLE 2
algorithm is hardly complicated.) This function is used by most option
exchanges for futures option pricing and margining, and hence may be
preferred to better methods, in order to maintain consistency with the
results from the exchange.
However, another option is one that first appears in (Hart 1968). This
algorithm uses high degree rational functions to obtain the approximation. This function is accurate to double precision throughout the real line.
We can compare the performance of the excel NORMSDIST function,
the (Abramowitz & Stegun 1974) function (AS henceforth) and the Hart
function. The poor reputation of the NORMSDIST function is probably
unwarranted, with this and the AS function materially the same for
inputs in the range [−6, +6] , which of course is more than adequate for
all purposes. (Anybody working so far in the tails of a normal distribution in financial mathematics is probably working with the wrong distribution anyway, so whether or not the results that are returned there are
accurate is probably moot.) In Figure 1 we see the relative values of these
functions (‘f relative to g’ at a point x means that we are graphing the
). Here we see the consistency of NORMSDIST until −6 (as all
value f (x)−g(x)
g(x)
the functions involved are symmetric, we only plot for negative real numbers), after which its behaviour is rather mysterious.
A vb version of the Hart function is in Figure 2.1
As pointed out in (Acklam 2004), having such a double precision
function has some rather pleasant spin-offs. For example, the Moro transform to find inverse cumulative normals is well known. Having the ability to generate normally distributed variables from a (quasi) random uniform sample is clearly important in work involving Monte Carlo experiments, and the Moro transformation is fast and accurate to about 10 decimal places.2 Given a function that can compute the normal cumulative
distribution function to double precision, the Moro approximation (and,
in fact, ANY initial approximation) of the inverse normal cumulative distribution function can be refined to full machine precision, by a fairly
straightforward application of Newton’s method. In fact, higher degree
methods such as Newton’s second order method (sometimes called the
Newton-Bailey method) or a third order method known as Halley’s
method will be the fastest, and are very amenable here, because the
Gaussian function is so easily differentiated over and over—see (Acklam
2002) and (Acklam 2004).
3. Bivariate Cumulative Normal
The cumulative bivariate normal distribution is the function
N2 (x, y, ρ) =
1
2π 1 − ρ 2
x
y
exp
−∞
−∞
−(X2 − 2ρXY + Y 2 )
dY dX
2(1 − ρ 2 )
Again, approximations are required. The most common algorithm is that
of (Drezner 1978), which appears in both (Hull 2002, Appendix 12C) and
in (Haug 1998, Appendix A.2), for example.
A more accurate pair of approximations was developed in (Drezner &
Wesolowsky 1989). A first function is given on pg. 103 of that paper.
However, it is then acknowledged that for ρ near to ±1, there will be inaccuracies, and so another algorithm is provided on pg. 105. We will call
these algorithms DW1 and DW2 respectively.
The DW2 method is single precision. (Genz 2004) has provided a modification of that algorithm which is near double precision. Again, we
have a vb version of the FORTRAN code of Genz. Adaptation was needed
because the algorithm calculated the complementary probability i.e. the
probability that X ≥ x, Y ≥ y, given the correlation coefficient. The algorithm has been adapted to return the more usual (in mathematical
finance anyway!) probability that X ≤ x, Y ≤ y.
In (Ağca & Chance 2003) the speed of the various bivariate approximations is considered.
4. Option Pricing Disasters
4.1 Problems with the univariate
Wilmott magazine
^
Figure 1: Relative values of the three univariate normal approximations.
The probably esoteric advantage mentioned in §2 of using
the double precision algorithm—in order to ‘tone’ our
cumulative normal inverse approximations—pales into
insignificant when we analyse the example provided by
(the critic of) Espen Haug.
The example that Espen tells me of is a partial-timestart-barrier option (Haug 1998, §2.10.3) i.e. an up-and-out
call type A with asset price S = 75 , strike price X = 85 ,
barrier H = 95 , time to maturity t1 = 0.35 years, time to
maturity t2 = 0.5 years, risk free rate 10%, cost of carry 5%,
and volatility 4%, with continuous monitoring of the barrier. The option price returned by Espen’s software is
−0.0393 . For a volatility of 3%, it returns a value of
−3860.5652 , and as σ ↓ 0, so V ↓ −∞ .
71
Figure 3: The bivariate cumulative normal function,
ρ = 50%.
itself, which includes calls to the relevant univariate function.) While
the difference may not appear material, this value is subsequently
multiplied by (h/S)2µ = 504727548721 .962 in the option price!
In actual fact, the problem is the evaluation of N (0). While we all
know that the answer is 0.5, the AS code doesn’t. It returns the value
0.5000000010279300. While of course this is correct to the claimed 6
decimal places, it is the root of the problem. If you add in the AS
algorithm a clause which instructs the function that if the input is
0, to return 0.5 exactly, and exit, the problem goes away! Of course,
this is not exactly a very appealing solution: it should worry. But the
Figure 2: A double precision univariate normal function. (All variables which are not
Hart
algorithm does return 0.5 to double precision.
declared here are set as private variables elsewhere.)
A similar problem can be created for the Bjerksund and
Stensland formula (Bjerksund & Stensland 2002). In both cases the
problem can be resolved by using the corrected AS algorithm or the Hart
Why does this come about? We created code for this and the other
algorithm.
option pricing formulae mentioned here, where in addition to the
The rule of thumb is that as soon as our option becomes exotic, and
parameters of the option, one needs to specify which univariate and
we need an n variate cumulant, the cumulant functions of lower order
which bivariate approximation one is using (the bivariate itself calling
should be double precision. But even that might not be enough, as we
the specified univariate when required). Thus, one can dig down into
will see now.
each factor of the option price and isolate the problem.
We fix a volatility of 3%. Using the code notation of Espen, the option
4.2 Problems with the bivariate: ρ = ±1
price involves the calculation of the value N2 (g4 , −e4 , ρ) . We have
A problem that arises with the Drezner bivariate function is the failure
g4 = 7.54255645241296 , e4 = 12.7827258096518 and ρ = 0.25 . Using the
to account for the case where ρ = ±1 . This observation is important,
bivariate method of (Drezner 1978) and the AS and Hart univariate
because even though the initial correlation might not be equal to ±1, the
method, one obtains values of 5.24808418944644E-10 and 0 respectively.
algorithm of Drezner might make a function call where the correlation
(The execution of the Drezner function involves several recursive calls to
is indeed equal to ±1.
72
Wilmott magazine
TECHNICAL ARTICLE 2
Let us consider the call on the minimum option
pricing formula of (Stulz 1982) with σ1 = 40.00% ,
σ2 = 25.00% , ρ = −1.00% , S1 = S2 = X = 100 , τ = 2,
r = 8.00% . Once again the option price will include
the calculation of the quantity N2 (a, b, ρ), where a =
−4.9065389333868 E − 17 , b = 0.275771644662754
and ρ = −0.01 . The Drezner algorithm now performs a very extensive recursive evaluation, which
critically relies on which octant of three dimensional
space (a, b, ρ) lies. And here, the question of which
octant a point on one of the xy, xz and yz planes is
assigned to is crucial. Essentially, the problem is
familiar: the code does not recognise that ‘in reality’
a = 0 , so the above point is on the yz plane, and
should be assigned to an octant where x ≥ 0, rather
than an octant where x < 0 . As a consequence the
code now fails to execute because a value of ρ = 1 is Figure 4: The price of a call on the minimum, using the DW1 bivariate cumulant and the DW2 bivariused. The Drezner algorithm alwaysinvolves a nor- ate cumulant. The underlying univariate is the Hart function in both cases. ρ = −70%.
malisation of a and b by division by 2(1 − ρ 2 ), and
so a division by 0 occurs. If, in this particular case, we
manually override a with a value of 0, the Drezner algorithm executes
properly (because a function value with ρ = 1 never occurs), returning a
value of 0.302786942980365.
The general solution is not to try to manipulate the definition of the
octants, but rather to build in traps for the limiting cases. Note that in
the sense of a limit
N2 (x, y, 1) = N(min(x, y))
0
N2 (x, y, −1) =
N(x) + N(y) − 1
(3)
if y ≤ −x
if y > −x
(4)
So, we can build in a test if |ρ| = 1, and if so, the algorithm executes as
above and the function exits.
It is not even sufficient in the Drezner algorithm to test directly if
|ρ| = 1, because to machine
precision this can be false, while to the same
precision it is true that 2(1 − ρ 2 ) = 0 . So the latter needs to be the criterion for testing.
4.3 Problems with the bivariate: negative option values
Wilmott magazine
Figure 5: The price of a call on the minimum, using the DW1
bivariate cumulant. The underlying univariate is the Hart
function. ρ = −97%.
This valley of negative option values is exacerbated as ρ → −1—see
Figure 5.
4.4 Problems with the bivariate: underflow
Remarkably, I have stumbled on problems with the DW2 algorithm too.
This algorithm is in Figure 6.
^
Taking the care mentioned in §4.1 turns out to be insufficient. The DW1
algorithm fails to avoid the problem of negative option values even when
the univariate is double precision, and even when the correlation coefficient is quite far away from ±1. We can find inputs where the call on the
minimum formula of (Stulz 1982) will return negative values even for
ρ ≈ −70% . For example, with S1 = 85 , S2 = 60 , σ1 = 40.00% ,
σ2 = 25.00% , X = 100 , ρ = −70.00% , r = 8.00% and τ = 2 years using the
DW1 algorithm (with an underlying Hart univariate) gives an option premium of -0.0038939. When using the DW2 algorithm the value improves
to 0.0180211. The price using the Genz algorithm is 0.0180005. See
Figure 4.
73
Figure 6: A visual basic version of the DW2 function, modified to return the usual rather than complementary probabilities.
74
An obvious coding failsafe strategy is
always to check that whenever division occurs
in any of your algorithms, that division cannot be by 0. An obvious example where this
immediately bears fruit is having option pricing formulae that correctly return the intrinsic value of the option at maturity, rather
than not executing. Recall that any formula of
Black-Scholes type will have one or more divisions by the square root of the annualised
term left to expiry in their execution.
This problem occurs in the DW2 algorithm, but in quite a subtle way. Examining
Figure 6, we see that a quantity h7 is defined
to be exp(−h3 /2) . Later on we have a loop
which involves evaluation of the quantity
exp(−h3 /(1 + r2 ))/r2 /h7 . If h3 is large, then
h7 might evaluate as 0 to the precision of
the language we are employing—that is,
underflow.3 In this case, the above fraction
is of the form 0/0, and will not evaluate.
opital’s rule,
However, this form suggests l’H
and sure enough, the necessary calculations can be made, showing that this quantity is equal to 0, in the sense of a limit.
The closed form American option pricing model of (Bjerksund & Stensland 2002)
uses the bivariate normal, and use of the
DW2 algorithm without the above modification will fail if the dividend yield is nonzero but very small. For example, with a
strike of 100, term of half a year, risk free
rate of 10%, dividend yield of 0.10%, and
volatility 20% the option price will fail for
calls. At one point the bivariate function is
called with a set of parameters that is reminiscent of those that we saw in §4.1, and
this time the underf low explained above
occurs. Note that if the dividend yield were
zero, then it is known that it is sub-optimal
to exercise calls early, and the model of
(Bjerksund & Stensland 2002) correctly
diverts to the Black-Scholes formula.
So, this modified DW2 algorithm might
be the algorithm of choice. It is not as accurate as the Genz algorithm, but does not
have any material inaccuracies, and is certainly a lot more compact. The modified
algorithm is in Figure 7.
Wilmott magazine
TECHNICAL ARTICLE 2
Wilmott magazine
^
Figure 7: A visual basic version of the DW2 function,
modified to return the usual rather than complementary
probabilities, with the underflow problem resolved.
75
TECHNICAL ARTICLE 2
5. The Trivariate Cumulative Normal
Function
The cumulative trivariate normal distribution is the function
N3 (x1 , x2 , x3 , ) =
1
(2π )
√
3/2
||
x1
x2
x3
exp
−∞
−∞
−∞
1
2
X −1 X dX3 dX2 dX1
(5)
where is the correlation matrix between standardised (scaled) variables X1 , X2 , X3 , and | · | denotes determinant. Denote by
N3 (x1 , x2 , x3 , ρ21 , ρ31 , ρ32 ) the function N3 (x1 , x2 , x3 , ) where

1
=  ρ21
ρ31
ρ21
1
ρ32

ρ31
ρ32  .
1
Again, approximations are required. Code for the trivariate cumulative normal is not generally available. There are a few highly non-transparent publications, for example (Schervish 1984), but this code is
known to be faulty. We have used the algorithm in (Genz 2004). This has
required extensive modifications because the algorithms are implemented in Fortran, using language properties which are not readily translated. The function in (Genz 2004) returns the complementary probability,
again, we have modified to return the usual probability that Xi ≤ xi
(i = 1, 2, 3) given a correlation matrix. Again, it is claimed that this algorithm is double precision; high accuracy (of our vb translation) has been
verified by testing against Niederreiter quasi-Monte Carlo integration
(using the Matlab algorithm qsimvn.m, also at the website of Genz).
In a forthcoming paper, we will look at applying this function to the
pricing of rainbow options on 3 assets, as in the work of (Johnson 1987).
FOOTNOTES & REFERENCES
1. This and all of the other vba code mentioned here is available from the author’s web
page (West 2004)
2. In contrast to before, the reader is now warned against the use of the built in excel
function NORMSINV, which is patently absurd. Of course, this inverse function should
take values in the interval (0, 1) and should map to the real line. In fact, NORMSINV
returns the value ± 50000 for input values within 0.0000003 of 1 or 0 respectively.
Given that such values close to 0 or 1 on occasion are provided by uniform random number generators, this approach is to be avoided.
3. This is dependent on the language used. But this example fails in C++, for example.
■ Abramowitz, M. & Stegun, I. A. (1974), Handbook of Mathematical Functions, With
Formulas, Graphs, and Mathematical Tables, Dover.
■ Acklam, P. J. (2002), ‘A small paper on Halley’s method’.
http://home.online.no/~pjacklam
■ Acklam, P. J. (2004), ‘An algorithm for computing the inverse normal cumulative distribution function’. http://home.online.no/~pjacklam
■ Ağca, Ş. & Chance, D. M. (2003), ‘Speed and accuracy comparison of bivariate normal
distribution approximations for option pricing’, Journal of Computational Finance 6(4),
61–96.
■ Bjerksund, P. & Stensland, G. (2002), ‘Closed form valuation of American options’.
http://www.nhh.no
■ Drezner, Z. (1978), ‘Computation of the bivariate normal integral’, Mathematics of
Computation 32, 277–279.
■ Drezner, Z. & Wesolowsky, G. (1989), ‘On the computation of the bivariate normal
integral’, Journal of Statist. Comput. Simul. 35, 101–107.
■ Genz, A. (2004), ‘Numerical computation of rectangular bivariate and trivariate normal
and t probabilities’, Statistics and Computing 14, 151–160.
http://www.sci.wsu.edu/math/faculty/genz/homepage
■ Hart, J. (1968), Computer Approximations, Wiley. Algorithm 5666 for the error function.
■ Haug, E. G. (1998), The complete guide to option pricing formulas, McGraw-Hill.
■ Hull, J. (2002), Options, Futures, and Other Derivatives, fifth edn, Prentice Hall.
■ Johnson, H. (1987), ‘Options on the maximum or the minimum of several assets’,
Journal of Financial and Quantitative Analysis 22, 277–283.
■ Schervish, M. (1984), ‘Multivariate normal probabilities with error bound’, Applied
Statistics 33, 81–87.
■ Stulz, R. M. (1982), ‘Options on the minimum or the maximum of two risky assets’,
Journal of Financial Economics XXXIII, No. 1, 161–185.
■ West, G. (2004). http://www.cam.wits.ac.za/mfinance/graeme
W
76
Wilmott magazine
Download