Better Approximations to Cumulative Normal Functions

Better approximations to cumulative normal functions Graeme West Financial Modelling Agency and Programme in Advanced Mathematics of Finance, School of Computational & Applied Mathematics, University of the Witwatersrand, Private Bag 3, Wits 2050, South Africa graeme@finmod.co.za 1. The need for High Precision Cumulative Normal Functions Espen Haug relates a story to me of how his book (Haug 1998) has received a rather scathing review at the Amazon website by one reader; and the underlying reason for the problem is in actual fact the inaccuracy of the cumulative normal approximation in his book, this inaccuracy is in turn inherited by the bivariate cumulative approximation. As a consequence, option prices where the bivariate cumulative is used can be negative, under not absurd inputs! It is important to remember that in most if not all approximations, the n-variate cumulative function will use the n − 1 -variate. It makes sense a priori to have a high precision univariate cumulative normal, but it makes even more sense if we are going to use the bivariate cumulative normal, as—besides needing to be satisfactory in its own right—this will rely on the univariate that we have chosen. And, if we use a trivariate cumulative then one will require a high precision bivariate cumulative function. 2. Univariate Cumulative Normal The Cumulative Standard Normal Integral is the function: x 1 X2 N(x) = √ e− 2 dX (1) 2π −∞ As is well known, a closed form solution does not exist for this integral, so a numerical approximation needs to be implemented. Most common is an approximation which involves an exponential and a fifth degree polynomial, given in (Abramowitz & Stegun 1974), and repeated in (Hull 2002, §12.9) and (Haug 1998, Appendix A), for example. (Some other shorter and less accurate approximations are available, but the above Thanks to Axel Vogt for some testing of the accuracy of some function implementations, and Espen Haug for suggesting this topic. 70 Wilmott magazine TECHNICAL ARTICLE 2 algorithm is hardly complicated.) This function is used by most option exchanges for futures option pricing and margining, and hence may be preferred to better methods, in order to maintain consistency with the results from the exchange. However, another option is one that first appears in (Hart 1968). This algorithm uses high degree rational functions to obtain the approximation. This function is accurate to double precision throughout the real line. We can compare the performance of the excel NORMSDIST function, the (Abramowitz & Stegun 1974) function (AS henceforth) and the Hart function. The poor reputation of the NORMSDIST function is probably unwarranted, with this and the AS function materially the same for inputs in the range [−6, +6] , which of course is more than adequate for all purposes. (Anybody working so far in the tails of a normal distribution in financial mathematics is probably working with the wrong distribution anyway, so whether or not the results that are returned there are accurate is probably moot.) In Figure 1 we see the relative values of these functions (‘f relative to g’ at a point x means that we are graphing the ). Here we see the consistency of NORMSDIST until −6 (as all value f (x)−g(x) g(x) the functions involved are symmetric, we only plot for negative real numbers), after which its behaviour is rather mysterious. A vb version of the Hart function is in Figure 2.1 As pointed out in (Acklam 2004), having such a double precision function has some rather pleasant spin-offs. For example, the Moro transform to find inverse cumulative normals is well known. Having the ability to generate normally distributed variables from a (quasi) random uniform sample is clearly important in work involving Monte Carlo experiments, and the Moro transformation is fast and accurate to about 10 decimal places.2 Given a function that can compute the normal cumulative distribution function to double precision, the Moro approximation (and, in fact, ANY initial approximation) of the inverse normal cumulative distribution function can be refined to full machine precision, by a fairly straightforward application of Newton’s method. In fact, higher degree methods such as Newton’s second order method (sometimes called the Newton-Bailey method) or a third order method known as Halley’s method will be the fastest, and are very amenable here, because the Gaussian function is so easily differentiated over and over—see (Acklam 2002) and (Acklam 2004). 3. Bivariate Cumulative Normal The cumulative bivariate normal distribution is the function N2 (x, y, ρ) = 1 2π 1 − ρ 2 x y exp −∞ −∞ −(X2 − 2ρXY + Y 2 ) dY dX 2(1 − ρ 2 ) Again, approximations are required. The most common algorithm is that of (Drezner 1978), which appears in both (Hull 2002, Appendix 12C) and in (Haug 1998, Appendix A.2), for example. A more accurate pair of approximations was developed in (Drezner & Wesolowsky 1989). A first function is given on pg. 103 of that paper. However, it is then acknowledged that for ρ near to ±1, there will be inaccuracies, and so another algorithm is provided on pg. 105. We will call these algorithms DW1 and DW2 respectively. The DW2 method is single precision. (Genz 2004) has provided a modification of that algorithm which is near double precision. Again, we have a vb version of the FORTRAN code of Genz. Adaptation was needed because the algorithm calculated the complementary probability i.e. the probability that X ≥ x, Y ≥ y, given the correlation coefficient. The algorithm has been adapted to return the more usual (in mathematical finance anyway!) probability that X ≤ x, Y ≤ y. In (Ağca & Chance 2003) the speed of the various bivariate approximations is considered. 4. Option Pricing Disasters 4.1 Problems with the univariate Wilmott magazine ^ Figure 1: Relative values of the three univariate normal approximations. The probably esoteric advantage mentioned in §2 of using the double precision algorithm—in order to ‘tone’ our cumulative normal inverse approximations—pales into insignificant when we analyse the example provided by (the critic of) Espen Haug. The example that Espen tells me of is a partial-timestart-barrier option (Haug 1998, §2.10.3) i.e. an up-and-out call type A with asset price S = 75 , strike price X = 85 , barrier H = 95 , time to maturity t1 = 0.35 years, time to maturity t2 = 0.5 years, risk free rate 10%, cost of carry 5%, and volatility 4%, with continuous monitoring of the barrier. The option price returned by Espen’s software is −0.0393 . For a volatility of 3%, it returns a value of −3860.5652 , and as σ ↓ 0, so V ↓ −∞ . 71 Figure 3: The bivariate cumulative normal function, ρ = 50%. itself, which includes calls to the relevant univariate function.) While the difference may not appear material, this value is subsequently multiplied by (h/S)2µ = 504727548721 .962 in the option price! In actual fact, the problem is the evaluation of N (0). While we all know that the answer is 0.5, the AS code doesn’t. It returns the value 0.5000000010279300. While of course this is correct to the claimed 6 decimal places, it is the root of the problem. If you add in the AS algorithm a clause which instructs the function that if the input is 0, to return 0.5 exactly, and exit, the problem goes away! Of course, this is not exactly a very appealing solution: it should worry. But the Figure 2: A double precision univariate normal function. (All variables which are not Hart algorithm does return 0.5 to double precision. declared here are set as private variables elsewhere.) A similar problem can be created for the Bjerksund and Stensland formula (Bjerksund & Stensland 2002). In both cases the problem can be resolved by using the corrected AS algorithm or the Hart Why does this come about? We created code for this and the other algorithm. option pricing formulae mentioned here, where in addition to the The rule of thumb is that as soon as our option becomes exotic, and parameters of the option, one needs to specify which univariate and we need an n variate cumulant, the cumulant functions of lower order which bivariate approximation one is using (the bivariate itself calling should be double precision. But even that might not be enough, as we the specified univariate when required). Thus, one can dig down into will see now. each factor of the option price and isolate the problem. We fix a volatility of 3%. Using the code notation of Espen, the option 4.2 Problems with the bivariate: ρ = ±1 price involves the calculation of the value N2 (g4 , −e4 , ρ) . We have A problem that arises with the Drezner bivariate function is the failure g4 = 7.54255645241296 , e4 = 12.7827258096518 and ρ = 0.25 . Using the to account for the case where ρ = ±1 . This observation is important, bivariate method of (Drezner 1978) and the AS and Hart univariate because even though the initial correlation might not be equal to ±1, the method, one obtains values of 5.24808418944644E-10 and 0 respectively. algorithm of Drezner might make a function call where the correlation (The execution of the Drezner function involves several recursive calls to is indeed equal to ±1. 72 Wilmott magazine TECHNICAL ARTICLE 2 Let us consider the call on the minimum option pricing formula of (Stulz 1982) with σ1 = 40.00% , σ2 = 25.00% , ρ = −1.00% , S1 = S2 = X = 100 , τ = 2, r = 8.00% . Once again the option price will include the calculation of the quantity N2 (a, b, ρ), where a = −4.9065389333868 E − 17 , b = 0.275771644662754 and ρ = −0.01 . The Drezner algorithm now performs a very extensive recursive evaluation, which critically relies on which octant of three dimensional space (a, b, ρ) lies. And here, the question of which octant a point on one of the xy, xz and yz planes is assigned to is crucial. Essentially, the problem is familiar: the code does not recognise that ‘in reality’ a = 0 , so the above point is on the yz plane, and should be assigned to an octant where x ≥ 0, rather than an octant where x < 0 . As a consequence the code now fails to execute because a value of ρ = 1 is Figure 4: The price of a call on the minimum, using the DW1 bivariate cumulant and the DW2 bivariused. The Drezner algorithm alwaysinvolves a nor- ate cumulant. The underlying univariate is the Hart function in both cases. ρ = −70%. malisation of a and b by division by 2(1 − ρ 2 ), and so a division by 0 occurs. If, in this particular case, we manually override a with a value of 0, the Drezner algorithm executes properly (because a function value with ρ = 1 never occurs), returning a value of 0.302786942980365. The general solution is not to try to manipulate the definition of the octants, but rather to build in traps for the limiting cases. Note that in the sense of a limit N2 (x, y, 1) = N(min(x, y)) 0 N2 (x, y, −1) = N(x) + N(y) − 1 (3) if y ≤ −x if y > −x (4) So, we can build in a test if |ρ| = 1, and if so, the algorithm executes as above and the function exits. It is not even sufficient in the Drezner algorithm to test directly if |ρ| = 1, because to machine precision this can be false, while to the same precision it is true that 2(1 − ρ 2 ) = 0 . So the latter needs to be the criterion for testing. 4.3 Problems with the bivariate: negative option values Wilmott magazine Figure 5: The price of a call on the minimum, using the DW1 bivariate cumulant. The underlying univariate is the Hart function. ρ = −97%. This valley of negative option values is exacerbated as ρ → −1—see Figure 5. 4.4 Problems with the bivariate: underflow Remarkably, I have stumbled on problems with the DW2 algorithm too. This algorithm is in Figure 6. ^ Taking the care mentioned in §4.1 turns out to be insufficient. The DW1 algorithm fails to avoid the problem of negative option values even when the univariate is double precision, and even when the correlation coefficient is quite far away from ±1. We can find inputs where the call on the minimum formula of (Stulz 1982) will return negative values even for ρ ≈ −70% . For example, with S1 = 85 , S2 = 60 , σ1 = 40.00% , σ2 = 25.00% , X = 100 , ρ = −70.00% , r = 8.00% and τ = 2 years using the DW1 algorithm (with an underlying Hart univariate) gives an option premium of -0.0038939. When using the DW2 algorithm the value improves to 0.0180211. The price using the Genz algorithm is 0.0180005. See Figure 4. 73 Figure 6: A visual basic version of the DW2 function, modified to return the usual rather than complementary probabilities. 74 An obvious coding failsafe strategy is always to check that whenever division occurs in any of your algorithms, that division cannot be by 0. An obvious example where this immediately bears fruit is having option pricing formulae that correctly return the intrinsic value of the option at maturity, rather than not executing. Recall that any formula of Black-Scholes type will have one or more divisions by the square root of the annualised term left to expiry in their execution. This problem occurs in the DW2 algorithm, but in quite a subtle way. Examining Figure 6, we see that a quantity h7 is defined to be exp(−h3 /2) . Later on we have a loop which involves evaluation of the quantity exp(−h3 /(1 + r2 ))/r2 /h7 . If h3 is large, then h7 might evaluate as 0 to the precision of the language we are employing—that is, underflow.3 In this case, the above fraction is of the form 0/0, and will not evaluate. opital’s rule, However, this form suggests l’H and sure enough, the necessary calculations can be made, showing that this quantity is equal to 0, in the sense of a limit. The closed form American option pricing model of (Bjerksund & Stensland 2002) uses the bivariate normal, and use of the DW2 algorithm without the above modification will fail if the dividend yield is nonzero but very small. For example, with a strike of 100, term of half a year, risk free rate of 10%, dividend yield of 0.10%, and volatility 20% the option price will fail for calls. At one point the bivariate function is called with a set of parameters that is reminiscent of those that we saw in §4.1, and this time the underf low explained above occurs. Note that if the dividend yield were zero, then it is known that it is sub-optimal to exercise calls early, and the model of (Bjerksund & Stensland 2002) correctly diverts to the Black-Scholes formula. So, this modified DW2 algorithm might be the algorithm of choice. It is not as accurate as the Genz algorithm, but does not have any material inaccuracies, and is certainly a lot more compact. The modified algorithm is in Figure 7. Wilmott magazine TECHNICAL ARTICLE 2 Wilmott magazine ^ Figure 7: A visual basic version of the DW2 function, modified to return the usual rather than complementary probabilities, with the underflow problem resolved. 75 TECHNICAL ARTICLE 2 5. The Trivariate Cumulative Normal Function The cumulative trivariate normal distribution is the function N3 (x1 , x2 , x3 , ) = 1 (2π ) √ 3/2 || x1 x2 x3 exp −∞ −∞ −∞ 1 2 X −1 X dX3 dX2 dX1 (5) where is the correlation matrix between standardised (scaled) variables X1 , X2 , X3 , and | · | denotes determinant. Denote by N3 (x1 , x2 , x3 , ρ21 , ρ31 , ρ32 ) the function N3 (x1 , x2 , x3 , ) where  1 =  ρ21 ρ31 ρ21 1 ρ32  ρ31 ρ32  . 1 Again, approximations are required. Code for the trivariate cumulative normal is not generally available. There are a few highly non-transparent publications, for example (Schervish 1984), but this code is known to be faulty. We have used the algorithm in (Genz 2004). This has required extensive modifications because the algorithms are implemented in Fortran, using language properties which are not readily translated. The function in (Genz 2004) returns the complementary probability, again, we have modified to return the usual probability that Xi ≤ xi (i = 1, 2, 3) given a correlation matrix. Again, it is claimed that this algorithm is double precision; high accuracy (of our vb translation) has been verified by testing against Niederreiter quasi-Monte Carlo integration (using the Matlab algorithm qsimvn.m, also at the website of Genz). In a forthcoming paper, we will look at applying this function to the pricing of rainbow options on 3 assets, as in the work of (Johnson 1987). FOOTNOTES & REFERENCES 1. This and all of the other vba code mentioned here is available from the author’s web page (West 2004) 2. In contrast to before, the reader is now warned against the use of the built in excel function NORMSINV, which is patently absurd. Of course, this inverse function should take values in the interval (0, 1) and should map to the real line. In fact, NORMSINV returns the value ± 50000 for input values within 0.0000003 of 1 or 0 respectively. Given that such values close to 0 or 1 on occasion are provided by uniform random number generators, this approach is to be avoided. 3. This is dependent on the language used. But this example fails in C++, for example. ■ Abramowitz, M. & Stegun, I. A. (1974), Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables, Dover. ■ Acklam, P. J. (2002), ‘A small paper on Halley’s method’. http://home.online.no/~pjacklam ■ Acklam, P. J. (2004), ‘An algorithm for computing the inverse normal cumulative distribution function’. http://home.online.no/~pjacklam ■ Ağca, Ş. & Chance, D. M. (2003), ‘Speed and accuracy comparison of bivariate normal distribution approximations for option pricing’, Journal of Computational Finance 6(4), 61–96. ■ Bjerksund, P. & Stensland, G. (2002), ‘Closed form valuation of American options’. http://www.nhh.no ■ Drezner, Z. (1978), ‘Computation of the bivariate normal integral’, Mathematics of Computation 32, 277–279. ■ Drezner, Z. & Wesolowsky, G. (1989), ‘On the computation of the bivariate normal integral’, Journal of Statist. Comput. Simul. 35, 101–107. ■ Genz, A. (2004), ‘Numerical computation of rectangular bivariate and trivariate normal and t probabilities’, Statistics and Computing 14, 151–160. http://www.sci.wsu.edu/math/faculty/genz/homepage ■ Hart, J. (1968), Computer Approximations, Wiley. Algorithm 5666 for the error function. ■ Haug, E. G. (1998), The complete guide to option pricing formulas, McGraw-Hill. ■ Hull, J. (2002), Options, Futures, and Other Derivatives, fifth edn, Prentice Hall. ■ Johnson, H. (1987), ‘Options on the maximum or the minimum of several assets’, Journal of Financial and Quantitative Analysis 22, 277–283. ■ Schervish, M. (1984), ‘Multivariate normal probabilities with error bound’, Applied Statistics 33, 81–87. ■ Stulz, R. M. (1982), ‘Options on the minimum or the maximum of two risky assets’, Journal of Financial Economics XXXIII, No. 1, 161–185. ■ West, G. (2004). http://www.cam.wits.ac.za/mfinance/graeme W 76 Wilmott magazine

Better Approximations to Cumulative Normal Functions

Related documents

Products

Support

Better Approximations to Cumulative Normal Functions

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib