Topics in Computational Finance: The Barndorff-Nielsen & Shephard Stochastic Volatility Model Martin Groth Dissertation presented for the Degree of PhilosophiæDoctor Department of mathematics University of Oslo 2007 ACKNOWLEDGEMENTS This thesis marks the end of a five year period in my life and I take the opportunity to thank those who have been there with me through the good and the bad days. I want to express my deepest gratitude to my supervisor Prof. Fred Espen Benth for helping me out when I most needed it. His continuous support and encouragement has been invaluable, and this thesis would not have been realised without his profound understanding of mathematics, finance and supervision. Many thanks go out to my co-authors, Paul C. Kettler, Rodwell Kufakunesu, Dr. Carl Lindberg and Olli Wallin, who have shared their time and knowledge with me. Without their help I would not have accomplished this. I am obliged to Docent Roger Pettersson at Växjö University for his enthusiastic efforts in the first years. The encouragement from Docent Magnus Wiktorsson, Lunds University, the opponent at my Licentiate defense, was much appreciated. The friendly and inspiring group of colleagues at the Centre of Mathematics for Applications is responsible for making work a pleasure. All facilitated by the excellent Administrative director Helge Galdal. I am grateful to all the participants at the Fourth Scandinavian Ph.D. workshop in Mathematical Finance for accepting my invitation. I also want to thank the Ph.D. committee at the Department of Mathematics for their unconditional support. On a personal level I want to thank all my friends for their unceasing pursuits to make my days include more than research. Proofreading is an unavoidable but unrewarding task and I am in debt to Camilla Malm for her kindness to carefully and courteously correct my mistakes. I owe everything to my parents for their endless devotion, and my siblings with families for always caring for me. Finally, Christina for all her support and love during the final year. Oslo, March 2007 Martin Groth CONTENTS • An introductory note • Paper I A quasi-Monte Carlo algorithm for the normal inverse Gaussian distribution and valuation of financial derivatives by Fred Espen Benth, Martin Groth, and Paul C. Kettler. Published in The International Journal of Theoretical and Applied Finance. Vol. 9, No. 6 (2006) pages 843-867. • Paper II The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nielsen - Shephard stochastic volatility model by Fred Espen Benth and Martin Groth. Submitted for publication. • Paper III Valuing volatility and variance swaps for a non-Gaussian OrnsteinUhlenbeck stochastic volatility model by Fred Espen Benth, Martin Groth and Rodwell Kufakunesu. Forthcoming in Applied Mathematical Finance. • Paper VI The implied risk aversion from utility indifference option pricing in a stochastic volatility model by Fred Espen Benth, Martin Groth and Carl Lindberg. Submitted for publication. • Paper V Derivation-free Greeks for the Barndorff-Nielsen and Shephard stochastic volatility model by Fred Espen Benth, Martin Groth and Olli Wallin. Submitted for publication. AN INTRODUCTORY NOTE In an expanding financial world it is necessary to analyse and understand the methods used and the models they rely on. For an investor to stay competitive and safeguard against failure the need for thorough and careful examination from a mathematical perspective is immense. A pure mathematical dissection is of considerable value, but with more complicated models, which are increasingly involved and technically demanding, the search for an analytical answer to pricing and hedging problems could be futile and the only possibility is to resort to numerics. This thesis is centered around numerical methods applied to problems in mathematical finance. While being in the same field, the problems differ substantially from each other. The articles cover many of the big questions in finance: option pricing, hedging, price sensitivities, Value-at-Risk, implied volatility and risk aversion. The numerical methods are varying; finite different methods for partial differential equations, Monte Carlo and quasi-Monte Carlo, the fast Fourier transform and numerical search methods are all used where applicable. This is not a thesis where new theory is developed in numerical mathematics and neither in finance, but rather in the borderland in between, in applied mathematical finance. It adds to the understanding of stock price models with jump processes, in particular the Barndorff-Nielsen and Shephard stochastic volatility model. The purpose of the introductory chapter is to give a brief presentation of the theory behind the material presented in the articles. Even though the aim was to make it self-contained it requires basic knowledge of finance theory, stochastic analysis and also some background in mathematical analysis. Numerous references are given for those interested in the original research in mathematical finance. Interested readers seeking a way into the subject should consider the following books: Björk [21] Arbitrage theory in continuous time, Cont and Tankov [35] Financial modelling with jump processes, Glasserman [64] Monte Carlo methods in financial engineering, Hull [78] Options, futures and other derivatives and Wilmott, Dewynne and Howison [116] Option pricing, which together well cover the material needed to indulge in this thesis. 1. Lévy processes Lévy processes have a central role in this thesis, although the focus is not on the processes themselves, but as building blocks. The financial models studied are driven by Lévy processes and to understand how they are used some background material is needed. This said, the Lévy processes are not studied from a theoretical point; no new properties are derived, nor are any new insights about Lévy processes brought to the table. In a sense this thesis is about the use of Lévy processes in mathematical finance, from a computational and applied view. For the coherence of the introduction, a brief summary of the theory needed to understand Lévy processes and how they are treated in the sequels is provided here. A Lévy process is a stochastic process with stationary independent increments. That is, pick a series of times with a fixed time step, measure the process at those times and calculate the change between times, then these numbers will have the same distribution and be independent of each other. To be formal, a Lévy process {Xt , t > 0} is a càdlàg process (i.e. right continuous with left limits) with X0 = 0, a.s. having the properties 2 MARTIN GROTH • For any choice of n ≥ 1 and 0 ≤ t0 < t1 < · · · < tn , the random variables Xt0 , Xt1 − Xt0 , Xt2 − Xt1 , . . . , Xtn − Xtn−1 are independent. • The distribution of Xs+t − Xs does not depend on s. • Xt is stochastic continuous, i.e. ∀ > 0, t ≥ 0, lims→t P(|Xs − Xt | > ) = 0. Stochastic continuity is not the same as the sample paths being continuous. A Lévy process may have jumps in the path but the probability that the process exhibit a jump at any given time is zero. Let µ be a probability measure on Rd and let µn be the n-fold convolution with itself µn = µ ∗ · · · ∗ µ. The probability measure µ is said to be infinitely divisible if for any positive integer n there is a probability measure µn on Rd such that µ = µnn . This implies that for any infinitely divisible distribution µ and any positive integer n there exist n random variables such that the sum of the variables have distribution µ. This resembles quite a lot the first point in the definition of a Lévy process and indeed, denoting the distribution of X by PX the following result holds true Theorem 1.1 (Theorem 7.10 Sato [110]). • If {Xt , t ≥ 0} is a Lévy process in d law on R then, for any t ≥ 0, PXt is infinitely divisible and, letting PX1 = µ, we have PXt = µt . • Conversely, if µ is an infinitely divisible distribution on Rd , then there is a Lévy process in law {Xt , t ≥ 0} such that PX1 = µ. • If {Xt } and {Xt0 } are Lévy processes in law on Rd such that PXt = PXt0 then {Xt } and {Xt0 } are identical in law. Here a Lévy process in law is defined similar to a Lévy process but without the càdlàg property. Examples of distributions which are infinitely divisible includes the Gaussian, Cauchy, Poisson, compound Poisson, exponential, inverse Gaussian, normal inverse Gaussian and the generalised version of the last two. Letting hx, yi denote the inner product on Rd , the characteristic function of a Lévy process can be written as E eihz,Xt i = etφ(z) , z ∈ Rd . The continuous function φ, called the characteristic exponent, is the cumulant generating function of X1 . The dependence on t is linear so the law of Xt is determined by the knowledge of the law of X1 . The form of the characteristic exponent for all infinitely divisible distributions is given by the Lévy-Khintchine representation, an important result about Lévy processes. Given a Lévy process Xt on Rd then φ has the representation Z 1 (1.1) φ(z) = − hz, Azi + ihγ, zi + eihz,xi − 1 − ihz, xi1|x|≤1 (x) ν(dx), z ∈ Rd , 2 Rd where A is a symmetric nonnegative-definite d × d matrix, γ ∈ Rd a vector and ν is a measure on Rd satisfying Z ν({0}) = 0 and (|x| ∧ 1)ν(dx) < ∞. Rd The three parts (A, ν, γ) are called the generating triplet for Xt and are uniquely determined by the distribution of X1 . A is called the Gaussian covariance matrix and ν the Lévy measure. For a subset A ∈ B(Rd ) the Lévy measure ν(A) can be interpreted TOPICS IN COMPUTATIONAL FINANCE 3 as the expected number of jumps with jump size in A per unit time. The triplet is unique, however the representation (1.1) is not. Other functions than 1|x|≤1 can be used to truncate the larger jumps in the integrand. This effects γ so one should clearly state the truncating function considered if it differs from the one in (1.1). The second important result is the Lévy-Itô decomposition which says that a Lévy process can be expressed as the sum of two independent parts, a continuous part and a part expressible as a compensated sum of independent jumps. Here the version from Cont and Tankov [35] is given, which is slightly more accessible than Sato’s version. To begin with, observe that it is possible to define a measure on [0, ∞) × Rd counting the jumps of Xt in [t1 , t2 ] with jump size B JX ([t1 , t2 ] × B) = #{(t ∈ [t1 , t2 ], Xt − Xt− ) ∈ B} for any measurable set [t1 , t2 ] × B ⊂ [0, ∞) × Rd . It will be required that the jump measure JX of a Lévy process X is a Poisson random measure, see Cont and Tankov for the definition. The Lévy-Itô decomposition then states Theorem 1.2 (Prop. 3.7 Cont and Tankov [35]). For a Lévy process {Xt , t ≥ 0} on Rd , where X1 has the generating triplet (A, ν, γ), the following holds • ν is a Radon measure on Rd \ {0} and verifies Z Z 2 |x| ν(dx) < ∞, ν(dx) < ∞. |x|≤1 |x|≥1 • The jump measure of Xt , denoted by JX , is a Poisson random measure on [0, ∞) × Rd with intensity measure ν(dx)dt. • There exists a d-dimensional Brownian motion {Bt , t ≥ 0} with covariance matrix A such that e , Xt = γt + Bt + Xtl + lim X t ↓0 Z Xtl = xJX (ds × dx) where and |x|≥1,s∈[0,t] et = X Z x{JX (ds × dx) − ν(dx)ds}. ≤|x|<1,s∈[0,t] All parts of the decomposition are independent and the convergence is almost sure and uniform in t on any bounded interval. The first two terms of the decomposition together form a Gaussian Lévy process, which is the continuous part. The two last terms form the discontinuous jump part. The condition that the Lévy measure has finite mass for |x| ≥ 1 makes Xtl into a compound Poisson process with almost surely finite number of jumps. The last term is a compensated jump integral for the small jumps, enabling processes with infinite jump activity, i.e. processes with infinitely many small jumps. It can be noticed that without passing to the limit, the last term will also form a compound Poisson process. An arbitrary Lévy process can therefore be approximated by a jump-diffusion, the sum of a Brownian motion with drift and a compound Poisson process. The last concept needed to be defined is a subordinator, a Lévy process with almost surely nondecreasing sample paths. Hence a subordinator {Xt , t ≥ 0} is increasing 4 MARTIN GROTH such that Xt ≥ 0 a.s. for every t > 0. For a Lévy process on R to be increasing the characteristic triplet needs to satisfy A = 0, Z Z ν(dx) = 0, xν(dx) < ∞ (−∞,0) and (0,1] Z xν(dx) ≥ 0. γ0 := γ − |x|≤1 The variable γ0 is called the drift and the integral in the definition of γ0 is finite, otherwise there would be infinitely many small jumps with positive jump size at any time. Hence a subordinator always has finite variation (no Brownian motion and finite jump activity). 2. Arbitrage pricing and Martingale measures In order to trade with claims there has to be a way to attribute a price in a manner excluding possibilities to make money out of nothing. To make a profit without risking any loss is called arbitrage and in a working theory for financial derivatives it is necessary that there are no arbitrage opportunities. The idea of arbitrage is fundamental in finance and the quest is to find conditions such that the market model is arbitrage-free. As will be showed later, absence of arbitrage is closely connected to the existence of equivalent martingale measures which will make the (discounted) price process of a claim into a martingale, concepts which will be defined below. In the Black-Scholes framework martingale pricing comes naturally from arbitrage considerations but for more complicated models this is not the case. The martingale approach started with Harrison and Kreps [70] and Harrison and Pliska [71]. They originally considered trading strategies which only allowed for simple predictable integrands. This constraint ruled out unfavorable trading strategies such as the ”doubling strategy” but was still too restrictive. Delbaen and Schachermeyer [42] replaced No arbitrage with the concept of No Free Lunch with Vanishing Risk (NFLVR). The difference between the concepts is a question of functional analysis definitions, i.e. choosing space to work in, and is left to the reader to find out from the references. Instead of considering only simple predictable integrands the NFLVR-concept opened up for the possibility to include a larger group of strategies, restricted to be admissible. Consider a market consisting of n traded risky assets whose evolutions are strictly positive and described by a filtered probability space (Ω,F,{Ft },P). A real adapted process {Xt , t ≥ 0} is a martingale if for all t (2.1) E[|Xt |] < ∞, E[Xt |Fs ] = Xs ∀ 0 ≤ s ≤ t ≤ ∞. If there exists a nondecreasing sequence of stopping times {τk } of the filtration {Ft } such that Xt∧τk is a martingale for all k, then Xt is called a local martingale. Let X denote a contingent claim with maturity T , referred to as a T -claim. Assume that the risky asset prices S(t) = [S0 (t) · · · Sn (t)] develop according to some underlying stochastics. In the Black-Scholes market the assets follow stochastic differential equations driven by Wiener processes, but for the general martingale pricing the stochastics are allowed to be semimartingales, see Protter [105]. S0 is often thought of as the riskfree asset in the market, a bank account with short rate r. In the general theory the only assumption is that S0 (t) > 0 P − a.s. for all t ≥ 0. TOPICS IN COMPUTATIONAL FINANCE 5 Instead of looking at the price vector process S(t), consider the normalised market with price vector process S1 (t) Sn (t) (2.2) Z(t) = [Z1 (t), . . . , Zn (t)] = ,..., . S0 (t) S0 (t) Here S0 is used as the numeraire and in the Z-economy Z0 (t) = 1 is a risk-free asset, a money account with zero interest rate. Let θ(t) = [θ0 (t), . . . , θn (t)] be a portfolio, where θi (t) represents the number of units held of the i th asset at time t. Since a trading strategy can only depend on information available at the current time it must be assumed that θ(t) is adapted (or even predictable). The value of the portfolio at any time t is given by the value process V (t; θ) = n X θi (t)Si (t). i=0 The value process can equally well be defined using the normalised market, giving the Z-value process n X Z V (t; θ) = θi (t)Zi (t). i=0 It is necessary to narrow down the class of strategies to avoid cases such as the doubling strategy. One common way is to require the portfolio to be admissible in the sense that it is limited from below: An adapted process θZ = [θ1 , . . . , θn ] is called admissible if there exists a nonnegative real number α such that Z t θZ (u) dZ(u) ≥ −α for all t ∈ [0, T ]. 0 A process θ(t) = [θ0 (t) θZ (t)] is called an admissible portfolio process if θZ is admissible. The value process should reflect the actual rise and fall of the assets, i.e. there is no flow of funds in or out of the portfolio. It should be self-financing: An admissible portfolio is said to be Z-self-financing if Z dV (t; θ) = n X θi (t) dZi (t). i=0 The choice of numeraire is not crucial for the concept of self-financing portfolios as it can be proved that a portfolio θ is S-self-financing if and only if it is Z-self-financing. Adding to this, a contingent claim X is said to be reachable if there exists a portfolio θ such that V (T, θ) = X. This extends straightforwardly to definitions of S-reachable and Z-reachable claims. Arbitrage is the possibility to make a positive amount of money while starting with nothing. Such a possibility can not exist over time in a sound market as it will be exploited by investors making a fortune without taking any risk. A mathematical definition of arbitrage can be given using the value function: A self-financing trading strategy θ(t) is called an arbitrage if either V (0; θ) < 0, P(V (T ; θ) ≥ 0) = 1, 6 MARTIN GROTH or V (0; θ) = 0, P(V (T ; θ) ≥ 0) = 1, P(V (T ; θ) > 0) > 0. The concept of arbitrage-free markets is closely related to the existence of probability measures under which asset dynamics of the normalised market are martingales. Two separate probability measures P and Q on a measurable space (X, F) are said to be equivalent (∼) if they define the same set of events as impossible, i.e. P ∼ Q : ∀A ∈ F Q(A) = 0 ⇐⇒ P(A) = 0. This is important since it will be shown that pricing takes place under measures equivalent to the historical measure. If this was not the case events which are impossible under the pricing measure could have positive probability under the historical measure, which could lead to arbitrage. A probability measure Q on FT is called an equivalent martingale measure for the market model given by Z(t), the numeraire S0 and the time interval [0, T ] if it has the following properties: • Q ∼ P on FT . • All price processes Z0 , Z1 , . . . , Zn are martingales under Q on the time interval [0, T ]. If Z0 , Z1 , . . . , Zn are local martingales under Q it is called a local martingale measure. Theorem 2.1 (First fundamental theorem of asset pricing). Consider the market model S0 , S1 , . . . , Sn where it is assumed that S0 (t) > 0 P-a.s. for all t ≥ 0. Assume furthermore that S0 , S1 , . . . , Sn are locally bounded. Then the following conditions are equivalent: • The model satisfies NFLVR. • There exists a measure Q ∼ P such that the processes Z0 , Z1 , . . . , Zn defined in (2.2) are local martingales under Q. See Delbaen and Schachermeyer [42] for a proof in the case of bounded price processes. The second fundamental theorem of asset pricing states that, presuming the market is free of arbitrage, then the market is complete, i.e. all contingent claims are reachable, if and only if the equivalent martingale measure is unique. Few of the markets studied in this thesis will be complete, and it is questioned whether market completeness is a financially realistic property. Completeness will therefore not play a significant role in the following. Having a T -claim X, what is a reasonable price process Λ(t; X)? It is clear from the first fundamental theorem that the price has to be consistent with the market S(t) and that including the claim in the market can not give rise to any arbitrage possibilities. For the extended market {Λ(t; X), S0 , . . . , Sn } there must then exist a local martingale measure Q. Using the definition of a martingale (2.1), the first fundamental theorem states that the price process divided by the numeraire is a martingale, hence Λ(t; X) X Q Λ(T ; X) Q =E Ft = E Ft . S0 (t) S0 (T ) S0 (T ) This gives the result: TOPICS IN COMPUTATIONAL FINANCE 7 Theorem 2.2 (General pricing formula). The arbitrage-free price process for the T claim X is given by X Q Λ(t; X) = S0 (t) E Ft , S0 (T ) where Q is a local martingale measure for the a priori given market S0 , S1 , . . . , Sn with S0 as the numeraire. Assuming that there exists a short rate r(t), the price process is given by the risk neutral pricing formula i h RT Q − t r(s) ds (2.3) Λ(t; X) = E e X Ft , Rt with the money account S0 (t) = S0 (0) e 0 r(s) ds as the numeraire. Left to determine are the claim X and the dynamics of the underlying assets, and some way to sample paths for the assets. Below is discussed different approaches proposed to model the dynamics of asset prices; models driven by Lévy processes and stochastic volatility models. This concise exposition of the theory for derivative pricing is on no account a full treatment of the subject; that is a task left to writers of textbooks such as Benth [9], Björk [21], Duffie [46] or Musiela and Rutkowski [94]. Those interested in reading some of the original work in the field of arbitrage pricing or seeking proofs of the theory should look up the following articles: Black and Scholes [22], Delbaen and Schachermeyer [42, 43], Harrison and Kreps [70], Harrison and Pliska [71] and Merton [92]. 2.1. Equivalent martingale measures. The first fundamental theorem of asset pricing states that there is a unique correspondence between the existence of an equivalent martingale measure and the absence of arbitrage. If the market is complete, like the Black-Scholes market, the martingale measure is unique. In incomplete markets this is not true, instead there exists a range of different martingale measures which are all equivalent to the historical measure. To price a contingent claim involves choosing under which of these martingale measures to work. Market incompleteness arises of several reasons; adding transaction costs, jumps in the asset dynamics or stochastic volatility, all of these make a market incomplete. If the market model contains a Lévy process with jumps, the class of equivalent martingale measures is surprisingly large, the precise formulation of equivalence of measures for Lévy processes is found in Sato [110]. It turns out that there is a large freedom to change the Lévy measure but unless there is a diffusion component present the drift can not be changed. In general one also has more freedom to change the distribution of the large jumps than the small ones. Presuming the market is incomplete one must decide what equivalent martingale measure to use, for Lévy processes several approaches exist. Raible [106] considers exponential Lévy models and suggests using the Esscher transform. This is an analogue to the drift change for the geometric Brownian motion. If X is a Lévy process, under suitable regularity conditions, the Esscher transform is a change of measure from the historical measure P to a local equivalent measure Q with transform density process dQ eθXt Zt = = , dP E [eθXt ] Ft where θ ∈ R. Let r be the interest rate and assume that the Lévy process is neither almost surely decreasing nor almost surely increasing. Then there exists a real constant 8 MARTIN GROTH θ which, through the Esscher transform, ensures the existence of a locally equivalent measure Q under which the discounted asset price exp(−rt)St = S0 exp(Xt ) is a martingale. Clearly the market will be free of arbitrage since Q is an equivalent martingale measure. Another possibility is to choose the equivalent martingale measure Q that is closest to the historical measure P in some sense. Examples of ways to measure the distance from P are the quadratic distance " 2 # dQ EP dP or the relative entropy (2.4) H(Q, P) = dQ Q P, EP dQ ln dP dP +∞ otherwise. The measure QME which minimise the distance in the entropy sense is called the minimal entropy martingale measure (MEMM), i.e. H(QME , P) = min H(Q, P) Q∈M where M is the set of equivalent martingale measures. Cont and Tankov [35] claim this can be interpreted in an information theoretic setting: minimising relative entropy corresponds to choosing a martingale measure by adding the least amount of information to the prior model. Frittelli [62] studies the minimal entropy martingale measure in a general context of incomplete markets and proves that if there exists an equivalent martingale measure Q with H(Q, P) < ∞, then QME exists, is unique and is equivalent to P. A similar result is proved in Grandits and Rheinländer [67], using the same assumption as Frittelli: If there exists a measure Q ∈ M s.t. H(Q, P) < ∞, the density of QME can be written as Z T dQ (2.5) ηt dSt = c exp dP 0 where c is a constant and η is a predictable process such that the integral is a QME martingale, i.e. Z T QME ηt dSt = 0. E 0 There is not a unique measure with the representation (2.5) so the opposite need not be true; a measure with this representation is not necessarily QME . To verify that a measure with this form is indeed the minimal entropy martingale measure an additional verification result discussed in Rheinländer [107] is needed. Two different methods to find QME in a stochastic volatility model are presented by Benth and Karlsen [15] and Rheinländer [107], the first via a solution of a semi-linear partial differential equation and the second by a duality method. The latter is stated in a general semimartingale setting with examples using the Stein-Stein model. The specific form of the MEMM in the Barndorff-Nielsen and Shephard model is discussed in connection with the introduction of the model in Section 4.3. The minimal entropy martingale measure is also studied in Fujiwara and Miyahara [63] for exponential Lévy processes, Benth and Meyer-Brandis [17] and Hobson [75] for stochastic volatility models. The minimal entropy measure is closely related to utility indifference pricing in the risk aversion limit, see Section 3. TOPICS IN COMPUTATIONAL FINANCE 9 3. Utility indifference pricing There is something strikingly intuitive about the concept of arbitrage pricing in the Black-Scholes market. Taking positions in the option and the underlying asset, forming a locally riskless portfolio, determines the price if no arbitrage exists in the market. A short, non-technical argument gives the main idea in a few lines. It is just as easy to understand why the concept fails. The possibility to make a perfect replication of the option by trading in the underlying is of fundamental importance in arbitrage pricing. In the Black-Scholes market there are several conditions to ensure this is possible, which all are simplifications of the real world. The theory assumes that there are no transaction costs, continuous trading is possible and that any fraction of a stock can be bought. Without these assumptions a perfect hedge is not achievable, and arbitrage pricing fails. It is a bit paradoxical that only the contracts possible to replicate perfectly are possible to price, something which makes them redundant in a sense. Market completeness implies that all options are replicable, and hence redundant. It is argued that the mere fact that options are traded implies that market completeness is not a financially justified property. In an incomplete market there is no longer a single arbitrage-free price, neither a unique perfect hedge of an option, and therefore it is an unavoidable risk associated with trading. Instead of trying to find the one arbitrage-free price one tries to measure the risk to hedge and price the claim. Other strategies are needed in incomplete markets, such as superhedging [54], quadratic hedging, both mean-variance [23] and (local) riskminimisation [58], and utility indifference pricing [76]. Superhedging is a conservative approach that tries to eliminate all risk associated with the option, quadratic hedging is a strategy minimising some quadratic function of the hedging error while utility indifference pricing, which is discussed below, builds on the old idea of expected utility maximisation. Hodges and Neuberger [76] study a Black-Scholes market with transaction costs. By removing the assumption that the market is friction-free it is made incomplete, so instead of arbitrage pricing they suggest an approach based on utility indifference. Let the market consist of a risky asset St and a bond Rt and let the investor have the possibility to issue an option on the risky asset. Hodges and Neuberger’s main idea is that the utility indifference price of a claim is the price at which the investor is indifference between entering into the market directly, or to first issue a claim and then enter into the market with the incremented wealth. Let the investor have an initial wealth x at time t and a utility function u(x), a concave increasing function with u(0) = 0 that depend on a risk aversion parameter γ. Assuming that A is the set of admissible trading strategies then πt ∈ A is the fraction of the wealth invested in the risky asset at time t. The value function when no claim is issued can be defined as V 0 (t, x, S) = sup E [u(XTπ )] πt ∈A where XTπ is the wealth dynamics at time T given π. The form of the wealth dynamics depends on the specific model chosen. Assuming that the investor issues a claim with payoff function f (St ) then the value function will instead be V c (t, x, S) = sup E [u(XTπ − f (ST ))] . πt ∈A 10 MARTIN GROTH The utility indifference price defined by Hodges and Neuberger for a given risk aversion γ is the price Λ(γ) s.t. V 0 (t, x, S) = V c (t, x + Λ(γ) , S). Then Λ(γ) is the price which provides the same utility in both cases: the investor is indifferent whether to issue a claim or not. The utility indifference price depends for most choices of the utility function on the initial wealth. Two investors with the same utility function but different amounts to invest could therefore disagree on the price of an option. The important exception is the exponential utility function, u(x) = 1 − exp(−γx) leading to a price independent of the initial wealth. The exponential utility has been extensively studied because of the connection between utility indifference pricing and certain hedging and pricing strategies. Using exponential utility and letting γ → ∞ the utility indifference price will tend to the superhedging price, which in general is considered to be too high. More interesting is letting γ → 0. Several authors [6, 41, 55, 113] have noticed that there is a duality between the utility indifference price in the risk aversion limit and the price under the minimal entropy martingale measure. Assume the price process St is a semimartingale and Xtπ the wealth process with self-financing strategy π and initial wealth x. For a contingent claim with payoff f (ST ) one tries to maximise the utility over all π in a suitable class Θ max EP [1 − exp(−γ(XTπ − f (ST )))] . π∈Θ In a general semimartingale framework Delbaen et.al.[41] gives different choices of Θ and shows that there is a dual problem where the relative entropy minus a correction is minimised min 1 − exp −H(Q, P) − γx + γEQ [f (ST )] Q∈M over a suitable class M of local martingale measures Q for St . Hence π Q sup E [1 − exp(−γ(XT − f (ST )))] = 1 − exp − inf H(Q, P) − γx + γE [f (ST )] Q∈M π∈Θ for γ > 0. Becherer [6] shows that when taking the risk aversion limit γ → 0, the utility optimisation problem coincides with pricing under the minimal entropy martingale measure. That is, 1 (γ) Q ME Λ = sup E [f (ST )] − H(Q, P) − H(Q , P) γ Q∈M and taking the limit it holds that ME lim Λ(γ) = EQ γ↓0 [f (ST )]. The measure QME for a general continuous semimartingale is derived through duality in the method developed by Rheinländer [107], as discussed in Section 2.1. For the stochastic volatility market proposed by Barndorff-Nielsen and Shephard, see section 4.3, the connection between QME and the risk-aversion limit of the utility indifference price under exponential utility appears in papers by Benth and Meyer-Brandis [17] and Rheinländer and Steiger [108]. In the first paper a representation of the minimal entropy martingale measure is developed for the Barndorff-Nielsen and Shephard model without leverage, which is generalised in the second paper. For this model the TOPICS IN COMPUTATIONAL FINANCE 11 representation of the utility indifference price as the solution of a semi-linear partial differential equation is also discussed in Section 4.3. 4. Exponential Lévy and Stochastic volatility models Even before the Chicago Board Options Exchange opened as the first stock option exchange there was an interest in modelling the erratical behaviour of the stock movement in order to price derivatives. The pioneer was Louis Bachelier with his thesis from 1900, followed by Samuelson [109] who introduced the geometric Brownian motion, and Mandelbrot [89] who preferred ”L-stable” probability laws and multifractals. Not until Fisher Black and Myron Scholes [22] together with Robert C. Merton [92] developed the theory nowadays bearing the names of the two first mentioned, there existed a consistent way to handle options. Black and Scholes built on Samuelson’s work, where the stock price dynamics is a geometric Brownian motion: dSt = µSt dt + σSt dBt adding a risk-free money account with rate of return r. The Black-Scholes framework has been the industry standard, mainly because it is simple, clear and easy to use. Explicit formulas exist for the price of vanilla contracts and, because of the widespread use, the model is well understood. However, the Black-Scholes model has some drawbacks noticed by market traders throughout the years. Apart from the simplifications made with regards to transaction costs, short selling and dividends, one major disadvantage is the Black-Scholes theory’s inability to explain the volatility smile. It was well known before the 1987 crash that the implied Black-Scholes volatilities of market prices gave rise to a smile, i.e. the volatilities implied by the Black-Scholes formula were higher for in-the-money and out-of-the-money options than options with strikes around the spot price. Empirical work clearly show that the implied volatilities of market prices are not constant but vary with strike price and time to maturity. After the 1987 crash a more frequent appearance of skewness was noticed in the implied volatilities, resulting in more of a smirk or sneer than a smile, see Dumas et.al.[47]. The common explanation is that investors became more aware of the risk for large downward movements in the market. Neither the smile nor the smirk are possible to explain within the Black-Scholes framework, as both indicate that the market emphasises the risk associated with large stock price movements more than the theory does. Empirical work also clearly indicates that stock price log-returns on a short time horizon exhibit a distribution with heavier tails than expected from the Black-Scholes model, and also jumps in the paths. A stream of new models have since then been proposed to replace the Black-Scholes model, all of them with the intention to model the market prices, and hence the implied volatilities, in a better way. Depending on the focus of the research different aspects have been considered important to capture in the modelling: the heavy tails of the returns, the jumps in the paths of asset prices, volatility clustering and/or dependence structures. Shortly after Black and Scholes proposed their model Merton [93] suggested to add a jump term in the stock price dynamics to incorporate jumps with unpriced risk: " # Nt X St = S0 exp µt + σBt + Yi , i=0 where Nt is a Poisson process with intensity λ independent of the Brownian motion Bt and Yi ∼ N (α, δ 2 ) are i.i.d. random variables independent from Bt and Nt . The pricing 12 MARTIN GROTH 0.08 350 0.06 300 0.04 250 0.02 0 200 −0.02 150 −0.04 100 −0.06 50 0 200 400 600 800 1000 1200 1400 1600 1800 2000 −0.08 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Figure 1. Left: Stock price path from the Black-Scholes model with mean 6.4 ∗ 10− 4 and variance 2.21 ∗ 10− 4. The mean and variance are equal to the mean and variance of the Lévy process used in Benth, Groth and Kettler [11]. Right: The log-marginal returns from the stock price. The use of a Brownian motion gives marginal returns being normal distributed. approach Merton devises assumes that the risk associated with the jumps is possible to diversify away and that hedging only takes the average effect of jumps into account. Simple as it is, the assumption that the individual jumps can be ignored because the investor diversifies leaves the position exposed to the jump risk, which in many cases is an unwanted situation. Three decades later two large classes of models can be distinguished from the literature; Firstly models built on replacing the geometric Brownian motion with some other exponential model, lately a lot of research has been done on exponential Lévy models. Secondly stochastic volatility models, where the constant volatility is replaced by some stochastic process. A third approach exists, the local volatility models, where the volatility depends on the price and time through a deterministic function dSt = µSt dt + σ(t, St )St dWt . Local volatility models and fitting of the local volatility surface will not be discussed further, the interested reader finds more information in Derman and Kani [45] and Dupire [49]. 4.1. Exponential Lévy models. Adding jumps can be accomplished by replacing the Brownian motion with a Lévy process, so called exponential Lévy models St = S0 exp(µt + Lt ), where Lt is a Lévy process with characteristic triplet (σ 2 , ν, γ). An equivalent approach is to write down the dynamics directly dSt = µSt dt + σSt dLt . Exponential Lévy models can be built with marginal log-returns in a range of different distributions, with heavier tails to better fit log-return data. This is actually what Merton did, with a jump-diffusion process as the driving noise. Models built around Lévy processes goes back to Mandelbrot in the 1960’s but resurged in the late 1990s. TOPICS IN COMPUTATIONAL FINANCE 13 0.5 0.4 0.3 0.2 0.1 −6 −4 −2 0 2 4 6 Figure 2. The normal inverse Gaussian density with three different parameter sets, (1, 0.75, −2, 1), (1, 0, 0, 1) and (1, −0.75, 2, 1). The dashed line is the standard normal distribution density. Madan et.al.[87, 88] used the variance-gamma process, Carr et.al.[27, 28] the CGMY process, a subclass of tempered stable processes, Barndorff-Nielsen [3] introduced the normal inverse Gaussian process while the use of the hyperbolic Lévy process was proposed by Eberlein and Keller [51]. The latter twoare both subclasses of the family of generalised hyperbolic Lévy processes, for more information about applications to finance see [50, 52, 53, 104, 106]. The class of hyperbolic Lévy processes, especially the normal inverse Gaussian Lévy process, requires some special attention. Beginning with the inverse Gaussian process IG(δ, γ), a subordinator, having probability density ( 2 ) 2 δ γ δ pIG (x; δ, γ) = √ x−3/2 exp − x+ , x > 0. 2x γ 2π One way to interpret pIG (x; δ, γ) is as the distribution of the time it takes for a Brownian motion to reach a fixed distance. The mean and variance of an IG(δ, γ)-distribution are δ/γ and δ/γ 3 . The distribution in itself is interesting because it is one possible choice for the stationary distribution of the volatility process in the Barndorff-Nielsen and Shephard model below. The IG-Lévy process is a subordinator, a process with nondecreasing paths. As a such it can be used to stochastically time change other processes, i.e. subordinate other processes. Consider the probability space (Ω, F, P) and a Lévy process {Xt , t ≥ 0} with cumulant generating function φ(u). If {St , t ≥ 0} is a subordinator with Laplace exponent l(u) then the process {Yt , t ≥ 0} defined by Y (t, ω) = X(S(t, ω), ω) for each ω ∈ Ω is a Lévy process with characteristic function E eiuYt = etl(φ(u)) . The process Yt is said to be subordinate to Xt and in effect St is used to change the clock of Xt . Using the inverse Gaussian subordinator to time change a Brownian motion results in the normal inverse Gaussian (NIG) process. The NIG distribution was proposed by Barndorff-Nielsen [2] in the context of wind-borne sand and is a normal variance-mean mixture distribution. If σ 2 ∼ IG(δ, γ) and ∼ N (0, 1) then x = µ + βσ 2 + σ have a 14 MARTIN GROTH 200 0.08 180 0.06 160 0.04 140 0.02 120 0 100 −0.02 80 −0.04 60 −0.06 40 0 200 400 600 800 1000 1200 1400 1600 1800 2000 −0.08 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Figure 3. Left: Stock price path from an exponential Lévy model with the normal inverse Gaussian Lévy process having parameters α = 136.29, β = −15.197, δ = 0.0295, µ = 0.00395. The parameter set is used in Benth, Groth and Kettler [11]. Right: The log-marginal returns from the stock price. The use of the normal inverse Gaussian Lévy process gives marginal returns with a more peaked look than expected from the normal distribution due to the heavier tails. N IG(α, β, µ, δ) distribution with density function p K (αq(x − µ)) δα 1 NIG 2 2 p (x; α, β, µ, δ) = exp δ α − β + β(x − µ) π q(x − µ) where √ q(x) = δ 2 + x2 and x ∈ R, µ ∈ R, δ > 0, 0 ≤ |β| ≤ α. K p1 is the modified Bessel function of third kind with index 1 and α is given as α = γ 2 + β 2 . The parameters of the distribution have interpretations with the shape of the density: increasing α gives a steeper density, increasing β gives an increasingly asymmetric distribution, δ scales the distribution and µ translates it, see Figure 2. The density will be asymmetric unless β = 0. The moments κi of the distribution are easily calculated from the moment generating function with mean and variance given as δβ κ1 = µ + p , α2 − β 2 δα2 κ2 = p 3 . α2 − β 2 The asymptotic behaviour of the distribution is g(x; α, β, µ, δ) ∼ c|x|−3/2 exp(−α|x| + βx) as x → ±∞ giving the distribution semi-heavy tails. The inverse Gaussian distribution can be generalised by adding a parameter λ, resulting in the generalised inverse Gaussian (GIG). A normal mean-variance mixture with GIG gives the family of generalised hyperbolic (GH) distributions, of which NIG is a special case. GH distributions are TOPICS IN COMPUTATIONAL FINANCE 15 studied by Eberlein and Keller [51] in relation to financial modelling. Figure 3 shows an example path of an exponential NIG-Lévy model and its log-marginal returns, using parameters relevant for daily observed stock prices. Exponential Lévy models share a considerable part of the quantitative properties observed in asset returns. They make it possible to model heavy or semiheavy (exponential) tails, the increments are independent, there are jumps in the paths and the distributions can be modeled to be asymmetric to capture differences in the behaviour of upward and downward movements. For a model to exhibit marginal returns with these stylised facts it needs to have a distribution of the returns with four parameters: a location parameter, a scale (volatility) parameter, a parameter describing the decay of the tails and an asymmetry parameter for the right and left tail to differ. The family of generalised hyberbolic distributions, including the normal inverse Gaussian distribution, is fulfilling this requirement as shown above. The choice of distribution becomes not a question of one fitting better than another but which one is easiest to handle for the purpose and in the circumstances considered. Not all quantitative features of returns are possible to capture with an exponential Lévy model. Volatility clustering and correlation in volatility are observed in the market but not exhibited by exponential Lévy models. It is possible to include these features in a stochastic volatility model as discussed below. However, the presence of heavy tails makes the realised volatility have ”stochastic volatility”-like behaviour, with high variability. Nor are exponential Lévy models able to handle leverage effects, an observed correlation between negative price movements and increasing volatility. As for the Black-Scholes model there is a partial differential equation governing the price of an option in an exponential Lévy model. Let St be given by a stock price model of the exponential Lévy type, driven by Lévy process Lt having characteristic triplet (A, ν, γ) under Q. Consider an option with payoff function f (St ), and assume that the option price can be expressed as a function of the log forward price Xt = ln er(T −t) St . The price of the option under the martingale measure Q is then Λ(x, t) = e−r(T −t) EQ f ex+LT −t . Assuming sufficient differentiability conditions of the payoff and regularity of the Lévy measure the option price satisfies the following integro-partial differential equation Z ∂Λ ∂Λ A ∂ 2 Λ ∂Λ (4.1) +γ + − rΛ + Λ(x + z, t) − Λ(x, t) − z1|z|<1 ν(dz) = 0 ∂t ∂x 2 ∂x2 ∂x R with x ∈ R, t ∈ (0, T ) and terminal condition Λ(x, T ) = f (ex ). The introduction of the nonlocal integral term makes the pde harder to solve both analytically and numerically than the Black-Scholes equation. One can especially notice that if restricting (4.1) to a finite grid the integral term needs to be extended beyond the boundary to make sense. Integro-partial differential equations and other aspects of exponential Lévy models in finance are discussed extensively in Cant and Tankov [35]. 4.2. Stochastic volatility models. Instead of replacing the Brownian motion as the driving source one could instead add another random process, making the volatility non-constant: dSt = µSt dt + σt (Yt )St dBt where Bt is a Brownian motion but σt now is a stochastic process, modelling the random volatility. Common driving processes for the volatility are the geometric Brownian 16 MARTIN GROTH motion, the Ornstein-Uhlenbeck process dYt = α(η − Yt ) dt + β dWt and the Cox-Ingelson-Ross (CIR) process dYt = κ(η − Yt ) dt + v p Yt dWt . The process Wt is another Brownian motion, correlated or uncorrelated to the Brownian motion in the stock price dynamics. However for the Ornstein-Uhlenbeck process there are also models where the second process is a Lévy process, as shown in the next section. Introducing stochastic volatility makes it possible to capture volatility clustering and dependence structures, at the same time as the models can replicate implied volatility smiles. Adding a jump term to the price dynamics or choosing a jump process also make the models realistic on a short-term scale when it comes to jumps in the paths. The drawback is the extra dimension that is added which has the effect that the stock price is no longer a Markov process. Instead it is necessary to consider a two-dimensional process. The complications it means for numerical methods to have a second dimension accounts for a lot of the hesitation shown towards the use of stochastic volatility models. Though, in recent years there has been an increasing interest from practitioners in these models, mainly in the model suggested by Heston [72]. The volatility process in the Heston model is a Cox-Ingersoll-Ross process with a Brownian motion correlated to the Brownian motion driving the stock price, i.e. p dSt = µSt dt + Yt St dBt , p dYt = κ(η − Yt ) dt + v Yt dWt , with the correlation between the two Brownian motions given as dBt dWt = ρ dt. A common feature for many of the suggested models is that the volatility process is mean reverting, like the mentioned Cox-Ingersoll-Ross process and Ornstein-Uhlenbeck process. This is thought to be a realistic feature observed in market data, new information perceived by the traders makes the activity jump up suddenly and then revert back towards a steady state. Assuming that the stochastic volatility model is of the Ornstein-Uhlenbeck class with dynamics dSt = µSt dt + σ(Yt )St dBt , dYt = α(m − Yt ) dt + β dWt , for some function σ(y), Fouque et.al.[59] derive a pricing partial differential equation similar to the Black-Scholes pde. Denoting the instantaneous correlation coefficient between the two Brownian motions by ρ, the price of an European derivative with payoff function f (x) is given by ∂Λ 1 2 ∂2Λ ∂2Λ 1 ∂2Λ + σ (y)s2 2 + ρβsσ(y) + β2 2 ∂t 2 ∂s ∂s∂y 2 ∂y (4.2) p ∂Λ µ−r ∂Λ +r s − Λ + α(m − y) − β ρ + γ(t, x, y) 1 − ρ2 =0 ∂s σ(y) ∂y TOPICS IN COMPUTATIONAL FINANCE 17 with the condition Λ(T, x, y) = f (x). Here r is the interest rate and γ(t, x, y) is an arbitrary function representing the risk premium factor from Wt . In the perfectly correlated case this factor does not appear. Otherwise it is the market price of risk which needs to be selected, an issue of great debate, see [59]. Models where the second random process is another Brownian motion also include the models by Hull-White [79] and Stein-Stein [114]. Scott [111] uses a Gaussian Ornstein-Uhlenbeck process but adds normal distributed jumps with exponential distributed arrival times, while Bates [5] adds a compound Poisson process to the stock price dynamics in the Heston model. The next chapter will contain a more detailed examination of a model where the second added process is not a Brownian motion but a Lévy process. Several books contain sections about stochastic volatility models and their usage. Nice overviews of the different stochastic volatility models and their properties can be found in Cont and Tankov [35], while Fouque, Papanicolaou and Sircar [59] and Lewis [85] concentrate around models without jumps. 4.3. The Barndorff-Nielsen - Shephard model. The returns predicted by most models suggested will by a central limit theorem tend towards a Gaussian distribution if sampled with low frequency. For long time horizons the Black-Scholes model could therefore seem like a reasonable choice, while on a short or moderate time scale the observed returns are typically heavy tailed, with volatility clustering and skewness. Barndorff-Nielsen and Shephard suggested in an inspiring paper [4] a model constructed to handle the short term aspects. The stock price dynamics is driven by a Brownian motion with drift (4.3) dSt = (µ + βσ 2 (t))St dt + σ(t)St dBt , but the volatility is assumed to be a stochastic process. Instead of a Brownian motion driving the volatility process a Lévy process with only positive jumps, a subordinator, is the driving source in a process of Ornstein-Uhlenbeck type (4.4) dσ 2 (t) = −λσ 2 (t) dt + dL(λt). The process L(λt) is termed the background driving Lévy process (BDLP) and the volatility process is said to be a non-Gaussian Ornstein-Uhlenbeck process. Like the Gaussian Ornstein-Uhlenbeck process it is a mean-revering process, however, because the subordinator only has positive jumps the volatility jumps up and reverts down. The subordinator will assure the positivity of the process σ 2 (t), something which is required from the squared volatility. The unusual timing L(λt) is to decouple the modelling of the marginal distribution of the stock’s log-returns and the autocorrelation structure. Whatever value of λ the marginal distribution of σ 2 (t) will be unchanged. A generalised model is achieved by adding a leverage term ρ dL(λt) to the stock price dynamics, which accounts for empirical studies showing that large downward moves in prices are associated with upward moves in volatility. The generalised model will not be considered here. Barndorff-Nielsen and Shephard [4] proposed to use a superposition of OrnsteinUhlenbeck processes Yk (t), with different speed of mean-reversion λk , to obtain a more general correlation pattern in the volatility structure. Let the volatility follow a weighted sum, with positive weights wk adding up to one, σ 2 (t) = m X k=1 wk Yk (t) 18 MARTIN GROTH 400 0.04 0.03 380 0.02 360 0.01 340 0 −0.01 320 −0.02 300 −0.03 −0.04 280 0 200 400 600 800 1000 1200 1400 1600 1800 2000 200 400 600 800 1000 1200 1400 1600 1800 2000 Figure 4. Left: Stock price path from the Barndorff-Nielsen and Shephard model without leverage simulated with parameters δ = 0.0116, γ = 54.2, µ = 0.000621, β = 0.5 and λ = 0.83. Right: The log-marginal returns from the stock price. The peaked structure is clearly visible together with a pattern of volatility clustering. where (4.5) dYk (t) = −λk Yk (t) dt + dLk (λk t) and Lk (λk t) are assumed to be independent but not necessarily identically distributed subordinators with Lévy measures `k (dz). The autocorrelation function for the stationary σ 2 (t) then becomes r(u) = m X w ek exp(−λk |u|) k=1 where the weights w ek are proportional to wk Var(Lk ). Letting some of the components represent short term and others long term movements both long-range and quasi-longrange dependence in the logreturns can be modeled. Below we will sometimes use the √ notation α(y) = (µ + βy), σ(y) = y for the parameter functions in (4.3), assuming the volatility is given by one function Y (t) of the form in (4.5). The choice of Ornstein-Uhlenbeck processes driving the volatility lead to some interesting aspects for the model. From a modelling perspective one can choose any self-decomposable distribution D and find a stationary process of Ornstein-Uhlenbeck type which has one-dimensional marginal law D. A self-decomposable distribution has the property that for any ζ ∈ R and c ∈ (0, 1) the characteristic function of the distribution can be written as φ(ζ) = φ(cζ)φc (ζ) where φc is another characteristic function. Two ways to approach the modelling of the Ornstein-Uhlenbeck process exist. Either write down the specific parametric form of the distribution D and calculate the implied behaviour of the BDLP. Otherwise, instead of starting out with the distribution, pick L(λt) and construct the OrnsteinUhlenbeck process based on it. Some restrictions apply to what Lévy process can be used to get a self-decomposable distribution, more specific, a necessary and sufficient TOPICS IN COMPUTATIONAL FINANCE 19 condition for (4.4) to have a stationary solution is that E [log(1 + |L(1)|)] < ∞. From the point of option pricing it is essential that the model is arbitrage-free. Barndorff-Nielsen and Shephard [4] use Esscher transforms to show that this is the case. Hence, there exist equivalent martingale measures under which exp(St ) is a martingale. Since the model is a stochastic volatility model, including a jump process, the model is incomplete and more than one equivalent martingale measure exist. Pricing becomes a question under which measure to work, for which there are several strategies as mentioned in Section 3. Nicolato and Venardos [96] investigate option pricing under structure preserving measures, i.e. measure under which the price dynamics remains of the Barndorff-Nielsen and Shephard type (4.3)-(4.4). The tractability makes it possible to price derivatives in closed form under structure preserving measures, especially the Laplace transform of log-prices has a simple form. The cumulant function of the log-price at time t, ln[ψ(θ)] = ln[EQ [exp(iθST )], under structure preserving measures, assuming the stationary law of σt2 is inverse Gaussian IG(δ, γ), is given as √ θ2 + iθ δ f1 2 ln[ψ(θ)] =iθ(St + r(T − t)) − [1 − exp{−λ(T − t)}]σ (t) + 2λ λ √ 2 δγ δ(θ + iθ) f1 γ √ − − tan−1 √ − tan−1 √ 2 λ λ f2 f2 f2 where θ2 + iθ f1 = γ 2 + [1 − exp{−λ(T − t)}], λ θ2 + iθ f2 = −γ 2 − . λ Given the characteristic function it is feasible to use numerical inversion techniques to price options in the Barndorff-Nielsen and Shephard model under structure preserving measures, see Groth [68] and also Nicolato and Venardos contribution to the discussion in Barndorff-Nielsen and Shephard [4]. Another choice is to use the measure which minimises the relative entropy (2.4), the minimal entropy martingale measure QME . Benth and Meyer-Brandis [17] studies the minimal entropy martingale measure in the Barndorff-Nielsen and Shephard model and derives the density function. Commencing with the utility maximisation problem as described above and going to the risk aversion limit, using verification theorems from Grandits and Rheinländer [67], they can identify the density. With the stochastic exponents Z t Z t α(Ys ) 1 α2 (Ys ) 0 Zt = exp − dBs − ds 2 0 σ(Ys ) 0 2 σ (Ys ) Z t Z ∞ Z tZ ∞ 00 ln δ(Ys , z, s)N (dz, ds) + (1 − δ(Ys , z, s)) ν(dz) ds Zt = exp 0 0 0 ME the density process of Q 0 will, under sufficient conditions, be given as Zt := Zt0 Zt00 . Here δ(y, z, t) is the function δ(y, z, t) = H(t, y + z) H(t, y) 20 MARTIN GROTH and H(t, y) is a function associated with the utility optimisation problem in the case when the investor is not issuing a claim. H can be represented as Z 1 T α2 (Ys ) H(t, y) = E exp − ds Yt = y , (t, y) ∈ [0, T ] × R+ , 2 t σ 2 (Ys ) and it was devised by Benth and Meyer-Brandis that H(t, y) is governed by the partial differential equation Z ∞ ∂H α2 (y) ∂H (H(t, y + z) − H(t, y)) ν(dz) = 0, − 2 H − λy +λ ∂t 2σ (y) ∂y 0 given that H(T, y) = 1, (t, y) ∈ [0, T ) × R+ . The minimal entropy martingale measure for the generalised Barndorff-Nielsen and Shephard model, including the leverage term, is studied in Steiger [113]. The minimal entropy martingale measure is, as mentioned in Section 2.1, equivalent to the historical measure which makes it suitable for option pricing. The utility indifference pricing setting considered when the density Zt is identified also leads to an integro-pde governing the price Λ of the option the investor can issue. Using a dynamic programming approach Benth and Meyer-Brandis derives the Hamilton-JacobiBellman (HJB) equations associated with the value process of the investor under QME : (4.6) ∂Λ ∂Λ 1 2 ∂2Λ ∂Λ + rs + σ (y)s2 2 − λy ∂t ∂s 2 ∂s ∂y Z ∞ H(t, y + z) ν(dz) = rΛ (Λ(t, y + z, s) − Λ(t, y, s)) +λ H(t, y) 0 with (t, y, s) ∈ [0, T )×R2+ and terminal condition Λ(T, y, s) = f (s). Under the minimal entropy martingale measure the subordinator L(λt) is changed into a pure jump Markov e process L(λt) with jump measure (4.7) νe(ω, dz, dt) = H(t, Yet (ω) + z) ν(dz) dt H(t, Yet (ω)) where the stochastic process Yet is given as e dYet = −λYet dt + dL(λt). An equation similar to (4.6), with only some sign changes, can be derived for the buyer of the claim, illustrating the problem of pricing in an incomplete market. It is known that the price under the minimal entropy martingale measure is the highest price the buyer can accept at the same time as it is the lowest price the seller will agree to. If the market prices deviate from this then the market will be in favour of one part. Notice that the function δ(y, z, s) appears as a measure change in (4.7) and also in the partial differential equation (4.6). The time and state-dependent ratio re-distribute the jump measure under the QME , rescaling the jumps. The integro-PDE (4.6) is studied numerically in Benth and Groth [10], Paper II, using finite difference methods to calculate option prices in the Barndorff-Nielsen and Shephard model. A related equation, for a general risk aversion parameter γ, is derived in Benth and Meyer-Brandis [16], giving again a pde governing the option price Λ(γ) for the issuer of TOPICS IN COMPUTATIONAL FINANCE 21 the claim ∂ 2 Λ(γ) ∂Λ(γ) ∂Λ(γ) 1 2 ∂Λ(γ) + rs + σ (y)s2 − λy ∂t ∂s 2 ∂s2 ∂y Z ∞ H(t, y + z) 1 +λ exp(γ(Λ(γ) (t, y + z, s) − Λ(γ) (t, y, s))) − 1 ν(dz) = rΛ(γ) γ H(t, y) 0 with Λ(γ) (T, y, s) = f (s), for (t, y, s) ∈ [0, T ) × R2+ . Using the change of variable Λ(γ) (t, y, s) = 1 ln h(γ) (t, y, s) γ removes the exponential term in the integrand but instead introduces a non-linearity in the pde (γ) 2 2 (γ) ∂h(γ) 1 2 ∂h ∂h(γ) ∂h(γ) 1 2 1 2∂ h + rs + σ (y)s ys − λy − ∂t ∂s 2 ∂s2 2 h(γ) ∂s ∂y (4.8) Z ∞ H(t, y + z) +λ (h(γ) (t, y + z, s) − h(γ) (t, y, s)) ν(dz) = rh(γ) . H(t, y) 0 After the change of variable the terminal condition is h(γ) (T, y, s) = exp(γf (s)). The numerical solution of (4.8) is used in Benth, Groth and Lindberg [13], Paper IV, together with a root-finding algorithm to find the investors’ implied risk aversion from actual traded options, assuming the underlying model is the stochastic volatility model by Barndorff-Nielsen and Shephard. 5. Numerical methods 5.1. Monte Carlo and quasi-Monte Carlo methods. Monte Carlo methods have over the years become indispensable tools in many areas, including financial engineering, and are perhaps the most flexible and applicable numerical methods available. Based on random sampling the elementary application is numerical integration, but there is a broad field of problems where Monte Carlo methods can be used. Assume that your problem can be cast as an integration over some measure, for which you know how to generate suitable random numbers. Then Monte Carlo integration is the easy task of sampling sequences of random numbers and using these to evaluate the integrand. The sample mean gives a probabilistic approximation of the integral and, when it is not possible to get an analytic solution, this probabilistic approach may prove to be very useful. But Monte Carlo methods have several drawbacks, the main thing being the slow convergence which makes them reliant on computational power and time. The commonly used introduction problem is Monte Carlo integration: Let f (x) be a function integrated over the unit interval Z 1 f (x) dx. 0 Assuming the integration is over the Lebesgue measure the evaluation of the integral can be represented as the approximative calculation of an expectation E[f (U )] over the interval U ∼ Unif[0, 1]. This expectation can be estimated by sampling points uniformly from the interval, resulting in the sequence a1 , . . . , an , and then taking the 22 MARTIN GROTH sample mean over these points n 1X E[f (U )] ≈ f (ai ). n i=1 The strong law of large numbers guarantees that this estimate converges almost surely. One of the disadvantages with Monte Carlo is that the error introduced by replacing the expectation with the sample mean is only a probabilistic measure. If f is square integrable then the standard error in the Monte Carlo estimate √ is approximately normal distributed with mean zero and standard deviation σ(f )/ n. Hence, the Monte Carlo integration yields a probabilistic error bound of order O(n−1/2 ). This error is not depending on the dimension, which makes Monte Carlo integration more attractive in higher dimensions. Conducting Monte Carlo integration also depends on the ability to sample from the underlying distribution, which could be difficult. Together with the probabilistic error bounds these are the main drawbacks of Monte Carlo integration according to Niederreiter [98]. For complicated financial derivatives, or models with other types of driving noise than Brownian motion, where no analytic answer can be obtained, a numerical method may be the only choice. A Monte Carlo method is an instrument which is incredibly flexible and usable under such premises. If it is known how to generate random numbers from the desired distribution, then it requires, in its basic form, little extra analytic work to get started. In the limit it will, due to the law of large numbers, give a correct answer. The key to use Monte Carlo simulation in finance is that one may write the price of an option as the expectation of the payoff depending on the stochastic development of the asset price. For many financial problems Monte Carlo simulations are especially suitable since the dimension turns out to be high or even infinite, for example when valuing a large portfolio consisting of several different types of assets. Other numerical approaches, such as solving partial differential equations, become hard to handle when the problem has more than a few dimensions. Monte Carlo methods, on the contrary, are not significantly harder to work with in higher dimensions than in a few. One of the Achilles tendons for the use in finance has otherwise been American options. For a long time Monte Carlo methods were considered incapable of handling pricing problems involving options with American exercise but since then both Broadie and Glasserman [26] and Tilley [115] have proposed methods to handle American options. Since Monte Carlo methods sample randomly, the points can in the short run be concentrated in a small part of the interval sampled from. If instead the interval is divided according to a Cartesian grid with n points and the points are sampled randomly in any order the convergence can be increased. This procedure is disregarded on the basis that it requires the number of points to be known in advance to form the grid. Using a Cartesian grid rules out the possibility to sample until a terminal condition it met, for example some convergence requirement. The concept behind quasi-Monte Carlo methods and Low-discrepancy sequences is a formalisation of the idea of how to be able to sample a sequence of deterministic numbers which fill the interval or space in an evenly distributed way. In contrast to a Cartesian grid, if sampling repeatedly from a low-discrepancy sequence the points retain an even distribution in the sense of discrepancy, a notion of uniformity described below. Because these sequences do not try to mimic randomness, as the pseudo-random sequences used in Monte Carlo methods, the error when using low-discrepancy sequences in numerical integration is deterministic. The notion of low-discrepancy is reserved for sequences with a convergence rate TOPICS IN COMPUTATIONAL FINANCE 23 of order O(log(n)d n−1 ) in d-dimensions and with sufficiently regular integrands. In low dimensions this is clearly better than the Monte Carlo error bound, and it has the extra benefit that the bound is deterministic. In higher dimensions the advantage over Monte Carlo methods is not as prominent since the error bound is depending on the dimension. But, as pointed out by Glasserman [64], for some problems in finance these methods are still more effective even in dimensions up to 150. Discrepancy is the measure used to describe how our point set is distributed compared to a uniform distribution and hence, it is a measure of deviation from uniformity. Given a nonempty family of Lebesgue-measurable subsets B ∈ I d and a point set P = {x1 , . . . , xn }, the discrepancy of P is given as Pn i=1 χ(xi ; B) D(P ; B) = sup − λd (B) n B∈B where λd denotes the d-dimensional Lebesgue-measure and χ the characteristic function. It is clear that 0 ≤ D(P ; B) ≤ 1 always. There are a few different notions of discrepancy where the star discrepancy D∗ (P ) and the extreme discrepancy D(P ) are the most important. The difference is the choice of subsets B considered, see Niederreiter [98]. It is, according to Niederreiter [98] widely believed that the star discrepancy of any d-dimensional point set P consisting of n points satisfies log(n)d−1 n for some constant cd . It is therefore usual to refer to sequences as low-discrepancy sequences if they have star discrepancy in order of O(log(n)d /n). Although the log(n)d becomes insignificant to the n−1 term as the number of points increases this might not be relevant for manageable point sets if d is large. Quasi-Monte Carlo has therefore traditionally been considered inferior to Monte Carlo in higher dimensions. Sequences used for financial applications include Faure [57], Halton [69], Niederreiter [97] and Sobol [112] sequences. The construction of low-discrepancy sequences is out of the scope of this text, see Glasserman [64] and the references in there for more information. Discrepancy plays a vital role in the Koksma-Hlawka inequality. This explains much of the great interest put into finding low-discrepancy sequences, while discrepancy itself is a rather theoretical concept. The Koksma-Hlawka inequality is a classic result providing a bound on the error introduced when substituting the integral with a sum and evaluating the integrand over a low-discrepancy sequence. The result builds on a one-dimensional result by Jürjen Koksma from 1942 which was extended by Edmund Hlawka in 1961. D∗ (P ) ≥ cd Theorem 5.1 (The Koksma-Hlawka inequality). If f has bounded variation V (f ) in the sense of Hardy-Krause on the closed hypercube I¯d = [0, 1]d , then for any set of points x1 , . . . , xn ∈ I d it holds that Z n 1 X (5.1) f (xi ) − f (u) du ≤ V (f )D∗ (x1 , . . . , xn ) n Id i=1 ∗ where D (x1 , . . . , xn ) is the star discrepancy. This error bound provides a strict deterministic bound on the integration error but is merely of theoretical value since it often grossly overestimates the error and both the Hardy-Krause variation and the star discrepancy are difficult to compute. The KoksmaHlawka bound (5.1) is stated only for the unit hypercube and the Lebesgue measure 24 MARTIN GROTH but using slightly different definitions Kainhofer [81] provides a Koksma-Hlawka bound for general measures and domains. Kainhofer also studies problems on unbounded domains, which appears frequently in finance, and uses the Hlawka-Mück method [74] for option pricing. The method enables generation of low-discrepancy sequences from arbitrary distributions, provided the distribution function is known. This is discussed in Benth, Groth and Kettler [11], Paper I, for the normal inverse Gaussian distribution. Starting with Boyle [24] in 1977, the research on Monte Carlo methods in finance has increased rapidly. Boyle et.al.[25] contains references to some of the applications of Monte Carlo in finance during the eighties and nineties including variance reduction techniques and low-discrepancy sequences. A short and comprehensive summary can also be found in Lehoczky [83]. The use of low-discrepancy sequences in finance started surprisingly late, with the first articles on the subject not appearing until the mid-nineties. Joy et.al.[80] use Faure sequences to price a variety of options including vanilla calls and Asian options. Faure sequences is also the choice of low-discrepancy sequence when Papageorgiou and Paskov [100] estimates Value-at-Risk for portfolios of stocks and mortgage obligations. The results from quasi-Monte Carlo in their study are superior compared to Monte Carlo, see also Papageorgiou and Traube [101], Paskov [102] and Paskov and Traube [103]. Glasserman [64] is an excellent source for information on Monte Carlo and quasi-Monte Carlo methods in finance, including a long list of the most important references. 5.2. Fast Fourier transform. The fast Fourier transform (FFT) is a computationally very fast and reliable method to calculate the discrete Fourier transform of a function gn = g(n∆u) for a range of parameter values xk = k∆x, k = 0, . . . , N − 1 simultaneously. Here ∆x = 2π/N ∆u and for the FFT to be most efficient N has to be an integer power of 2. The algorithm takes N complex numbers as input and returns N complex numbers N −1 X nk Gk = e−2πi N gn , k = −N/2, . . . , N/2. n=0 In the nineties research surfaced where Fourier analysis and Laplace analysis were used for transform-based methods to price options in extensions of the Black-Scholes model, see Bakshi and Chen [1], Bates [5], Chen and Scott [34], Heston [72] and Scott [111]. The models include stochastic volatility elements and jumps to give better correspondence to observed asset prices as well as interest rate options. However, the approaches of these authors could not utilise the computational power of the fast Fourier transform. Carr and Madan [32] propose a method able to price options when the characteristic function of the return is known analytically. The foundation for Carr and Madan’s use of the fast Fourier transform is the following: Assume one wants to know the price of an European option with maturity T . The payoff depends on the terminal spot price ST of the underlying asset. Denoting the logarithm of the spot price by sT , it is necessary to know analytically the characteristic function of sT , defined as φT (u) = E[exp(iusT )]. Denote the logarithm of the strike price by k, and let CT (k) be the value of a call option with strike exp(k). If qT (s) is the risk-neutral density of the log-price then the TOPICS IN COMPUTATIONAL FINANCE 25 characteristic function of qT is Z ∞ eius qT (s) ds. φT (u) = −∞ The value of the call can be described as an integral over this density, i.e. Z ∞ e−rT (es − ek )qT (s) ds. CT (k) = E[f (sT , k)] = k To kill out the option price as k → −∞, and get a square integrable function, Carr and Madan consider the modified call price cT (k) cT (k) = exp(αk)CT (k), α>0 and give suggestions for appropriate choices of the parameter α, the damping parameter. Now, the Fourier transform of cT (k) is defined as Z ∞ eivk cT (k) dk, ψT (v) = −∞ and Carr and Madan’s idea is to get an analytical value of ψT in terms of φT and then use the inverse Fourier transform to obtain option prices. The option price is given by the equation Z exp(−αk) ∞ −ivk (5.2) CT (k) = e ψT (v) dv π 0 since CT (k) is real. The analytic expression for ψT (v) is determined as Z ∞ Z ∞ ivk ψT (v) = e eαk e−rT (es − ek )qT (s) ds dk k Z−∞ Z s ∞ −rT = e qT (s) (es+αk − e(1+α)k )eivk dk ds −∞ −∞ (α+1+iv)s Z ∞ e e(α+1+iv)s −rT = e qT (s) − ds α + iv α + 1 + iv −∞ e−rT φT (v − (α + 1)i) . = α2 + α − v 2 + i(2α + 1)v After discretisation and introduction of Simpson’s rule weights the option prices can be represented as N (5.3) exp(−αku ) X −i 2π (j−1)(u−1) ibvj η C(ku ) = e N e ψ(vj ) [3 + (−1)j − δj−1 ] π 3 j=1 where δn is the Kronecker delta function which is one for n = 0 and zero otherwise. Carr and Madan use this approach for the variance gamma model, which assumes that the log-price obeys a one-dimensional pure jump Markov process with stationary independent increments. The Carr-Madan method is both fast and reliable but has its limitations. One is that it requires the analytical form of the characteristic function, but the probably severest is that the method is quite restricted in what kind of option types it can handle. In specific, it is unable to handle path dependent options, such as Asian options. The method was generalised to include other options by Raible [106] who uses Fourier and bilateral Laplace transforms and Lewis [86] who uses generalised Fourier transforms consistently. Carr and Madan consider a Fourier transformation in the strike price but 26 MARTIN GROTH as showed in Groth [68] it is equivalent and natural to Fourier transform using the spot price. 5.3. PDE-methods: Finite differences and Finite elements. If the asset price is driven by a geometric Brownian motion there is a direct connection between solving the risk neutral pricing problem and solving a bounded value problem formulated with a parabolic partial differential equation. Following the original Black-Scholes analysis one can derive the Black-Scholes partial differential equation from Itô’s formula. Assume sufficient regularity and that the asset S is given by the stochastic differential equation dSt = µSt dt + σSt dBt . From the Feynman-Kac formula it follows that the derivative Λ, written on the underlying asset S, solves the partial differential equation (5.4) ∂Λ ∂Λ 1 2 2 ∂ 2 Λ + rs + σ s − rΛ = 0 ∂t ∂s 2 ∂s2 Λ(T, s) = f (s), where f (s) is the payoff function and r is the interest rate. For all models discussed above the price of an option has a representation as the solution to a partial differential equation. The price in the Black-Scholes model solves the one-dimensional pde in equation (5.4). If the stochastic volatility is driven by a Brownian motion the equation is the two-dimensional linear pde (4.2). As shown in Section 4 it is possible to derive an integro-pde representing the price of a contingent claim, in both exponential Lévy and stochastic volatility models including a Lévy process. Solving an integro-pde numerically is naturally a more involved task than solving an ordinary pde. The integral term is non-local, depending on the whole solution and not only on the variables in a small neighbourhood. The use of standard techniques to solve the equation includes finding a suitable way to represent the integral on the possibly infinite domain, either with the information at hand or by approximation. This can prove to be cumbersome and introduce severe numerical problems if not treated carefully. If the Lévy process driving the model has infinite activity the measure is singular at zero, causing additional problems in the implementation. Benth and Groth [10], Paper II, discuss how to solve the integro-pde in equation (4.6) using the finite difference method, while Benth, Groth and Lindberg [13], Paper IV, consider equation (4.8). The standard techniques for solving partial differential equations are the finite difference and the finite element methods. The foundation of these methods will not be discussed here, as the methods are used only as tools. The interested reader is referred to standard textbooks on numerical solutions of partial differential equations. The main reference for pde-methods in finance is Wilmott, Dewynne and Howison [116], but coming to age the book lacks any treatment of models with jumps or stochastic volatility and focuses mainly on finite difference methods. Cont and Tankov [35] includes a chapter, with numerous references, about integro-pdes in exponential Lévy markets based partly on Cont, Tankov and Voltchkova [36] and Cont and Voltchkova [37, 38]. Important research on finite element methods in finance is done by the group around Schwab [73, 90, 91]. TOPICS IN COMPUTATIONAL FINANCE 27 6. Option sensitivities Numerous research articles focus on the question of how to price options and other derivatives. Equally many, if not more, instead ask the related question on how to hedge the positions. Financial institutions need to know how to manage the risk their portfolios face from changes in the market. The classic Black-Scholes analysis depends on the possibility to set up a risk-free portfolio, with the rate of return equal to the interest rate, consisting of a short position in the option and a long position in ∆ shares of the underlying. This quantity ∆ is the sensitivity of the options to changes in the price of the underlying asset, i.e. ∂Λ ∆= . ∂s This is called the delta of the option and it is one of the option sensitivities often grouped together under the name the Greeks. They are all measures of how sensitive the option price is to changes in one parameter in the model of the underlying asset. Common ones are rho ρ, theta Θ, vega V and gamma Γ, which measure in order the sensitivity to the interest rate, the passage of time, the volatility and the second derivative with respect to the price of the underlying. The primary one is clearly Delta because of the connection with the Black-Scholes analysis and the concept of deltahedging. Holding the portfolio described above the option owner is instantaneously secured against any changes in the price of the asset as the gain (loss) in the price of the option is offset by a similar fall (rise) in the price of the position in the stock. Maintaining a delta-neutral portfolio enables traders to manage the risk from asset price changes. This holds true in theory only though, since delta-hedging is a dynamic hedging strategy that needs continuously rebalancing of the portfolio, incurring prohibiting large transaction costs. Similarly, investors can aim to keep a portfolio gamma-neutral to reduce the risk from the curvature of the option price which is not covered by the delta-hedge. See Hull [78] for a more extensive introduction. While the price of liquid options are observable in the market the sensitivities are not and need to be calculated, which in reality means estimating a derivative. For certain models and simple option types, for example European options in the Black-Scholes model, it is possible to derive analytical expressions and there is no need to involve in simulations. For more complicated contracts in advance models this is not feasible and one needs to resort to numerical approximations. This section is concentrated solely on Monte Carlo simulations of the sensitivities, with a brief covering of three different methods, these being the finite difference, the pathwise differentiation and the likelihood ratio methods, and finally a more in depth cover of the Malliavin method. Suppose the price of the option is represented as a discounted expectation similar to (2.3), with payoff function f and asset price St depending on a parameter θ. Assume for clarity that the interest rate is constant. The sensitivity of the price with respect to θ is then the derivative ∂ −rT α(θ) = E e f (ST (θ)) . ∂θ The obvious approach to simulate α is to use a finite difference approximation of the derivative. Simulate n independent replications of ST (θ) and ST (θ + h), take the averages fb over the two sets of paths and let the estimate α b be α b(θ, h) = e−rT fb(ST (θ + h)) − fb(ST (θ)) . h 28 MARTIN GROTH There are some obvious drawbacks with the finite difference approach. To begin with it has a bias dependent on the value of h but the variance is proportional to h−2 . While the bias is reduced by taking a smaller h this has to be weighted against the effect on the variance. Using a forward difference and independent random numbers for the two sequences the best convergence rate is typically O(n−1/4 ). The convergence rate can be improved to O(n−1/2 ) by taking central differences and by using common random numbers, as suggested by Glasserman and Yao [65], which is the best that can be expected from Monte Carlo simulations. Then however, the convergence rate can be sensitive to the smoothness of the payoff function, leading to poor performance for options with discontinuous payoffs like binary options. To achieve a better convergence rate than with the finite difference method Broadie and Glasserman [26] investigate two different methods, the pathwise method and the likelihood ratio method. Instead of taking the derivative of the expectation the pathwise method assumes α can be represented as ∂ −rT −rT ∂ α(θ) = E e f (ST (θ)) = E e f (ST (θ)) . ∂θ ∂θ The last part can be considered as a pathwise derivative of the payoff function and sufficient regularity of the payoff function is assumed to be able to interchange differentiation and expectation. According to Glasserman [64] this method has usually much less variance than the finite difference and the likelihood ratio method. To yield an unbiased estimator the pathwise method requires that the differentiation can be moved inside the expectation, which in general demands that the payoff is pathwise continuous with respect to θ. Binary options are not continuous with respect to the price of the underlying so the pathwise method is not applicable for the Greeks of a binary option. Neither is barrier options and for the same reasons the pathwise method is unable to handle the gamma of an ordinary call option. The likelihood ratio method assumes that the distribution of the underlying asset St has a density p(St ) with θ being a parameter of the density. Again assume there is enough regularity to change the order of expectation and differentiation. Using the density, the sensitivity can be written as Z ∂ −rT ∂ α(θ) = E e f (x) p(x) dx. f (ST (θ)) = ∂θ ∂θ R Since smoothness is rarely a problem for densities the likelihood ratio method is applicable for a wider range of options than the pathwise method. Dividing with p(x) and rewriting the integrand leaves ∂ log p(ST ) −rT α(θ) = E e f (ST ) . ∂θ Here ∂ log p(ST )/∂θ works as a weight function multiplying the payoff function. The product is an unbiased estimator of the derivative when applicable but the weight often produces large variance, limiting the use of the method. The main limitation is nevertheless the need for explicit knowledge of the density, which in turn needs to depend on the parameter θ. An example where the density of the marginal logreturns is not explicitly known is the Barndorff-Nielsen and Shephard model when the stationary distribution of the volatility process is inverse Gaussian. The likelihood ratio method is interesting because the derivation is not applied to the expectation or the payoff function, instead the payoff is multiplied by a weight function. TOPICS IN COMPUTATIONAL FINANCE 29 In a sense this can be viewed as a derivative-free calculation of the Greeks. The derivative is in the weight function which can cause high variance of the simulations. Taking this idea further Fournié et.al.[61] used in an inspiring paper Malliavin calculus to derive weight functions. 6.1. Malliavin calculus and Greeks. The drawbacks of the pathwise and the likelihood ratio methods make it hard to estimate option sensitivities for more complicated contracts and in markets where the option price density is unknown. At the same time the finite difference method is prone to large bias and large variance, especially for options with discontinuous payoff functions. A method capable of handling the contracts the pathwise and likelihood methods struggle with, while still producing unbiased results with low variance, is the Malliavin method proposed by Fournié et.al.[61]. The idea is to use variational stochastic calculus to derive a derivative free method of calculating the Greeks in the Balck-Scholes market. The method relies on the theory called Malliavin calculus, especially the integrationby-parts formula, to devise weights multiplying the payoff. In this way it is possible to avoid taking the derivative of the payoff function, similar to the way it is avoided in the likelihood ratio method. What follows is a short primer to Malliavin calculus, for a full account of the theory see Nualart [99]. Let Wt , t ∈ R+ be a d-dimensional Brownian motion, and let C denote the space of random variables F of the form Z ∞ Z ∞ hn (t) dWt , f ∈ S(Rn ), h1 (t) dWt , . . . , F =f 0 0 h1 , . . . , hn ∈ L (R+ ), where S(R ) is the space of rapidly decreasing C ∞ functions on Rn . For a given F ∈ C the Malliavin derivative Dt F of F is the process Dt F, t ∈ R+ in L2 (Ω × R+ ) defined by Z ∞ Z ∞ n X ∂f h1 (t) dWt , . . . , hn (t) dWt hi (t), t ∈ R+ , a.s. Dt F = ∂xi 0 0 i=1 2 n Define the norm on C by 2 1/2 kF k1,2 = (E[F ]) Z + E ∞ 1/2 |Dt F | dt , F ∈ C. 2 0 Let D1,2 denote the Banach space which is the completion of C with respect to the norm k · k1,2 . The derivative operator D is a closed linear mapping defined on D1,2 with values in L2 (Ω × R+ ). The derivative operator has a chain-rule for derivation, i.e. if ψ : Rn → R is continuously differentiable with bounded partial derivatives and F = (F1 , . . . , Fn ) a random vector whose components belong to D1,2 , then ψ(F ) ∈ D1,2 and n X ∂ψ (F )Dt Fi , Dt ψ(F ) = ∂xi i=1 t ∈ R+ , a.s. The divergence operator δ, also called the Skorohod integral, exists and is the adjoint of D. Assuming u is a stochastic process in L2 (Ω × R+ ) then u ∈ Dom(δ) if and only if for all F ∈ D1,2 it holds that Z ∞ E[hDF, uiL2 (R+ ) ] := E Dt F u(t) dt ≤ K(u)kF k1,2 , 0 30 MARTIN GROTH where K(u) is a constant independent of F . If u ∈ Dom(δ), then δ(u) is defined by the following integration-by-parts formula E[F δ(u)] = E[hDF, uiL2 (R+ ) ] , ∀F ∈ D1,2 . The domain of δ contains all adapted processes which belong to L2 (Ω × R+ ), and for such processes the Skorohod integral coincides with the Itô integral. That is, for an adapted process u ∈ L2 (Ω × R+ ) Z ∞ u(t) dWt . δ(u) = 0 Also, if F ∈ D1,2 then for all u ∈ Dom(δ) such that F δ(u) − holds that Z T Dt F u(t) dt. δ(F u) = F δ(u) − RT 0 Dt F u(t) dt ∈ L2 (Ω) it 0 The main result for computation of sensitivities with Malliavin calculus is the following: Let (F α )α be a family of random variables, continuously differentiable in Dom(D) with respect to the parameter α and let u(t), t ∈ [0, T ] be a process in L2 (Ω × R+ ). Assuming that hDF α , uiL2 (R+ ) 6= 0, a.s. then ∂F α/∂α ∂ α α (6.1) E [f (F )] = E f (F )δ u ∂α hDF α , uiL2 (R+ ) for all functions f such that f (F α ) ∈ L2 (Ω). Using (6.1) one can compute Malliavin weights assuming it is allowed to interchange differentiation and expectation. u is a weighting function which can be chosen to get an optimal tuning for specific contracts. The Malliavin weights produce unbiased estimates and do not rely on an explicit knowledge of the stock price density, as the likelihood ratio method does. The result in Fournié et.al. [61] suggests that the method gives significantly lower variance for options with discontinuous payoffs. The research literally exploded after the first article, with the same analysis done for other type of contracts, with other weighting functions and in other models, see [7, 8, 18, 19, 20, 60, 66]. As noticed in Kohatsu-Higa and Montero [82] the likelihood ratio method is similar to the Malliavin method if the density is known. It was also shown by Chen and Glasserman [33] that taking a time-step approximation using Euler schemes, applying the likelihood ratio method and then passing to the continuous-time limit results in the same weights as in the Malliavin method for several important cases, i.e. delta, rho and vega. The Malliavin method sprung the interest in doing similar research on methods including jumps. Except the pure-jump setting examined by El-Khatib and Privault [56] the main idea has been to consider the derivative in the direction of the Wiener process. León et.al.[84] was first to consider simple Lévy processes, a linear combination of a Brownian motion and several Poisson processes with fixed jump size. Developing a Malliavin calculus for simple Lévy processes they showed that the analysis can be made on the Wiener space and the formulas from the pure Wiener case can be used. A similar approach is considered by Davis and Johansson [39] while Debelley and Privault [40] extend the idea to cover general jump-diffusions. The directional derivative approach is also applied on the Barndorff-Nielsen and Shephard model in Benth, Groth and Wallin [14], Paper V. TOPICS IN COMPUTATIONAL FINANCE 31 7. Volatility derivatives The volatility is the easiest measure of the uncertainty attached to a financial asset. It was considered as a constant quantity in the Black-Scholes theory, something which has been disputed because the volatility is known to change over time. How it changes and how it can be modeled were discussed in Section 4. Given a stochastic model for the dynamics of the volatility it is a short leap to the idea of constructing contracts written on realised volatility, and trade these contracts to hedge against the changes. Calculating the Greeks of an option can tell investors about the exposure they face from changes in underlying parameters in the models, but not how to hedge it away. Trading in the underlying asset can help the investor reduce the risk associated with changes in the price, the delta-exposure. In the Black-Scholes market this is the only risk perceived since it is assumed that all other parameters are constant under the time horizon considered. The inability of the Black-Scholes model to capture the implied volatility rises the question about the risk associated with changes in the volatility of the underlying asset. A change in the volatility will influence the price of the option, possibly without changing the price of the underlying asset, but this change is not possible to hedge by the usual delta-hedging approach. The exposure to volatility is measured in vega (V), the sensitivity to changes in the volatility parameter in the Black-Scholes model. An investor sitting on a large portfolio might find that his vegaexposure is high and wish to hedge away this risk. The market has met this demand by offering derivatives written on realised variance and volatility. In 1993 the Chicago Board Option Exchange (CBOE) introduced a volatility index (VIX) which became the benchmark for stock market volatility. It measures the market expectation on the 30-day volatility based on S&P 500 index option prices with a range of strike prices. Accompanying the VIX there exists a family of derivative products written with the VIX as the underlying, including futures and options. The structure of a volatility contract is in principle not different from contracts on other underlying assets. Let the realised volatility σR (T ) over a period [0, T ] be defined as s Z 1 T 2 σR (T ) = σ (s) ds. T 0 The process σ 2 (s) depends on the model, from constant in the Black-Scholes model to a non-Gaussian Ornstein-Uhlenbeck process in the Barndorff-Nielsen and Shephard model. A volatility swap is the simplest contract, paying at time T the amount N (σR (T ) − Σ) where Σ is the strike, a predefined level of volatility, and N is a notional, turning the volatility difference into money. The strike Σ is chosen such that the swap is entered into at zero cost. A variance swap is similarly defined as N (σR2 (T ) − Σ2 ). The extension to options on realised volatility or variance is obvious. In effect the buyer swaps a fixed volatility against the actual realised volatility. Under the riskneutral probability measure Q the fixed level of volatility, sometimes referred to as the price of the swap, can be expressed as Σ(t, T ) = EQ [σR (T )|Ft ] 32 MARTIN GROTH and the price of the variance swap as Σ2 (t, T ) = EQ [σR2 (T )|Ft ]. The fair price of variance can be calculated directly by calculating the risk neutral expectation of a variance swap, something that would enforce to specify a model for the variance, see for example Benth, Groth and Kufakunesu [12], Paper III, who price swaps in the Barndorff-Nielsen and Shephard model. Much of the interest has rather been focused on how to replicate swaps on realised variance and volatility. Early work by Derman et.al.[44], Dupire [48] and Neuberger [95] shows that a continuously sampled variance swap in a diffusive market is possible to replicate by trading in the asset and its options. Assume that the price St of the asset has dynamics (7.1) dSt = µ(t)St dt + σ(t)St dWt where the drift µ(t) and the continuously sampled volatility σ(t) are arbitrary functions of time and other parameters. Applying Itô’s lemma to log St and subtracting from (7.1) then 1 dSt − St d(log St ) = σ 2 (t) dt 2 and hence Z T 2 ST dSt 2 σR = − log . T 0 St S0 Taking the conditional expectation gives the price of the variance swap. For replication one can notice that the first part inside the brackets can be considered as the continuously rebalanced position of being long 1/St shares. The second term represents the static short position in a claim on log ST /S0 . The problem of trading on the logarithmic contract can be solved by synthesizing it with liquid options on the asset. If an arbitrary put-call separator κ > 0 is picked then the log-payoff can be decomposed such that Z κ Z ∞ ST 1 1 ST − κ + − log =− + (K − S , 0) dK + (ST − K, 0)+ dK. T 2 2 St κ K K 0 κ This suggests that in addition to the 1/St shares held one should hold a short position in 1/κ forward contracts struck at κ, a long position in 1/K 2 put options at K for all strikes from 0 to κ and a similar position in call options for all strikes from κ to ∞, all contracts expiring at T . The fair price of the swap follows from the initial value of each part. Swaps and options written on volatility are known to be more difficult to price and hedge than their variance counterparts. Naively, the price of a volatility swap could be thought to be the square root of the variance swap. By Jensen’s inequality it is easy to see that this might not be the case, i.e. q E[σR (T )|Ft ] ≤ E [σR2 (T )|Ft ]. The common knowledge was that the replication strategy for volatility swaps was highly model-dependent, something which was challenged in recent papers by Carr and Lee [30, 31]. Trading dynamically in the underlying together with positions in European options, call, puts and straddles, Carr and Lee generate a synthetic volatility swap, without specifying a model for the volatility. The replication strategy is more involved than for variance derivatives but holds under a general assumption about correlation between the stock and volatility. For pricing of volatility options Carr and TOPICS IN COMPUTATIONAL FINANCE 33 Lee assume the time-t conditional distribution of the volatility is a displaced lognormal, and derive explicit formulas. Trading variance and volatility swaps they show how to hedge options, however the formulas are rather complex. The market interest in volatility derivative contracts pushes the academic research interest. Except from the references mentioned work is done by Windcliff, Forsyth and Vetzal [117] for a model with jumps in the asset price dynamics while Howison, Rafailidis and Rasmussen [77] study a stochastic volatility model with a mean-reverting lognormal volatility dynamics. Also notable in the field is the paper by Carr et.al.[29] which studies properties of the volatility in a model driven by pure jump processes, preferably the class of CGMY processes. 34 MARTIN GROTH References [1] G. Bakshi and Z. Chen. An alternative valuation model for contingent claims. J. Financial Econ., 44:123–165, 1997. [2] O. E. Barndorff-Nielsen. Exponentially decreasing distributions for the logarithm of particle size. Proc. Royal society London A, 353:401–419, 1977. [3] O. E. Barndorff-Nielsen. Processes of normal inverse Gaussian type. Finance and Stochastics, 2:41–68, 1998. [4] O. E. Barndorff-Nielsen and N. Shepard. Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics. J. Royal Stat. Society, 63:167–241, 2001. [5] D. A. Bates. Jumps and stochastic volatility: Exchange rate processes implicit in Deutsche mark options. Review of Financial Studies, 9(1):69–107, 1996. [6] D. Becherer. Rational hedging and valuation with utility-based preferences. PhD thesis, Technischen Universität Berlin, 2001. [7] E. Benhamou. Smart monte carlo: various tricks using Malliavin calculus. Quant. Finance, 2(5):329–336, 2002. [8] E. Benhamou. Optimal Malliavin weighting function for the computation of the Greeks. Mathematical Finance, 13(1), 2003. [9] F. E. Benth. An introduction to mathematical finance. Springer-Verlag, Berlin, 2004. [10] F. E. Benth and M. Groth. The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nielsen and Shephard stochastic volatility model. Submitted, 2006. [11] F. E. Benth, M. Groth, and P. C. Kettler. A quasi-Monte Carlo algorithm for the normal inverse Gaussian distribution and valuation of financial derivatives. Int. J. Theor. Applied Finance, 9(6), 2006. [12] F. E. Benth, M. Groth, and R. Kufakunesu. Valuing volatility and variance swaps for a nonGaussian Ornstein-Uhlenbeck stochastic volatility model. Submitted, 2006. [13] F. E. Benth, M. Groth, and C. Lindberg. The implied risk aversion from utility indifference option pricing in a stochastic volatility model. Submitted, 2007. [14] F. E. Benth, M. Groth, and O. Wallin. Derivative-free Greeks for the Barndorff-Nielsen and Shephard stochastic volatility model. Submitted, 2007. [15] F. E. Benth and K. H. Karlsen. A PDE-representation of the density of the minimal entropy martingale measure in stochastic volatility markets. Stoch. Stoch. Rep., 77(2):109–137, 2005. [16] F. E. Benth and T. Meyer-Brandis. Indifference pricing and the minimal entropy martingale measure in a stochastic volatility model with jumps. Preprint Pure Math. Univ. of Oslo, 3, 2004. [17] F. E. Benth and T. Meyer-Brandis. The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps. Finance and Stochastics, 9(4), 2005. [18] H.-P. Bermin. A general approach to hedging options: applications to barrier and partial barrier options. Mathematical Finance, 12:199–218, 2002. [19] H.-P. Bermin. Hedging options: The Malliavin calculus approach versus the δ-hedging approach. Mathematical Finance, 13(1), 2003. [20] G. Bernis, E. Gobet, and A. Kohatsu-Higa. Monte carlo evaluation of Greels for multidimensional barrier and lookback options. Mathematical Finance, 13(1), 2003. [21] T. Björk. Arbitrage theory in continuous time. Oxford University Press, Oxford, 1998. [22] F. Black and M. Scholes. The pricing of options and corporate liabilities. J. Political Econ., 81:637–659, 1973. [23] N. Bouleau and D. Lamberton. Residual risks and hedging strategies in Markovian markets. Stoch. Proc. Appl., 33:131–150, 1989. [24] P. Boyle. Options: A Monte Carlo approach. J. Financial Econ., 4:323–338, 1977. [25] P. Boyle, M. Broadie, and P. Glasserman. Monte Carlo methods for security pricing. J. Econ. Dynamics Control, 21:1267–1321, 1997. [26] M. Broadie and P. Glasserman. Estimating security price derivatives by simulation. Management Science, 42:269–285, 1996. [27] P. Carr, H. Geman, D. B. Madan, and M. Yor. The fine structure of asset returns: An empirical investigation. J. Business, 75, 2002. TOPICS IN COMPUTATIONAL FINANCE 35 [28] P. Carr, H. Geman, D. B. Madan, and M. Yor. Stochastic volatility for Lévy processes. Mathematical Finance, 13:345–382, 2003. [29] P. Carr, H. Geman, D. B. Madan, and M. Yor. Pricing options on realized variance. Finance and Stochastics, 9:453–475, 2005. [30] P. Carr and R. Lee. Pricing and hedging options on realized volatility and variance. Preprint, 2006. [31] P. Carr and R. Lee. Robust replication of volatility derivatives. Preprint, 2006. [32] P. Carr and D. B. Madan. Option valuation using the fast Fourier transform. J. Computational Finance, 2:61–73, 1998. [33] N. Chen and P. Glasserman. Malliavin Greeks without Malliavin calculus. Preprint, 2006. [34] R.-R. Chen and L. Scott. Pricing interest rate options in a two-factor Cox-Ingersoll-Ross model of the term structure. Review of Financial Studies, 5(4):613–636, 1992. [35] R. Cont and P. Tankov. Financial modelling with jump processes. Chapman & Hall/CRC, Boca Raton, Florida, 2005. [36] R. Cont, P. Tankov, and E. Voltchkova. Option pricing models with jumps: Integro-differential equations and inverse problems. In P. Neittaanmäki, T. Rossi, E. Korotov, J. Periaux, and D. Knörzer, editors, Proceedings 4th European Congress. ECCOMAS, 2004. [37] R. Cont and E. Voltchkova. A finite difference scheme for option pricing in jump diffusion and exponential Lévy models. In P. Neittaanmäki, T. Rossi, E. Korotov, J. Periaux, and D. Knörzer, editors, Proceedings 4th European Congress. ECCOMAS, 2004. [38] R. Cont and E. Voltchkova. Integro-differential equations for option prices in exponential Lévy models. Finance and Stochastics, 9:299–325, 2005. [39] M. H. A. Davis and M. Johansson. Malliavin Monte Carlo Greeks for jump-diffusions. Stoch. Proc. Appl., 116(1):101–129, 2006. [40] V. Debelley and N. Privault. Sensitivity analysis of European options in jump-diffusion models via the Malliavin calculus on the Wiener space. Preprint, 2004. [41] F. Delbaen, P. Grandits, T. Rheinländer, D. Samperi, M. Schweizer, and C. Stricker. Exponential hedging and entropic penalties. Mathematical Finance, 12(2):99–123, 2002. [42] F. Delbaen and W. Schachermeyer. A general version of the fundamental theorem of asset pricing. Mathematishe Annalen, 300:463–520, 1994. [43] F. Delbaen and W. Schachermeyer. The fundamental theorem for unbounded processes. Mathematishe Annalen, 312:215–250, 1998. [44] K. Demeterfi, E. Derman, M. Kamal, and J. Zou. More than you ever wanted to know about volatility swaps. Goldman Sashs, March 1999. [45] E. Derman and I. Kani. Riding on a smile. RISK, 7(2):32–39, 1994. [46] D. Duffie. Dynamic asset pricing theory. Princeton university press, Princeton, New Jersey, 2001. [47] B. Dumas, J. Fleming, and R. E. Whaley. Implied volatility functions; Empirical tests. J. Finance, 53(6):2059–2106, 1998. [48] B. Dupire. Model art. RISK, September:118–120, 1993. [49] B. Dupire. Pricing with a smile. RISK, 7(1), 1994. [50] E. Eberlein. Application of generalized hyperbolic Lévy motions to finance. In O. E. BarndorffNielsen, T. Mikosch, and S. I. Resnick, editors, Lévy processes, Theory and Applications, pages 319–336. Birkhäuser, Boston, 2001. [51] E. Eberlein and U. Keller. Hyperbolic distributions in finance. Bernoulli, 1(3):281–299, 1995. [52] E. Eberlein and K. Prause. The generalized hyperbolic model: financial derivatives and risk measures. In Mathematical Finance - Bachelier Congress 2000, pages 245 – 267. Springer-Verlag, 1998. [53] E. Eberlein and S. Raible. Term structure models driven by general Lévy processes. Mathematical Finance, 9(1):31–53, 1999. [54] N. El Karoui and M. Quenez. Dynamic programming and pricing of contingent claims in an incomplete market. SIAM J. Control Optim., 33:29–66, 1995. [55] N. El Karoui and R. Rouge. Pricing via utility maximization and entropy. Mathematical Finance, 10(2):259–276, 2000. [56] Y. El-Khatib and N. Privault. Computations of Greeks in a market with jumps via the Malliavin calculus. Finance and Stochastics, 8:161–179, 2004. 36 MARTIN GROTH [57] H. Faure. Discrépence de suites associée à un systéme de numèration (en dimension s). Acta Arithmetica, 43:337–351, 1982. [58] H. Föllmer and M. Schweizer. Hedging of contingent claims under incomplete information. In M. Davis and R. Elliot, editors, Applied stochastic analysis, pages 389–414. Gordon and Breach, London, 1991. [59] J.-P. Fouque, G. Papanicolaou, and R. Sircar. Derivatives in financial markets with stochastic volatility. Cambridge University Press, Cambridge, 2000. [60] E. Fournié, J.-M. Lasry, J. Lebuchoux, and P.-L. Lions. Applications of Malliavin calculus to Monte Carlo methods in finance ii. Finance and Stochastics, 5:201–236, 2001. [61] E. Fournié, J.-M. Lasry, J. Lebuchoux, P.-L. Lions, and N. Touzi. Applications of Malliavin calculus to Monte Carlo methods in finance. Finance and Stochastics, 3:391–412, 1999. [62] M. Frittelli. The minimal entropy martingale measure and the valuation problem in incomplete markets. Mathematical Finance, 10(1):39–52, 2000. [63] T. Fujiwara and Y. Miyahara. The minimal entropy martingale measure for geometric Lévy processes. Finance and Stochastics, 7:509–531, 2003. [64] P. Glasserman. Monte Carlo methods in financial engineering. Springer-Verlag, New York, 2004. [65] P. Glasserman and D. Yao. Some guidelines and guarantees for common random numbers. Management Science, 38(6):884–908, 1992. [66] E. Gobet and A. Kohatsu-Higa. Computation of Greeks for barrier and lookback options using Malliavin calculus. Electronic Communications in Probability, 8:51–62, 2003. [67] P. Grandits and T. Rheinländer. On the minimal entropy martingale measure. Annals of Prob., 30:1003–1038, 2002. [68] M. Groth. Simulation in financial mathematics. Licentiate Thesis, Växjö University, 2005. [69] J. Halton. On the efficiency of certain quasi-random sequences of points in evaluating multidimensional integrals. Numerische Mathematik, 2:84–90, 1960. [70] J. M. Harrison and D. M. Kreps. Martingales and arbitrage in multiperiod securities markets. J. Economical Theory, 20:381–408, 1979. [71] J. M. Harrison and S. R. Pliska. Martingales and stochastic integrals in the theory of continuous trading. Stoch. Proc. Appl., 11:215–260, 1981. [72] S. L. Heston. A closed-form solution for options with stochastic volatility, with applications to bond and currency options. Review Financial Studies, 6(2):327–343, 1993. [73] N. Hilber, A. Matache, and C. Schwab. Sparse wavelet methods for option pricing under stochastic volatility. J. Computational Finance, 8(4), 2005. [74] E. Hlawka and R. Mück. A transformation of equidistributed sequences. In Applications of number theory to numerical analysis, pages 371–388. Academic Press, 1972. [75] D. Hobson. Stochastic volatility models, correlation, and the q-optimal measure. Mathematical Finance, 14(4):537–556, 2004. [76] S. Hodges and A. Neuberger. Optimal replication of contingent claims under transaction costs. Review Futures Markets, 8:222–239, 1989. [77] S. Howison, A. Rafailidis, and H. Rasmussen. On the pricing and hedging of volatility derivatives. Applied Math. Finance, 11(4):317–346, 2004. [78] J. C. Hull. Option, futures and other derivatives; Fifth edition. Prentice Hall, Upper Saddle River, New Jersey, 2003. [79] J. C. Hull and A. White. The pricing of options on assets with stochastic volatility. J. Finance, 42:281–300, 1987. [80] C. Joy, P. P. Boyle, and K. S. Tan. Quasi-Monte Carlo methods in numerical finance. Management Science, 42(6):926–939, 1996. [81] R. Kainhofer. Quasi-Monte Carlo algorithms with applications in numerical analysis and finance. PhD thesis, Technischen Universität Graz, 2003. [82] A. Kohatsu-Higa and M. Montero. Malliavin calculus in finance. In S. T. Rachev and G. A. Anastassiou, editors, Handbook of computational and numerical methods in finance, pages 111– 174. Springer-Verlag, Berlin, 2004. [83] J. P. Lehoczky. Simulation methods for option pricing. In M. A. Dempster and S. R. Pliska, editors, Mathematics of Derivative Securities, pages 528–544. Cambridge University Press, Cambridge, 1997. TOPICS IN COMPUTATIONAL FINANCE 37 [84] J. A. León, J. L. Solé, F. Utzet, and J. Vives. On Lévy processes, Malliavin calculus and market models with jumps. Finance and Stochastics, 6:197–225, 2002. [85] A. L. Lewis. Option valuation under stochastic volatility. Finance Press, Newport Beach, California, 2000. [86] A. L. Lewis. A simple option formula for general jump-diffusions and other exponential Lévy processes. Envision Financial Systems, 2001. [87] D. B. Madan and F. Milne. Option pricing with variance gamma martingale components. Mathematical Finance, 1:39–55, 1991. [88] D. B. Madan and E. Seneta. The variance gamma (V.G.) model for share market returns. J. Business, 63(4):511–524, 1990. [89] B. B. Mandelbrot. Fractals and scaling in finance; discontinuity, concentration, risk. SpringerVerlag, New York, 1997. [90] A.-M. Matache, P. Nitsche, and C. Schwab. Wavelet Galerkin pricing of American options on Lévy driven assets. Quant. Finance, 5(4):403–424, 2005. [91] A.-M. Matache, T. von Petersdorff, and C. Schwab. Fast deterministic pricing of options on Lévy driven assets. Math. Modelling Num. Analysis, 38(1):37–72, 2004. [92] R. C. Merton. Theory of rational option pricing. Bell J. Econ. Manag. Sciences, 4:141–183, 1973. [93] R. C. Merton. Option pricing when underlying stock returns are discontinuous. J. Financial Econ., 3:125–144, 1976. [94] M. Musiela and M. Rutkowski. Martingale methods in financial modelling. Springer-Verlag, Berlin, 1997. [95] A. Neuberger. The log contract. J. Portfolio Management, 20(2), 1994. [96] E. Nicolato and E. Venardos. Option pricing in stochastic volatility models of the OrnsteinUhlenbeck type. Mathematical Finance, 13(4):445–466, 2003. [97] H. Niederreiter. Low-discrepancy and low-discrepancy sequences. J. Number Theory, 30:51–70, 1988. [98] H. Niederreiter. Random number generation and quasi-Monte Carlo methods. SIAM, Philadelphia, Pennsylvania, 1992. [99] D. Nualart. Malliavin calculus and related topics. Springer-Verlag, Berlin, 1995. [100] A. Papageorgiou and S. Paskov. Deterministic simulation for risk management. J. Portfolio Management, 25th anniversary issue:122–127, 1999. [101] A. Papageorgiou and J. Traube. Beating Monte Carlo. RISK, 9:63–65, 1996. [102] S. Paskov. New methodologies for valuing derivatives. In S. Pliska and M. Dempster, editors, Mathematics of Derivative Securities, pages 545–582. Cambridge University Press, Cambridge, 1997. [103] S. Paskov and J. Traube. Faster valuation of financial derivatives. J. Risk Management, 22:113– 120, 1995. [104] K. Prause. The generalized hyperbolic model: Estimation, financial derivatives, and risk measures. PhD thesis, Albert-Ludwigs-Universität Frieburg, 1999. [105] P. Protter. Stochastic integration and differential equations. Springer-Verlag, New York, 2003. [106] S. Raible. Lévy processes in finance: Theory, numerics, and empirical facts. PhD thesis, AlbertLudwigs-Universität Frieburg, 2000. [107] T. Rheinländer. An entropy approach to the Stein/Stein model with correlation. Finance and Stochastics, 9(3), 2005. [108] T. Rheinländer and G. Steiger. The minimal martingale measure for general BarndorffNielsen/Shephard models. Annals Applied Prob., 16(3), 2006. [109] P. Samuelson. Rational theory of warrant pricing. Industrial management review, 6:13–32, 1965. [110] K.-I. Sato. Lévy Processes and infinitely divisible distributions. Cambridge University Press, Cambridge, 1999. [111] L. O. Scott. Pricing stock options in a jump-diffusion model with stochastic volatility and interest rates: Applications of Fourier inversion methods. Mathematical Finance, 7(4):413–424, 1997. [112] I. Sobol. The distribubtion of points in a cube and the approximate evaluation of integrals. USSR Computational Math. and Math. Physics, 7(4):86–112, 1967. [113] G. Steiger. The optimal martingale measure for investors with exponential utility function. PhD thesis, Swiss federal institute of Technology, 2005. 38 MARTIN GROTH [114] E. Stein and J. Stein. Stock price distributions with stochastic volatility: An analytical approach. Review Financial Studies, 4:727–752, 1991. [115] J. Tilley. Valuing American options in a path simulation model. Transactions Society of Actuaries, 43:83–104, 1993. [116] P. Wilmott, J. Dewynne, and S. Howison. Option pricing, mathematical models and computation. Oxford Financial Press, Oxford, 1993. [117] H. Windcliff, P. A. Forsyth, and K. Vetzal. Pricing methods and hedging strategies for volatility derivatives. J. Banking Finance, 30:409–431, 2006. I A quasi-Monte Carlo algorithm for the normal inverse Gaussian distribution and valuation of financial derivatives Fred Espen Benth, Martin Groth, and Paul C. Kettler International Journal of Theoretical and Applied Finance Vol. 9, No. 6 (2006) p. 843-867 A QUASI-MONTE CARLO ALGORITHM FOR THE NORMAL INVERSE GAUSSIAN DISTRIBUTION AND VALUATION OF FINANCIAL DERIVATIVES FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER Abstract. We propose a quasi-Monte Carlo (qMC) algorithm to simulate variates from the normal inverse Gaussian (NIG) distribution. The algorithm is based on a Monte Carlo technique found in Rydberg [13], and is based on sampling three independent uniform variables. We apply the algorithm to three problems appearing in finance. First, we consider the valuation of plain vanilla call options and Asian options. The next application considers the problem of deriving implied parameters for the underlying asset dynamics based on observed option prices. We employ our proposed algorithm together with the Newton Method, and show how we can find the scale parameter of the NIG-distribution of the logreturns in case of a call or an Asian option. We also provide an extensive error analysis for this method. Finally we study the calculation of Value-at-Risk for a portfolio of nonlinear products where the returns are modeled by NIG random variables. 1. Introduction The fair price of a financial derivative can be expressed in terms of a risk-neutral expectation of a random pay-off. In some cases the expectation is explicitly computable, the Black & Scholes formula for call options on assets modeled by a geometric Brownian motion being the prime example. However, considering for instance an Asian option, there exists no longer closed form expressions for the price, and numerical methods are called for. This may even be the case when considering plain vanilla call options written on assets with non-normal returns. In the present paper we propose a quasiMonte Carlo algorithm for the valuation of expectations of functionals of normal inverse Gaussian distributed random variables. Barndorff-Nielsen [1] proposed to model the log-returns of asset prices by using the normal inverse Gaussian (NIG) distribution. This family of distributions has proven to fit the semi-heavy tails observed in financial time series of various kinds extremely well (see e.g. Rydberg [13], or Eberlein and Keller [2] who apply the hyperbolic distribution, being a close relative to the NIG). The time dynamics of the asset prices are modeled by an exponential Lévy process. To price derivatives, even simple call and put options, we need to consider the numerical evaluation of the expectation. Raible [12] have considered a Fourier method to evaluate call and put options. An alternative to this could be Monte Carlo method, however, these are rather slow in convergence. The quasi-Monte Carlo (qMC) method has been applied with success in financial applications by many authors (see Glasserman [4], and references therein), and has very powerful convergence properties. Even though it samples deterministically, it Date: 12 September 2005. 2000 Mathematics Subject Classification. 49M15, 65C05, 68Q25, 65D30. Key words and phrases. Quasi-Monte Carlo, normal inverse Gaussian distribution, NewtonRaphson method, option pricing, implied volatility. 1 I. 2 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER is often considered as a kind of Monte Carlo algorithm. Most of the work done on applying these simulation techniques to finance has concentrated on problems where one needs to simulate from the normal distribution. One exception is Kainhofer [7], who proposes a qMC algorithm for NIG variables based on a technique proposed by Hlawka and Mück [6] to produce low-discrepancy samples for general distributions. His method requires knowledge of the cumulative NIG distribution function, which needs to be computed using numerical integration. We propose a qMC algorithm based on a simulation method for generalized inverse Gaussian distributions suggested by Michael, Schucany, and Haas [9]. The algorithm requires the simulation of three independent uniform random variables, and NIG samples are calculated via explicit transformations of these. For simplicity, the algorithm is given in one dimension, but extends readily to many dimensions. Our qMC-algorithm for NIG variates does not require the numerical inversion of the NIG cumulative distribution function. We apply our algorithm on three financial problems, two one-dimensional option pricing problems and a multivariate portfolio problem. The first involves the pricing of a plain vanilla call option and an Asian call option, being a call option written on the average of the asset price over a specified time period. We can approximate the price of the latter as an expectation of a functional of a NIG distribution, which we evaluate based on our qMC algorithm. We compare our results with the algorithm proposed by Kainhofer [7]. Our next application involves finding the “implied volatility” from a call and an Asian option based on a NIG model. More precisely, given the price of an Asian option, and supposing that the log-returns of the underlying asset is NIG distributed, how can we find one (or more) of the parameters of the NIG distribution? This is an inverse problem, where we try to find the parameter in the NIG distribution which is so that the quoted price is achieved. A natural approach is to use Newton’s method, which involves calculating the option price along with its derivative. Thus, we need to calculate two expectations involving a multivariate NIG, and iterate these until convergence is reached. We provide a general analysis of the convergence properties of such an algorithm. Our final application is on Value-at-Risk (VaR). This is somewhat detached from option pricing, but still is an interesting application of our qMC-algorithm. We consider a portfolio of assets where the log-returns are modeled using NIG distributions (independently!), and compare with a crude Monte Carlo algorithm. Since the calculation of the VaR for a portfolio can be recast as finding a quantile, we may apply the Newton’s method. However, it turns out that this is not a fruitful way compared to the usual approach with (quasi-) Monte Carlo and sorting. The paper is organized as follows: In the next section we present the theory relating to pricing options with the NIG distribution. Following that we investigate a quasiMonte Carlo algorithm for simulating NIG distributed random variables. Continuing, we go about finding implied parameters using Newton’s Method and qMC. Next we turn attention to applications to finance. Finally, we summarize our conclusions. 2. Pricing options with the NIG distribution Let (Ω, F , P ) be a probability space equipped with a filtration {Ft }t∈[0,T ] satisfying the usual conditions1, with T < ∞ being the time horizon. Let L(t) be a Lévy process being right-continuous with left-limit (RCLL, or càdlàg), and consider the following 1see e.g. Karatzas and Shreve [8]. QMC AND NIG I. 3 exponential model for the asset price dynamics (2.1) S(t) = S(0) exp(L(t)) . In this paper we will mostly be concerned with the exponential NIG-Lévy process dynamics, meaning that L(t) has increments being distributed according to a NIG distribution. The NIG family of distributions is specified by four parameters. A random variable is said to be NIG distributed with parameters µ, β, α and δ, denoted X ∼ NIG(α, β, µ, δ), where µ is the location, β the skewness, α the tail-heaviness and δ the scale. The density of a NIG(α, β, µ, δ)-variable is given by K αs(x − µ) δα 1 (2.2) exp δ α2 − β 2 + β(x − µ) p(x; µ, β, α, δ) = π s(x − µ) where x ∈ R, µ ∈ R, δ > 0, 0 ≤ |β| ≤ α and s(x) = √ δ 2 + x2 , and where K1 is the modified Bessel function of the third kind with index 1. Specifically, ∞ y y2 exp − t + t−2 dt, y ∈ R K1 (y) = 4 4t 0 The NIG family of distributions is infinitely divisible, which means that there exists a Lévy process such that for ∆t > 0, L(t + ∆t) − L(t) ∼ L(∆t) ∼ NIG(α, β, µ, δ) , for every t ≥ 0. It turns out that this Lévy process is a pure-jump process, and the associated Lévy measure is absolutely continuous with respect to the Lebesgue measure and its density can be calculated explicitly as (2.3) (z) = π −1 δα |z|−1 K1 (α |z|) eβz (2.4) C(0) = e−rT EQ [max(S(T ) − K, 0)] , Note that R min(1, z 2 )(z) dz < ∞. We refer to Barndorff-Nielsen [1], Geman [3], and Rydberg [13], for a discussion of the NIG distribution and the corresponding Lévy process. Considering an asset dynamics given by the exponential NIG-Lévy process, we can find the price of a European call option with strike price K at exercise time T as where r is the risk-free interest rate and Q is an equivalent martingale measure. The exponential NIG-Lévy model gives rise to an incomplete market, thus leading to a continuum of equivalent martingale measures that can be used for risk-neutral pricing. However, we choose the approach of Raible [12], and consider the Esscher transform method to derive a Q-measure for pricing. This approach is so-called structure preserving, in the sense that we search for equivalent probability measures Q such that the distribution of L(T ) remains in the class of NIG distributions and where the log-return I. 4 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER of S is the risk-free return r. Thus, supposing L(T ) ∼ NIG(α, β, µ, δ), we can recast the expression in Equation (2.4) as (2.5) C(0) = e−rT E max S(0)eX − K, 0 where X is a NIG(α, β̂, µ, δ)-variable with (µ − r)2 (µ − r)2 1 2− α . β̂ = − + sgn(β) 2 δ 2 + (µ − r)2 4δ 2 In this paper we shall also be concerned with Asian options written on S(t). Consider such an option with exercise at time T and strike price K on the average over the time span up to T . The risk-neutral price is T 1 (2.6) A(0) = e−rT EQ max S(t) dt − K, 0 . T 0 Again applying the Esscher transform, we have that L(t) is still a NIG-Lévy process, and approximating the integral with a Riemann sum2 yields the price N S(0) (2.7) A(0) = e−rT E max exp(L(ti ))∆t − K, 0 . N i=1 By using the independent increment property of the Lévy process we may rewrite the sum into a function of N increments of L, that is, into a function g : RN → R such that (2.8) A(0) = e−rT E [max (g(X1 , . . . , XN ) − K, 0)] . Here, Xi = L(ti ) − L(ti−1 ), for i = 1, . . . , N. For simplicity we focus on regular time partitions, ∆t = ti − ti−1 . From the considerations above we see that both the call and the Asian pricing functional can be written as (2.9) C(0) = E [f (X1 , . . . , Xd )] where d ≥ 1 and Xi are i.i.d. NIG(α, β, µ, δ)-variables. We note that numerous other type of options can be expressed in the same way, counting for instance spread options and barrier options. The number d gives the dimensionality of the problem, and the function f is connected to the payoff of the option and the exponential function giving the asset price dynamics. The rest of the paper is concerned with developing and analyzing a qMC method to valuate the expectation in Equation (2.9). 3. A quasi-Monte Carlo algorithm for simulating NIG distributed random variables We develop a quasi-Monte Carlo method for simulating expectation of function of NIG distributed random variables. Include some discussion of convergence, and a numerical evaluation of the log N/N convergence. 2Note that in practice there exists no Asian options with continuous averaging. The Asian options traded in the market has discrete averaging, also known as Bermudian options, and thus a simple Riemann approximation is the most natural. QMC AND NIG I. 5 Consider the simulation algorithm for sampling from a NIG(α, β, µ, δ)-distributed variable X proposed by Rydberg [13] building on work by Michael, Schucany and Haas [9] (referred to from now on as the Rydberg-MC method): • Sample Z from IG(δ 2 , α2 − β 2 ) • Sample Y from N(0, 1) √ • Return X = µ + βZ + ZY The sampling of Z consists of first drawing a random variable V which is χ2 (1)distributed, defining a random variable ξ 2V ξ W = ξ + 2 − 2 4ξδ 2 V + ξ 2 V 2 2δ 2δ and then letting ξ2 · 1{U1 ≥ ξ } , Z = W · 1{U1 ≤ ξ } + ξ+W ξ+W W U1 being uniformly distributed, and ξ = δ/ α2 − β 2 . This provides us with a Monte Carlo algorithm for simulating an NIG(α, β, µ, δ)-distributed variable X. From the algorithm, we see that to sample from X we basically need to sample a standard normal Y , a χ2 distributed V , and a uniform U1 . The two first ones can be sampled from two independent uniform distributions U2 and U3 by a transformation using the normal distribution function; we are thus led to the conclusion that sampling from X entails sampling from three independent uniformly distributed random variables: X = µ + βq(U2 , U3 ) + q(U2 , U3 )Φ−1 (U1 ), where Φ is the cumulative distribution function of the standard normal distribution, and q(x, y) = w(x) · 1{y≤ ξ ξ+w(x) }+ ξ2 ·1 , ξ w(x) {y≥ ξ+w(x) } with 2 ξ 2 Φ−1 (x) ξ 2 Φ−1 (x) 2 + ξ 2 Φ−1 (x) 4 − 4ξδ w(x) = ξ + 2δ 2 2δ 2 These considerations give us a scheme to sample low-discrepancy sequences for the NIG distribution by combining three low-discrepancy sequences and appealing to the fast inversion algorithm for the normal distribution given by Moro [10]. We refer to this qMC algorithm for NIG as the Rydberg-qMC method. We now discuss some issues on the convergence of this algorithm applied to calculating the prices of financial derivatives based on NIG models. First, in view of Equation (2.9) and the algorithm above, we can write C(0) = E [f (X1 , . . . , Xd )] = E h(U11 , U21 , U31 , U12 , U22 , U33 , . . . , U1d , U2d , U3d ) for d independent triples of three independent uniform random variables (U1i , U2i , U3i ), i = 1, . . . , d. The function h is a combination of f and the transforms above. We can I. 6 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER Interval [−1, 1] Interval [−4, −1] 0.1 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 0 2000 4000 6000 8000 10000 0 0 2000 Interval [−0.25, 0.1] 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 2000 4000 6000 6000 8000 10000 8000 10000 Interval [2.5, 3] 0.1 0 4000 8000 10000 0 0 2000 4000 6000 Figure 1. Convergence of the Rydberg-qMC algorithm for the estimate of an indicator function over an interval with distribution NIG(1, 0.75, 1, 0). The smooth curves show the function c ∗ log3 N/N with the constant c = 0.2. state this as an integration over the unit hypercube: h(y1 , . . . , y3d ) dy . C(0) = [0,1)3d Thus finding the price C(0) using our proposed qMC algorithm entails in a 3 × ddimensional problem. If we have a low-discrepancy sequence {y k }k=1,... in [0, 1)3d , the Koksma-Hlawka bound says that N 1 log3d N h(y , . . . , y ) dy − h(y ) ≤ V (h)c(d) 1 3d k N k=1 N [0,1)3d where V (h) is the variation of h in the sense of Hardy and Krause (see e.g. Glasserman [4]) and c(d) is a constant only dependent on the dimension d. Note that this bound is only valid for functions h with finite variation, V (h) < ∞, which in general is not the case in financial applications since h may be unbounded. Also, the result predicts a rather slow convergence in higher dimensions. In practical examples the rate of convergence is, however, much better (see Papageorgiou [11] for a discussion of convergence related to financial applications). We provide some numerical results indicating the convergence rate for our algorithm. A mathematical analysis of the properties of the algorithm will be provided elsewhere. In Fig. 1 we display some simulations of the convergence rate. We use a Niederreiter sequence to generate uniformly distributed low-discrepancy numbers and the RydbergqMC algorithm to get normal inverse Gaussian distributed numbers. We simulate an QMC AND NIG I. 7 indicator function χ[a,b] (x) and compare to a simulated true value. Fig. 1 show the relative error of the quasi-Monte Carlo simulation together with the smooth curve c ∗ log3 N/N with the constant c = 0.2. It is clear that for these simulations the convergence rate of the Rydberg-qMC numbers are of order log3 N/N or better, and other simulations also confirms this. Our proposed Rydberg-qMC algorithm is an alternative to the Hlawka-Mück method for qMC simulations from general distributions. The latter is used by Kainhofer [7] to generate qMC-samples from a NIG distribution. To sample a point set from a distribution with cumulative distribution function F we start with a uniformly distributed set ω = (x1 , x2 , . . . , xn ) on the half open unit interval with discrepancy Dn (ω). We then let n 1 χ[0,xk ] (F (xr )) yk = n r=1 and get the F -distributed point set ω̃ = (y1 , y2 , . . . , yn ). Hence, every point in ω̃ is of the form i/n, i = 0, . . . , n and we observe that we need to have, at least, a numerical approximation of the cumulative distribution function. If M = supx∈[0,1) f (x), where f (x) is the corresponding density function, then the discrepancy of ω̃ is bounded by Dn,F (ω̃) ≤ (1 + M)Dn (ω), see Kainhofer [7]. We shall refer to this algorithm as the HM-method and it extends readily to higher dimensions. Since the Hlawka-Mück method only applies for distributions supported on the unit hypercube, Kainhofer [7] considers a transformation between the real line and unit interval given by the double-exponential distribution with parameter λ > 0, having cumulative distribution function 1 exp(λx) , if x < 0 2 (3.1) Hλ (x) = 1 − 12 exp(−λx) , if x ≥ 0 and inverse given as Hλ−1(x) = 1 λ log(2x) , − λ1 log(2 − 2x) , if x ≤ 12 if x > 12 . To prevent having an argument equal to zero in the logarithm, Kainhofer [7] suggests to shift zero by 1/n, where n is the number of points in the sequence. This is shown to have minor influence on the properties of the sequence. 4. Finding implied parameters using Newton’s Method and qMC In finance one is often interested in the implied volatility, that is, the volatility of the asset price dynamics yielding a certain option price. If the option in question is of Asian type, one can not resort to the Black & Scholes formula to derive the implied volatility, but need to employ a numerical procedure involving calculation of the option price and search for the volatility for a given price. If the underlying asset is modeled using a exponential NIG-Lévy process, there are essentially three parameters to search for in a risk neutral pricing paradigm. We shall later concentrate on deriving the implied δ, and use the Newton Method in conjunction with our proposed qMC algorithm to find the implied δ from a given Asian option price. I. 8 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER We can state the problem in general as follows: Let x ∈ R be a parameter of the distribution for a random variable (being multi-dimensional in general) X. Define p to be (4.1) p = E [f (X(x))] , where we use the notation X(x) to indicate that the distribution of X depends on x. Here, f is some function (in our context, the payoff from some option), and we assume that f (X(x)) has finite variance. The problem is to find x for a given p, when the family of distributions for X(x) is known but parametrized by x. For notational simplicity, define the function g : R → R to be (4.2) g(x) = E [f (X(x))] . It is natural to use the Newton Method to find x. However, this requires an evaluation of g along with its derivative g (x), and in our situation we do not have a functional expression even for g(x) when X is NIG distributed. To evaluate g(x) for a given x we will apply our Rydberg-qMC algorithm, but this introduces an error in the estimation. Even more, when estimating the derivative g (x) by numerical differentiation (and thereby a re-estimation of the function g at a slightly perturbed location) this error may become even bigger. We provide an error analysis of the methods in question, and show that by a careful increase in the length of the sampling sequence at each Newton step preserve the quadratic convergence property of the Newton algorithm. Suppose g ∈ C 2 , with g (x) = 0 in U, and |g (x)| ≤ K uniformly in U for some subset U ⊂ R. Suppose further that there exists a low-discrepancy sequence for the distribution of X(x) with convergence independent of x ∈ U, and given by the rate logd N/N where N is the length of the sequence and d the dimension. Recall that for the NIG distribution the dimension is 3×d, with d being the dimension of X. Newton’s Method takes the form (4.3) N xN i+1 = xi − g N (xN i )−p N (g )(xN i ) after selecting an initial point x0 . In the process it makes a functional evaluation g N (x) by qMC, wherein the superscript N denotes the number of samples at step i. It will later be natural to index N by i, that is N(i), to indicate that the number of samples in the qMC-sequence may depend on the step i in the Newton iteration. If we skip the index N, and write g(x), we mean that the function g is evaluated accurately. At each step the algorithm estimates g N (xN i ) by the secant method, using for the N N second point g (xi + ∆i ), with the increment ∆i chosen carefully to preserve accuracy in the next step. We now move on the analyze the convergence properties of the method when the functional evaluations is made by qMC. The analysis addresses in particular the functional form of the requisite number of samples in the sequence, depending on the step index i. 4.1. Convergence to a fixed-point. With exact valuation of g(x) and g (x), the ith step takes xi to xi+1 as follows. (4.4) xi+1 = xi − g(xi ) − p g (xi ) QMC AND NIG I. 9 The second term is the exact error of the algorithm at step i, say εi . So, εi : = xi+1 − xi = − g(xi ) − p g (xi ) With qMC valuations, the approximate error of the algorithm would be, say εN i . So, N εN i : = xi+1 − xi = − g N (xi ) − p g N (xi + ∆i ) − g N (xi ) ∆i It is desired to keep the difference of these error terms small. To this end, see the difference, say ιi , as (4.5) N ιi : = εN i − εi = xi+1 − xi+1 = − g N (xi ) − p − εi g N (xi + ∆i ) − g N (xi ) ∆i We know, from the specification of qMC, that for some constant ci > 0, d N g (xi ) − g(xi) ∨ g N (xi + ∆i ) − g(xi + ∆i ) ≤ ci log N −−−→ 0 (4.6) N →∞ N where d is the dimension of the valuation domain. This fact, along with the continuity of g (x), guarantees from Equation (4.5) that (4.7) lim ιi = 0, ∆i →0 N →∞ and thus for sufficiently small ∆i and large N, the introduction of qMC valuations compromises neither the existence of the successive approximations {xi } of Newton’s Method, nor their accuracy. A consequence is that the algorithm produces a virtual fixed point at a solution. 4.2. Rate of convergence. We approach convergence of the Newton-qMC algorithm in three parts, determining (1) the choice of ∆i to ensure that for sufficiently large N, ιi is small (2) the choice of N, with corresponding estimate for ιi (3) an implicit function N(i) expressing the number of samples through the steps 4.2.1. Choice of ∆i . This Subsubsection presents a basic error analysis for using the secant method to approximate a derivative, in the context of a Newton’s method step, and using the introduced notation. Similar analyses appear in many places under the heading “numeric differentiation.” A good source is Griewank [5], which contains an extensive bibliography encompassing the relevant issues. Looking to Equation (4.7) we wish to select an appropriate value of ∆i so that step i of the algorithm can provide a sufficiently accurate value xN i+1 . Herein we take “sufficiently accurate” to mean that any inaccuracy in estimating g (xi ) by substituting the exact secant slope adds no more error to xN i+1 than the estimated error of the algorithm at the following step, ε̃i+1 , a value developed below as Equation (4.13). This error is estimable from the quadratic convergence of Newton’s Method, wherein εi+1 is O(ε2i ). Specifically, (4.8) εi+1 ≈ − g (xi−1 ) 2 ε 2g (xi−1 ) i I. 10 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER We make these concepts more precise, and end the narrative with the principal result, Equation (4.14) below. Refer first to Equation (4.4). Continuing with exact analysis, that is, without yet the invocation of qMC valuations, let us estimate the effect of using a secant approximation to g (xi ). Allow this estimate to be (4.9) ḡ (xi ) : = g(xi + ∆i ) − g(xi ) , ∆i and then let x̂i+1 : = xi − g(xi ) − p , ḡ (xi ) and further let ε̂i : = x̂i+1 − xi By Taylor’s expansion 1 ḡ (xi ) = g (xi ) + g (xi )∆i , 2 ignoring third and higher order terms. So, g(xi ) − p ε̂i = − g (xi ) + 12 g (xi )∆i The effect of the secant approximation ḡ (xi ), therefore, is to induce a second order error to xN i+1 of magnitude g(xi ) − p κi : = ε̂i − εi = − − εi , g (xi ) + 12 g (xi )∆i and so κi = 1 g (xi )∆i εi 2 g (xi ) + 12 g (xi )∆i But |g (xi )| ≤ K, and therefore one may first choose (4.10) to ensure that (4.11) |∆i | ≤ g (xi ) K ∆i εi |κi | ≤ K g (xi ) One may further choose ∆i to meet any desired maximal value for |κi |. To this end, return to the estimated error of the algorithm at the following step, ε̃i+1 . In the iteration of Newton’s Method at the ith step we have in hand the error terms ε̂i−2 and ε̂i−1 . These are related, at least approximately insofar as qMC valuations are incorporated, by Equation (4.8), adjusted back two iterations. Thus we may infer ε̂i−1 ≈ − g (xi−3 ) 2 ε̂ 2g (xi−3 ) i−2 QMC AND NIG I. 11 The coefficient herein, we assume is bounded on the domain of convergence through the iterations, and thus ν : = sup i≥2 |ε̂i−1 | exists. ε̂2i−2 It follows readily that (4.12) ε̃i ≤ ν ε̂2i−1 , and that (4.13) ε̃i+1 ≤ ν 3 ε̂4i−1 This last estimate is the one we merge with Equation (4.11) to provide a choice of ∆i . Remembering the first constraint on ∆i , as expressed in Equation (4.10), we have 4 |g (xi )| |g (xi )| |g (xi )| 3 ε̂4i−1 3 ε̂i−1 ∧ ν = (4.14) |∆i | ≤ 1∧ν K K |εi | K |εi | In practice neither εi nor g (x) is known in advance, so we substitute in the former instance the value ε̃i from Equation (4.12), and in the latter instance the value ḡ (xi ) from Equation (4.9). 4.2.2. Number of samples for a step. Again looking to Equation (4.7), we wish to select an appropriate value of N so that step i of the algorithm can provide a sufficiently accurate value xN i+1 , ending the narrative with the principal result, Equation (4.18) below. We take “sufficiently accurate” to mean that any inaccuracy in estimating g (xi ) by approximating g(xi ) and g(xi + ∆i ) by g N (xi ) and g N (xi + ∆i ), respectively, further adds no more error to xN i+1 than the estimated error of the algorithm at the following step, ε̃i+1 . With a choice of ∆i made, we look to the outer error bound for qMC, as expressed in Equation (4.6), as a guide in selecting sample size. To proceed it is first necessary to estimate empirically the coefficient ci , for there are some variables which are intractable analytically, such as the effect of a particular choice of sampling scheme. It may well be also that ci is not sensitive to the step of the iteration, and so may be chosen uniformly. Refer to Equation (4.5) and Equation (4.6). It is desired to select the number of samples N such that |ιi | ≤ |ε̃i+1 | To this end, assume that g N (xi ) = g(xi ) + ζi , g N (xi + ∆i ) = g(xi + ∆i ) + ηi , and (4.15) |ζi | + |ηi | ≤ ci logd N N Then, after some elementary manipulation, g(xi) − p + ζi − εi ιi = − g(xi + ∆i ) − g(xi ) ηi − ζi + ∆i ∆i I. 12 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER This calculation considers the combined effects of estimating g (xi ), and of using qMC to value g(xi ) and g(xi + ∆i ). Insofar as error in g (xi ) has already been accounted, replace the term g(xi + ∆i ) − g(xi ) above, with g (xi ), ∆i to focus on the error induced by qMC valuations. Thus, we wish to set ζi + ηi − ζi εi ) − p + ζ g(x i i ∆i = ≤ |ε̃i+1 | − − ε (4.16) i − ζ η η − ζ i i i i g (xi ) + g (x ) + i ∆ ∆ i i Assume by Equation (4.15) that we have chosen N sufficiently large that |ζi | + |ηi | ≤ |∆i g (xi )| , 2 ηi − ζi |ηi | + |ζi | |g (xi )| ≤ ≤ ∆i |∆i | 2 Replace the first factor of the denominator in Equation (4.16) by 12 |g (xi )|, which is smaller, giving η − ζ i i ζi + ε i ∆i ≤ |ε̃i+1 | 1 |g (xi )| 2 Enlarging the numerator gives εi εi |ηi | + |ζi | 1 + |ζi | + |ηi | |εi | |ζi | + ∆i ∆i |∆i | = ≤ |ε̃i+1 | (4.17) 1 1 |g (xi )| |g (xi )| 2 2 and thus that further as sufficient. This last expression can be driven to zero with large N. The formulation to calculate a sufficient N is evident. If Equation (4.15) holds, then also logd N |ζi | ≤ ci N and logd N N independently. These relations combined with Equation (4.17) evolve to εi 1 + 2 logd N ∆i ≤ |ε̃i+1 | c (4.18) i 1 N |g (xi )| 2 |ηi | ≤ ci as a sufficient condition on N. One may solve this relation numerically to guarantee the qMC induced error small, that is, within the bound of ε̃i+1 , as expressed. In practice neither εi nor g (x) is known in advance, so we substitute in the former instance the value ε̃i from Equation (4.12), and in the latter instance the value ḡ (xi ) from Equation (4.9). QMC AND NIG I. 13 Under some circumstances convergence of qMC may be faster than that indicated herein. For a discussion see Papageorgiou [11]. 4.2.3. Step dependent qMC sampling. N, recall, is the number of samples taken for qMC valuation of g(x) and g (x) at Newton step i. We shall indicate this dependence as N(i). The principal result herein is Proposition 4.1. From Equation (4.12) we have implied, given the assumed stability of ν and the faithful prediction of ε̂i by ε̃i, that |ε̃i+1 | = ν ε̃2i Combined therefore with Equation (4.6) we have 2 logd N(i) |ε̃i+1 | = ν ci (4.19) , N(i) but by the same reasoning, (4.20) |ε̃i+1 | = ci+1 logd N(i + 1) N(i + 1) One may assume that the series {ci } is stable through the Newton steps, especially as the steps get smaller as a solution is approached. Assume, therefore c : = ci , i ≥ 0, as this approximate value. Equations (4.19) and (4.20) therefore imply a relationship between N(i + 1) and N(i). This is given implicitly by 2 d logd N(i + 1) log N(i) (4.21) = νc N(i + 1) N(i) Table 1 below shows an example of the evolving number of samples necessary to maintain accuracy, computed recursively from Equation (4.21) above, for the captioned parameters. Iteration i Samples N(i) 0 1,000 1 2,035 7,534 2 80,926 3 4 5,969,401 16,024,385,755 5 log N(i) 6.908 7.618 8.927 11.301 15.602 23.497 Table 1. qMC Samples by Iteration: N(0) = 1000, ν = 1, c = 2, d = 3 Next, we state formally this observed growth of N(i). Proposition 4.1 (Log Samples Limit). If logd N(0) γ : = νc < 1, N(0) then lim inf i→∞ log N(i) ≥1 2i log γ −1 I. 14 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER Proof. Take Equation (4.21) above, and compute the Newton error by recursion to step i beginning at step 0. Resulting is this relationship. d 2i i 2i γ2 log N(0) logd N(i) i 2 −1 2i −1 γ = (νc) = = (νc) N(i) N(0) νc νc Assuming N(i) > 1, d log log N(i) log(νc) log N(i) = + i +1 i −1 i −1 2 log γ 2 log γ 2 log γ −1 As i → ∞ the denominators of these terms increase without bound, because log γ −1 > 0 by the hypothesis. Therefore, the second term on the right converges to zero. If the first term on the right also converges to zero, then the assertion follows to a limit of one. Otherwise the limit inferior is greater. 5. Applications to finance In this Section we consider three applications of our qMC method for simulating the normal inverse Gaussian distributed variables. The first example contains the valuation of a plain vanilla call option and different Asian options when the underlying asset price dynamics is driven by a geometric NIG-Lévy process. Next we consider the problem of recovering parameters of the underlying asset price dynamics when observing option prices. This is a problem similar to finding the implied volatility in the Black & Scholes context, however, in our situation we need to resort to simulation since there is no analytical option pricing formula. We find the implied δ in the NIG distribution from the observed plain vanilla call and Asian option prices, and our method combines qMC-valuation of these option prices with Newton’s method to iterate toward the implied value. In our final application we analyze the qMC method to deriving the Value-at-Risk measure for a portfolio of assets. We compare with standard Monte Carlo, but also demonstrate how we can use Newton’s method to simulate VaR, even though not much is gained with this approach. In our applications we focus on both accuracy and efficiency in terms of speed. 5.1. Calculating option prices with qMC for NIG. We consider the problem of pricing options written on an asset dynamics given by an exponential NIG-Lévy process. We suppose that parameters of the NIG distributed log-returns under the equivalent martingale measure given by the Esscher transform of the asset is given by µ = 0.00395, β = −15.1977, α = 136.29, δ = 0.0295 which are the same set of parameters as in Kainhofer [7, Ch. 8]. We note in passing that these parameters are relevant for daily observed stock price log-returns (see e.g. Rydberg [13] for empirical analysis of Danish stock returns). We suppose further that the stock price today is S(0) = 100 and that the risk-free interest rate is r = 3.75% yearly. Consider first European at-the-money call options with a common strike K = 100 and exercise horizons of four, eight, or twelve weeks, calculated by weekly sampling with NIG parameters as above. We now compare our proposed Rydberg-qMC method with the HM-qMC method. To show the superiority qMC-methods over crude Monte Carlo, we also include a comparison with the Rydberg-MC method and an acceptancerejection Monte Carlo method (AR-MC). For the HM-qMC method, we apply λ = QMC AND NIG I. 15 Vanilla Call, Relative error Hlawka-Muck Rydberg-qMC Rydberg-MC A-R -1 10 -2 10 -3 10 2 10 3 10 4 10 Figure 2. Comparison of the different methods when calculating the price of a vanilla call option 95.2271 as in Kainhofer [7] in the double-exponential transformation of Equation (3.1). For both the Monte Carlo algorithms we use the built-in functions in Matlab for simulating uniform and normal variables. A Halton sequence is used for the HM-qMC method, while for Rydberg-qMC we base our sampling with a three dimensional Niederreiter sequence. We compare the four approaches in terms of their relative error, where the “correct” price is obtained from a long Monte Carlo simulation. In Fig. 2 we have plotted the relative error as a function of the number of samples in the sequence, with log-scale on both axes. The error for the Hlawka-Mück method is generally lower than for the other methods but the quasi-Monte Carlo method is superior to the two Monte Carlo methods. The results for the two Monte Carlo methods are the mean over ten consecutive runs and we observe that the two methods perform equivalently for all sets of points. Our quasi-Monte Carlo method is slightly worse off than the Hlawka-Mück method in accuracy, which is expected. In one dimension the Hlawka-Mück points are filling the space in a more even way than the qMC points. Hence, we expect that the Hlawka-Mück method is more accurate for a given point set, but are confident that the quasi-Monte Carlo method performs better than ordinary Monte Carlo. In accuracy the Hlawka-Mück method seems to outperform the other methods. However, the execution times differ significantly, see Table 2. The generation of the HlawkaMück numbers involves calculating the cumulative distribution function using numerical integration and iterates over all points repeatedly, which makes the method very slow compared to the other ones. Even though the integration is in only one dimension, and thus avoids the curse of dimensionality in multi-dimensional integration, the execution time for the Hlawka-Mück method compared to the other ones is a clear indication that there is more work done than necessary. The quasi-Monte Carlo method we propose has slightly lower accuracy than the Hlawka-Mück method, but when considering the time it takes to reach a certain level of accuracy our method is clearly I. 16 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER Points H-M 32 0.1400 64 0.1600 0.3200 128 256 0.6900 512 1.6200 4.3500 1024 16.6100 2048 4096 91.1800 8192 912.4300 16386 8861.8200 qMC Rydberg-MC A-R 0.0500 0.0000 0.0084 0.0100 0.0000 0.0164 0.0300 0.0000 0.0344 0.0400 0.0100 0.0672 0.1000 0.0100 0.1372 0.1800 0.0300 0.2704 0.3800 0.0600 0.5404 0.7800 0.1200 1.0812 1.6900 0.3300 2.1676 4.1100 0.9000 4.3488 Table 2. Table of the execution times in seconds for the vanilla call option price with different sizes of the sequence 4 dimensions 8 dimensions Hlawka-Muck Rydberg-qMC Rydberg-MC -1 10 12 dimensions Hlawka-Muck Rydberg-qMC Rydberg-MC -1 10 Hlawka-Muck Rydberg-qMC Rydberg-MC -1 10 -2 10 -2 -2 10 10 -3 10 -3 10 -3 10 -4 10 2 10 3 10 4 10 2 10 3 10 4 10 2 10 3 10 4 10 Figure 3. Log-Log plot of the relative errors for the Asian call option price with different sizes of the sequence. Quasi-Monte Carlo results are from a single run, Monte Carlo results are the average over 25 runs. competitive. The Rydberg-MC method is the fastest method for a given point set but it suffers, along with the Acceptance-Rejection method, from lower accuracy. We also consider the same Asian option pricing problem that Kainhofer [7] examines. The option is sampled in weekly intervals and the parameters for the distribution are taken from Kainhofer. We let the options, as noted, have maturities of four, eight, or twelve weeks, and use a Sobol sequence for all quasi-Monte Carlo methods. We see from Figure 3 that our method is not as accurate as the Hlawka-Mück method in general, but still better than the crude Monte Carlo. In 12 dimensions we observe that we do not get nearly as good results for the Hlawka-Mück method as in Kainhofer [7]. This could perhaps be attributed to a better numerical integration in Kainhofer, who uses Mathematica to do the integration before applying the Hlawka-Mück method. We do the integration within the method, using native Matlab routines. Also, even if the Hlawka-Mück method gives a better result over a given number of points, we may reach the same accuracy in shorter time with our method, using more points. 5.2. Finding implied parameters from option prices. We next turn our attention to the problem of finding parameters implied from given prices. We could imagine that we have option prices quoted on some market and a model for the dynamics of the underlying assets. For example, we could imagine that the underlying asset follows some stochastic model making the log-returns normal inverse Gaussian distributed. Since QMC AND NIG δ 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 δ̂ 0.00098 0.00192 0.00287 0.00390 0.00494 0.00599 0.00698 0.00799 0.00898 0.00993 Time 0.30912 0.69724 0.22056 0.34375 0.20619 0.17977 0.18817 0.18907 0.18820 0.18844 δ 0.011 0.012 0.013 0.014 0.015 0.016 0.017 0.018 0.019 0.020 I. 17 δ̂ 0.01090 0.01177 0.01269 0.01357 0.01460 0.01563 0.01664 0.01768 0.01869 0.01968 Time 0.17700 0.19173 0.21593 0.18810 0.17739 0.17744 0.17814 0.19812 0.18875 0.17994 Table 3. Simulated δ with a combination of Newton’s method and quasi-Monte Carlo simulations. Columns three and six give the time until the methods terminate. The relatively long time for δ = 0.002 is due to the maximal number of iterations being exceeded before any other terminal condition was met. the methods can only work with one single parameter we must for a four-parameter distribution such as the normal inverse Gaussian have some other way to assess the other three parameters beforehand. When we have the other parameters in place it is an easy task for the algorithm to find the remaining parameter sought from the given price. In any such computational process, a good stopping rule is essential. Among such rules are these. (1) Perform a predetermined number of iterations, based on an analysis of errors, (2) Iterate until successive absolute differences fall below some threshhold, and (3) Iterate until successive absolute differences fail to get smaller, choosing the next to last value as best. The third of these is a good choice for Newton’s Method if one desires full machine accuracy across computing platforms in a production environment. We start testing the method with a European call option. The method is implemented in C++ and we use a very long Monte Carlo simulation to get a ”true value” for the option, which will be the designated target. We use parameter values from Rydberg [13] for the NIG distribution and choose the set of parameters for Deutsche Bank as our test example. The estimated value of the scale parameter is in this case δ = 0.012, but to test the model we try a range of values such that δ ∈ [0.001, 0.002, . . . , 0.020]. We implement a few different termination criteria for Newton’s method, settling on a combination of the first and second rules as listed above. We iterate to a selected small difference of successive values, but only until a chosen maximum number of them. We found that using 4096 points for the quasi-Monte Carlo method gave a good balance between speed and accuracy in the simulations, and proved sufficient for our research needs. We see from Table 3 that the method finds the given value of δ within a few percent relative error in about one fifth of a second when the method terminates before reaching the maximal number of iterations allowed. It should be noted that the method compares option prices with a precision of order 10−5 , which is much more precise than what is quoted as market prices. Also, the Monte Carlo simulation of the ”true price” I. 18 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER δ 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 δ̂ 0.00098 0.00197 0.00300 0.00397 0.00492 0.00592 0.00691 0.00787 0.00890 0.00980 Time 1.08032 1.08491 1.08687 0.97535 0.97139 0.86727 0.86690 0.85736 0.86405 0.63186 δ 0.011 0.012 0.013 0.014 0.015 0.016 0.017 0.018 0.019 0.020 δ̂ 0.01089 0.01194 0.01292 0.01379 0.01475 0.01564 0.01668 0.01770 0.01889 0.01984 Time 0.74758 0.74309 0.74683 0.75018 0.74524 0.85670 0.85645 0.85376 0.86557 0.87636 Table 4. Simulated δ with a combination of Newton’s method and quasi-Monte Carlo simulations for an Asian option over ten days. δ̂ is the estimate when we assume we have quoted price with two decimals. adds some additional errors which we do not have if we consider the quoted price as the true observed price. This method easily extends to path dependent options such as Asian options. To illustrate this we test the method using a 10 days Asian option with daily normal inverse Gaussian distributed log-returns and parameters as above. We apply a Sobol sequence for the low-discrepancy numbers and the effective dimension is 30. We now consider a case where we have quoted option prices with only two decimals precision. In reality this is the situation we would find if we used real data as the basis for our root finding algorithm. We lower the required accuracy in the Newton’s method to account for this. The method now requires longer time, since we need much more work to evaluate the option in each qMC step. As we can see in Table 4 the time for the simulation is now around a second. The precision in the estimates are overall not significantly worse than previous results. Clearly, to have prices quoted with many decimals is not crucial for the result. The error in the Monte Carlo or quasi-Monte Carlo evaluations is probably more influential than the error in the terminal condition of the Newton’s method. Following the convergence analysis in Section 4.2.3 we tried an approach where we increased the number of qMC points in the function evaluation for every step of the Newton’s method. This would ensure that the function evaluation is of the same order as the expected error from the Newton’s iteration. However, we found that this did not improve the convergence, rather the opposite. We believe that this is a practical problem, because the number of points and iterations is comparably small. The change in the function evaluation we experience by changing the number of points distracts the Newton’s method, requiring more iterations to get the same accuracy. However, if we let the number of points and iteration approach infinity the convergence analysis show that we converge to the correct answer, while with a fixed number of points the method will converge to an estimate with an error bounded by the qMC error. QMC AND NIG I. 19 5.3. Calculating the Value-at-Risk for a portfolio. Let X be a random variable describing the portfolio position at time T . We are interested in finding the Value-atRisk VaR T (p) for a given risk level p ∈ (0, 1) at time T , defined as: (5.1) Pr [X ≤ VaR T (p)] = p We can rewrite this as E 1{X≤VaRT (p)} = p . To this end, define the function g : R+ → [0, 1] as (5.2) g(x) = E 1{X≤x} and note that VaRT (p) is a solution of the equation g(x) = p. We can find this solution by using a fixed-point iteration in conjunction with some simulation method enabling us to calculate g(x) for a given x. We suggest using quasi-Monte Carlo techniques for the latter. Letting x0 ∈ R+ be our initial guess of VaR T (p), we can use Newton’s Method to iterate as follows: g(xn ) − p (5.3) xn+1 = xn − g (xn ) We now elaborate a bit on the form of g. We let X be the value of a portfolio of n risky assets or a mixture of assets and options on these, represented as (5.4) n X= fi S1 (T ), . . . , Sm (T ) i=1 Here Sj (t), j = 1, . . . , m, are m independent geometric NIG Lévy processes and fj are the pay-off functions. If asset number j is a stock, then fj (x1 , . . . , xm ) = xj , while if it is a call option we can write it as fj (x1 , . . . , xm ) = [xj −K]+ . However, the specification of the fj ’s can be chosen rather freely. We conclude with g(x) = E 1{0≤Pn fi (S1 (T ),...,Sm (T ))≤x} i=1 We then turn the attention to the simulation of Value-at-Risk with the combined qMC and Newton’s method approach. We use an ordinary Newton-Raphson method and a Sobol sequence [14, 15] to generate the uniform quasi-random numbers. For the numerical derivative we keep track of the closest point larger than the current estimate. We then use the difference between the the function values at the two points divided by the distance between them. Our test case is a portfolio consisting of 10 options. We use normal inverse Gaussian log-returns employing the proposed quasi-Monte Carlo (Rydberg-qMC) sampling algorithm, and let the options have different parameters to reflect the different heavinesses of the tails. Observe that we estimate the quantile rather then the possible loss. For a true value we use a Monte Carlo simulation over 100, 000 points. One concern we must address is the problem with the number of points. Using a Sobol sequence to generate quasi-random numbers we would preferably use 2k points, where k is a positive integer. However, as we are interested, for example, in Value-atRisk at 5%, using 210 = 1024 points gives a subsample of 0.05 · 1024 = 51.2 points, which is not an integer. We could use a number of points such that the subsample is an integer, but the risk is that this practice would demolish the advantage of the quasi-random numbers. I. 20 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER VaR 0.010 0.020 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 0.030 0.032 0.034 0.036 0.038 0.040 0.042 0.044 0.046 0.048 Lognormal 9.7652 10.4651 11.1104 11.7137 12.2833 12.8253 13.3442 13.8433 14.3254 14.7927 15.2469 15.6895 16.1217 16.5445 16.9588 17.3655 17.7651 18.1583 18.5455 18.9273 True 4.2166 5.1591 6.0225 6.9381 7.8956 8.7591 9.6287 10.4094 11.2482 11.9398 12.8199 13.5697 14.3761 15.1415 15.9424 16.6828 17.5082 18.3226 19.0592 19.7754 MC 4.9494 6.0128 7.9668 8.2702 10.0260 12.2965 12.9095 14.3063 14.6840 15.6789 18.0643 18.3838 19.0249 19.3859 20.3904 22.1250 22.8483 23.4318 23.6134 24.1981 qMC Sorting 3.8483 3.9128 4.5565 4.8286 5.4323 6.3773 7.0618 8.4053 9.1187 10.6132 11.6665 12.5485 14.1657 14.4826 14.9184 15.5063 16.0907 16.3529 16.5205 17.8377 qMC Newton = 3.8483 ¡ 4.0000 = 4.5565 = 4.8286 ¡ 5.6703 = 6.3773 ¡ 7.8710 ¡ 8.7925 ¡ 9.4216 ¡ 10.8920 = 11.6665 ¡ 13.3410 = 14.1657 ¡ 14.6097 ¡ 15.4281 = 15.5063 = 16.0907 = 16.3529 ¡ 17.2845 ¡ 17.8994 Table 5. Results for Value-at-Risk, in order: Log Normal, True over 100,000 points, Monte Carlo, quasi-Monte Carlo and sorting, quasiMonte Carlo and Newton’s Method 20 18 16 14 12 10 8 6 True Quasi-MC with sorting Q-MC with Newton 4 2 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 Figure 4. Plots of the quantile value for our test portfolio QMC AND NIG VaR 0.010 0.012 0.014 0.016 0.018 0.020 0.022 0.024 0.026 0.028 0.030 0.032 0.034 0.036 0.038 0.040 0.042 0.044 0.046 0.048 MC 0.06258900 0.07825003 0.08729312 0.09785151 0.07999894 0.06265327 0.07821451 0.09920062 0.07245710 0.06869601 0.06599299 0.07176859 0.07416334 0.07306887 0.06022343 0.08779631 0.06652290 0.07902179 0.07487483 0.08748519 qMC Sorting 0.2563974 0.2359742 0.2661687 0.2145515 0.2354247 0.2047915 0.2614528 0.2495174 0.2384592 0.2336857 0.2326620 0.2544114 0.2097142 0.2751248 0.2146338 0.2060886 0.2058392 0.2013111 0.2545944 0.1969024 I. 21 qMC Newton 0.2412919 0.2495200 0.2155142 0.2720427 0.2664269 0.2147248 0.2394932 0.2204215 0.2364952 0.2383182 0.2714227 0.2112069 0.2486333 0.2388638 0.2116795 0.2044529 0.2299584 0.2250775 0.2380706 0.2340689 Table 6. Times to calculate Value-at-Risk for in order, Monte Carlo, quasi-Monte Carlo and sorting, quasi-Monte Carlo and Newton’s Method Taking n points and letting the desired level of Value-at-Risk be VaR, the way our method works we can not hope for a better value for VaR than that between the VaRpoint and the point above in the sorted point set; see Table 5 and Figure 4. We see that we do better than the Monte Carlo method, but we are not very close to the true solution. Our hope is that our method is faster than sorting the points to find this point. We run 10 consecutive runs and take the mean value over these times to try to smooth computer dependent variations. As we see in Table 6, our method is comparable with the approach to sort the points. But, if we look more closely into what takes time in the algorithms, we can see that drawing the quasi-random numbers and calculating the portfolio spends more than 0.20 seconds. This can be compared with the time of sorting 1000 points, which takes about 0.004 second. Hence, the time to gain with our approach is insignificant compared with the time it takes to draw the random numbers. It is clear that our method gives no advantage over the sorting approach. It appears that the methods proposed do not show any improvement when calculating the Valueat-Risk. 6. Conclusions We have proposed a qMC-algorithm to draw NIG-variates. The algorithm is applied to three problems appearing in finance, namely valuyation of options, finding implied parameters from quoted option prices and deriving the Value-at-Risk for a nonlinear I. 22 FRED ESPEN BENTH, MARTIN GROTH AND PAUL C. KETTLER MC Quasi-MC with sorting Q-MC with Newton 0.3 0.25 0.2 0.15 0.1 0.05 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 Figure 5. Plots of the times to calculate VaR for 1000 points portfolio. Our algorithm is compared with several other ways to compute prices numerically, and it is demonstrated that it works efficiently and accurately. When finding implied parameters, we combine the qMC algorithm with a Newton Method, for which we also provide an analysis of convergence properties. Our qMC-algorithm is based on a Monte Carlo simulation algorithm suggested by Rydberg [13]. It is an alternative to the general Hlawka-Mück method for sampling non-uniform distributions, and we argue for its superiority in the sense of computational speed and simplicity. Our proposed sampling technique involves simulating three unform variables based on low-discrepancy sequences, instead of doing a numerical integration to achieve the cumulative distribution function which is the case for the Hlawka-Mück method. QMC AND NIG I. 23 References [1] O. E. Barndorff-Nielsen. Processes of normal inverse Gaussian type. Finance and Stochastics, 2:41–68, 1998. [2] E. Eberlein and U. Keller. Hyperbolic distributions in finance. Bernoulli, 1:281–299, 1995. [3] H. Geman. Pure jump Lévy processes for asset price modelling. J. Banking Finance, 26:1257– 1317, 2002. Special issue: Beyond VaR. [4] P. Glasserman. Monte Carlo Methods in Financial Engineering. Springer-Verlag, New York, 2003. [5] A. Griewank. A mathematical view of automatic differentiation. Acta Numer., 12:1–78, 2003. [6] E. Hlawka and R. Mück. A transformation of equidistributed sequences. In S. K. Zaremba, editor, Applications of Number Theory to Numerical Analysis, pages 371–388. Academic Press, New York, 1972. [7] R. F. Kainhofer. Quasi-Monte Carlo algorithms with applications in numerical analysis and finance. PhD thesis, Graz University of Technology, Austria, Apr. 2003. [8] I. Karatzas and S. E. Shreve. Brownian Motion and Stochastic Calculus. Springer-Verlag, New York, 2nd edition, 1991. [9] J. R. Michael, W. R. Schucany, and R. W. Haas. Generating random variates using transformations with multiple roots. Amer. Statist., 30:88–90, 1976. [10] B. Moro. The full Monte. Risk, 8(2):57–58, Feb. 1995. [11] A. Papageorgiou. Sufficient conditions for fast quasi-Monte Carlo convergence. J. Complexity, 19:332–351, 2003. [12] S. Raible. Lévy Processes in Finance: Theory, Numerics, and Empirical Facts. PhD thesis, Albert-Ludwigs-Universität, Frieburg, 2000. [13] T. H. Rydberg. The normal inverse Gaussian Lévy process: simulation and approximation. Comm. Statist. Stochastic Models, 13(4):887–910, 1997. [14] I. M. Sobol. The distribution of points in a cube and the approximate evaluation of integrals. USSR Comput. Math. Math. Phys., 7(4):86–112, 1967. [15] I. M. Sobol. Multidimensional Quadrature Formulas and Haar Functions. Izdat. “Nauka”, Moscow, 1969. II The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nielsen - Shephard stochastic volatility model Fred Espen Benth and Martin Groth Submitted THE MINIMAL ENTROPY MARTINGALE MEASURE AND NUMERICAL OPTION PRICING FOR THE BARNDORFF-NIELSEN - SHEPHARD STOCHASTIC VOLATILITY MODEL FRED ESPEN BENTH AND MARTIN GROTH Abstract. We develop and apply a numerical scheme for pricing options in the stochastic volatility model proposed by Barndorff-Nielsen and Shephard. This nonGaussian Ornstein-Uhlenbeck type of volatility model gives rise to an incomplete market, and we consider the option prices under the minimal entropy martingale measure. To numerically price options with respect to this risk neutral measure, one needs to consider a Black & Scholes type of partial differential equation, with an integro-term arising from the volatility process. We suggest finite difference schemes to solve this parabolic integro-partial differential equation, and derive appropriate boundary conditions for the finite difference method. As an application of our algorithm, we consider price deviations from the Black & Scholes formula for call options, and the implications of the stochastic volatility on the shape of the volatility smile. 1. Introduction Barndorff-Nielsen and Shephard proposed in [6] to model the price dynamics of financial assets as a geometric Brownian motion where the (squared) volatility process follows a non-Gaussian Ornstein-Uhlenbeck (OU) process. This stochastic dynamics gives rise to an incomplete financial market, where there exist a continuum of riskneutral probabilities for arbitrage-free valuation of options. Nicolato and Venardos [15] have applied structure preserving martingale measures to price European options in terms of Laplace transforms, suitable for numerical inversion techniques. In the present paper we study the problem of pricing European options under the minimal entropy martingale measure (MEMM), and propose a numerical method for solving the associated parabolic integro-partial differential equation. The BNS-model assumes that the squared volatility is given as an Ornstein-Uhlenbeck process reverting to zero, with the stochastic innovations given by a subordinator process. This modeling perspective has the advantage of capturing both the heavy tails and the dependency structure observed in financial return data. Furthermore, it allows for an easy way to achieve this empirically by separating the modelling of the return distribution and the autocorrelation function of the returns. The reader is directed to [6] for more details. Based on utility indifference pricing, it is known that for an issuer of an option having exponential risk preferences, the lowest acceptable price will be given by the discounted expected payoff with respect to the MEMM. Mathematically, we reach this price as the indifference price when the risk aversion tends to zero. On the other hand, this price is the highest acceptable price for the buyer, which gives us a rationale for choosing a pricing measure in the family of all equivalent martingale measures. Date: 3 October 2005. Key words and phrases. Integro-partial differential equation, Lax-Wendroff scheme, Stochastic volatility, Lévy process, minimal entropy martingale measure. 1 II. 2 FRED ESPEN BENTH AND MARTIN GROTH In Benth and Meyer-Brandis [7] the density process for the MEMM is derived for the BNS-model, together with the associated parabolic integro-partial differential equation giving the price dynamics of options. A crucial ingredient is a function which rescales the Lévy jump measure of the subordinator process for the volatility dynamics under the MEMM, turning the Lévy dynamics into a state dependent Markov jump process. The option price dynamics satisfies a parabolic integro-partial differential equation (integro-PDE, for short) which consists of a standard Black & Scholes operator together with a non-local integral operator. Discretizing this equation using finite differences, leads to the problem of finding suitable boundary conditions on the finite solution domain. Given the detailed description of the state dynamics of the price process under the MEMM, we are able to derive asymptotics for the option price which yields boundary conditions for the numerical algorithm, arbitrary far out from the solution domain. This enables us to consider the integral term even outside the solution domain, a convenient feature when considering the integral-part of the problem. We suggest to use operator splitting on the two-dimensional problem and derive finite difference schemes, i.e. Lax-Wendroff schemes. For the integral-part we consider it as a source term and use a simple trapezoidal rule to numerically evaluate the integral. Approaching the problem of pricing options by solving the associated integro-PDE allows for a simple way to consider sensitivity measures like the delta or the gamma of the option by numerical differentiation. Other methods, like inversion of Laplace transforms and Monte Carlo methods, provide us with a price only for specified values of the volatility and the underlying asset. Using our numerical algorithm, we analyze the price difference between the Black & Scholes formula and the MEMM option price. The BNS-model is supposed to be driven by an inverse Gaussian subordinator, leading to normal inverse Gaussian distributed returns, and we collect parameter estimates from Nicolato and Venardos [15]. It turns out that the difference depends crucially on the moneyness of the option, and that the Black & Scholes price can be both greater and less than the MEMM option price. For far-out and -in-the-money options the difference is negligible, while it is crucial for options close to at-the-money. A further analysis reveals that pricing options under the MEMM produce a volatility smile. The paper is organized as follows: In the next section we recall some background on the model and the minimal entropy martingale measure. Section three concerns the boundary conditions of the finite solution domain of the linked system of PDEs. Finite difference schemes are proposed in Section four and finally in Section five we apply the numerical solver to some test problems and discuss the results. 2. An integro-Black & Scholes PDE for the MEMM price Consider a market consisting of two assets, a bond and a risky asset, with price processes denoted R(t) and S(t), respectively. We assume that the bond yields a risk-free rate of r, and thus has the standard price dynamics, dR(t) = rR(t) dt , with initial value R(0) = 1. The risky asset is evolving according to the stochastic volatility model proposed by Barndorff-Nielsen and Shephard [6], where the squared volatility is given by a non-Gaussian Ornstein-Uhlenbeck process: dS(t) = (µ + βY (t)) S(t) dt + Y (t)S(t) dB(t), S(0) = s > 0 (2.1) (2.2) dY (t) = −λY (t) dt + dL(λt), Y (0) = y > 0, NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 3 where B(t) is a Brownian motion and L(t) is a pure-jump subordinator (that is, an increasing pure-jump Lévy process with no drift). We let {Ft }t≥0 be the completion of the filtration σ(B(s), L(λs); s ≤ t) generated by the Brownian motion and the subordinator such that (Ω, F , Ft , P) becomes a complete filtered probability space. Lévy measure of the subordinator is denoted (dz), and satisfies by definition The ∞ min(1, z)(dz) < ∞. We impose a stronger exponential integrability condition on 0 the Lévy measure, given by ∞ (2.3) {ecz − 1}(dz) < ∞ , 1 for the constant β2 (1 − e−λT ) . λ Remark here that under this integrability condition, the moment generating function of L(1) exists for all |θ| ≤ c, being defined as c= E [exp(θL(1))] = exp(φ(θ)) where φ(θ) = 0 ∞ {eθz − 1} (dz) . Benth and Meyer-Brandis [7] derived the density process of the MEMM for the stochastic volatility model described above under the exponential integrability condition (2.3). We now recall some results from their paper which will be useful in our context, and note that these results have been extended by Rheinländer and Steiger [16] to the BNS-model with leverage. If we let QM E denote the MEMM, the density process Z(t) can be represented as Z(t) := Z B (t)Z L (t) where Z B (t) and Z L (t) are defined as the stochastic exponentials t t 2 (µ + βY (s)) µ + βY (s) 1 ds , dB(s) − Z B (t) = exp − Y (s) Y (s) 0 0 2 t ∞ t ∞ L Z (t) = exp ln δ(Y (s), z, s)N(dz, ds) + (1 − δ(Y (s), z, s)) (dz) ds . 0 0 0 0 Here, N(dz, dt) is the Poisson random measure of L and the function δ(y, z, t) is defined as H(t, y + z) δ(y, z, t) := H(t, y) where 2 1 T µ 2 (2.4) H(t, y) = E exp − + 2µβ + β Y (s) du Y (t) = y , 2 Y (s) t for (t, y) ∈ [0, T ] × R+ . This function will play a key role in the derivation of MEMM prices for claims, since it gives the jump characteristics of the subordinator under QM E . In fact, the dynamics of the processes S(t) and Y (t) under QM E are given by dB(t), Y (t)S(t) dS(t) = dY (t) = −λY (t) dt + dL(λt), II. 4 FRED ESPEN BENTH AND MARTIN GROTH where B(t) is a Brownian motion. The subordinator is transformed to a pure jump Markov process L(t), having jump measure dz, dt) = H(t, Y (t, ω) + z) (dz) dt. (ω, H(t, Y (t, ω)) We observe that the function H(t, y) rescales the jumps of the subordinator process. Moreover, the jump measure becomes time-inhomogeneous and state-dependent, thus is not even an independent increment (or Sato) process under the MEMM, except L for the case µ = 0. We find that (2.4) is the Feynman-Kac representation of the integro-PDE 1 µ2 2 (2.5) ∂t H(t, y) − + 2µβ + β y H(t, y) + LY H(t, y) = 0 , (t, y) ∈ [0, T ) × R+ 2 y with ∞ {H(t, y + z) − H(t, y)} (dz) , (2.6) LY H(t, y) = −λy∂y H(t, y) + λ 0 and terminal data H(T, y) = 1, y ∈ R+ . We have used the notation ∂x for partial differentiation with respect to the argument x of a function. In general, it is hard to derive an explicit expression for the expectation in (2.4) defining the function H. However, for the special case when µ = 0 we can derive a solution, as proved in Benth and Meyer-Brandis [7]. Since we will need this later, we include the result here: Lemma 2.1. Assume µ = 0. Then it holds, H(t, y) = exp(b(t)y + c(t)), where b(t) and c(t) are defined as (2.7) β2 b(t) = − (1 − exp(−λ(T − t))), 2λ c(t) = λ T φ(b(u)) du . t We proceed further to discuss the price of claims under the MEMM. Consider a contingent claim of European type with payoff f (S(T )) at the exercise time T . We suppose that f is of linear growth in order to assure integrability under MEMM. Let Λ(t, y, s) denote the minimal entropy price of the contingent claim at time t conditioned on S(t) = s and Y (t) = y, Λ(t, y, s) = e−r(T −t) EQM E [f (S(T )) | Y (t) = y, S(t) = s] . Since S(t) is a martingale with respect to QM E , we easily see that the above price is well-defined due to the linear growth of f . We may rewrite the price as )) | Y (t) = y, S(t) = s] , Λ(t, y, s) = e−r(T −t) E[f (S(T which is the Feynman-Kac representation of the following Black & Scholes integral equation 1 Λ(t, y, s) = rΛ(t, y, s) , (2.8) ∂t Λ(t, y, s) + rs∂s Λ(t, y, s) + ys2 ∂ss Λ(t, y, s) + LMEMM Y 2 with (t, y, s) ∈ [0, T ) × R2+ , ∞ H(t, y + z) MEMM (dz) , LY Λ(t, y, s) = −λy∂y Λ(t, y, s) + λ (Λ(t, y + z, s) − Λ(t, y, s)) H(t, y) 0 NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 5 and terminal condition Λ(T, y, s) = f (s), (y, s) ∈ R+ × R+ . We shall approach the calculation of option prices under MEMM by solving numerically the integro-PDE above. We remark that to solve for Λ, knowledge of H is required, which also solves an integro-PDE. Thus, we must consider a coupled system of two integro-PDEs when calculating the option prices for the BNS-model under the MEMM. 3. Boundary conditions for the integro-PDEs on the solution domains The coupled system of integro-PDEs (2.5) and (2.8) is defined on the positive half plane for both s and y. Applying a finite difference method to solve this system numerically requires that we constrain the problem to a finite solution domain, where we must impose conditions on the solution along the boundary of the domain. Furthermore, the integral terms in both PDEs will naturally extend beyond any finite truncation of the y-direction, and we need to find conditions which enable us to analyze the integral also outside the solution domain. In this section we derive the necessary boundary conditions required to use finite difference methods to find Λ. 3.1. Boundary conditions for H. We start by deriving some asymptotic results for the function H(t, y) when y becomes large and small. These results will give us the correct boundary conditions when truncating the solution domain in the spatial dimension y. Lemma 3.1. It holds, lim |H(t, y; µ) − H(t, y; 0)| = 0 , y→∞ where the notation H(t, y; µ) is introduced in order to emphasize the dependency on µ in (2.4). Proof. When y → ∞, we see that µ2 /Y t,y → 0 a.s. The result holds by dominated convergence. From Lemma 3.1 we see that for large values of y, we have that H(t, y; µ) ≈ H(t, y; 0), and an explicit representation for H(t, y; 0) is given in Lemma 2.1. Thus, after truncation the solution domain in y to the interval [0, ymax], we impose the condition H(t, y; µ) = H(t, y; 0) for y ≥ ymax in the numerical approximation procedure. Observe that the asymptotics in Lemma 3.1 also gives us a condition on the integrand in (2.6) whenever the argument y + z is outside the solution domain [0, ymax ]. The following holds for the case y = 0: Lemma 3.2. Suppose µ = 0. Then H(t, 0) = 0. Proof. We have that 2 T −t 1 µ ds , H(t, y) ≤ cE exp − 2 0 Y y (s) for some positive constant c, where y −λs Y (s) = ye +e −λs 0 s eλs dL(λu) . II. 6 FRED ESPEN BENTH AND MARTIN GROTH Letting y ↓ 0, we see that 2 T −t 1 µ ds , 0 ≤ lim H(t, y) ≤ cE exp − y↓0 2 0 Y 0 (s) if the limit exists. We prove that the right-hand side of this expression is equal to zero, from which we can conclude the claim. This is shown by demonstrating that the integral with respect to s inside the exponential is diverging to infinity. The singularity is obtained for the lower integration limit. For > 0 sufficiently small we have that for s ≤ , s 0 Y (s) = e−λ(s−u) dL(λu) ≈ L(λs) a.s. 0 Furthermore, from Prop. 8, p. 84 in Bertoin [8], we know that for a subordinator L it holds that limt↓0 t−1 L(t) = d, a.s., where d is the drift of L. Thus, for s ≤ , it holds approximately s−1 s−1 1 = ≈ , Y 0 (s) s−1 Y 0 (s) dλ which is singular when integrating from zero. Thus, the lemma holds. In our numerical calculations, we impose the boundary condition H(t, 0) = 0 for t ∈ [0, T ) when µ = 0. Note that for µ = 0, we find from Lemma 2.1 that T φ(b(u)) du , H(t, 0) = exp λ t which is not equal to zero. Hence, the two cases µ = 0 and µ = 0 lead to completely different boundary conditions. In most practical situations, µ = 0, and this is also the case we shall focus on when applying our numerical solution algorithm in the next section. 3.2. Boundary conditions for Λ. The domain of the integro-PDE (2.8) is (t, y, s) ∈ [0, T ) × R2+ . Introducing a finite difference approximation, we shall consider the truncated domain (t, y, s) ∈ [0, T ) × [0, ymax] × [0, smax ], which requires conditions on the solution Λ at the boundaries s = 0, s = smax , y = 0 and y = ymax . Observing that when S(t) = s = 0, we have S(u) = 0 for all u ∈ [t, T ). Hence, we find that Λ(t, y, 0) = e−r(T −t) f (0) , which we use as a boundary condition for s = 0 in the integro-PDE (2.8). Let us consider the boundaries s = smax and y = ymax , where the stock price and/or the volatility is large. It turns out that the MEMM price Λ behaves like a Black & Scholes price with time dependent volatility when the volatility becomes large. We know from the Black-Scholes framework that as the volatility tends to infinity and S → ∞, the price of a European call option will converge to the stock price. Similarly, for a European put option the price of the option will approach the strike price. See Lewis [14] for a discussion of large volatility asymptotica for stochastic volatility models where the volatility is driven by a Brownian motion. For large stock prices, the asymptotics of Λ is the same as the one we would get for constant volatility, that is, the Black & Scholes model. We give the details in the following Lemma and Proposition. First, let us prove the following lemma NUMERICAL OPTION PRICING FOR THE BNS-MODEL Lemma 3.3. It holds that T lim 0T (3.1) y→∞ Proof. Note that since T y Y (t) dt = and t 0 0 0 T Y y (t) dt = 1 , a.s. ye−λt dt −λt ye T dt + 0 e −λt 0 e dL(λs) ≥ 0, we find that limy→∞ λs T 0 II. 7 t eλs dL(λs) dt , 0 y Y (t) dt = ∞, a.s. Moreover, since H(t, y + z) = eb(t)z y→∞ H(t, y) lim (with b(t) as in equation (2.7)), we have dz, dt) = eb(t)z (dz) dt, a.s. lim (ω, y→∞ under QM E converges to the jump measure of a pure jump and the jump measure of L = It therefore holds that limy→∞ L(t) independent increment process, denoted by L. L(t), a.s. Thus, by dominated convergence we find 1 1 t λs → 0, a.s. , e dL(λs) ≤ eλt L(λt) y 0 y when y → ∞. Hence, we conclude that T −λt 1 t λs T y e y 0 e dL(λs) dt Y (t) dt 0 0 =1+ → 1 a.s. , T T ye−λt dt e−λs ds 0 0 and the Lemma follows. We find the following asymptotics for Λ when y → ∞: Proposition 3.4. We have Λ(t, y, s) lim 2 ΛBS (t, s; σt,T (y)) y→∞ =1 where ΛBS (t, s; σ 2 ) is the Black & Scholes price for an option with payoff function f written on an underlying having volatility σ, and y 2 1 − e−λ(T −t) . (y) = σt,T λ(T − t) and the Brownian motion B are independent under QM E , Proof. The jump process L and we can express the option price as an integral with respect to the density of the integrated variance as follows (see Hull and White [11] and Nicolato and Venardos [15]): ∞ Λ(t, y, s) = ΛBS (t, s; x/T − t)qME (x) dx , where qME is the density of the density of T t 0 Y t,y (s) ds under the MEMM. Rewriting this in terms of T T t t Y t,y (s) ds y exp(−λ(s − t)) ds II. 8 FRED ESPEN BENTH AND MARTIN GROTH which we denote by qME , we get ∞ Λ(t, y, s) = 0 2 ΛBS (t, s; xσt,T ) qME (x) dx . Observe that by Lemma 3.3 we have that qME (x) dx → δ1 (dx) when y → ∞, where δ1 is the Dirac measure concentrated at 1. Hence, we find ∞ BS 2 Λ (t, s; xσt,T ) Λ(t, y, s) = qME (x) dx 2 2 BS BS Λ (t, s; σt,T ) Λ (t, s; σt,T ) 0 ∞ BS 2 Λ (t, s; xσt,T ) = { qME (x) dx − δ1 (dx)} + 1 . 2 BS Λ (t, s; σt,T ) 0 The first integral term converges to zero when y → ∞ since the ratio 2 ΛBS (t, s; σt,T x) 2 BS Λ (t, s; σt,T ) can be bounded and the signed measure qME dx − δ1 (dx) tends to zero. Hence, the proposition follows. The proposition above yields that for large values of y, Λ is given by the Black & Scholes price when the underlying asset has a time dependent volatility given by √ y exp(−λt/2). For example, if we consider a call option, this price can be explicitly calculated, as stated in the next corollary: Corollary 3.5. Assume f (x) = x − K. Then ΛBS defined in Prop. 3.4 is given by the Black & Scholes pricing formula for a call option at time t written on an asset with price s and volatility σt,T (y). The knowledge of the asymptotic behaviour of Λ in y permits us to consider the integral term in the integro-PDE (2.8) also for values of y outside of the solution domain. Hence, we do not need to truncate the integral term in any unnatural way when we are close to the boundary of the solution domain in the y-direction. The next Proposition states the asymptotic behaviour when s → ∞: Proposition 3.6. Suppose that f (s)/s → c for some constant c when s → ∞. Then Λ(t, y, s) = c. s→∞ s Proof. We have by dominated convergence t,1,y Λ(t, y, s) f (S t,s,y (T )) t,1,y = EQME S (T ) → c E (T ) = c, S ME s S t,s,y (T ) when s → ∞. lim We now finish our study of the boundary behaviour of the solution Λ by analyzing the case y = 0. First we note that the process Y is not defined for the initial state y = 0, since the jump measure explodes. Indeed, what we observe from a heuristic point of view is that the closer we are to y = 0, the greater the ratio H(t, y + z)/H(t, y) becomes, and thus the stronger the process will be pushed away from this state. A reflecting boundary at y = 0 would imply a no-flow condition on Λ at this boundary, i.e. the Neumann condition that the derivative of Λ in the direction of y vanishes NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 9 at y = 0. To gain further understanding of the boundary behaviour of Λ at y = 0, consider the following heuristic argument: Suppose that Λ is analytical in y, having a series expansion ∞ ∂yn Λ(t, y, s) n z ,n ≥ 1. Λ(t, y + z, s) − λ(t, y, s) = n! n=1 Inserting this into the integro-PDE (2.8), we see that the integral part will be a sum of terms like ∂yn Λ(t, y, s) ∞ n z H(t, y + z)(dz) . n!H(t, y) 0 Since H(t, y) → 0 when y ↓ 0, we must have that ∂yn Λ(t, y, s) → 0, otherwise the integral terms will diverge. Hence, all the derivatives of Λ should vanish at the boundary y = 0, which shows the strong reflection at y = 0 of the volatility process under the MEMM. When considering the numerical solution, we impose the condition that the derivatives up to a certain order vanish at the boundary, the simplest choice being a Neumann condition at the boundary y = 0, that is, ∂y Λ(t, 0, s) = 0 . Such a choice may be defended by the work of Barles et.al. [3, 4, 5], which have analyzed the sensitivity of boundary conditions related to finance problems. They found that artificial boundary conditions have negligible impact on the solution outside a boundary layer. This means that even wrongly stated conditions may be smoothed out when moving into the solution domain of interest. In our case we have weakened the strong analytical condition, but believe that the true level of volatility is sufficiently far away from y = 0 that the impact is relatively small. Note also that we do not have exact information about the solution for s and y being large, requiring similar considerations to defend the appropriateness of the numerical boundary conditions. Based on the derived boundary conditions for the two integro-PDE problems, we now move on to develop a finite difference scheme appropriate for our equations. 4. Derivation of finite difference schemes In order to calculate the option price we need to solve (2.8) using a numerical method. Applying the finite difference method we derive numerical schemes, which involves truncating the infinite solution domain and solve the problem on an appropriate grid. We also have to numerically approximate the involved integral, possibly involving a Lévy measure with a singularity at zero. Let us first concentrate on the case with r = 0. The function H(t, y) appears as a measure change in the integral part of the integroPDE for the option price. Hence, we need to solve (2.5) first, in order to arrive at the correct option price. The non-local integral term in (2.5) need to be numerically approximated with the information attainable. Since we only know the value of H(t, y) at the grid points we use a trapezoid integration scheme and treat the integral as a fully explicit source term. However, if we use only the points in the grid we get less points to integrate over as we get closer to the boundary y = ymax . The approximation of the integral would then be less accurate for large y, which is an undesirable feature. By adding extra points in y and assigning the explicit solution beyond ymax , in accordance with Lemma 3.1, we can make sure we get a coherent treatment of the integral. If the number of extra points n is large enough, the decay of the measure will make sure we II. 10 FRED ESPEN BENTH AND MARTIN GROTH capture the influence from the integral. It will then be unnecessary to integrate over more than n points anywhere. Reducing the number of integration points this way gives a clear speed up. To solve the integro-PDE (2.5) we derive an implicit Lax-Wendroff scheme λ∆τ g 2 ∆τ 2 agy ∆τ 2 R n+1 2 −1 + + g∆τ − R Hk−1 + 1 + R − + g∆τ − Hkn+1 2 2 2 2 R λ∆τ n+1 + = Hkn + Fkn 1− − g∆τ − R Hk+1 2 2 where R = λyk ∆τ /∆y, a = λy and Fkn is the integral term. Furthermore, g is the function 1 µ2 2 g(y) = + 2µβ + β y 2 y and gy the derivative of this function with respect to y. We now turn our attention to equation (2.8). Since option pricing is a problem in two spatial dimensions we use Gudonov dimensional splitting [10] and following Strang [17] we approximate the exact solution operator by successive use of one-dimensional operations, i.e. y n S(T )Λ0 ≈ S s ∆t Λ0 . S (∆t)S s ∆t 2 2 Here S(T ) is the exact solution operator of (2.8), approximated by one-dimensional operators, S s (t) and S y (t), and we iterate over n time steps. Since we treat the integral as a non-homogeneous term and we integrate over y, it seems natural to include the integral operator LMEMM in S y . More information about dimensional splitting for Y conservation laws can be found in Kröner [13]. The modeling of the volatility will in most situations result in an infinite activity Lévy process, having a singularity in the jump measure at zero. We run into numerical difficulties if we try to numerically integrate from zero in such cases. Hence, we need to start the integration at the first grid point larger than zero. Some of the pure jump Lévy process we want to consider is dominated by the small jumps and a cut-off of the integral close to zero may lead to a loss of significant parts of the integral. To make up for this we approximate part of the integral term by a drift in the integro-PDE for the price: Letting be the first grid point larger than zero we do the approximation H(t, y + z) (dz) ≈ ξ(t, y)Λy (t, y, s) (Λ(t, y + z, s) − Λ(t, y, s)) H(t, y) 0 where H(t, y + z) (dz). z ξ(t, y) = H(t, y) 0 The Lévy measure integrates z close to zero, thus the integral makes sense. However, we need to calculate ξ numerically, which we now discuss. Since we only have knowledge of the integrand at the grid points, there are only the two end points available for numerical integration of ξ. To work around this problem we assume that H(t, y) is close to linear between two grid points. Then we can use linear interpolation between the points of the grid and evaluate the integrand in an arbitrary number of points. However, we still need to avoid zero because of the singularity. If we include the terms introduced by the risk free rate of return r > 0 in the s S -operator, we get the Black & Scholes PDE with Dirichlet boundary conditions. Using transformation to dimensionless parameters we can always reduce this to the NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 11 heat equation. We decide to use the simple implicit finite difference scheme, here illustrated in the case r = 0 in which case the equation is reduced to the heat equation immediately: n+1 n+1 n+1 n 2 2 − Λ − 2Λ + Λ Λ Λn+1 σ (y )s k,l k l k,l k,l+1 k,l k,l−1 − =0 ∆t 2 ∆s2 where Λnk,l is Λ(t, y, s) evaluated at the point (n, k, l) in the (t, y, s) grid. For the solution operation S y we use the same approach as for H(t, y) to derive a non-homogeneous Lax-Wendroff scheme: Let a = a(y) = λy and Λnk,l as above. A Lax-Wendroff scheme is then λ∆t λ∆t R R n+1 2 n+1 n n −1 + − R Λk−1,l + (1 + R )Λk,l + 1− − R Λn+1 k+1,l = Λk,l + Fk,l 2 2 2 2 n where R = λyk ∆τ /∆y and Fk,l is the integral term. 5. Numerical valuation of call options under MEMM In this Section we apply our numerical pricing algorithm to the valuation of European call options under the MEMM. The simulated prices are contrasted with those obtained from the Black & Scholes formula based on a geometric Brownian motion with comparable parameters. Further, we study the volatility smile in the context of our pricing approach. We have proposed schemes to handle the partial differential equations for both integro-PDEs in the coupled system. The finite difference schemes described above have been implemented in C++ and run on designated simulation servers. The integrand is dependent on both s and y and hence we need to do an integration for every grid point and time step, leading to a significant increase in the simulation load compared to ordinary PDEs. Solving on a 100 × 100 grid with 35 extra points in the y-direction and 50 time steps execute in about 11 seconds. We observed that making the grid finer in the y-variable gave a super-linear increase in the simulation time. Assume that the squared volatility have a stationary distribution being inverse Gaussian IG(γ, δ). As noted by Barndorff-Nielsen and Shephard [6], this choice of volatility process implies that the log-returns of S(t) become approximately normal inverse Gaussian distributed. The Lévy measure of the subordinator L(t) is then 1 δ −3/2 (dz) = √ z (1 + γz) exp − γz dz. 2 2 2π Below follows some results from the simulations, starting with the solution to (2.5). For the volatility process we use the same parameters as Nicolato and Venardos [15]: λ = 2.4958, γ = 11.98, δ = 0.0872. For the option we let the strike be K = 200, and suppose zero interest rate, r = 0. We let the constants in the market model (2.1) be µ = 0.05, β = 0.5. In most examples below we work with a grid of size 251 × 201 points, except for the comparison with the Black & Scholes prices, where we choose a much finer grid of 1501 × 401 points. II. 12 FRED ESPEN BENTH AND MARTIN GROTH Figure 1. Simulations of H(t, y), with the parameter values from Nicolato and Venardos [15]. Since H(t, y) occurs as a measure change in the integral of the partial differential equation (2.8), we need to simulate it before we can solve for the option price. Figure 1 shows a plot of the function H(t, y) based on the chosen parameters. In Figures 2-4 we show the resulting option prices as a function of t, s and y. Remark that in neither of the figures we have plotted the whole solution area, which was (s, y) ∈ [0 600] × [0 1]. In Fig. 2 we have fixed y to be y = 0.1528, while in Fig. 4 the asset price is set to s = 340. Not surprisingly, we see in Fig. 2 that the shape of the price surface as a function of s and t resembles quite well the Black & Scholes price surface. However, an interesting question is now to what extent the two pricing methodologies are differing. To get a better picture of how the MEMM prices relate to prices from the Black & Scholes formula we need first to determine what volatility we should use in the Black & Scholes formula. We suggest the following procedure: We find the expectation of Yt for the stationary distribution and let the squared volatility in Black & Scholes, 2 σBS , be equal to this expectation. We then choose the starting value y for the process 2 in the grid. This means that we compare Black & Yt to be the point closest to σBS Scholes prices with constant volatility to our indifference prices which have a volatility fluctuating around this constant level. As we see in Fig. 5 the difference has a “Wshape” where MEMM prices are lower at-the-money and higher for in-the-money and out-of-the money. This reflects the Black & Scholes model’s inability to capture the risk of large price movements. The form of the difference is similar to results from Eberlein [9], who prices options in an exponential Lévy model with the hyperbolic distribution based on structure preserving risk-neutral measure obtained through the Esscher transform. Let us consider the implied volatility yielded by our MEMM prices. We simulated prices for a range of strikes and calculated the implied Black & Scholes price given by NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 13 35 30 MEMM price 25 20 15 10 5 0 200 1 0.8 150 0.6 0.4 100 0.2 0 S t Figure 2. Option prices under the MEMM as a function of time and underlying asset price. 120 100 MEMM price 80 60 40 20 0 300 250 0.5 200 0.4 0.3 150 0.2 100 S 0.1 50 0 y Figure 3. Option prices under the MEMM as a function of the underlying asset price and squared volatility level. II. 14 FRED ESPEN BENTH AND MARTIN GROTH 156 154 MEMM price 152 150 148 146 144 142 140 0.8 0.6 0.4 0.2 0 y 1 0.6 0.8 0.4 0 0.2 t Figure 4. Option prices under the MEMM as a function of the squared volatility level and time. Price difference Black−Scholes price with variance=0.007333 minus MEMM price with y=0.00733 0.3 0.25 Price difference 0.2 0.15 0.1 0.05 0 −0.05 −0.1 0.4 0.6 0.8 1 1.2 stockprice−strike ratio 1.4 1.6 1.8 Figure 5. Plot of the difference between the Black & Scholes price and the MEMM price. NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 15 0.16 0.15 Implied Black−Scholes volatility 0.14 0.13 0.12 0.11 0.1 0.09 0.08 0.5 0.6 0.7 0.8 0.9 1 1.1 spotprice−strike ratio 1.2 1.3 1.4 1.5 Figure 6. Plot of the implied Black-Scholes volatility produced by the MEMM prices. these, assuming the spot price is s = 200. As we see in Fig. 6 we get a skewed volatility smile. 5.1. Pricing of the jump risk under MEMM. An interesting question is how the jump risk is priced under the MEMM. We know that the MEMM is transforming the jump measure of the subordinator L by a ratio of the function H. Thus, small and big jumps are rescaled according to the time and state-dependent ratio H(t, y + z)/H(t, y). We have done some numerical tests demonstrating how the jump measure is re-distributed under the MEMM. In Fig. 7 we have plotted the ratio for the parameters chosen in the numerical examples above for two different values of y. We have fixed t = 1 and let y = 0.0317 (left) and y = 0.1650 (right). We see that smaller jumps are scaled up, before the ratio dampens the bigger jumps. For small values of y the left pictures indicates that all jumps up to quite large jump sizes are scaled up, while in the right picture jumps with size larger than 0.7 will be scaled down. In fact, we observe here that the large jumps are less influential under the MEMM than under the objective measure, showing that the MEMM puts less value to these. The positive jump risk price is assigned to small jumps and we note that for very small y this upscaling is substantial, increasing towards infinity as we approach zero from the right. 5.2. A discussion of convergence. We end this section with a few words on convergence of our numerical procedure. In the present paper we have not considered this question from a theoretical point of view, but refer the reader to the works by Amadori [1], Amadori, Karlsen and La Chioma [2] and Jakobsen and Karlsen [12], where convergence is analyzed for integro-PDEs similar to ours. In order to justify that our numerical solution of Λ indeed converges, we have tested the algorithm by step-wise FRED ESPEN BENTH AND MARTIN GROTH 1.3 1.02 1.25 1.01 1.2 1 H(t,y+z)/H(t,y) H(t,y+z)/H(t,y) II. 16 1.15 0.99 1.1 0.98 1.05 0.97 1 0 0.5 1 1.5 z 2 2.5 3 0.96 0 0.5 1 1.5 z Figure 7. Plots of the ratio H(t, y + z)/H(t, y), illustrating the scaling of the jumps. Here y = 0.0317 (left) and y = 0.165 (right). refining the grid. The relative distance of the resulting numerical solution with respect to the obtained one for the finest grid is shown in Figure 8. We see from this plot that the relative error decreases, indicating that we have convergence. It is the goal in future studies to analyze the convergence and stability of the numerical scheme from a mathematical point of view. References [1] A. L. Amadori (2001). Differential and integro–differential nonlinear equations of degenerate parabolic type arising in the pricing of derivatives in incomplete market. PhD thesis, University of Roma I - La Sapienza. [2] A. L. Amadori, K. H. Karlsen, and C. La Chioma (2004). Nonlinear degenerate integro-partial differential evolution equations related to geometric Lévy processes and applications to backward stochastic differential equations. Stoch. Stoch. Rep., 76(2), pp. 147–177. [3] G. Barles (1997). Convergence of numerical schemes for degenerate parabolic equations arising in finance theory. In L. Rogers and D. Talay, editors, Numerical Methods in Finance. Cambridge University Press, Cambridge. [4] G. Barles, C. Daher, and M. Romano (1995). Convergence of numerical schemes for parabolic equations arising in finance theory. Math. Models Methods Appl. Sc., 5(1), pp. 125–143. [5] G. Barles and P. Souganidis (1991). Convergence of approximation schemes for fully nonlinear second order equations. Asymp. Anal., 4, pp. 271–283. [6] O. E. Barndorff-Nielsen and N. Shephard (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics. J. Roy. Stat. Soc. A, 63, pp. 167–241. [7] F. E. Benth and T. Meyer-Brandis (2005). The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps. Finance Stoch., 9(4), pp. 563–575. [8] J. Bertoin (1998). Lévy Processes. Cambridge University Press, Cambridge. [9] E. Eberlein (2001). Application of Generalized Hyperbolic Lévy motions to Finance. In O. E. Barndorff-Nielsen, T. Mikosch, and S. I. Resnick, editors, Lévy processes, Theory and Applications, pp. 319–336. Birkhäuser, Boston. [10] S. Gudonov (1959). Finite difference methods for numerical computations of discontinuous solutions of the equations of fluid dynamics. Matematiceskij Sbornik, 47, pp. 271–306. [11] J. C. Hull and A. White (1987). The pricing of options on assets with stochastic volatility. J. Finance, 42, pp. 281–300. [12] E. R. Jakobsen and K. H. Karlsen (2004). A ’maximum principle for semicontinuous functions’ applicable to integro-partial differential equations. To Appear in: Nonlin. Diff. Eq. Appl. [13] D. Kröner (1997). Numerical schemes for conservation laws. John Wiley & Sons and B.G. Teubner Publisher, Chichester. NUMERICAL OPTION PRICING FOR THE BNS-MODEL II. 17 Figure 8. Plot of the relative error with respect to the solution obtained on the finest grid. [14] A. L. Lewis (2000). Option valuation under stochastic volatility. Finance Press, Newport Beach, California. [15] E. Nicolato and E. Venardos (2003). Option pricing in stochastic volatility models of the OrnsteinUhlenbeck type. Math. Finance, 13(4), pp. 445–466. [16] T. Rheinländer and G. Steiger (2005). The minimal martingale measure for general BarndorffNielsen/Shephard models. Preprint, London School of Economics. [17] G. Strang (1968). On the construction and comparison of difference schemes. SIAM J. Num. Anal., 5, pp. 506–517. III Valuing volatility and variance swaps for a non-Gaussian Ornstein-Uhlenbeck stochastic volatility model Fred Espen Benth, Martin Groth and Rodwell Kufakunesu Forthcoming in Applied Mathematical Finance VALUING VOLATILITY AND VARIANCE SWAPS FOR A NON-GAUSSIAN ORNSTEIN-UHLENBECK STOCHASTIC VOLATILITY MODEL FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU Abstract. Following the increasing awareness of the risk from volatility fluctuations the markets for hedging contracts written on realised volatility has surged. Companies looking for means to secure against unexpected accumulation of market activity can find over-the-counter products written on volatility indices. Since the Black and Scholes model requires a constant volatility the need to consider other models is obvious. We investigate swaps written on powers of realised volatility in the stochastic volatility model proposed by Barndorff-Nielsen and Shephard [3]. We derive a key formula for the realised variance and are able to represent the swap price dynamics in terms of Laplace transforms, which makes fast numerical inversion methods viable. We show an example using the fast Fourier transform and compare with the approximation proposed by Brockhaus and Long [7]. 1. Introduction A constant volatility is not able to explain the volatility clustering observed in financial markets, where periods of high activity and large price movements occur. An increasing awareness of the risk associated with the fluctuations in the market activity has led to a growing focus on stochastic volatility models. Making the volatility stochastic forces the market participants to consider the impact from changes in trading intensity and measures to hedge against unwanted effects. The risk from volatility movements can be hedged using financial instruments where the underlying asset is realised variance. Swaps on realised variance have been traded over the counter for several years, giving firms means to manage the perceived risk. The interest in such products indicates that market participants perceive the uncertainty in the variance as a feature in the market, which they need to hedge themselves against. More recently, this has spun out to a fully fledged market for hedging and speculation in financial contracts on realised variance, like the CBOE S&P 500 Volatility Index (VIX). The industry standard model for stock returns, the Black and Scholes model, gives no room for uncertainty in the volatility, since it is considered as a constant entity. It is well known that the model is unable to replicate the implied volatility smiles observed empirically, resulting in a flat implied volatility across strike and maturity. Clearly this is not viable when pricing contracts on realised variance, and more realistic models are needed. The interest has focused on stochastic volatility models, including models with jumps in the volatility process, see for example Carr et.al. [8] who thoroughly investigate quadratic variance for infinite activity processes, more specifically the class of CGMY processes. Stochastic volatility models are undeniably more complicated to Date: 29 November 2006. Acknowledgments: The authors thank Carl Lindberg for providing parameter estimates and an anonymous referee for very useful comments and suggestions. 1 III. 2 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU work with compared to the Black and Scholes model due to the much richer structure of randomness. We consider the problem of valuing volatility and variance swaps in the framework of the non-Gaussian Ornstein-Uhlenbeck model for stochastic volatility proposed by Barndorff-Nielsen and Shephard [3]. Instead of the constant volatility in the Black and Scholes market the volatility is stochastic and given as a mean-reverting process driven by a subordinator, i.e. a Lévy process with positive jumps and no continuous part. The model is able to replicate the skewness and fat tails seen in high-frequency stock returns and capture implied volatility smiles. Option pricing under the Barndorff-Nielsen and Shephard model is investigated by Nicolato and Venardos [19] and in an indifference pricing setting by Benth and Meyer-Brandis [5] and Benth and Groth [4]. Transform based option pricing methods were investigated in several papers before Carr and Madan [12] showed how to utilise the computational efficiency of the fast Fourier transform. Given the analytical form of the risk-neutral density the method is one of the swiftest numerical pricing algorithms. The drawback is that the riskneutral density is not always available analytically. We will show that by casting the swap pricing problems in form of an (inverse) Laplace transform we may use the fast Fourier transform to simulate prices. We derive a general formula and provide an example where the stationary distribution of the Ornstein-Uhlenbeck process is inverse Gaussian. We compare the numerical results with the approximation by Brockhaus and Long [7]. Moreover, pricing swaptions on realised variance is also an applicable problem for the fast Fourier transform and we present a short description how to use the framework of Carr and Madan [12] to price them. An alternative to fast Fourier transform methods is solving the partial differential equation associated with the volatility derivative. In Howison, Rafailides and Rasmussen [16], an extensive asymptotic study of the partial differential equations governing the price of volatility options is provided, where the volatility follows a diffusion dynamics. Moreover, for volatilities following jump-diffusion dynamics, the expectation of the realised variance is calculated. The model resembles the one we have in mind, however, we deal with on one hand a more general jump dynamics, and on the other hand a much more specific model. Windcliff, Forsyth and Vetzal [20] analyse a jump-diffusion asset price model with volatility being a function of time and current asset price. Pricing and hedging of volatility derivatives are studied using the numerical solution of a partial integro-differential equation. Other papers in the area include the work of Detemple and Osakwe [13], where American and European volatility options are studied using a general equilibrium stochastic volatility framework. In a general context, Carr and Lee [10] are deriving hedging strategies for variance and volatility swaps, and consider explicit examples including the Heston volatility model. In their paper, a displaced lognormal approximation is proposed for deriving explicit prices and hedges. In Carr and Lee [9], it is shown how to replicate variance swaps by trading vanilla options. The rest of the paper is organised as follows: In the next section we review the Barndorff-Nielsen and Shephard stochastic volatility model, realised variance and swaps written on realised variance. Section 3 provides a key formula similar to the one found in Eberlein and Raible [15], the transform-based swap price dynamics and a subsection on options written on realised variance. Brockhaus and Long [7] suggested an approximation for the volatility swap price dynamics which is reviewed in section VALUING VOLATILITY AND VARIANCE SWAPS III. 3 4. In section 5 we give an example and compare the accuracy of the BrockhausLong approximation with numerical results using the fast Fourier transform on our transform-based swap price dynamics. 2. The volatility model of Barndorff-Nielsen and Shephard The stochastic volatility model of Barndorff-Nielsen and Shephard (from now on called the BNS-model) appeared first in [3]. The BNS-model is a very flexible class of stochastic volatility models, being able to model accurately heavy tailed and skewed log-returns as well as the autocorrelation in the returns. We will introduce the model based on the theory in [3], and discuss some of its analytical properties being useful for our analysis of the volatility and variance swaps considered in this section and later. Consider the probability space (Ω, F , P ) and assume the asset price evolves in time as (2.1) dS(t) = µ + βσ 2 (t) S(t) dt + σ 2 (t)S(t) dB(t), where B(t) is a Brownian motion, µ and β constants and σ 2 (t) follows a non-Gaussian Ornstein-Uhlenbeck process. The modelling perspective of Barndorff-Nielsen and Shephard [3] is to find an Ornstein-Uhlenbeck dynamics σ 2 (t) for which the marginal distribution and the autocorrelation structure of the log-returns are modelled separately. They achieve this by assuming (2.2) dσ 2 (t) = −λσ 2 (t) dt + dL(λt), where λ is a positive constant and L is the background driving Lévy process to be specified. By supposing L to be a subordinator the positivity of the process σ 2 (t) is ensured. We denote by {Ft }t≥0 the completion of the filtration σ(B(s), L(λs); s ≤ t) generated by the Brownian motion and the subordinator such that (Ω, F , Ft , P ) becomes a complete filtered probability space. The Lévy measure is denoted (dz), and is supported on the positive real line since L is a subordinator. In [3] it is showed that the log-returns of the asset become scaled mixtures of normal distributions in the sense that the log-returns conditioned on the variance are normally distributed. Barndorff-Nielsen and Shephard [3] exploit this fact to model the marginal distribution of the log-returns (indirectly) by assuming a specific stationary distribution for σ 2 (t). Given this specification, there will exist a subordinator process L such that σ 2 (t) is the solution of the Ornstein-Uhlenbeck equation (2.2). Moreover, the autocorrelation function for (the stationary) σ 2 (t) is explicitly known to be an exponentially damped function. The reason for the unusual time scaling L(λt) in the dynamics for σ 2 (t) is namely the separation of the modelling of autocorrelation (i.e. the time dynamics of the volatility) and the invariant distribution (i.e. the marginal distribution for the log-returns). Note that from Itô’s Formula for semimartingales it follows that for s ≤ t t 2 2 −λ(t−s) + e−λ(t−u) dL(λu). (2.3) σ (t) = σ (s)e s A more general structure is obtained by a superposition of m different non-Gaussian Ornstein-Uhlenbeck processes: Let wk , k = 1, 2, . . . , m, be positive weights summing to one, and define m (2.4) σ 2 (t) = wk Yk (t), k=1 III. 4 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU where (2.5) dYk (t) = −λk Yk (t) dt + dLk (λk t), for independent background driving Lévy processes Lk , k = 1, . . . , m. We denote the corresponding Lévy measures k (dz), k = 1, . . . , m, which all are supported on the positive real line under the assumption that the Lk ’s are subordinators. In line with (2.3), we find the explicit dynamics of Yk (t) to be t −λk (t−s) + e−λk (t−u) dLk (λk u), (2.6) Yk (t) = Yk (s)e s for 0 ≤ s ≤ t. The realised volatility σR (T ) over a period [0, T ] is defined as 1 T 2 σ (s) ds. (2.7) σR (T ) = T 0 In the market, the realised volatility is not continuously measured, but rather discretely. We introduce the market realised volatility as n 1 (2.8) σ R (T ) = σ 2 (si ), n i=1 where we sample at time instances si ∈ [0, T ]. In the further analysis we shall stick to the continuously defined realised volatility in (2.7). The quadratic variation of the log-prices ln S(t) is connected to the realised volatility by the following relation: t mr r r 2 (ln S(ti+1 ) − ln S(ti )) = σ 2 (s) ds [ln S](t) := p -lim r→∞ i=1 r t0 = 0 tr1 trmr 0< < ... < with supi (tri+1 − tri ) → 0 for for any sequence of partitions r → ∞. A volatility swap is a forward contract that pays to the holder the amount c (σR (T ) − Σ) where Σ is a fixed level of volatility and the contract period is [0, T ]. The constant c is a factor converting volatility surplus or deficit into money. For simplicity, we choose c = 1 in this paper. The fixed level of volatility Σ is chosen so that the swap has a price equal to zero, that is, at time 0 ≤ t ≤ T , it is costless to enter the contract. The BNS-model gives rise to an incomplete market due to the jump feature of the volatility. Thus, there exists a continuum of risk-neutral probability measures Q which may be used for pricing. From general theory, the fixed level of the volatility swap can be expressed as the conditional risk-neutral expectation (using the adaptedness of the fixed volatility level): (2.9) Σ(t, T ) = EQ [σR (T ) | Ft] where Q is a risk-neutral probability. In order to specify one price, a choice of which pricing measure Q to use must be made. We introduce a parametrised subclass of riskneutral probabilities Q by the Esscher transform, which yields a parametrised class of fixed volatility levels which can be calibrated with market quotes. In this way one can choose the market’s pricing measure. Note from the expression of Σ(t, T ) that Σ(0, T ) = EQ [σR (T )] , VALUING VOLATILITY AND VARIANCE SWAPS III. 5 Σ(T, T ) = σR (T ). In a completely similar manner, we define a variance swap to have the price (2.10) Σ2 (t, T ) = EQ σR2 (T ) | Ft . To have a more compact notation, we define for γ > −1 (2.11) Σ2γ (t, T ) = EQ σR2γ (T ) | Ft . Below, we shall derive pricing dynamics for swaps written on all powers of the realised volatility σR bigger than -2. Of course, our concern is the volatility and variance swap prices, corresponding to γ = 1/2 and γ = 1, resp. However, as we shall see below, our framework gives prices that naturally extends to any γ > −1. 3. Valuation of volatility and variance swaps using the Laplace transform We construct martingale measures Q using the Esscher transform, following the analysis in Benth and Saltyte-Benth [6]. Assume θk (t), k = 1, . . . , m are real-valued, measurable and bounded functions. Consider the stochastic process t m t θ Z (t) = exp θk (s) dLk (λk s) − λk ψk (θk (s)) ds , k=1 0 0 where ψk (x) are the log-moment generating functions of Lk (t), that is ψk (x) = ln E[exp(xLk (1))]. For many natural choices of Lk these functions are explicitly known. We refer the reader to Section 5 for one example. Let us impose an exponential integrability condition on the Lévy measure ensuring existence of moments. Condition (L): There exist a constant κ > 0 such that the Lévy measure satisfies the integrability condition ∞ ezκ k (dz) < ∞. 1 θ The processes Z (t) are well-defined under natural exponential integrability conditions on the Lévy measures k which we assume to hold. That is, they are well defined for t ∈ [0, T ] if condition (L) holds for κ = supk=1,...,m,s∈[0,T ] |θk (s)|. Introduce the probability measure Qθ (A) = E[1A Z θ (τmax )], where 1A is the indicator function and τmax is a fixed time horizon including all the trading times. We denote the expectation under the probability measure Qθ by Eθ [·]. By using the time varying θ’s we have a flexible class of martingale measures Qθ of which we shall call θ the ”market price of risk”. The following key formula for σR2 (T ) is useful when deriving explicit pricing formulas for the swaps in terms of Fourier transforms: Lemma 3.1. Let z ∈ C and θk : R+ −→ R, k = 1, . . . , m be real-valued measurable λ−1 functions. Suppose condition (L) is satisfied and well defined for |Re(z)| < [ Tk (1 − e−λk (T −s) )]−1 κ for all k, where κ = supk=1,...,m,s∈[0,T ] |θk (s)|. Then m T zω 2 k (1 − e−λk (T −s) ) + θk (s) − ψk (θk (s)) ds Eθ [ezσR (T ) | Ft] = exp λk ψk λ T k t k=1 III. 6 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU × exp z T m 1 2 tσR (t) + (1 − e−λk (T −t) )ωk Yk (t) λ k k=1 . Proof. From Bayes’ Formula it follows m zωk T 2 Z θ (T ) Yk (s) ds Eθ exp zσR (T ) | Ft = E exp Ft θ (t) T Z 0 k=1 m T zωk T = E exp Yk (s) ds + θk (s) dLk (λk s) T 0 t k=1 m T × exp −λk ψk (θk (s)) ds . k=1 Ft t Since σ 2 (s) is Fs -adapted, we have m T zωk T 2 Eθ exp zσR (T ) | Ft = E exp Yk (s) ds + θk (s) dLk (λk s) Ft T t t k=1 m T zωk t × exp Yk (s) ds − λk ψk (θk (s)) ds . T 0 t k=1 To this end, recall from the dynamics of Yk that T λk Yk (s) ds = −Yk (T ) + Yk (t) + t T t dLk (λk s) , and invoking the explicit expression for Yk (T ) in (2.6), T −λk (T −t) + e−λk (T −u) dLk (λk u), Yk (T ) = Yk (t)e t yields t T 1 1 Yk (s) ds = Yk (t) 1 − e−λk (T −t) + λk λk t T 1 − e−λk (T −s) dLk (λk s) . Thus, 2 (T ) zσR ) ( Eθ e Ft m = E exp zωk 1 − e−λk (T −s) + θk (s) dLk (λk s) Ft λk T t k=1 T m zωk z 2 tσ (t) + (1 − e−λk (T −t) )Yk (t) − λk ψk (θk (s)) ds × exp T R T λ k t k=1 m T zωk λk ψk (1 − e−λk (T −s) ) + θk (s) − ψk (θk (s)) ds = exp λ T k t k=1 m 1 z tσR2 (t) + × exp (1 − e−λk (T −t) )ωk Yk (t) , T λ k k=1 T where we have used the independent increments property of the subordinator. Hence, the proof is complete. VALUING VOLATILITY AND VARIANCE SWAPS III. 7 We remark that a related formula can be found in Eberlein and Raible [15], with a further generalization in Nicolato and Venardos [19]. Let us briefly discuss the relationship between the continuously and discretely realised volatility. Consider the market realised volatility defined in (2.8). We have that T n n n −λk si Yk (si ) = Yk (0) e + 1u<si e−λk (si −u) dL(λk u). i=1 0 i=1 i=1 Thus, we can use similar calculations as in the proof above to derive a key formula for the market realised volatility. We notice that n 1 1 −λk si 1 − e−λk T , e = lim n→∞ n λk i=1 and hence, the key formula for σR will be reasonably close to that of σ R . This means that we can expect the expressions that we derive for the different volatility swap contracts based on the continuously measured realised volatility to be reasonably close to the market based swaps. Applying the key formula in Lemma 3.1, we are now in the position to derive representations of the swap price dynamics in terms of Laplace transforms. The details are given in the next proposition: Proposition 3.2. For every γ > −1 and any constant c > 0 satisfying the condition λ−1 c < [ Tk (1 − e−λk (T −s) )]−1 κ for all k, where κ = supk=1,...,m,s∈[0,T ] |θk (s)|, it holds Γ(γ + 1) c+i∞ −(γ+1) Σ2γ (t, T ) = z Ψθ (t, T, z) 2πi c−i∞ m ω Y (t) z k k (1 − e−λk (T −t) ) dz , × exp tσR2 (t) + T λk k=1 where Ψθ (t, T, z) = exp m k=1 λk t T ψk zωk −λk (T −s) 1−e + θk (s) − ψk (θk (s)) ds . λk T Proof. We know from the theory of Laplace transforms that Γ(γ + 1) c+i∞ −(γ+1) zx γ z e dz , x = 2πi c−i∞ for any c > 0 and γ > −1. Thus, under the conditions of the proposition, making the moment generating function well-defined, we have Γ(γ + 1) c+i∞ −(γ+1) Σ2γ (t, T ) = z Eθ exp zσR2 (T ) | Ft dz . 2πi c−i∞ Applying the Key Formula in Lemma 3.1 gives the desired result. We remark that the expression for the swap prices in the proposition above is suitable for numerical calculations based on the fast Fourier transform (FFT) or other fast numerical inversion techniques for the Laplace transform. This seems to be a standard approach in the context of volatility swaps or general derivatives pricing (see e.g. Carr and Lee [11] and Matytsin [18]). This will be the topic in Section 5. III. 8 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU The variance swap price has an explicit expression, which is stated in the proposition below. Proposition 3.3. The variance swap has a price given by the following expression: m ωk t 2 1 − e−λk (T −t) Yk (t)+ Σ2 (t, T ) = σR (t) + T T λk k=1 (3.1) m ωk T ψk (θk (s))(1 − e−λk (T −s) ) ds . T t k=1 Proof. We can prove this directly by using z ∈ R, differentiate with respect to z in the Key Formula in Lemma 3.1 and then let z = 0. Observe that the swap prices Σ2γ at time t are dependent both on the current level of the variance σ 2 (t) and the realised variance σR2 (t). Based on this, we can go further and price options written on the swaps. 3.1. Options. Let f be a real-valued measurable function with at most linear growth. Then the fair price C(t) at time t of an option price paying f (Σ2γ (τ, T )) at exercise time τ > t is given by C(t) = e−r(τ −t) Eθ [f (Σ2γ (τ, T )) | Ft], where Σ2γ (τ, T ) is given in Proposition 3.2, with T > τ . For the variance swap the explicit solution in Proposition 3.3 leads to a formulation of the option pricing problem where the fast Fourier transform is applicable. We focus our discussion on call options. Using the approach by Carr and Madan [12] we can formulate the price of a call option as an inverse Fourier transform in the strike price = ln(K) be the log of the strike price. After introducing an exponential K. Let K damping to get a square integrable function we can represent the price of the option as ∞ exp(−αK) e e−ivK Φ(v) dv (3.2) C(t) = π 0 where ∞ e e e + ivK −r(τ −t) αK Σ2 (τ,T ) K e Φ(v) = e Eθ e e −e Ft dK. −∞ Using the explicit expression for the variance swap in Proposition 3.3, the explicit solution for the non-Gaussian Ornstein-Uhlenbeck processes Yk (t) and the independent increments property of the subordinators we get that m e−r(τ −t) ωk Yk (t) exp (1 + α + iv) 1 − e−λk (T −t) Φ(v) = (α + iv)(α + 1 + iv) λ kT k=1 m ωk T t 2 σ (t) + × exp (1 + α + iv) ψk (θk (s))(1 − e−λk (T −s) ) ds T R T τ k=1 m τ ωk (1 + α + iv) 1 − e−λk (T −s) ds , × exp λk ψk λ T k t k=1 where we recall ψk (·) to be the log-moment generating functions of the subordinators Lk . The details in the derivation of this formula is given in Appendix A. It is now VALUING VOLATILITY AND VARIANCE SWAPS III. 9 possible to calculate the option price using the fast Fourier transform of the integral in (3.2) following the outline in Carr and Madan [12]. 4. An approximation of the volatility swap price dynamics We have seen above how we can apply techniques based on the Laplace transform to derive formulas for the swap price dynamics. An alternative approach for volatility swaps√is to derive an approximation from a second-order Taylor expansion of the function x. This was suggested by Brockhaus and Long [7], and we now elaborate on this approximation for the BNS-model. Below we derive the approximate volatility swap price dynamics, and analyse the error made with this method in Section 5. The following proposition holds true: Proposition 4.1. The volatility swap price dynamics can be expressed by Σ4 (t, T ) − 2Σ2 (0, T )Σ2 (t, T ) + Σ22 (0, T ) Σ2 (t, T ) 1 − Σ2 (0, T )+ Σ(t, T ) = +R(t, T ) 3/2 2 2 Σ2 (0, T ) 8Σ2 (0, T ) where 3 1 (σR2 (T ) − Σ2 (0, T )) F , R(t, T ) = Eθ 5/2 t 32 (Σ2 (0, T ) + Θ (σR2 (T ) − Σ2 (0, T ))) and Θ is a random variable such that 0 < Θ < 1. Proof. For a positive random variable X, a second-order Taylor approximation of around Eθ [X] with remainder term gives √ 1 1 1 (X − Eθ [X]) − X = Eθ [X] + (X − Eθ [X])2 + RX 3/2 8 Eθ [X] 2 Eθ [X] 1 (X − Eθ [X])2 X 1 1 − = Eθ [X] + + RX , 2 2 Eθ [X] 8 Eθ [X]3/2 √ X where the remainder term is RX = (X − Eθ [X])3 1 . 32 (Eθ [X] + Θ(X − Eθ [X]))5/2 Thus, letting X = σR2 (T ), and taking conditional expectation together with the defini tion of Σ2γ , yields the result. With the dynamics of Σ4 (t, T ) given by Proposition 3.2, we can derive an approximative dynamics of the volatility swap price Σ(t, T ) based on the expression in Proposition 4.1 by ignoring the R(t, T )-term. How good this approximation is depends of course on the size of the remainder. We analyse the remainder term numerically in the next section. 5. Numerical studies of volatility and variance swaps In the previous sections we have seen how the price of swaps written on all powers of realised volatility can be expressed as an inverse Laplace transform. This representation is suited for numerical solution using some inversion technique, such as the fast Fourier transform (FFT). In this section we show how to utilise the computational power of the FFT to evaluate swap prices and give a few numerical examples. III. 10 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU α β µ δ 233.0 5.612 −5.331 × 10−4 0.0370 Table 1. Estimated parameters for the NIG-distribution The fast Fourier method is a computationally efficient way to do the discrete Fourier transform (5.1) ω(k) = N 2π e−i N (j−1)(k−1) x(j), for k = 1, . . . , N, j=1 when the amount of points N is a power of 2, reducing the number of multiplications from order N 2 to N ln2 (N). The use of the fast Fourier transform for option pricing was investigated by Carr and Madan [12]. The possibility to use pre-implemented and optimised versions of the algorithm from software packages, together with its speed and simplicity, makes it a competitive method. The only requirement is that we know the characteristic function of the density analytically. Proposition 3.2 gives the price of a swap as the inverse Laplace transform of a function on a form suitable for the (inverse) fast Fourier transform. To begin with we need to discretise both z and σR and approximate the integral with a finite sum. As we see from the formula we actually need to discretise σ 2 := σR2 × t/T , hence we get a time scaling of the output variable. Since FFT are restricted by sampling constraints this has the undesirable consequence that if t is small compared to T we get few data points in the domain of interest. To make the best use of the computational efficiency we let N be a power of 2 and choose ∆ σ 2 sufficiently small. The discretised variable is σ 2 ∗ (j − 1). To rewrite the sum in the standard form of the fast Fourier then σ 2 (j) = ∆ transform it requires that 2π ∆z = N∆ σ2 and z(k) = c + i∆z(k − 1). Applying these discretisations give us a summation of the form (5.1). The background driving Lévy processes Lk have to be specified to get the log-moment generating functions explicitly. The standard approach is to specify a stationary distribution of the Ornstein-Uhlenbeck process and then derive the log-moment generating function for the Lévy process from the distribution. Two popular distributions are the inverse Gaussian and variance-gamma, see Barndorff-Nielsen and Shephard [2], Carr and Madan [12], Nicolato and Venardos [19]. Here we only consider the inverse Gaussian (IG) distribution, having an explicit density function 1 2 −1 (γ/δ)−1/2 −3/2 2 exp − δ x + γ x , x 2K−1/2 (δγ) 2 where K−1/2 is the Bessel function of third kind with index −1/2. The parameters of the IG distribution are δ and γ, both supposed to be positive. In this case the log-moment generating function has been calculated by Nicolato and Venardos [19] to be ψ(θ) = θδ(γ 2 − 2θ)1/2 . After rewriting the integrand to simplify the simulations we can implement it using Matlab’s predefined command for applying FFT. VALUING VOLATILITY AND VARIANCE SWAPS III. 11 λ ω OU1 0.9127 0.9224 OU2 0.0262 0.0776 Table 2. Estimated parameters for the decay rates and weights of the OU-processes When specifying the stationary distribution of the Ornstein-Uhlenbeck process to be inverse Gaussian the log-returns of the stock will be approximately normal inverse Gaussian (NIG) distributed. The NIG distribution is a four parameter family of distributions proposed by Barndorff-Nielsen [1] as a flexible class to model the stylized facts of log-returns (see Eberlein and Keller [14] for the related hyperbolic distribution). The parameters of the NIG distribution are µ, the location, β, the skewness, δ, the scale and α, the tail heaviness. The µ and β parameters we recognize directly from the 2 asset price model (2.1), whereas δ is from the volatility model and α = β + γ 2 . We refer the reader to Barndorff-Nielsen [1] for more on this family of distributions and its applications to finance. In this example, we use parameters for the normal inverse Gaussian distribution estimated by Lindberg [17] for the Swedish company AstraZeneca. The parameters are estimated based on daily log-returns over the period August 1, 2003 to June 1, 2004, see Table 1. Following the analysis of Lindberg [17] we assume that we have the superposition of two Ornstein-Uhlenbeck processes, both with an inverse Gaussian law. The rates of decay and weights were also estimated at the same time, see Table 2. Left unknown are estimates of the current level of variance for both processes. For the purpose of illustration we choose these in such a way that multiplied with the weights and added they equal the variance of the NIG distribution. With the parameters in Table 1 we get that the variance of the NIG distribution is 1.59 × 10−4 and for the numerical tests we then let Y1 (t) = 1.66 × 10−4 and Y2 (t) = 7.5 × 10−5. The variance swap has the explicit solution given in Proposition 3.3 which we use as a benchmark for the FFT-method. We use 215 points which give a good tradeoff between speed and accuracy, and we can choose the step size to be ∆ σ 2 = 0.0005. We let t = 31 and T = 61 and plot the difference between the explicit solution and the result from the FFT-method. Figure 1 shows that we have an absolute error in the order of 10−5 or below for the simulation. We account the error in the prices to the precision of the FFT-algorithm. The use of another set of times, t = 1, T = 31, gives similar results but with less data points in the domain of interest is due to the unfortunate time scaling of the output variable. Turning to the volatility swap we now want to compare the FFT method with the approximation of Brockhaus and Long discussed in previous section. The approximation requires values for the variance swap prices, both for time zero and t. We use the explicit solution (3.1) for the variance swap prices, including the case t = 0, as calculated above. We simulate for the same two sets of times, first t = 1, T = 31 and second t = 31, T = 61 and plot the resulting price lines for the two methods. As seen in figure 2 the Brockhaus and Long approximation is reasonable for values close to the expected value of the realised variance at time zero, which is approximately 0.1. When the realised variance σR2 approaches higher values the approximation is increasingly poor. We notice that the Brockhaus and Long method performs better when the III. 12 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU −6 x 10 14 12 abs. error 10 8 6 4 2 0 0 0.05 0.1 0.15 0.2 0.25 0.3 sigmaR2 0.35 0.4 0.45 0.5 Figure 1. Absolute error between the explicit and FFT-solution of the variance swap price as a function of σR . FFT−solution and Approximation for volatility swap 0.5 FFT−solution and Approximation for volatility swap 0.5 FFT−solution Brockhaus and Long approximation 0.45 0.45 0.4 0.4 0.35 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 FFT−solution Brockhaus and Long approximation 0.1 0 0.05 0.1 0.15 0.2 0.25 Yearly volatility 0.3 0.35 0.4 0.45 0 0.05 0.1 0.15 0.2 0.25 Yearly volatility 0.3 0.35 0.4 0.45 Figure 2. Comparison between the Brockhaus and Long approximation and the FFT-solution for the volatility swap price as a function of yearly volatility. Left: t = 1, T = 31 , Right: t = 31, T = 61 fraction t/T is small. This is related to the values of the variance swap being smaller which make the Taylor expansion less sensible. Appendix A = ln(K) be the log of the strike price. We consider Section 3.1 in more detail. Let K After introducing an exponential damping to get a square integrable function we can represent the price of the option as ∞ exp(−αK) e e−ivK Φ(v) dv (A.1) C(t) = π 0 VALUING VOLATILITY AND VARIANCE SWAPS where ∞ e Φ(v) = e ivK −∞ III. 13 e e + −r(τ −t) αK Σ2 (τ,T ) K e Eθ e e −e Ft dK. Now, we can use the explicit solution in Proposition 3.3, the explicit solution for the non-Gaussian Ornstein-Uhlenbeck processes Yk (t) and the independent increments property of the subordinators to find an explicit formula for Φ(v). For simplicity we assume m = 1, i.e. that we have only one Ornstein-Uhlenbeck process. The generalisation to the case with superposition of Ornstein-Uhlenbeck processes is straightforward. First we observe that ∞ e e e + ivK+α Σ2 (τ,T ) K K Φ(v) =Eθ e −e e dK Ft e−r(τ −t) −∞ eΣ2 (τ,T )(1+α+iv) =Eθ Ft e−r(τ −t) (α + iv)(α + iv + 1) Inserting the explicit solution for Σ2 (τ, T ) we get that er(τ −t) Φ(v) 1 R Rτ 1 −λ(T −τ ) )Y (τ )+ T ψ (θ(s)) 1−e−λ(T −s) ds ( ) ) τ e T (1+α+iv)( 0 Y (s) ds+ λ (1−e = Eθ Ft (α + iv)(α + iv + 1) T t exp T1 (1 + α + iv)( τ ψ (θ(s))(1 − e−λ(T −s) ) ds + 0 Y (s) ds) = (α + iv)(α + iv + 1) τ 1 1 −λ(T −τ ) (1 + α + iv)( Y (s) ds + (1 − e )Y (τ )) Ft × Eθ exp T λ t Now we use the following relations Y (τ ) = Y (t) − λ τ Y (s) ds + L(τ ) − L(t) τ −λ(τ −t) Y (τ ) = Y (t)e + e−λ(τ −s) dL(λs) t t to see that λ Let t τ −λ(τ −t) Y (s) ds = Y (t)(1 − e K(t, τ ) = exp (1 + α + iv)( T1 T τ τ )+ t ψ (θ(s))(1 − e 1 − e−λ(τ −s) dL(λs). −λ(T −s) (α + iv)(α + iv + 1) ) ds + 1 T t 0 Y (s) ds) , then er(τ −t) Φ(v) R 1 (1+α+iv)( λT (Y (t)(1−e−λ(τ −t) )+ tτ 1−e−λ(τ −s) dL(λs)+(1−e−λ(T −τ ) )Y (τ ))) = K(t, τ )Eθ e Ft 1 −λ(τ −t) )) = K(t, τ )e(1+α+iv) λT (Y (t)(1−e × Rτ R 1 −λ(T −τ ) −λ(τ −t) + τ e−λ(τ −s) dL(λs))+ 1 )(Y (t)e 1−e−λ(τ −s) dL(λs)) t λT t Eθ e(1+α+iv)( λT (1−e Ft 1 = K(t, τ )e(1+α+iv) λT Y (t)(1−e −λ(τ −t) +(1−e−λ(T −τ ) )e−λ(τ −t) ) × III. 14 FRED ESPEN BENTH, MARTIN GROTH AND RODWELL KUFAKUNESU Rτ 1 −λ(T −τ ) )e−λ(τ −s) +1−e−λ(τ −s) dL(λs) )F Eθ e(1+α+iv) λT ( t (1−e t R τ 1 1 (1+α+iv) λT 1−e−λ(T −s) dL(λs)) (1+α+iv) λT Y (t)(1−e−λ(T −t) ) ( t Eθ e = K(t, τ )e Ft 1 = K(t, τ )e(1+α+iv) λT Y (t)(1−e −λ(T −t) ) eλ Rτ t 1 ψ( λT (1+α+iv)(1−e−λ(T −s) )) ds where we in the last step used the independent increments property of the subordinator. Hence, we see that e−r(τ −t) Y (t) −λ(T −t) Φ(v) = exp (1 + α + iv) 1−e (α + iv)(α + 1 + iv) λT t 2 1 T −λ(T −s) × exp (1 + α + iv) σ (t) + ψ (θ(s))(1 − e ) ds T R T τ τ 1 −λ(T −s) × exp λ (1 + α + iv) 1 − e ψ ds , λT t and in the more general setting, we have the formula m ωk Yk (t) e−r(τ −t) exp (1 + α + iv) 1 − e−λk (T −t) Φ(v) = (α + iv)(α + 1 + iv) λ T k k=1 m ωk T t 2 σ (t) + × exp (1 + α + iv) ψk (θk (s))(1 − e−λk (T −s) ) ds T R T τ k=1 m τ ωk (1 + α + iv) 1 − e−λk (T −s) × exp λk ψk ds . λk T t k=1 References [1] Barndorff-Nielsen, O. E. (1998). Processes of normal inverse Gaussian type. Finance Stoch., 2, pp. 41-68. [2] Barndorff-Nielsen, O. E. (2001). Superposition of Ornstein-Uhlenbeck type processes. Theory Probability Appl., 45(2), pp. 175-194. [3] Barndorff-Nielsen, O. E. and Shephard, N. (2001). Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics. J. Roy. Stat. Soc. A, 63, pp. 167-241. [4] Benth, F.E. and Groth, M. (2006). The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nielsen and Shephard stochastic volatility model. Preprint, University of Oslo, Norway. [5] Benth, F.E. and Meyer-Brandis, T. (2005). The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps. Finance Stoch., 9(4), pp. 563-575. [6] Benth, F. E. and Saltyte-Benth, J. (2004). The normal inverse Gaussian distribution and spot price modelling in energy markets. Intern. J. Theor. Appl. Finance, 7(2), pp. 177-192. [7] Brockhaus, O. and Long, D. (1999). Volatility swaps made simple. RISK magazine, 2(1), pp. 9295. [8] Carr, P. and Geman, H. and Madan, D.B. and Yor, M. (2005). Pricing options on realized variance. Finance Stoch., 9, pp. 453-475. [9] Carr, P. and Lee, R. (2006). Robust replication of volatility derivatives. Preprint, Bloomberg LP and University of Chicago. [10] Carr, P. and Lee, R. (2006). Pricing and hedging options on realized volatility and variance. Preprint, University of Chicago. [11] Carr, P. and Lee, R. (2004) Robust Hedging of volatility derivatives Presentation New York, September 20, 2004 : Downloadable at: http://math.uchicago.edu/∼rl/voltrading.pdf [12] Carr, P. and Madan, D.B. (1998). Option valuation using the fast Fourier transform. J. Comp. Finance, 2, pp. 61-73. VALUING VOLATILITY AND VARIANCE SWAPS III. 15 [13] Detemple, J. and Osakwe, C. (2000). The valuation of volatility options. Europ. Finance Rev., 4, pp. 21-50. [14] Eberlein, E., and Keller, U. (1995). Hyperbolic distributions in finance. Bernoulli, 1, pp. 281-299. [15] Eberlein, E., and Raible, S. (1999). Term structure models driven by general Lévy models. Math. Finance, 9(1), pp. 31-53. [16] Howison, S., Rafailidis, A. and Rasmussen, H. (2004). On the pricing and hedging of volatility derivatives. Appl. Math. Finance, 11(4), pp. 317-346. [17] Lindberg, C. (2005). The estimation of a stochastic volatility model based on the number of trades. Preprint, Chalmers University, Sweden. [18] Matytsin, A. (1999). Modelling volatility and volatility derivatives. Presentation, New York, 25 September 1999. Downloadable at: http://www.math.columbia.edu/ smirnov/Matytsin.pdf [19] Nicolato, E., and Venardos, E. (2003). Option pricing in stochastic volatility models of the Ornstein-Uhlenbeck type. Math. Finance, 13(4), pp. 445-466. [20] Windcliff, H., Forsyth, P. A. and Vetzal, K. R. (2006). Pricing methods and hedging strategies for volatility derivatives. J. Banking Finance, 30, pp. 409-431. IV The implied risk aversion from utility indifference option pricing in a stochastic volatility model Fred Espen Benth, Martin Groth and Carl Lindberg Submitted THE IMPLIED RISK AVERSION FROM UTILITY INDIFFERENCE OPTION PRICING IN A STOCHASTIC VOLATILITY MODEL FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Abstract. In recent decades, there has been a growing interest for utility indifference based approaches to solve the question of pricing of derivatives in incomplete markets. In this paper we consider a stochastic volatility model defined as a positive non-Gaussian Ornstein-Uhlenbeck process, and price Call and Put options using the indifference methodology in the case of exponential utility. The purpose of the study is to investigate empirically the implied risk aversion for a representative agent in the option market, as a function of time to maturity and strike price. Our studies are based on price data for two companies, Microsoft and Volvo, where we calibrate the stochastic volatility model using historical price returns. The implied risk aversion is found by numerically inverting the indifference pricing equation, given observed option prices. The numerical inversion involves solving an integro-partial differential equation. We find that the option prices in the market are basically set by the issuer, in the sense that it is the issuer’s indifference prices that matches the market prices. Since the stochastic volatility model explains the stylized facts of returns rather well, we expect the implied risk aversion to be rather flat with respect to maturity and strike price of the options. We find on the contrary a clear smile effect for short dated options, which may be explained by the issuer’s fear of a market crash (in the case of the issuance of a Put option). Although the stochastic volatility model explains the heavy tails of the returns, the crash risk seems to be unexplained by the stochastic volatility model. 1. Introduction The volatility smile is a well-known signature for the mismatch between the theoretical Black & Scholes and the realized market price of Call options. The Black & Scholes pricing paradigm supposes a frictionless market where hedging of the option can be done continuously at no cost and (logarithmic) returns of the underlying asset are independent and normally distributed. In reality, transaction costs are incurred when trading in the market, and returns may be dependent and leptokurtic. Many models have been suggested, going beyond the geometric Brownian motion, to explain the stylized facts of observed asset price returns and the volatility smile. In recent years, the stochastic volatility model of Barndorff-Nielsen and Shephard [3] has gained a lot of attention for its flexibility in explaining both the heavy tails and the dependency structure of asset returns. They propose to use a geometric Brownian motion model for the asset price dynamics, where the volatility (in fact the squared volatility) follows a sum of non-Gaussian Ornstein-Uhlenbeck processes. The model is sufficiently sophisticated for a precise modeling of asset returns, besides being analytically tractable for derivatives pricing and portfolio optimization (see Benth, Karlsen and Reikvam [5] and Lindberg [14], [15]). Date: 8 January 2007. 1991 Mathematics Subject Classification. Key words and phrases. Stochastic volatility, utility indifference option pricing, risk aversion, Lévy processes. 1 IV. 2 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG The crucial insight of Black & Scholes [8] and Merton [18] in their seminal papers is the independence of risk preferences in the pricing of options. However, this is strongly related to the hypothesis of completeness of the market, which in practice does not hold. A perfect hedge of an option is not possible in the real market, thus incurring a certain risk associated to issuing (or being short) an option. Therefore, the price of the option will be a reflection of the cost of a partial hedge together with a premium charged for taking on the unhedgeable risk. The latter is dependent on the issuer’s risk preferences. Using the stochastic volatility model of Barndorff-Nielsen and Shephard [3] puts us in an incomplete market, and the question of option pricing involves choosing a risk-neutral pricing measure (or an equivalent martingale measure). This can be done by appealing to techniques which takes the risk preferences of the investor directly into account. In the last decades, utility indifference pricing has become an increasingly popular tool for a theoretical analysis of the pricing problem in incomplete markets. First proposed by Hodges and Neuberger [13] for pricing of Call options on a geometric Brownian motion stock dynamics in a market with transaction costs, it has later been used for other stock price models and different market set-ups. Closely related to our paper are Becherer [4] and Rheinländer and Steiger [21]. The utility indifference approach is usually based on the choice of an exponential utility function, since then it is in most cases possible to derive explicit prices (or at least efficiently computable prices) and these prices coincide with the Black & Scholes price when the market context “degenerates to the complete case”. The exponential utility function has one parameter, measuring the risk aversion of the investor. Letting the risk aversion tend to zero we obtain a price which coincide with the one induced from the minimal entropy martingale measure (see Benth and Meyer-Brandis [7] and Rheinländer and Steiger [21]). An alternative approach is to choose a martingale measure based on a structure preserving Esscher transform (see Nicolato and Venardos [19]), however, such an approach does not take into account any risk preferences of the investor explicitly (although one implicitly conjectures a risk preference by choosing this transform). In Nicolato and Venardos [19] and Benth and Groth [6] it is demonstrated that a volatility smile is produced when using the stochastic volatility model of BarndorffNielsen and Shephard. In the former paper, an analysis of option prices for the S&P500 index is performed when a leverage effect is included in the dynamical model. Benth and Groth [6] price options under the minimal entropy martingale measure using a numerical solution of an integro-partial differential equation. The purpose of this paper is to investigate the implied risk aversion from option prices. To the best of our knowledge, no one has so far investigated this practical approach to utility indifference pricing. Based on a hypothesis that the underlying asset price dynamics is following the Barndorff-Nielsen and Shephard model and that there is a representative agent in the market pricing options using a utility indifference method with exponential utility, we back out the implied risk aversion from theoretical prices. The theoretical prices for a given risk aversion can be calculated by solving numerically a nonlinear integro-partial differential equation, being a generalization of the Black & Scholes equation. Backing out the implied risk aversion from market prices for options, we are able to study the risk aversion as a function of maturity time and exercise price of the option. Of course, if the market were using a utility indifference pricing approach, the implied risk aversion should be flat. We investigate this question empirically for options written on two stocks; Microsoft listed at NYSE and Volvo IMPLIED RISK AVERSION IV. 3 listed at the Swedish stock exchange OMX Stockholmsbörsen. The former is a very liquid asset and option, while the latter is traded in a significantly thinner market. Using historical time series for the asset prices and trading volumes, we fit the stochastic volatility model. The estimation procedure is based on a technique developed by Lindberg [16], efficiently calibrating the stochastic volatility model with a high degree of statistical precision. From this we calculate option prices by solving an integro-partial differential equation using advanced numerical methods. Our results indicate that prices are in favour of the issuer, since the observed trade prices are above the minimal entropy martingale measure prices. We also find a smile, or rather a smirk, effect in the implied risk aversion. The results tell us that even when using a highly sophisticated stochastic volatility model, which explains the dependency and distributional properties of the returns close to perfect, together with a pricing approach taking risk preferences into account, there are still risks unaccounted for. The obvious explanation is of course that we have not taken transaction costs into account. However, this can not be the only reason, since a large part of the option trades are naked, that is, the short position is not covered by a hedge, thus making transaction costs irrelevant. There is also an interesting shape of the implied risk aversion which may be explained by differences in out-of and in-the money positions. We find that although our stochastic model for the asset prices includes heavy-tailed returns, the market is pricing in a premium for potential crashes. The paper is organized as follows. We introduce the Barndorff-Nielsen and Shephard stochastic volatility model in Section 2 together with short sections about the minimal entropy martingale measure and utility indifference pricing. Section 3 contains the estimation of the parameters in the model. We solve the indifference pricing problem numerically in Section 4, i.e. calculate theoretical option prices. Finally in Section 5 we use the numerical framework and market prices to backtrack the implied risk aversion in the market for the two option classes studied. 2. The model 2.1. Model definitions. For 0 ≤ t ≤ T < ∞, we assume as given a complete probability space (Ω, F , P ) with a filtration {Ft }0≤t≤T satisfying the usual conditions. We take a subordinator L, and denote its Lévy measure by l(dz). A subordinator is defined to be a Lévy process taking values in [0, ∞) , which gives that its sample paths are increasing. The Lévy measure l of a subordinator satisfies the condition ∞ min(1, z)l(dz) < ∞. 0+ We assume that we use the cádlág version of L. Denote by Y the OU stochastic process whose dynamics are governed by (2.1) dY (t) = −λY (t) dt + dL(λt), where λ > 0 denotes the rate of decay. We call processes with these dynamics news processes. The unusual timing of L is chosen so that the marginal distribution of Y will be unchanged for any value of λ. The stationary news process Y can be written as t (2.2) Y (t) = Y0 e−λt + 0 e−λ(t−u) dL(λu), t ≥ 0, where Y0 := Y (0), and we assume that L(0) = 0. The variable Y0 has the stationary marginal distribution of the process and is independent of L(t) − L(0), t ≥ 0. Further, IV. 4 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG if Y0 ≥ 0, then Y (t) > 0 ∀t ∈ [0, T ] , since L is non-decreasing. The square root √ of the process Y is called the volatility, denoted by Y . In general, the volatility can be expressed as a linear combination of Ornstein-Uhlenbeck processes of the form in Equation (2.2). However, we consider for simplicity the case with only one such process. Consider a Wiener process W independent of L. We use the filtration {Ft }0≤t≤T := {σ (W (t), L(λt))}0≤t≤T , to make the OU process and the Wiener process simultaneously adapted. Define the stock price S to have the dynamics dS(t) = S(t) (µ + βY (t)) dt + Y (t) dW (t) , where µ is the constant mean rate of return, and β is the skewness parameter. This dynamics implies the explicit stock price process t t 1 µ + β − 2 Y (u) du + (2.3) S(t) = S(0) exp Y (u) dW (u) . 0 0 The model allows for the increments of the logreturns R (t) := log (S(t)/S(0)) , to have semi-heavy tails as well as both volatility clustering and skewness. The increments of the logreturns R are stationary since S (t) S (s) L S (s) = R (s − t) , − log = log (2.4) R (s) − R (t) = log S (0) S (0) S (t) L where ” = ” denotes equality in law. We assume the usual risk-free bond dynamics dB(t) = rB(t) dt, with a constant interest rate r > 0. 2.2. The minimal entropy martingale measure. We recall a few results from [7] for the convenience of the reader. Assume that the Lévy measure l satisfies ∞ {eαz − 1} l(dz) < ∞, 1 for the constant β2 1 − e−λT . λ It is shown in [7] that under this condition on l, the density process of the minimal entropy martingale measure (MEMM), denoted by QM E , can be represented as α= Z(t) := Z W (t)Z L (t), where Z W (t) = exp − 0 and L Z (t) = exp t 0 t µ + βY (u) dW (u) − Y (u) ∞ 0 log δ (Y (u), z, u) N(dz, du) + t 0 1 (µ + βY (u))2 du , 2 Y (u) t 0 0 ∞ (1 − δ (Y (u), z, u)) l(dz)du IMPLIED RISK AVERSION IV. 5 for the Poisson random measure N(dz, du) of L. The function δ(y, z, t) is defined as H(t, y + z) δ(y, z, t) := , H(t, y) where 2 µ 1 T 2 + 2µβ + β Y (u) du Y (t) = y , (2.5) H(t, y) = E exp − 2 t Y (u) for (t, y) ∈ [0, T ] × R+ . It turns out that H(t, y) solves the integro-pde 1 µ2 2 (2.6) ∂t H(t, y) − + 2µβ + β y H(t, y) + Lσ H(t, y) = 0, 2 y for (t, y) ∈ [0, T ) × R+ with Lσ H(t, y) = −λy∂y H(t, y) + λ ∞ 0+ {H(t, y + z) − H(t, y)} l(dz), and terminal data H(T, y) = 1, y ∈ R+ . 2.3. Utility indifference pricing. The concept of utility indifference pricing was proposed by [13]. The idea springs from realizing that in incomplete markets, arbitrage pricing theory does not give unique option prices, so additional criteria are required. The utility indifference price for an issuer of an option is the price for which she is indifferent between selling a contract or entering the market by her own account. The approach requires that the investor chooses a utility function, the most common one being the exponential utility function U(x) = 1 − exp (−γx) , where γ > 0 is the risk aversion parameter. This choice has the advantage that the price of the option becomes independent of the issuer’s wealth, but most of all it allows for explicit computations. For a mathematical foundation of the following analysis, we refer to Becherer [4], Benth and Meyer-Brandis [7] and Rheinländer and Steiger [21]. We denote by A the set of Ft -adapted controls π for which there exist wealth processes X π (t) that solves dX(u) = X(u) π (u) (µ + βY (u)) du + r du + π (u) Y (u) dB(u) , X(t) = x. The value function for the optimal control problem, if the investor does not issue a claim, is V 0 (t, x, y) = sup E [1 − exp (−γX(T ))| X(t) = x, Y (t) = y] . π∈A If the investor issues a claim f (S(T )), the value function becomes V (t, x, y, s) = sup E [1 − exp (−γ (X(T ) − f (X(T ))))| X(t) = x, Y (t) = y, S(t) = s] . π∈A Hence the utility indifference price for the claim f (S(T )) is given by the unique solution Λ(γ) (t, y, s) to the equation V 0 (t, x, y) = V (t, x + Λ(γ) (t, y, s), y, s). Provided the value functions are sufficiently smooth we can apply the dynamic programming method to solve the two stochastic control problems. In the process we IV. 6 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG derive the Hamilton-Jacobi-Bellman (HJB) equations associated with the value functions. It happens that equation (2.6) corresponds to the first case, when no claim is issued. Solving the second value function, when a claim is issued, we arrive at the HJBequation for the utility indifference price of the option. The form of the integro-pde depends on whether we look at the problem from the seller’s or the buyer’s side, differing only in sign of terms in the equation. The integro-pde governing the price Λ(γ) for the issuer of a claim becomes 1 (γ) (γ) (γ) rΛ(γ) = Λt + ys2 Λ(γ) ss − λyΛy + rsΛs 2 ∞ (2.7) H(t, y + z) 1 exp γ Λ(γ) (t, y + z, s) − Λ(γ) (t, y, s) − 1 +λ l(dz), γ H(t, y) 0 with Λ(γ) (T, y, s) = f (s), for (t, y, s) ∈ [0, T ) × R2+ , where H is given by Equation (2.6). Hence, to obtain the price Λ(γ) one has to solve a system of two coupled integro-pde. For completeness, we also include the integro-pde for the indifference price of the buyer (γ) : of the option, denoted Λ 1 2 (γ) (γ) (γ) (γ) (γ) = Λ rΛ t + ys Λss − λy Λy + rsΛs 2 ∞ H(t, y + z) 1 (γ) (γ) −λ exp −γ Λ (t, y + z, s) − Λ (t, y, s) − 1 l(dz), γ H(t, y) 0 (γ) (T, y, s) = f (s), for (t, y, s) ∈ [0, T ) × R2 . with Λ + The lowest acceptable utility indifference price for an issuer of a claim is reached when the risk aversion γ tends to zero. This price coincides with the arbitrage free price under MEMM, but also with the maximal utility indifference price for a buyer of the same claim. This makes MEMM particularly interesting to study. In the risk aversion limit γ ↓ 0, equation (2.7) simplifies to (see [7]) 1 rΛ = Λt + ys2Λss − λyΛy + rsΛs ∞2 (2.8) H(t, y + z) l(dz). +λ (Λ(t, y + z, s) − Λ(t, y, s)) H(t, y) 0 We have used the short-hand notation Λ(0) := Λ here. It is well known that a higher risk aversion leads to higher prices, so if the option prices we observe in the market is higher than the prices under MEMM, we can assume the buyer has a risk aversion γ > 0. If they, on the other hand, are lower the same applies but for the seller. Using Equation (2.7), market prices and a root finding algorithm, we shall find the implied risk aversion from the market. 3. On estimating the BNS model to price and volume data In this section we use the approach from [16] to analyze observed asset prices from two stocks, Microsoft and Volvo. The estimation approach of [16] involves using both observed stock prices as well as the traded volume of the asset. The latter is used to get information for the volatility variations. We have available time series of daily adjusted closing prices and daily trading volume for the Microsoft stock traded at the New York Stock Exchange in the period 1 January 2004 to 18 September 2006. For the Swedish company Volvo we also have IMPLIED RISK AVERSION IV. 7 daily adjusted closing prices and daily traded volume of its B shares collected from the OMX Stockholmsbörsen over the time period 1 August 2004 to 30 December 2005. We start by presenting a discrete time version of the BNS model together with the method to fit this to the observations. Next, we apply the estimation method to the available data sets. Assume that the logreturns Rc (∆), Rc (2∆) − Rc (∆), ..., Rc (d∆) − Rc ((d − 1)∆), are observed, with Rc defined by Equation (2.4). From now on, ∆ is assumed to be one day, and the number of consecutive observations in our time series data is d + 1. It is reasonable to assume that the approximation t (3.1) Y (s) dW (s) ≈ Y (t) ε, t−∆ holds, with ε(t) ∼ N(0, 1) being i.i.d., unless some λj are large so that the volatility processes will be volatile. The model in Equations (2.3) and (2.4) then take the discrete time form (3.2) R(t) = µ + βY (t) + Y (t) ε(t), where t = 1, 2, ..., and ε(·) is a sequence of independent N(0, 1) variables. It was argued in [16] that one should not try to fit the logreturns directly to data. This is due to the severe parameter instability, or large flexibility, of many of the marginal distributions typically used in finance, such as the normal inverse Gaussian (NIG) distribution. Instead, it was proposed that one should try to measure Y with parameters µ and β such that the empirical normalized logreturns R(·) − (µ + βY (·)) (3.3) ξ(·) := Y (·) are i.i.d. and N(0, 1). If we can do this, it is easy to model Y within the framework in [3], thanks to the large flexibility of the BNS model. This approach verifies the validity of the discrete time model, and allows us to better understand the structure of the process that generated the returns R(·). It is important to get ξ(·) and the model for Y (·) correct, since it is these quantities that generate the model, and hence contain the key to the understanding of it. The next priority is to get the parameters for the distribution of Y (·). Equation (3.2) gives an implied distribution of the returns R(·) that we have a good comprehension of. The procedure is illustrated by using NIG(µ, β, δ, γ) as the marginal distribution of the returns R(·). This implies that the volatility Y (·) has an inverse Gaussian distribution IG(δ, γ). We proceed as follows. 1. Find volatility processes Y (·) and parameters µ and β for each stock so that the normalized returns ξ(·) become independent N(0, 1). For this purpose, we assume that the discrete time volatility processes Y (·) is a constant times some measure of trading intensity z(·) on each trading day, i.e., (3.4) Y (·) = θz(·). The idea of this model is to try to verify that a function of some measure of trading intensity can by used as Y (·) in Equation (3.3) to obtain ξ(·) that are i.i.d. and N(0, 1). We then model Y (·) within the framework of [3]. If we can do this, we have asserted that our continuous time stochastic volatility model is reasonable. Furthermore, we get an economical interpretation of the volatility. IV. 8 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Note that we do not claim that the number of trades, the number of traded stocks, or any other measure of trading intensity, can always be used to model the volatility for all stocks. However, we have experienced that very often one can use such measures to obtain good estimates of the volatility for relatively long periods of time. Advantages are that we can get stable parameter estimates easily and with only daily data. The next step of the estimation procedure is: 2. Estimate parameters δ and γ so that the empirical distributions of Y (·) from Equation (3.4) fit the IG(δ, γ) distribution. Hence, we have specified the NIG-distribution for R(·). We could do this estimation simultaneously for IG and NIG. However, since the NIG-distribution is very flexible and unstable, we know that even if we would get a slightly better fit this way, it would be at the cost of less understanding of the process. The third and final step in the calibration of the BNS-model says: 3. Use the estimates of the volatility processes Y (·) to estimate the rates of decay λj . This is done by matching the empirical autocorrelation function with the autocorrelation function of the continuous time volatility process Y . The autocorrelation ρ√Y of the volatility process becomes ρ√Y (h) = Cov(Y (h),Y (0)) V ar(Y (0)) = exp (λ |h|) , h ≥ 0. Y (1), ..., The rate of decay λ is therefore obtained from the discrete time volatilities Y (d), by minimizing the least squared distance between the theoretical and empirical autocorrelation functions. We now move on to implement this statistical approach to calibrating the stock price process and its stochastic volatility model to observed prices and volume data. We discuss mainly the statistical analysis for the Microsoft stock, and report only some major results for the Volvo stock. 3.1. Microsoft. For Microsoft, we choose Y = θ × (Normalized Traded Volume)3/2 , as a simple model for the volatility, where the exponent 3/2 was picked ad hoc since it gave nice normalized returns. This parameter could of course also be made part of the optimization algorithm, but in our experience, the results remain approximately the same for exponents between 1 and 2. Further, we have no economical intuition as to why we should prefer one exponent over another. To get a better scaling, we use ’Normalized Traded Volume’ which is the traded volume divided by its standard deviation. This model turns out to give a good fit. Judging from Figures 1 and 2, we have little reason to suspect that ξ would not come from an i.i.d. sample, although the autocorrelation for |ξ(·)| shows a significant positive dependence on a few too many lags. Moreover, the empirical cdf of ξ, and the normal probability plot in Figure 3, indicates a very nice fit of ξ to the normal distribution. In particular compared to the normal probability plot of the raw returns, see Figure 3. Further, ξ pass the Kolmogorov-Smirnov test for normality with a p-value of 0.97, as well as the JarqueBera normality test based on skewness and kurtosis with a p-value of 0.12. Since the mean value parameters µ and β are connected through the relation (3.5) E [R] = µ + βE [Y ] , IMPLIED RISK AVERSION Autocorrelation ξ Autocorrelation ξ 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 10 20 Lags 30 IV. 9 40 0 10 20 Lags 30 40 Figure 1. Left: The estimated autocorrelation function for the absolute normalized returns |ξ| for the Microsoft stock from January 1, 2004, to September 18, 2006. Right: The estimated autocorrelation function for the normalized returns for Microsoft during the same time period. The figures show the first 40 lags, and the straight lines parallel to the xaxes are√ the asymptotic 95% confidence bands which are given here as ±1.96/ number of observations. Normalized returns ξ Empirical cdf for ξ 1 3 0.8 Empirical cdf ξ Normal cdf 2 0.6 1 0 0.4 −1 0.2 −2 2004−01−01 2005−05−01 2006−09−18 0 −5 0 5 Figure 2. Left: The normalized returns ξ for the Microsoft stock during January 1, 2004, to September 18, 2006. Right: The empirical cdf for ξ for Microsoft during the same time period, and the standard normal cdf. it is misleading to look at confidence intervals for these parameters. Instead, we check robustness of the results by testing the hypothesis H0 : µ = β = 0. Under this hypothesis, the Kolmogorov-Smirnov and the Jarque-Bera tests give the p-values 0.83 and 0.09 respectively, which indicates that the model is not very sensitive to these parameters. Under H0 , we can use standard normal statistical theory to get a 95% confidence interval for θ. The interval is [2.51 × 10−5 , 3.12 × 10−5] . Since the effect of µ and β is small, the confidence interval for θ̂ without the hypothesis H0 will be similar. However, it is hard to calculate this exactly. The implied NIG-distribution and the estimated IG-distribution fit their empirical densities well, see Figure 4. In addition, the volatility process Y has the characteristic look of a news process, see Figure 5. The parameter estimates are µ̂M = −7.70 × 10−4 , β̂M = 8.65, δ̂M = 0.0186, γ̂M = 194, λ̂M = 1.14, and θ̂M = 2.78 × 10−5 . IV. 10 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Normal probability plot for R 0.999 0.997 0.99 0.98 0.95 0.90 0.75 0.50 0.25 0.10 0.05 0.02 0.01 0.003 0.001 Probability Probability Normal probability plot for ξ −2 0.999 0.997 0.99 0.98 0.95 0.90 0.75 0.50 0.25 0.10 0.05 0.02 0.01 0.003 0.001 0 2 Normalized returns ξ −0.1 −0.05 0 Returns R 0.05 Figure 3. Left: The normal probability plot of the normalized returns ξ for the Microsoft stock during January 1, 2004, to September 18, 2006. Right: The normal probability plot of the returns for Microsoft during the same time period. The theoretical quantiles are on the y-axes. 2 2 x 10 Distribution σ 0 1 4 Distribution R 60 50 1.5 40 30 1 20 0.5 10 0 −0.05 0 0.05 0 2 3 −4 x 10 Figure 4. Left: Plot of empirical density of the returns and the implied NIG density obtained from the estimated IG density for the Microsoft stock during January 1, 2004, to September 18, 2006. Right: Plot of empirical density of θ̂ ∗ (Number of trades per day) and the estimated IG-density during the same time period. 3.2. Volvo. For Volvo, the model Y = θ × (Normalized Traded Volume)2 was used, where, analogous to above, the 2 in the exponent was chosen because it gave good normalized returns, but could equally well have been part of the optimization procedure. The same figures as in the analysis of the Microsoft stock all looked good, see for example Figure 6. The p-values for the Kolmogorov-Smirnov test and the Jarque-Bera test were 0.73 and 0.72, respectively. The parameter estimates are µ̂V = 6.21 × 10−4 , β̂V = 1.27, δ̂V = 0.0116, γ̂V = 54.2, λ̂V = 0.83, and θ̂V = 6.63 × 10−5 . 4. Solving the integro-pde for indifference pricing numerically We saw in Section 2.3 that the utility indifference price of a claim could be represented as the solution of a coupled system of integro-pdes. Numerical solution of integro-pdes in the context of finance has been studied extensively over the last decade. IMPLIED RISK AVERSION −3 Stock price 28 2.5 27 x 10 IV. 11 2 Estimated σ 2 USD 26 25 1.5 24 1 23 0.5 22 21 2004−01−01 2005−05−01 2006−09−18 0 2004−01−01 2005−05−01 2006−09−18 Figure 5. Left: The price process in USD for the Microsoft stock from January 1, 2004, to September 18, 2006. Right: The estimated volatility 3 process θ̂ ∗ (Number of trades per day) 2 for Microsoft during the same time period. Probability Normal Probability Plot for ξ Autocorrelation ξ 1 0.999 0.997 0.99 0.98 0.95 0.90 0.75 0.8 0.6 0.50 0.4 0.25 0.10 0.05 0.02 0.01 0.003 0.001 0.2 0 −2 −1 0 Data 1 2 0 10 20 30 40 Figure 6. Left: The normal probability plot of the normalized returns ξ for Volvo B during August 1, 2004, to December 30, 2005. The theoretical quantiles are on the y-axes. Right: The estimated autocorrelation function for the absolute normalized returns |ξ| for the Volvo B stock from August 1, 2004, to December 30, 2005. The figure shows the first 40 lags, and the straight lines√parallel to the x-axes are the asymptotic 95% confidence bands ±1.96/ number of observations. For Lévy processes the finite difference method has been used by Andersen and Andreasen [1] and Cont and Voltchkova [9]. Finite element methods for Lévy driven processes were studied by Matache, Petersdorff and Schwab [17] and stochastic volatility models driven by Brownian motions by Hilber, Matache and Schwab [12]. For the BNS model we build upon work by Benth and Groth [6], who use finite differences to solve Equation (2.8) and find option prices under the minimal entropy martingale measure. Since we are interested in both the MEMM prices and those derived from the general risk aversion in Equation (2.7), we must adapt the methodology used in [6]. Solving Equation (2.8) with finite differences implies restricting the equation to a finite grid. The problem is in its nature unbounded, since the stock price and volatility in theory could have arbitrary large values. Because of the restriction to a finite grid we need to find appropriate boundary conditions where necessary. We also have to IV. 12 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Boundary s=0 s = Smax y=0 y = Ymax Boundary condition Dirichlet Dirichlet von Neumann Dirichlet Table 1. Boundary conditions for the integro-pdes. The Dirichlet condition is to use appropriate Black-Scholes prices while we have a strong reflection giving a von Neumann condition at y = 0. approximate the non-local integral term on a sufficient range of points. The approximation should be able to capture the main influence from the integral since the Lévy measure will kill off the integral for sufficient large z. For simplicity we use a simple trapezoid scheme to approximate the integral. To handle the two-dimensional problem we use Gudonov operator splitting [11] following suggestions by Strang [22]. This gives us two one-dimensional equations which we solve iteratively. It is possible that the subordinator L(t) is of infinity activity, which gives the Lévy measure a singularity at zero. Since the singularity can not be handled by the trapezoid scheme we add a diffusion term to make up for the part of the integral close to zero. Regarding the general risk aversion equation, for numerical stability we make the change of variable 1 Λ(γ) (t, y, s) = ln h(γ) (t, y, s). γ This transforms Equation (2.7) into the non-linear integro-pde 1 2 1 2 (∂s h(γ) )2 (γ) (γ) + Lmemm h(γ) = rh(γ) (4.1) ∂t h + ys ∂ss h + rsh − ys Y (γ) 2 2 h (γ) with terminal condition h (T, y, s) = exp(γf (s)). Here ∞ H(t, y + z) memm (dz) h(t, y) = −λyh(t, y) + λ {h(t, y + z) − h(t, y)} L H(t, y) 0 This is a nonlinear integro-pde, where the only nonlinearity in the equation is in the quadratic term (∂s h(γ) )2 /h(γ) . We remark that this non-linear term is less severe to handle than the appearance of an exponential term in the integrand. For Equation (2.8) we use implicit schemes, deriving a Lax-Wendroff scheme for the non-homogeneous equation involving the integral. For Equation (4.1) we need to use an explicit scheme for the non-linear one-dimensional equation. This force us to take significantly shorter time steps when running the solver. Benth and Groth [6] derive suitable boundary conditions for the integro-pde, which we have collected in Table 1. The Dirichlet boundary conditions mean using BlackScholes prices at the boundaries, i.e. as the variables goes to infinity the prices will adjust to the corresponding Black-Scholes prices. Further motivation for the choice of boundary conditions and the methodology applied to handle the integral can be found in [6]. Boundary conditions for Equation (4.1) are similar. For the sake of visualization we have used interpolation between the points in the data set where necessary to plot the result. (γ) IMPLIED RISK AVERSION IV. 13 4.1. MEMM prices. Given the parameters estimated above we can use the implemented solver to calculate option prices under MEMM. We know that theoretically the MEMM price is the highest price the buyer and lowest price the seller can agree on. Comparing with bid/ask-prices gives us a pointer whether the market is in favour of either one of them. If the MEMM prices are below the bid prices the market will be in favour of the seller while if the ask prices are below the MEMM prices the opposite is the case. This also gives us an indication of whom takes the greatest risk in the market. 4.1.1. Microsoft. The calibration data for the Microsoft stock is until September 18, 2006 so for comparison with the calculated MEMM prices we take bid/ask prices from September 18, 2006, for a range of options with different strikes and maturities. The spot price at the time was $26.85 and we assume a fixed interest rate of 4.94%, which was the three month treasury yield at the time. The Microsoft stock is highly traded and liquid, and the option market for Calls and Puts has good liquidity as well. Looking at the illustration in Figure 7(a) we see that MEMM prices are significantly lower than the bid prices for Call options, which is also true for Put options. This clearly suggests that the market is in favour of the issuer of the claim, letting the buyer take on the largest part of the risk. Hence, the market prices are such that the seller gets a compensation for bearing the risk being short the option. Of course, the buyer knows the maximal loss when entering the position, whereas the seller needs to take into account that the position needs to be liquidized or hedged in order to control potential and uncertain losses. We notice that the difference increases with time to exercise, reflecting that the future is more uncertain than the present, leading to a higher risk premium. In this respect, we can not disregard the possibility that the market operates with a higher or lower interest rate than the 4.94% we used in the simulations. However this should have opposite effect on Call and Put prices, making one of them even more in favour of the issuer. Looking at the implied Black-Scholes volatilities, Figure 7(b), we see that the volatilities, as the prices suggest, are close for short maturities, displaying a skewed smile. For long maturities the implied volatilities for the MEMM prices are close to zero, now with more of a smirk than a smile. The bid prices are almost constant for the Call options while Put options display a flat smile. It appears that the market prices are not consistent with the prices under the MEMM. As we see from Figure 7(c) the differences between the bid prices and the MEMM prices are peaking around the spot price with a 10-12% mispricing. One may speculate that this could be a reflection of the instability of hedging portfolio around the strike, where one can have big changes in the hedging position when the spot is close to the strike. The hedge is more stable when the strike is farther from spot (in either direction), and thus a hedge does not need to be updated so frequently to be accurate. Looking at the percentage error in Figure 7(d) gives another perspective, showing that the mispricing of the options with very small price is substantially higher. The MEMM prices of far-out-of-the-money options are counted in fractions of hundreds or thousands of a dollar. Quoted prices of these options, on the other hand, are usually around five or ten cents, giving a percentage error close to 100%. One should bear in mind that the volume traded at the quoted prices is insignificant, if not zero, for the mentioned options. To conclude, we have that the prices under MEMM are not close to the quoted prices but significantly lower than both bid and ask prices. This tells us that for the Microsoft options the risk in the market is carried by the buyer of the options. The large observed IV. 14 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG (a) (b) 3 100 2.5 90 80 2 Mispricing (%) USD 70 1.5 1 0.5 60 50 40 30 20 0 10 Jan09 0 Jan08 Jan09 Jan08 Jan07 Okt06 10 15 25 20 30 35 40 45 Jan07 Oct06 15 20 Strike (c) 25 30 35 40 Strike (d) Figure 7. Illustrations of features and differences between theoretical MEMM prices and bid prices for Microsoft call options taken September 18, 2006 . (a): Option prices. (b): Implied volatility. (c): Difference between MEMM prices and market prices. (d): Mispricing in percentage error between MEMM prices and bid prices. difference indicates that the market perceives a higher risk aversion than zero, which is assumed in the MEMM prices. In the next section we will investigate the risk aversion further. 4.1.2. Volvo. The Stockholm stock exchange is substantially smaller compared to the stock markets in the United States. Compared to NYSE 2005 average daily dollar volume of 56.1 billions the Stockholm stock exchange’s 14,876 million Swedish crowns are rather trifling. Together with the late introduction of options on stocks listed on the Stockholm stock exchange makes it a much less liquid market. We expect the Volvo options to be traded less frequently than the Microsoft options, which is indicated by volume data. We are interested in if there are any obvious differences in the risk aversion due to this fact, or if the same features as for Microsoft is visible for the Volvo IMPLIED RISK AVERSION (a) IV. 15 (b) Figure 8. (a): Plot of implied Black-Scholes volatilities of call options on the Volvo stock, bid prices from December 30, 2005 and simulated MEMM prices.(b): Plot of implied Black-Scholes volatilities of call options on the Volvo stock, ask prices from December 30, 2005 and simulated MEMM prices. options. The Volvo options are quoted on December 30, 2005 with a stock price at the time of 374.5 SEK. We assumed an interest rate of 3%, which was close but slightly higher than the 3-month STIBOR at the time, but there was a general consensus at the time that the Swedish central bank would increase the repo rate during the year. The main difference compared to Microsoft is that the MEMM prices for Call options written on Volvo are above the bid prices for a large range of strikes and maturities for Call options, more precisely, far-in-the-money options. This is illustrated through the implied volatilities in Figure 8. Thus, the buyer’s price may be decisive for the trades. The bid prices for in-to-the-money options result in a indistinguishable small implied volatility, which means that the bid price is close to the present value of the payoff from the option. For out-of-the-money options, the implied volatilities are above the ones given by the MEMM prices, with an implied volatility around 15-20%. The ask prices is above the MEMM prices for all Call options and looking at Put options there are only a few cases of bid prices falling below the MEMM prices. As observed for Microsoft, the price difference peak around the strike but the percentage error is not as grave as for the Microsoft options. This is due to the nominal value of the stocks, higher nominal price of the stock gives higher nominal value for options on the flanks. This in turns makes the percentage error appearing to be less severe. 5. The implied risk aversion In this section we calculate the implied risk aversion γ from quoted (bid/ask) Call and Put option prices. We proceed as follows. For a given option price, we iterate γ until we reach an agreement between the market quote and the indifference price. For each iteration of γ, we use the numerical algorithm to solve the integro-partial differential equation as described in detail in the previous section. For the root-finding we use Ridder’s method as described in Press et al. [20], avoiding taking a numerical IV. 16 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG derivative. In general the algorithm executes in 5-7 iterations but in some cases the double is needed. We have collected the results for Microsoft in two figures (Figs. 9-10), where the implied risk aversion as a function of maturity and strike of Put and Call options are plotted, respectively. The implied risk aversion for Put options is decreasing with the strike price. There is an apparent effect that the risk aversion decreases more sharply to the left of the current spot price (compounded by the interest rate up to exercise time) than to the right. In fact, for some maturities we even see an increase with the strike to the right. The opposite effect is observed for Call options in Fig. 10, where the implied risk aversion is increasing with the strike, but more sharply to the right of today’s spot price (compounded by the interest rate up to exercise time). This tells us that for Put options the market is averse to crashes, meaning that the Put option becomes far-in-the-money. The same effect is for Call options, where big price jumps upward brings the options far-in-the-money. We can reflect back this relative high risk aversion towards such abrupt and big price changes to the underlying asset price model, which seems to not capture the sudden big movements in prices as much as desired by the market. We know that the model produces a volatility smile, but despite this and the modelling of the heavy-tailed logreturns, the market is still pricing in the fear of a crash (or the opposite for Call options). One may mend this (at least for Put options), by introducing a leverage effect in the stock price model, however, for mathematical and numerical complexity we have not done so (see, however, Rheinländer and Steiger [21]). From the option prices we notice that Calls with high strike close to maturity have unrealistic high prices, taking in to consideration they are unlikely to be exercised. Clearly the price is there to make a market and not as a fair price. For the options far from maturity we see a slight upward slop opposite to what we observed for the Put options. However, for the options with an exercise date in the near future we see that the aversion is higher for low strikes and falling towards the spot price. This could be a consequence of the amounts of money involved in transactions with Call options with low strikes. The unboundedness of the payoff functions could make this a very costly deal in terms of transactions and money transfers. It could be that the issuers marks up the price to cover expenses inflicted upon them for this. This could also explain why this feature is not as prominent for the Put options, since the payoff is bounded in that case. The aversion towards market crashes is also signatured in the decreasing implied risk aversion with time to maturity for far-out-the-money Put options. For Put options being close to maturity, we face a market crash risk, while the longer to maturity, the less is the reason for such options to be striked due to a market crash. Hence, we clearly see the effect of crash risk in the risk aversion, which is not clear at all in the price difference between bid and MEMM (see Fig. 7(c)), however clear in the implied volatility (see Fig. 7(b)). The opposite picture holds for Call options, naturally, since here it is the possibility for an upward jump that worries the issuer, and which is difficult to hedge on the short term. Another effect is that the risk aversion flattens out with increasing time to maturity. When the exercise time is far in the future, the market seems to have a more overall view on the risk, with an aversion being less dependent on the strike. This is in line with the understanding that the sample space for the asset prices are more spread in the future, and we have weaker information on whether the option will be striked or not. In the long term, large price movements will have time to even out, thus controlling IMPLIED RISK AVERSION IV. 17 the issuer’s risk of being striked. However, the overall picture of an de/increasing risk aversion holds for Put/Call options, as for the case when we have short time to maturity. As we can see from Figs. 11- 12, the same conclusions hold for the implied risk aversion from Volvo.∗ Surprisingly, the implied risk aversions for both Call and Put options seem to be approximately one order lower than for Microsoft. In view of liquidity, one may have expected the opposite effect. On the other hand, the implied risk aversion is a complicated nonlinear function of many effects, including the model parameters like the distribution and volatility. Thus, it is not clear how the liquidity comes in and affects the risk aversion for our situation. ∗ Note that we have only considered the implied risk aversion from those prices which are bigger than the MEMM prices, that is, being the issuer’s prices IV. 18 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Oct06 Jan07 Jan08 Jan09 Oct06 Jan07 Jan08 Jan09 10 9 8 Aversion 7 6 5 4 3 2 1 0 15 20 25 Strike 30 35 40 Figure 9. Plot of market aversions simulated from Microsoft put options, bid/ask prices from September 18, 2006. Oct06 Bid Jan07 Bid Jan08 Bid Jan09 Bid Oct06 Ask Jan07 Ask Jan08 Ask Jan09 Ask 1.6 1.4 Aversion 1.2 1 0.8 0.6 0.4 10 15 20 25 Strike 30 35 40 Figure 10. Plot of market aversions simulated from Microsoft call options, bid/ask prices from September 18, 2006. IMPLIED RISK AVERSION IV. 19 Jan06 Ask March06 Ask May06 Ask Jan07 Ask Jan06 Bid March06 Bid May06 Bid Jan07 Bid 0.3 0.25 Aversion 0.2 0.15 0.1 0.05 0 250 300 350 Strike 400 450 Figure 11. Plot of market aversions simulated from Volvo put options, bid/ask prices from December 30, 2005. Jan06 Bid March06 Bid May06 Bid Jan07 Bid Jan06 Ask March06 Ask May06 Ask Jan07 Ask 0.1 0.09 0.08 Aversion 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 280 300 320 340 360 380 Strike 400 420 440 460 Figure 12. Plot of market aversions simulated from Volvo call options, bid/ask prices from December 30, 2005. IV. 20 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Maturity Microsoft Put Options Strike MEMM price Bid price Ask price October 06 15.0 17.5 20.0 22.5 25.0 27.5 30.0 0.00 0.00 0.00 0.00 0.03 0.76 3.00 . . . . 0.05 0.75 3.00 0.05 0.05 0.05 0.05 0.10 0.75 3.00 January 07 12.0 15.0 17.0 19.5 20.0 22.0 22.5 24.5 25.0 27.0 27.5 29.5 30.0 32.0 32.5 37.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.18 0.40 2.15 2.62 4.57 5.06 9.47 . . . . 0.05 0.10 0.10 0.30 0.35 0.95 1.20 2.60 3.30 5.50 5.50 10.10 0.05 0.05 0.05 0.10 0.10 0.20 0.15 0.35 0.40 1.05 1.25 2.65 3.10 5.10 5.60 10.10 January 08 15.0 17.5 20.0 22.5 25.0 27.5 30.0 35.0 40.0 0.00 0.00 0.00 0.00 0.00 0.03 1.22 5.86 10.53 0.05 0.15 0.35 0.75 1.25 2.15 3.50 7.90 12.90 0.20 0.25 0.50 0.75 1.30 2.25 3.70 8.10 13.10 January 09 15.0 20.0 22.5 25.0 30.0 35.0 0.00 0.00 0.00 0.00 0.36 4.28 0.20 0.70 1.15 1.85 4.10 7.90 0.25 0.75 1.25 1.95 4.20 8.10 Table 2. Prices for puts on the Microsoft stock. Bid and Ask prices from September 18, 2006, MEMM prices simulated with parameter estimates from Section 3.1. IMPLIED RISK AVERSION Maturity IV. 21 Microsoft Call Options Strike MEMM price Bid price Ask price October 06 7.50 10.00 12.50 15.00 17.50 20.00 22.50 25.00 27.50 30.00 32.50 19.39 16.90 14.41 11.92 9.44 6.95 4.46 2.00 0.24 0.01 0.00 19.40 17.00 14.50 12.00 9.50 7.00 4.50 2.10 0.35 . . 19.60 17.10 14.60 12.10 9.60 7.10 4.70 2.20 0.40 0.05 0.05 January 07 12.00 15.00 17.00 19.50 20.00 22.00 22.50 24.50 25.00 27.00 27.50 29.50 30.00 32.00 32.50 15.06 12.12 10.15 7.70 7.20 5.24 4.75 2.79 2.31 0.51 0.23 0.02 0.01 0.00 0.00 15.00 12.10 10.10 7.70 7.20 5.30 4.80 3.10 2.65 1.25 1.00 0.35 0.25 0.05 0.05 15.20 12.20 10.30 7.80 7.40 5.40 5.00 3.20 2.75 1.35 1.05 0.40 0.30 0.10 0.10 January 08 15.00 17.50 20.00 22.50 25.00 27.50 30.00 35.00 40.00 12.83 10.49 8.16 5.82 3.48 1.18 0.03 0.00 0.00 12.50 10.30 8.20 6.20 4.30 2.85 1.70 0.45 0.10 12.70 10.50 8.30 6.30 4.50 2.95 1.75 0.50 0.20 January 09 15.00 20.00 22.50 25.00 30.00 35.00 40.00 13.51 9.06 6.84 4.62 0.32 0.00 0.00 12.90 9.00 7.20 5.60 3.00 1.45 0.65 13.10 9.20 7.40 5.70 3.20 1.60 0.75 Table 3. Prices for calls on the Microsoft stock. Bid/Ask prices from September 18, 2006, MEMM prices simulated with parameters estimated in Section 3.1. IV. 22 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG Maturity Strike Volvo Put Options MEMM price Bid price Ask price January 06 330.00 350.00 370.00 390.00 410.00 430.00 450.00 470.00 0.03 0.33 3.02 16.13 34.85 54.64 74.56 94.49 0.15 0.40 3.65 15.50 33.00 53.00 73.00 93.00 0.55 0.60 4.50 18.00 37.00 57.00 76.75 97.00 March 06 280.00 290.00 300.00 310.00 330.00 350.00 370.00 390.00 410.00 430.00 450.00 470.00 0.00 0.02 0.03 0.11 0.59 2.44 7.76 18.82 34.67 52.98 72.29 91.92 . . . 0.03 1.20 4.50 10.50 20.75 35.25 53.25 73.00 93.00 1.00 1.00 1.00 0.50 2.20 5.50 12.00 24.75 39.25 57.25 76.75 97.00 May 06 270.00 280.00 290.00 300.00 310.00 330.00 350.00 370.00 390.00 410.00 430.00 450.00 0.01 0.03 0.08 0.14 0.33 1.13 3.33 8.47 18.88 34.01 51.57 70.30 0.25 0.70 1.40 2.20 5.75 11.00 20.50 32.25 48.00 65.50 84.50 . 1.00 1.25 1.70 2.40 3.65 7.25 13.50 23.00 37.00 52.25 69.75 88.50 January 07 230.00 250.00 270.00 290.00 310.00 330.00 350.00 390.00 430.00 0.00 0.00 0.00 0.01 0.03 0.15 0.67 9.44 42.55 0.40 1.00 2.55 5.00 8.75 13.50 20.75 41.25 70.50 1.35 2.10 3.90 6.00 10.25 16.25 23.75 46.00 74.75 Table 4. Prices for puts on the Volvo stock. Bid/Ask prices from December 30, 2005, MEMM prices simulated with parameters estimated in Section 3.2. IMPLIED RISK AVERSION Maturity Strike IV. 23 Volvo Call Options MEMM price Bid price Ask price January 06 277.88 290.00 297.04 310.00 330.00 350.00 370.00 390.00 410.00 430.00 450.00 470.00 97.11 85.01 77.98 65.04 45.10 25.45 8.18 1.35 0.15 0.01 0.00 0.00 95.50 83.25 76.25 63.25 43.25 23.75 8.75 2.00 . . . . 99.50 87.25 80.25 67.25 47.25 27.50 10.50 2.40 1.00 1.00 1.00 1.00 March 06 280.00 290.00 300.00 310.00 330.00 350.00 370.00 390.00 410.00 430.00 450.00 96.30 86.37 76.45 66.59 47.21 29.21 14.70 5.97 2.06 0.62 0.17 94.50 84.50 74.75 65.00 46.25 29.25 16.50 8.00 2.95 0.90 0.02 98.50 88.50 78.75 69.00 50.25 32.75 19.00 9.50 4.25 1.90 1.00 May 06 280.00 290.00 300.00 310.00 330.00 350.00 370.00 390.00 410.00 430.00 450.00 97.79 87.95 78.13 68.44 49.49 31.95 17.39 8.13 3.62 1.54 0.64 95.00 85.00 75.25 65.75 47.50 31.25 19.00 10.50 5.00 2.05 0.60 99.00 89.00 79.25 69.75 51.50 35.50 21.75 12.00 6.25 3.05 1.55 January 07 290.00 310.00 330.00 350.00 390.00 430.00 93.37 74.01 54.74 35.88 6.10 0.61 85.50 67.75 52.00 39.00 20.75 10.00 89.50 71.75 56.25 43.25 23.25 11.50 Table 5. Prices for calls on the Volvo stock. Bid/Ask prices from December 30, 2005, MEMM prices simulated with parameters estimated in Section 3.2. IV. 24 FRED ESPEN BENTH, MARTIN GROTH AND CARL LINDBERG References [1] Andersen, L., Andreasen, J., (2000): Jump-diffusion models: Volatility smile fitting and numerical methods for pricing, Review of derivatives research, 4, 231-262. [2] Barndorff-Nielsen, O. E., Shephard, N. (2001): Modelling by Lévy processes for financial econometric, in: O. E. Barndorff-Nielsen, T. Mikosch and S. Resnick (Eds.), Lévy Processes - Theory and Applications, Boston: Birkhäuser, 283-318. [3] Barndorff-Nielsen, O. E., Shephard, N. (2001): Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics, Journal of the Royal Statistical Society: Series B 63 (with discussion), 167-241. [4] Becherer, D. (2006): Bounded solutions to backward SDE’s with jumps for utility optimization and indifference hedging. To appear in Annals of Applied Probability. [5] Benth, F. E., Karlsen, K. H., Reikvam, K. (2003): Merton’s portfolio optimization problem in a Black and Scholes market with non-Gaussian stochastic volatility of Ornstein-Uhlenbeck type, Mathematical Finance 13(2), 215-244. [6] Benth, F.E., Groth, M. (2005): The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nilsen - Shephard stochastic volatility model, Submitted. [7] Benth, F. E., Meyer-Brandis, T. (2005): The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps, Finance and Stochastics 9, 563-575. [8] Black, M., and Scholes, M. (1973): The Pricing of Options and Corporate Liabilities. Journal of Political Economy 81(3), 637-654. [9] Cont, R., Voltchkova, E., (2003): Finite difference methods for option pricing in jump-diffusion and exponential Lévy models, Finance and Stochastics, 9, 299-325. [10] Engle, R. F., Ng, V. K., Rothschild, M. (1990): Asset pricing with a factor-ARCH covariance structure, Journal of Econometrics 45, 213-237. [11] S. Gudonov (1959). Finite difference methods for numerical computations of discontinuous solutions of the equations of fluid dynamics. Matematiceskij Sbornik, 47, pp. 271–306. [12] Hilber. N., Matache, A., Schwab, C. (2005): Sparse Wavelet Methods for Option Pricing under Stochastic Volatility, Journal of Computational Finance, 8(4). [13] Hodges, S. D., Neuberger, A. (1989): Optimal replication of contingent claims under transaction costs, Review of Futures Markets 8, 222-239. [14] Lindberg, C. (2006): News-generated dependence and optimal portfolios for n stocks in a market of Barndorff-Nielsen and Shephard type, Mathematical Finance 16(3), 549-568. [15] Lindberg, C. (2006): Portfolio optimization and a factor model in a stochastic volatility market, Stochastics 78(5), 259-279. [16] Lindberg, C (2006): The estimation of a stochastic volatility model based in the number of trades, submitted. [17] Matache, A., von Petersdorff, T., Schwab, C. (2004): Fast deterministic pricing of options on Lévy driven assets. in Mathematical Modelling and Numerical Analysis 38(1), 37-72. [18] Merton, R. (1973): Theory of Rational Option Pricing. Bell Journal of Economics and Management Science 4(1), 141-183. [19] Nicolato, E., Venardos, E. (2003): Option pricing in stochastic volatility models of the OrnsteinUhlenbeck type, Mathematical Finance 13, 445-466. [20] Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P. (1992): Numerical Recipes in C, Cambridge: Cambridge University Press. [21] Rheinländer, T., and Steiger, G. (2006). The minimal entropy martingale measure for general Barndorff-Nielsen/Shephard models, Annals of Applied Probability, 16(3), pp. 1319–1351. [22] G. Strang (1968). On the construction and comparison of difference schemes. SIAM J. Num. Anal., 5, pp. 506–517. V Derivation-free Greeks for the Barndorff-Nielsen and Shephard stochastic volatility model Fred Espen Benth, Martin Groth and Olli Wallin Submitted DERIVATIVE-FREE GREEKS FOR THE BARNDORFF-NIELSEN AND SHEPHARD STOCHASTIC VOLATILITY MODEL FRED ESPEN BENTH, MARTIN GROTH, AND OLLI WALLIN Abstract. We derive derivative-free formulas for the Delta and other Greeks of options written on an asset modeled by a geometric Brownian motion with stochastic volatility of Barndorff-Nielsen and Shephard type. The method applies the Malliavin Calculus in Wiener space which moves differentiation of the payoff function of the option to a random weight function. Our method paves the way for simple Monte Carlo approaches, illustrated by several numerical examples. 1. Introduction Option price sensitivities, commonly referred to as the Greeks, are essential tools for investors trying to hedge their positions. Being measurements of how a contract respond to shifts in the parameters of the underlying model, the Greeks are used to manage the risk from unfavourable changes. Informally, one can think of the Greeks as derivatives with regards to a parameter θ of the risk-neutral price: ∂ E[φ(S(T ))] ∂θ where φ(S(T )) is the payoff function and S(T ) the underlying asset, depending on θ. The Greeks are unobservable quantities in the market, and hence, we need to choose a model for the underlying asset to obtain an estimate of them. Given a model, the option prices can with benefit be calculated using a Monte Carlo method. The flexibility and low implementation threshold often makes them the preferred pricing tool in finance. However, calculating the option sensitivities requires often substantially greater effort than calculating the price of the option. The slow convergence is especially prominent for discontinuous payoffs. To speed up the convergence there are several different methods and variance reduction techniques proposed. The finite difference method is the simplest and crudest method to approximate the derivative using a Monte Carlo method. Simulating two different paths with a small difference in the parameter and forming a finite difference, gives an approximation of the sensitivity. The method is universally applicable, however, the estimates are known to be biased and prone to large variance. Broadie and Glasserman [8] proposed two different unbiased methods to improve the convergence rate, both assuming we can exchange order of expectation and differentiation. The pathwise method assumes the dynamics of the model depends on the parameter and differentiates the paths of the model. On the contrary, the likelihood ratio method assumes that the probability density of the price depends on the parameter θ and instead differentiate the measure. Both methods are reported to have significantly lower variance than the finite Date: February 28, 2007. Key words and phrases. Ornstein-Uhlenbeck process, subordinators, stochastic volatility, Malliavin derivative, Greeks, Monte Carlo methods. We thank Tommi Sottinen for fruitful and interesting discussions. 1 V. 2 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN difference method but are not as applicable. The pathwise method is unable to handle discontinuous payoffs, while the likelihood ratio method is restricted by requiring an explicit knowledge of the density of the underlying model. Recent development suggests using an approach based on variational stochastic calculus, referred to as Malliavin calculus. Using an integration-by-parts formula, Fournié et al. [16] derive expressions for the Greeks involving weight functions such that the payoff function is not differentiated. The method proved to outperform the finite difference method for discontinuous payoffs, while remaining less restricted than the pathwise and likelihood ratio methods. However, for smoother functions, like vanilla options, the Malliavin method is not reported to be significantly better than the finite difference method. The pioneer work done with the Black-Scholes model spun a large research activity to find optimal weighting functions and perform similar analysis for other contracts and models. Chen and Glasserman [11] list some of the important references. Models including a different source of randomness than a Brownian motion provide an additional complexity, because the Malliavin calculus covers only the Wiener space. There exists several papers developing a similar Malliavin theory for Poisson random fields (Benth and Løkka [5], Nualart and Vives [24], Carlen and Pardoux [9] and Bichteler, Gravereaux and Jacod [7]). El-Khatib and Privault [15] derived Malliavin weights for a market driven by Poisson processes using an integration-by-parts formula, but the domain of the differential operator exclude many option types, for example European claims. Jump-diffusion models are considered in several papers; Leon et al. [19], Davis and Johansson [13] and Debelley and Privault [14], the two former considering markets where the jump sizes are deterministic. Due to the lack of chain-rule for the jump component the general idea is to take a directional derivative and use the analysis on the Wiener space. Barndorff-Nielsen and Shephard [2] proposed a stochastic volatility model suitable to capture the characteristics from high-frequency stock price data. Intra-day sampled log-returns are known to experience heavy tails, skewness and volatility clustering. The Barndorff-Nielsen and Shephard (BNS) model features a stock price dynamics driven by a Brownian motion together with a non-Gaussian Ornstein-Uhlenbeck process describing the volatility. The mean-reverting volatility process includes jumps given by a subordinator, a Lévy process with strictly non-negative increments. At the same time as the model is able to generate realistic asset prices it is analytically tractable enough for derivative pricing and portfolio optimisation, see Benth and Groth [3] and Lindberg [20], [21]. For the BNS model the density of the price distribution is not know explicitly. For options with discontinuous payoff function neither the pathwise nor the likelihood ratio method will be directly applicable for simulations of the Greeks. We use the Malliavin calculus on the Wiener space to derive weight functions for the Greeks, assuming the stock price is given by the Barndorff-Nielsen and Shephard model. The weights here resemble the weights in the Black-Scholes market, but now involve a stochastic volatility. We consider both options depending exclusively on the terminal value of the stock and discretely sampled path-dependent options. The organisation of the paper is as follows. In the next section we introduce the Barndorff-Nielsen and Shephard model and the properties we use in latter sections. Section 3 discuss the Malliavin calculus in the product space we are interested in. The Malliavin weight for the Greeks in the BNS-model are derived in Section 4 while Section 5 gives several numerical examples. DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 3 2. The Barndorff-Nielsen and Shephard model In this section, we give a brief review of the Barndorff-Nielsen and Shephard model, with a view towards option pricing. We consider a financial market where a risk-free asset and a single risky asset (a stock) are traded up to a fixed time T > 0. Especially, we assume the asset price dynamics of the stock price S(t) = x exp(X(t)) are defined on a filtered probability space (Ω, F , F, P) with P denoting the physical probability measure and log prices following Black and Scholes type dynamics (2.1) dX(t) = (µ + βσ 2 (t)) dt + σ(t) dW (t) + ρ dZ(λt), X(0) = 0 with stochastic volatility given by a non-Gaussian Ornstein-Uhlenbeck (OU) process (2.2) dσ 2 (t) = −λσ 2 (t) dt + dZ(λt) , σ 2 (0) > 0. Here W is Brownian motion, and is Z a subordinator commonly referred to as the background driving Lévy process (BDLP). We denote by κ(·) the cumulant generating function κ(z) := log(E[exp(zZ(1)]), which uniquely specifies the distribution of Z(t) for all t ∈ [0, T ]. Moreover, r > 0 is the risk-free rate of return and λ > 0, ρ ≤ 0 are constants related to the mean-reversion rate of the volatility and the leverage effect, respectively. The Brownian motion W and the subpordinator Z are independent, and F = (Ft )t∈[0,T ] is assumed to be the augmented natural filtration of (W, Z). The parameters µ and β are constants. Note that the solution of (2.2) can be written explicitly as t 2 2 −λt (2.3) σ (t) = σ (0)e + eλ(s−t) dZ(λs), σ 2 (0) > 0. 0 Clearly the volatility process σ 2 = (σ 2 (t))t∈[0,T ] is then bounded from below by the deterministic function σ 2 (0)e−λt and is, especially, strictly positive on [0, T ]. Later we shall deal with processes of the form 1/σ n (t) for n = 1, 2, which are thus bounded from above by a constant. Barndorff-Nielsen and Shephard [2] propose to use a superposition of Ornstein-Uhlenbeck processes as the model of the squared volatility. We restrict ourselves to only one here, but our results can easily be extended to the general case. Following a standard procedure in mathematical finance literature, we can now choose a concrete model by specifying a distribution for Z through the cumulant κ. Let us start by stating some additional assumptions on Z. Assumption 1. The subordinator Z has no drift and its Lévy measure has density w(·) so that κ(·), when it is well defined, takes the form (ezy − 1)w(y) dy. κ(z) = R+ Moreover, ẑ := sup{θ ∈ R : κ(z) < ∞} satisfies ẑ > max{0, 2λ−1(1 + β + ρ)} and limz→ẑ κ(z) = +∞. It can be seen using the formula for the Laplace transform of X(t), computed in Nicolato and Venardos [22], that the condition ẑ > 2λ−1 (1 + β + ρ) is sufficient for square integrability of S. Furthermore, ẑ > 0 implies that the variance process σ 2 has an invariant distribution which, in particular, is self-decomposable. A deep connection between self-decomposable distributions and OU-processes is that the converse is also true: for every self-decomposable distribution µ on R+ there is a subordinator Z such V. 4 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN that µ is the invariant distribution of σ 2 . The cumulant generating function κ of Z can be easily recovered from the cumulant κµ of µ by dκµ (z). (2.4) κ(z) = z dz This one-to-one correspondence makes it possible to build stochastic volatility models of OU-type by first stating the invariant distribution for σ 2 (t). An important example is the case when µ is an inverse Gaussian (IG) distribution, since for ρ = 0 the marginal distribution of log-returns are approximately normal inverse Gaussian (NIG). This class of distributions has been shown to have excellent fit with empirical return distributions. The cumulant function of an IG(δ,γ) distribution is κIG (z) = zγ − δ(γ − 2z)1/2 so it follows from (2.4) that κ(z) = zδ(γ − 2z)−1/2 is the cumulant function for the corresponding BDLP. For definitions and properties of invariant and self-decomposable distributions, and the connection with OU-processes we refer the reader to the book by Sato [28]. Let us now turn to option pricing theory under the Barndorff-Nielsen and Shephard model. By the first fundamental theorem of option pricing, the arbitrage free price of an option can be expressed as the discounted expectation of the payoff under an equivalent martingale measure (EMM) Q, which is also commonly called the risk neutral measure. For the BNS model, these measures were characterized by Nicolato and Venardos in [22]. In general, under Q the jump process Z does not remain a Lévy process and W and Z may be dependent. Thus, the log-price process X may no longer be described by a BNS model. In this article, we restrict our main attention to the class of measures Q that do retain the general form of the model (2.1), (2.2), but with possibly different parameters and Lévy measure for Z. It was shown in [22] that under any such structure preserving Q, the risk neutral dynamics of the log-price have the form 1 (2.5) dX(t) = (r − λκ(ρ) − σ 2 (t)) dt + σ(t) dW (t) + ρ dZ(λt) 2 (2.6) dσ 2 (t) = −λσ 2 (t) dt + dZ(λt) , σ 2 (0) > 0 where κ is now the cumulant function of Z under the measure Q. In subsequent sections, we shall assume directly that the risk neutral BNS model (2.5), (2.6) has been given and that Assumption 1 holds for κ with respect to the measure Q. Note that the integrability criteria given in Assumption 1 collapse to ẑ > λ−1 for the no leverage case ρ = 0, which is the situation we consider in Section 5. We remark in passing that we could easily have included other measure changes which are not structure preserving. In fact, our theory will be valid for any martingale measure which lets W and Z be independent. One interesting example is the minimal entropy martingale measure for the BNS-model, which turns Z into a Markov process with state-dependent jumps (see [6]). This more general situation require different integrability hypotheses for the jump process Z. For many specifications of µ or Z, the Laplace transform of X(t) given in [22] has a fairly explicit form. This makes it possible to compute option prices and Greeks using numerical transform methods if the payoff depends on the terminal value only. Although these methods are potentially superior in simple cases, we do not consider DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 5 them here because of their limited applicability and instead refer the reader to [10], [12], [22], [29] and the references therein. 3. Malliavin calculus with respect to Brownian motion We base our derivation of derivative-free formulas of the sensitivities on the Malliavin Calculus, as presented in Fournié et al. [16]. To do this, we work on the product of the canonical spaces for Brownian motion W and the subordinator Z. This allows us to do Malliavin calculus with respect to Brownian motion in the classical setting of [23]. Let (ΩW , F W , QW ) be the canonical Wiener space for Brownian motion (see, for example, [17]) and correspondingly, let (ΩZ , F Z , QZ ) be the canonical space for the Lévy process Z ([1], [28]). Furthermore, let FW and FZ be the augmented natural filtrations generated by W and Z, respectively. Then, by independence of W and Z, we can model the risk-neutral dynamics of the BNS model (2.5) on the filtered probability space given by the product (Ω, F , F, Q) = (ΩW ⊗ ΩZ , F W ⊗ F Z , FW ⊗ FZ , QW ⊗ QZ ). There exists a regular conditional probability of Q given the sigma-algebra G generated by events of the form ΩW × {ω Z }, ω Z ∈ ΩZ , and it is denoted by Q(·|ω Z ). By independence, this measure coincides with the Wiener measure. We denote by EQW , EQZ the expectations under the measures QW and QZ , respectively. Furthermore, we use E to denote the expectation under the product measure Q, so that E = EQ = EQZ EQ(·|ωZ ) = EQZ EQW = EQW EQZ . Now, let F = F (ω W , ω Z ) be a random variable on (Ω, F , Q). From standard measure theory it follows that for every fixed ω Z ∈ ΩZ , the mapping ω W → F (ω W , ω Z ), ω W ∈ ΩW is a random variable on (ΩW , F W , QW ). Assuming further that this random variable is Malliavin differentiable, we can apply the usual Malliavin calculus on the Wiener space. Moreover, it follows by applying this result on each Ft , t ∈ [0, T ] that if X is an F-adapted stochastic process on (Ω, F , F, Q), then, for fixed ω Z ∈ ΩZ , the process (X(t, ·, ω Z ))t∈[0,T ] is an FW -adapted stochastic process on (ΩW , F W , FW , QW ). Finally, suppose a process u is progressively measurable with respect to FZ . Then, for almost every ω Z ∈ ΩZ , the mapping t → u(t, ω Z ) is measurable and deterministic. Furthermore, if u ∈ L2 ([0, T ]×ΩZ ), then t → u(t, ω Z ) is in L2 ([0, T ]) for almost every ω Z ∈ ΩZ . We recall here that every adapted process which is measurable has a progressively measurable modification, and henceforth we shall work with this modification. Let us next recall the Malliavin calculus on Wiener space in view of sensitivity analysis for the Barndorff-Nielsen and Shephard model. The above discussion hints at a natural way to use Malliavin calculus in our setting. We let SBN S denote the set of smooth random variables F of the form T T h1 (t) dW (t), . . . , hm (t) dW (t), ω Z , F =f 0 2 0 where h1 , . . . , hm ∈ L ([0, T ] × Ω) are F-adapted and f : Rm × ΩZ → R are such that f (·, ω Z ) ∈ C2∞ (Rm ) for ω Z ∈ ΩZ . Note that, for fixed ω Z ∈ ΩZ , the random variable F (·, ω Z ) belongs to the set S of random variables on the Wiener space that are smooth V. 6 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN in the classical sence of [23]. Given F ∈ SBN S , the Malliavin derivative of F with respect to Brownian motion is the process (Dt F )t∈[0,T ] in L2 ([0, T ] × Ω) defined by T m T Z f h1 (t) dW (t), . . . , hm (t) dW (t), ω hj (t). Dt F := j=1 0 0 Again, this is nothing but the classical definition done ω Z -wise. On L2 (ΩW , F W , QW ), define the norm 1/2 T 2 F 1,2 := EQW [F ] + EQW |Dt F |2 dt 0 1,2 and denote by D the closure under · 1,2 of the set of smooth Wiener random variables S. The normed space (D1,2 , · 1,2 ) is a Banach space, and the Malliavin derivative is a closed linear operator on D1,2 taking values in L2 ([0, T ] × ΩW ). Now, 2 Z 1,2 we denote by D1,2 BN S the set of random variables F ∈ L (Ω) such that F (·, ω ) ∈ D for almost every ω Z ∈ ΩZ . Then we also have the existence of a sequence Fn ∈ SBN S such that T 2 2 E[F ] + E |Dt F | dt = EQZ [Fn − F 21,2 ] → 0. 0 Let us illustrate the calculus with the following Example 3.1. Let us consider the random variable T σ(t) dW (t). F = 0 Fixing ω Z , the mapping t → σ(t, ω Z ) is a deterministic function in L2 ([0, T ]), so F (·, ω Z ) is a Malliavin differentiable random variable on the Wiener space. We thus have Dt F = σ(t), t ∈ [0, T ] almost surely. It is also clear that since we are doing Malliavin calculus with respect to Brownian motion only, anything that is F Z -measurable vanishes on differentiation. Property 3.1. If F is F Z measurable, then Dt F = 0 for t ∈ [0, T ] . The Malliavin derivative satisfies the chain rule, which we state here in a form suitable for our purposes: Property 3.2. Let φ : Rm → R be a continuously differentiable function and let (F1 , . . . , Fm ) be a random vector whose components belong to D1,2 BN S . Suppose furthermore that m T 2 | ∂xj φ(F1 , . . . , Fm )Dt Fj |2 dt < ∞. E[|φ(F1 , . . . , Fm )| ] + E 0 j=1 Then (3.1) Dt φ(F1 , . . . , Fm ) = m j=1 ∂xj φ(F1 , . . . , Fm )Dt Fj . DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 7 In standard references, this result is usually stated only for φ with bounded derivatives which would exclude the important case of the exponential function. In the above generality, the proof (for the more general case of Fj ∈ D1,1 ) can be found in the Appendix of [25]. The Malliavin derivative has an adjoint operator called the Skorohod integral (also known as the divergence operator). Let us start by returning once more to the setting of the Wiener space, and then give the corresponding extension. Property 3.3. Skorohod integral in the Brownian direction: Let u ∈ L2 ([0, T ] × ΩW ). Then u ∈ Dom(δ W ) if and only if for all F ∈ D1,2 we have T Dt F u(t) dt ≤ C(u)F 1,2 EQW 0 where C(u) is a constant independent of F ∈ D1,2 . If u ∈ Dom(δ W ), then the Skorohod integral of u is the a.s. unique random variable δ(u) ∈ L2 (ΩW ) satisfying the relation T EQW [F δ(u)] = EQW Dt F u(t) dt . 0 We define Dom(δ) to be the set of processes u ∈ L2 ([0, T ] × Ω) such that u belongs to Dom(δ W ) QZ -almost surely and T Dt F u(t) dt < ∞. (3.2) E 0 If δ ∈ Dom(δ), we denote by δ the operator δ : L2 ([0, T ] × Ω) → L2 (Ω) defined by δ(u)(ω W , ω Z ) = δ W (u(·, ω W , ω Z )). Then it follows by Fubini’s theorem that E[F δ(u)] = E (3.3) 0 T Dt F u(t) dt . The above equality (3.3) is commonly referred to as the integration-by-parts formula, and the process u is called Skorohod integrable if u ∈ Dom(δ). One of the main properties of the Skorohod integral δ W on the Wiener space is that all FW -adapted processes in L2 ([0, T ] × ΩW ) are Skorohod integrable and the Skorohod integral of such processes coincides with the usual stochastic integral of Itô. Here we state the corresponding result for adapted integrands in L2 ([0, T ] × Ω). Property 3.4. For F-adapted u ∈ L2 ([0, T ] × Ω), we have u ∈ Dom(δ) and T u(s) dW (s). δ(u) = 0 Furthermore, Dt δ(u) = Dt T 0 u(s) dW (s) = u(t). In the above, the claim that (3.2) holds might not seem clear at first. However, we can again use conditioning to estimate 2 2 T W Dt F u(t) dt = EQZ EQW F δ (u) E 0 V. 8 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN = E F 2 u(t) dW (t) 0 T 2 2 ≤ F L2 (Ω) E u(t) dW (t) 0 T = F 2L2 (Ω) E u2 (t) dt ≤ T 0 2 2 F L2 (Ω) uL2 ([0,T ]×Ω) < ∞. where we have used properties of the Skorohod integral on Wiener space, CauchySchwarz inequality and the Itô isometry. The following lemma facilitates further computation of Skorohod integrals in an important special case where the integrand is no longer adapted. T Property 3.5. Let F ∈ D1,2 BN S . For all u ∈ Dom(δ) such that F δ(u)− 0 Dt F u(t) dt ∈ L2 (Ω) we have F u ∈ Dom(δ) and T Dt F u(t) dt. (3.4) δ(F u) = F δ(u) − 0 Finally, it is easily seen that θ → S θ is pathwise differentiable (with exception of boundary values x = 0, σ 2 (0) = 0) for the different parameters θ = x, r, ρ, σ 2 (0) and (to be defined in section 4.2). Remark. Instead of following the concrete program via pointwise conditioning on ω Z outlined here, one could also proceed by viewing elements in L2 (Ω) as L2 (ΩZ )-valued random variables on the Wiener space, see [19], [23]. 4. Malliavin weights for the Greeks In this section we apply the previous results to derive formulas for the Greeks as weighted expectations of the payoff. We start by verifying some quite standard but useful lemmas. The first one justifies differentiation under the expectation. Lemma 4.1. Let F θ be a real valued random variable, depending on a parameter θ ∈ R. Suppose furthermore that, for almost every ω ∈ Ω the mapping θ → F θ (ω) is continuously differentiable in [a, b], and that E[ sup |∂θ F θ |] < ∞. θ∈[a,b] Then, the mapping θ → E[F θ ] is differentiable in (a, b), and for θ ∈ (a, b) we have ∂θ E[F θ ] = E[∂θ F θ ]. Proof. First, fix θ̄ ∈ (a, b) and note by the assumptions we have 1 θ̄+h − F θ̄ } → ∂θ F θ̄ , almost surely, {F h as h → 0. Moreover, by the mean value theorem of calculus we see that 1 | {F θ̄+h − F θ̄ }| ≤ sup |∂θ F θ |. h θ∈[a,b] Thus, we deduce by dominated convergence theorem that 1 1 {E[F θ̄+h ] − E[F θ̄ ]} = E[ {F θ̄+h − F θ̄ }] → E[∂θ F θ̄ ] h h DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL as h → 0, finishing the proof. V. 9 The next lemma allows us to assume infinite smoothness of the payoff function when deriving the formulas. Let us denote by L2 (S) the class of locally integrable functions φ such that the set of discontinuities of φ has Lebesgue measure zero, and satisfy E[φ(S(t1 ), . . . , S(tm ))2 ] < ∞. (4.1) From now on, we denote S(·) = S θ (·) to emphasize the dependence of the model on a parameter θ. Lemma 4.2. Suppose that (4.2) ∂θ E[φ(S θ (t1 ), . . . , S θ (tm ))] = E[φ(S θ (t1 ), . . . , S θ (tm ))π θ ] holds for φ ∈ C0∞ (Rm ), π θ ∈ L2 (Ω, F , Q). Suppose also that the mapping θ → π θ is continuous, almost surely. Then the equality (4.2) holds also for φ ∈ L2 (S). Proof. Let φ satisfy (4.1) and let φk , k = 1, 2, . . . be such that φk ↑ φ Lebesgue almost everywhere as k → ∞. Since X has transition probability that are absolutely continuous with respect to Lebesgue measure (see [22]), and discontinuities of φ have measure zero, we have φk (S θ (t1 ), . . . , S θ (tm )) ↑ φ(S(t1 )θ , . . . , S θ (tm )) almost surely. Furthermore, the family φk (S(t1 ), . . . , S(tm ))2 is uniformly integrable so φk (S θ (t1 ), . . . , S θ (tm )) → φ(S θ (t1 ), . . . , S θ (tm )) in L2 (Ω, F , Q) (and thus also in L1 (Ω, F , Q)) as k → ∞. Let us now define u(θ) := E[φ(S θ (t1 ), . . . , S θ (tm ))], uk (θ) := E[φk (S θ (t1 ), . . . , S θ (tm ))], and note that uk (θ) → u(θ) for every θ ∈ [a, b]. Furthermore, let f (θ) := E[φ(S θ (t1 ), . . . , S θ (tm ))π θ ]. By the Cauchy-Schwartz inequality, |∂θ uk (θ) − f (θ)| ≤ k (θ)ψ(θ), where k (θ) = (E[(φk (S θ (t1 ), . . . , S θ (tm )) − φ(S θ (t1 ), . . . , S θ (tm )))2 ])1/2 and ψ(θ) = (E[|π θ |2 ])1/2 . From the assumptions it follows that ψ and are continuous. Thus, for an arbitrary compact subset K ⊂ R, we have sup |∂θ uk (θ) − f (θ)| ≤ CK sup k (θ) θ∈K θ∈K with CK = supθ∈K ψ(θ). Since supθ∈K k (θ) → 0 as k → ∞, it follows that ∂θ uk (θ) → f (θ) uniformly on compact subsets of R, proving the lemma. Note that the class L2 (S) defined before the lemma is not the L2 -space on Rm . The difference is important as L2 (Rm ) does not contain most of the contracts, including the call option. We are now prepared to derive formulas for the Greeks. We treat Delta and Gamma first, and then move on to study the other Greeks in a unified manner. V. 10 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN 4.1. Delta and Gamma. Delta and Gamma are, respectively, the first and second order derivatives of the option price with respect x, the current price level of the underlying stock price S. Proposition 4.3. Let a ∈ L2 ([0, T ]) be an F-adapted process such that ti a(t) dt = 1 almost surely 0 for all i = 1, 2, . . . , m. Then: (i) The Delta of the option is given by −rT ∂x E[e −rT x x ∆ φ(S (t1 ), . . . , S (tm ))] = E e φ(S (t1 ), . . . , S (tm ))π , x x where the Malliavin weight π ∆ equals T a(t) ∆ π = dW (t). 0 xσ(t) (ii) The Gamma of the option is given by (4.3) ∂x2 E[e−rT φ(S x (t1 ), . . . , S x (tm ))] = E[e−rT φ(S x (t1 ), . . . , S x (tm ))π Γ ], where the Malliavin weight π Γ equals π (4.4) Γ 1 1 = (π ) − π ∆ − 2 x x ∆ 2 0 T a(t) σ(t) 2 dt. Proof. First note that the assumptions of the above Lemma 4.1 and 4.2 hold, and thus we only need to prove the claim for φ ∈ C0∞ (Rm ). (i) Applying Lemma 4.1, we compute ∂x E[e−rT φ(S x (t1 ), . . . , S x (tm ))] = E[e−rT ∂x φ(S x (t1 ), . . . , S x (tm ))] m = E[e−rT φxi (S x (t1 ), . . . , S x (tm ))∂x S x (ti )] i=1 = E[e−rT m i=1 ti 1 φxi (S x (t1 ), . . . , S x (tm )) S x (ti )]. x Using Dt S x (ti ) = σ(t)S x (ti )1[0,ti ] (t) and 0 a(t) dt = 1, we note that T a(t) 1 Dt S x (ti ) dt = S x (ti ), x 0 xσ(t) so ∂x E[e−rT φ(S x (t1 ), . . . , S x (tm ))] = −rT E e T 0 m i=1 By the chain rule property it follows that −rT ∂x E[e x x φxi (S x (t1 ), . . . , S x (tm )) −rT φ(S (t1 ), . . . , S (tm ))] = e E 0 T a(t) Dt S x (ti ) dt . xσ(t) Dt φ(S x (t1 ), . . . , S x (tm )) a(t) dt . xσ(t) DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 11 The claim now follows from the integration-by-parts property (3.3) and (3.4). T a(t) dW (t), we note that ∂x F x = − x1 F x so (ii) Denoting F x := 0 xσ(t) ∂x2 E[e−rT φ(S x (t1 ), . . . , S x (tm ))] = ∂x E[e−rT φ(S x (t1 ), . . . , S x (tm ))F x )] 1 (4.5) = − E[e−rT ∂x φ(S x (t1 ), . . . , S x (tm ))F x ] x m 1 x −rT x x x +E e φxi (S (t1 ), . . . , S (tm )) S (ti )F . x i=1 For the second term on the last line above, we can re-iterate the procedure from (i): m 1 E e−rT φxi (S x (t1 ), . . . , S x (tm )) S x (ti )F x x i=1 T a(t) x −rT F dt Dt φ(S x (t1 ), . . . , S x (tm )) =E e xσ(t) 0 a(·) x −rT x x F . = E e φ(S (t1 ), . . . , S (tm ))δ xσ(·) a(t) Finally, applying (3.4) with Dt F x = xσ(t) , we have 2 2 T T T a(·) x a(t) a(t) a(t) x x 2 F dW (t) − dt = (F ) − dt. δ =F xσ(·) xσ(t) xσ(t) 0 xσ(t) 0 0 Combining this with (4.5) finishes the proof. 4.2. Rho and the three Vegas. Next we investigate sensitivities with respect to other model parameters. We call Rho, as always, the sensitivity with respect to the interest rate level r. It is common practise to call the sensitivity of the option price related to parameters affecting the random fluctuations of the stock price Vega (although this is not a Greek letter), and we have three different measures here. We call sensitivities with respect to starting value σ 2 (0) of the variance process and the leverage parameter ρ for Vega1 and Vega3, respectively. We name Vega2 the sensitivity of the option price with respect to changes in the whole volatility structure. More precisely, let σ (u) = σ(u) + σ̃(u), where σ̃ is a bounded and adapted process such that σ is uniformly bounded away from zero. Furthermore, put t t 1 2 [r − σ (u)] du + σ2 (t) dW (u), Xσ̃ (t) = 2 0 0 and Sσ̃ (t) = xeXσ̃ (t) . Then we define Vega2 = ∂ E[e−rT φ(Sσ̃ (t1 ), . . . , Sσ̃ (tm ))]|=0 . In what follows, we let b : [0, T ] → R be an F-adapted process satisfying tj b(t) dt = 1, almost surely tj−1 for all j = 1, . . . , m. Furthermore, let m b(t)(∂θ X θ (tj ) − ∂θ X θ (tj−1 ))1[tj−1 ,tj ] (t) (4.6) a(t) = j=1 V. 12 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN where X is the log-price process. Notice that the process a so defined is not adapted in general. Now, we prove a general formula from which the above Greeks can be derived. Theorem 4.4. Let φ ∈ L2 (S). Then θ θ θ θ ∂θ E[φ(S (t1 ), . . . , S (tm ))] = E φ(S (t1 ), . . . , S (tm ))δ (4.7) a(·) σ(·) Proof. Recall again that by Lemma 4.2 we may assume φ ∈ C0∞ (Rm ). Here ∂θ E[φ(S θ (t1 ), . . . , S θ (tm ))] = E[∂θ φ(S θ (t1 ), . . . , S θ (tm ))] m = E[ φxi (S θ (t1 ), . . . , S θ (tm ))∂θ S θ (ti )] i=1 m = E[ φxi (S θ (t1 ), . . . , S θ (tm ))S θ (ti )∂θ X θ (ti )]. i=1 We note that T 0 a(t) Dt S θ (ti ) dt = S θ (ti )∂θ X θ (ti ), σ(t) so that ∂θ E[φ(S (t1 ), . . . , S (tm ))] = E θ θ 0 T m φxi (S θ (t1 ), . . . , S θ (tm ))Dt S θ (ti ) i=1 a(t) dt σ(t) T a(t) θ θ Dt φ(S (t1 ), . . . , S (tm )) dt = E σ(t) 0 a(t) θ θ = E φ(S (t1 ), . . . , S (tm ))δ σ(t) where we have again applied the chain rule and the integration-by-parts properties. Next, we study the Malliavin weights π θ for the Greeks Rho, Vega1, Vega2 and Vega3 in more detail, using the proposition above. That is, we shall find explicit forms of a random variable π θ such that ∂θ E[e−rT φ(S θ (T ))] = E[e−rT φ(S θ (T ))π θ ]. Corollary 4.5. Let φ ∈ L2 (S). (Rho) The Malliavin weight for the sensitivity of the option price with respect to interest rate r is π Rho = T (xπ ∆ − 1), that is Rho = T x × delta − T × price. (Vega1) The Malliavin weight for the sensitivity of the option price with respect to initial value σ02 := σ 2 (0) of the variance process is m tj b(t) 1 −λtj 1 tj e−λt V ega1 −λtj−1 dW (t) + (e dW (t) = −e ) π 2 j=1 λ tj−1 σ(t) tj−1 σ(t) m 1 tj b(t) −λt − e dt 2 2 j=1 tj−1 σ (t) DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 13 (Vega2) Sensitivity of the option price with respect to a perturbation σ̃ of the volatility process is given by ∂ E[e−rT φ(Sσ̃ (t1 ), . . . , Sσ̃ (tm ))]|=0 = E[e−rT φ(S(t1 ), . . . , S(tm ))πσ̃V ega2 ], with πσ̃V ega2 = m Fj tj−1 j=1 where Fj = tj tj−1 tj b(t) dW (t) − σ(t) σ̃(t) dW (t) − T 0 b(t) σ̃(t) dt, σ(t) tj tj−1 σ(t)σ̃(t) dt (Vega3) The Malliavin weight for sensitivity with respect to the leverage parameter ρ is given by tj m b(t) V ega3 π = (∆Zj − λκ (ρ)∆tj ) dW (t) tj−1 σ(t) j=1 where ∆Zj := Z(λtj ) − Z(λtj−1) and ∆tj := tj − tj−1 . Proof. The results are a straightforward application of the above Theorem 4.4 and properties given in Section 3 to compute the Skorohod integral in a more recognizable form. Rho: First notice that ∂r X(t) = t. Choosing m 1 1[t ,t ] (t), b(t) = tj − tj−1 j−1 j k=1 and noticing that a(·) given in (4.6) is now adapted, we have T T a(·) a(t) 1 dW (t) = dW (t) = T xπ ∆ . δ = σ(·) 0 σ(t) 0 σ(t) The result now follows from (4.8) ∂r E[e−rT φ(S r (t1 ), . . . , S r (tm ))] = − T E[e−rT φ(S r (t1 ), . . . , S r (tm ))] + e−rT ∂r E[φ(S r (t1 ), . . . , S r (tm ))]. −λ e , so Vega1: First, ∂σ02 σ 2 (t) = e−λt and ∂σ02 σ(t) = ∂σ02 (σ 2 (t))1/2 = 2σ(t) t −λs t t −λs 1 e e 1 1 −λs ∂σ02 X(t) = { dW (s) − dW (s) − (e−λt − 1)}. e ds} = { 2 0 σ(s) 2 0 σ(s) λ 0 From this, we have m 1 (Cj + Fj )b(t)1[tj−1 ,tj ] (t), a(t) = 2 j=1 where Cj = λ1 (e−λtj − e−λtj−1 ) and Fj = tj tj−1 e−λs dW (s). σ(s) e−λt 1 (t), σ(t) [tj−1 ,tj ] the result follows from Properties 3.5 and Noting finally that Dt Fj = 3.4. Vega2: The proof of Theorem 4.4 does not use the specific form of the process σ 2 , only its integrability properties and that it is adapted to the filtration FZ . Thus we V. 14 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN see that (4.7) holds with σ replaced with σ and S replaced with S . Now, applying Theorem 4.4 with t t σ (s)σ̃(s) ds + σ̃(s) dW (s), ∂ X (t) = − 0 0 the result is a straight forward calculation using Properties 3.5 and 3.4. Vega3: This is a trivial calculation using Properties 3.4 and 3.1. Note that Rho can be stated in terms of the price and the Delta, requiring no extra computation. We will now list the Malliavin weights (except Rho which is the same) in the simple case where the option payoff depends on terminal value only. Here, we have taken the simple choice of a(·) = 1/T as the weight function for Delta and Gamma, and similarly b(·) = 1/T for the other Greeks. T 1 1 ∆ dW (t), π = T x 0 σ(t) T 1 1 1 Γ ∆ 2 π = (π ) − 2 2 dt − π ∆ , 2 T x 0 σ (t) x T −λt T 1 1 −λt 1 e 1 T e−λt V ega1 dW (t) + (e − 1) dW (t) − dt, = π 2T 0 σ(t) λ 2T 0 σ 2 (t) 0 σ(t) πσ̃V ega2 1 = T 0 T 1 dW (t) σ(t) 0 T σ̃(t) dW (t) − 0 T 1 T σ̃(t) σ(t)σ̃(t) dt − dt, T 0 σ(t) T 1 1 dW (t). π V ega3 = ( Z(λT ) − λκ (ρ)) T 0 σ(t) From these representations we also notice that the weights for Delta, Gamma, Vega2 (and Rho) agree with those in the Black and Scholes model if we replace the stochastic volatility by a constant one. Using the basic principles developed in this chapter, it is also possible to modify numerous results that have already appeared in the diffusion setting to be applicable for the BNS model. We refer to Kohatsu-Higa and Montero [18] for a comprehensive survey and reference list. 5. Numerical examples In the previous sections we derived Malliavin weights for a derivative-free simulation of option sensitivities in the Barndorff-Nielsen and Shephard stochastic volatility model. In this section we provide some examples to show the efficiency of the method and possible pitfalls. We show the superior performance of the Malliavin method compared to the finite difference method in cases where the payoff function is discontinuous, but also that the methods are comparable for smoother functions. The examples will focus on the three Greeks Delta, Gamma and Vega2, but similar results hold for the other Greeks as well. For the numerical examples we consider the BNS model, without the leverage effect. The invariant distribution of the variance process is assumed to be the inverse Gaussian distribution, which will give marginal log-returns being approximately normal inverse Gaussian distributed. To have relevant parameters for the volatility dynamics we use DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 15 the estimates found in Benth, Groth and Lindberg [4] for the Volvo B stock traded at the OMX in Stockholm, where δ = 0.0116, γ = 54.2 and λ = 0.83. The estimation procedure uses the number of trades as a measurement of volatility in the market, see [4] for an extensively discussion. The spot price is 374.5 SEK and we assume an interest rate of 3%, which is close to the 3-month STIBOR1 at the time. The contracts tested are a plain vanilla call with strike 400 SEK, a binary call with the same strike and a knock-out option. The knock-out option has an European payoff function with strike at 380 SEK and a knock-out boundary at 400 SEK. The implementations are done in Matlab, using generic (pseudo)-random number generators. A variance reduction could be obtained by using low-discrepancy sequences, but would not change the structure of the results and is not applied here. A variance reduction technique which can be applied with the Malliavin method, introduced by Fournié et al. [16], is to localise the Malliavin weights around the strike price K. Let φ(S) represent the payoff function, being non-smooth at the strike K, and suppose we are interested in the sensitivity with respect to the parameter θ. The Malliavin weight introduce noise but if the otherwise global weight is localised around K the variance is reduced. Assume we can approximate φ(S) with a smooth function φ (S) such that φ(S) − φ (S) becomes zero outside the interval [K − , K + ]. Define Ψ (S) = φ(S) − φ (S), then we see that ∂θ E[φ(S)] = ∂θ E[φ (S)] + ∂θ E[Ψ (S)] = E[φ (S)∂θ S] + E[Ψ (S)π θ ] where π θ is the Malliavin weight for θ. Localising the Malliavin weight reduces the noise at the same time as we avoid taking the derivative of the payoff function close to the strike price. The choice for the function φ depends on the particular payoff function. Several things make the implementation of the Malliavin method more complicated in the BNS than in the Black-Scholes model. The foremost complicating factor is that we need to simulate the stochastic volatility process. Also, adding to the complexity is that the weights contain one or several integrals which need to be estimated using a numerical integration algorithm, in this case the extended trapezoidal rule (see Press [26]). A poor numerical integration adds a bias in the simulations and especially the two measures of Vega suffer if we take a coarse discretisation. The execution time scales with the number of time steps used in the simulation and integration, so there will be a trade-off between speed and accuracy. For the simulation of the variance processes (2.6) we use the series representation proposed in Rosiński [27], see also Barndorff-Nielsen and Shephard [2]. Recall that the variance process (2.6) can be written explicitly as equation (2.3). To simulate paths for this process we need to simulate integrals of the form λt (5.1) exp(−λt) exp(s) dZ(s). 0 Letting be the Levy measure of Z(1) we denote by −1 the inverse of the tail mass function + . Then integrals of the form (5.1) can be approximated as λ ∞ L f (s) dZ(s) = −1 (ai /λ)f (λri ) 0 1Stockholm Interbank Offered Rate i=1 V. 16 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN 0.016 Malliavin Loc. Malliavin Fin. diff. 0.014 Gamma 0.012 0.01 0.008 0.006 0 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 4 x 10 Figure 1. Simulation of the Gamma for a Vanilla Call with payoff function f (x) = (S − K)+ , K = 400. where ai and ri are two independent sequences of random variables with ri ∼ Unif[0, 1] and a1 < a2 < · · · < ai < · · · being arrival times of a Poisson process with intensity 1. Hence we can simulate (5.1) by λt ∞ L (5.2) exp(−λt) exp(s) dZ(s) = exp(−λt) −1 (ai /λ) exp(λri ). 0 i=1 For our choice of stationary distribution the explicit form of the inverse of the mass tail function is unknown, so we need to do a numerical inversion of + . It should be noticed that we need to truncate the infinite sum appearing in (5.2) and here it will be another trade-off between speed and accuracy. A numerical inversion using a search method must be done for each part of the sum. Since the sum appears in every time step, this is the time-consuming part of the algorithm. In practice this is too inefficient to be of any use. Instead we make a fine grid, invert it to get a numerical approximation of −1 and use linear interpolation to find the values. Avoiding the numerical inversion in each steps makes way for a remarkable speed-up, to the cost of a slight error in each estimate. Our first example is a vanilla call option, depending only on the terminal value. The Malliavin method performs, as expected, best for Gamma, where the localised version is slightly better than the finite difference method, see Figure 1. The unlocalised Malliavin method proves to be comparable to or even worse than the finite difference method, something that has been reported previously, see Fournié et al.[16]. To really utilise the Malliavin method we need an option known to produce large variance when simulated with the finite difference method. One choice is a binary option with the discontinuous payoff function φ(x) = 1{x≥K}(x). DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL −3 x 10 Malliavin Fin. Diff 16 14 12 Delta 10 8 6 4 2 0 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 5 4 x 10 Figure 2. Simulation of the Delta for a binary option with payoff function f (x) = 1{x≥K} (x), K = 400. Malliavin Fin. Diff 0.06 0.05 0.04 Gamma 0.03 0.02 0.01 0 −0.01 −0.02 −0.03 0 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 5 4 x 10 Figure 3. Simulation of the Gamma for a binary option with payoff function f (x) = 1{x≥K}(x), K = 400. V. 17 V. 18 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN Malliavin Fin. Diff. 90 Malliavin Fin. Diff. 40 80 35 70 30 Vega2 Vega1 60 50 25 40 20 30 15 20 10 10 5 0 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 5 4 x 10 0 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 4 x 10 Figure 4. Left: Simulation of Vega1 for a binary option with payoff function f (x) = 1{x≥K}(x), K = 400. Right: Simulation of Vega2 for a binary option with payoff function f (x) = 1{x≥K}(x), K = 400 using the perturbation function u(t) = 1. A similar option, considered in the original paper by Fournié et al.[16], is the option with payoff 1[a,b] (x). The binary options are discontinuous, leading to a high variance if simulated with the finite difference method. At the same time we can not use the pathwise or the likelihood ratio method because of the discontinuity and the choice of model. Fortunately this is the kind of problem where the Malliavin method is most suited. As we see in Figure 2 and Figure 3 the unlocalised Malliavin simulation outperforms the finite difference method for both Delta and Gamma. Interesting are the two different measures of the sensitivity with regards to the volatility in the BNS model. The first, Vega1, perturbs the initial value of the variance process while the other, Vega2, adds another stochastic process to the whole volatility process. The two measures are not equal and give different interpretations of the volatility sensitivity. For simplicity, the numerical tests assume that the perturbation function for Vega2 is constant equal to 1, i.e. u(t) = 1. The two different approaches give different results, but as we see in Figure 4, for the Binary option, the simulation using the Malliavin method is superior to the finite difference method in both cases. Path dependent options provide some additional problems in the simulations. If we look at the requirements for Delta and Gamma in Proposition 4.3 we notice that the class of functions satisfying the restriction is rather small. One obvious choice is the function 1{t∈[0,t1 ]} (t) . a1 (t) = t1 We notice that using this function the weight will only depend on the first time period of the paths. Other possible functions are a1 plus some periodic function with integral equal to zero on each interval (ti , ti+1 ), i = 1, 2, . . .. Tests show that including a periodic or alternating function only adds more noise to the simulations, and in the results below we therefore used a1 . Path dependent options are also more or less suitable for the Malliavin method. We implemented and simulated a few different options, including Asian options and different variants of barrier options, not reported with graphics here. Asian options show similar patterns as vanilla options; the Malliavin methods are comparable or inferior to the finite difference method except for the simulation of DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL V. 19 Gamma using the localised Malliavin method. For the Asian options the smoothness of the payoff function is similar to the vanilla options which is why the finite difference method performs reasonably well. The Malliavin method performs much better for a discontinuous option like a knock-in or knock-out option. We test a knock-out option, with vanilla call payoff function, having strike at K = 380 and a knock-out barrier at 400. In Figures 5 and 6 we see results from simulation of Gamma and Vega2 using the same parameters as above. We see again that the Malliavin method performs much better than the finite difference one. References [1] D. Applebaum. Lévy Processes and Stochastic Calculus. Cambridge University Press, Cambridge, UK, 2004. [2] O. E. Barndorff-Nielsen and N. Shephard. Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in economics. J. R. Statist. Soc. B, 63, Part 2:167–241, 2001. [3] F. E. Benth and M. Groth. The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nielsen and Shephard stochastic volatility model. Submitted, 2006. [4] F. E. Benth, M. Groth, and C. Lindberg. The implied risk aversion from utility indifference option pricing in a stochastic volatility model. Submitted, 2007. [5] F. E. Benth and A. Løkka. Anticipative calculus for Lévy processes and stochastic differential equations. Stoch. Stoch. Reports, 76(3):191–211, 2004. [6] F. E. Benth and T. Meyer-Brandis. The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps. Finance Stoch., 9(4):563–575, 2005. [7] K. Bichteler, J.-B. Gravereaux, and J. Jacod. Malliavin calculus for processes with jumps. Gordon and Breach, New York, 1987. [8] M. Broadie and P. Glasserman. Estimating security price derivatives by simulation. Management Science, 42:269–285, 1996. [9] E. Carlen and E. Pardoux. Differential calculus and integration-by-parts on Poisson space. In Stochastics, Algebra and Analysis in Classical and Quantum Dynamics. Kluwer Acad. Publ., Dordrecht, 1990. [10] P. Carr and D. Madan. Option valuation using the fast Fourier transform. J. Comput. Finance, 2(4):61–73, 1998. [11] N. Chen and P. Glasserman. Malliavin Greeks without Malliavin calculus. Preprint, 2006. [12] R. Cont and P. Tankov. Financial modelling with jump processes. Chapman & Hall, London, UK, 2003. [13] M. H. A. Davis and M. Johansson. Malliavin Monte Carlo Greeks for jump-diffusions. Stoch. Proc. Appl., 116(1):101–129, 2006. [14] V. Debelley and N. Privault. Sensitivity analysis of European options in jump-diffusion models via the Malliavin calculus on the Wiener space. Preprint, 2004. [15] Y. El-Khatib and N. Privault. Computations of Greeks in a market with jumps via the Malliavin calculus. Finance and Stochastics, 8:161–179, 2004. [16] E. Fournié, J.-M. Lasry, J. Lebuchoux, P.-L. Lions, and N. Touzi. Applications of Malliavin calculus to Monte Carlo methods in finance. Finance and Stochastics, 3:391–412, 1999. [17] I. Karatzas and S. Shreve. Brownian motion and stochastic calculus. Springer, 1991. [18] A. Kohatsu-Higa and M. Montero. Malliavin calculus in finance. In Handbook of Computational Finance. Birkhauser, 2004. [19] J. A. León, J. L. Solé, F. Utzet, and J. Vives. On Lévy processes, Malliavin calculus and market models with jumps. Finance and Stochastics, 6:197–225, 2002. [20] C. Lindberg. The estimation of a stochastic volatility model based in the number of trades. Submitted, 2006. [21] C. Lindberg. Portfolio optimization and a factor model in a stochastic volatility market. Stochastics, 78(5):259–279, 2006. [22] E. Nicolato and E. Venardos. Option pricing in stochastic volatility models of the ornsteinuhlenbeck type. Mathematical Finance, 13, No. 4:445–466, 2003. [23] D. Nualart. The Malliavin Calculus and related topics. Springer-Verlag, Berlin, 1995. V. 20 FRED ESPEN BENTH, MARTIN GROTH AND OLLI WALLIN [24] D. Nualart and J. Vives. Anticipative calculus for the Poisson process based on the Fock space. In Séminaire de Probabilitiés XXIV 1998/99, Lecture Notes in Math. 1426. Springer-Verlag, Berlin, 1990. [25] D. L. Ocone and I. Karatzas. A generalized Clark representation formula, with application to optimal portfolios. Stochastics and Stochastics Reports, 34:187–220, 1991. [26] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C. Cambridge University Press, Cambridge, 1992. [27] J. Rosinski. On a class of infinitely divisible processes represented as mixtures of Gaussian processes. In S. Cambanis, G. Samorodnitsky, and M. Taqqu, editors, Stabel Processes and Related Topics, pages 27–41. Birkhäuser, Basel, 1991. [28] K. Sato. Lévy processes and infinitely divisible distributions. Cambridge Press, Cambridge, 1999. [29] W. Schoutens. Lévy processes in finance. Wiley, 2003. DERIVATIVE-FREE GREEKS FOR THE BNS STOCHASTIC VOLATILITY MODEL Malliavin Fin. diff. 0.15 0.1 Gamma 0.05 0 −0.05 −0.1 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 5 4 x 10 Figure 5. Simulation of Gamma for a knock-out option with payoff function f (x) = (S − K)+ , K = 380 and knock-out boundary at 400. 250 Malliavin Fin. Diff. 200 Vega2 150 100 50 0 −50 0 0.5 1 1.5 2 2.5 3 No. of iterations 3.5 4 4.5 5 4 x 10 Figure 6. Simulation of Vega2 for a knock-out option with payoff function f (x) = (S − K)+ , K = 380 and knock-out boundary at 400, using the perturbation function u(t) = 1. V. 21