Fisher College of Business Working Paper Series Charles A. Dice Center for Research in Financial Economics Changing Times: The Pricing Problem in Non-Linear Models Robert L. Kimmel, Department of Finance, The Ohio State University Dice Center WP 2008-24 Fisher College of Business WP 2008-03-022 December 17, 2008 This paper can be downloaded without charge from: http://www.ssrn.com/abstract=1317903. An index to the working paper in the Fisher College of Business Working Paper Series is located at: http://www.ssrn.com/link/Fisher-College-of-Business.html fisher.osu.edu Changing Times: The Pricing Problem in Non-Linear Models∗ Robert L. Kimmel Fisher College of Business The Ohio State University† This Version: 17 December 2008 Abstract Finding conditional moments and derivative prices is a common application in continuous-time financial economics, but these quantities are known in closed-form only for a few specific models. Recent research identifies a large class of models for which solutions to such problems have convergent power series, allowing approximation even when not known in closed-form. However, such power series may converge slowly or not at all for long time horizons, limiting their practical use. We develop the method of time transformation, in which the variable representing time is replaced by a non-linear function of itself. With appropriate choice of the time transformation, power series often converge for much longer time horizons, and also much faster, sometimes uniformly for all time horizons. For applications such as bond pricing, in which the time-to-maturity may be many years, rapid convergence is very important for practical application. The ability to approximate solutions accurately and in closed-form simplifies the estimation of non-affine continuous-time term structure models, since the bond pricing problem must be solved for many different parameter vectors during a typical estimation procedure. We show through several examples that the series are easy to derive, and, using term structure models for which bond prices are known explicitly, also show that the series are extremely accurate over a wide range of interest rate levels for arbitrarily long maturities; in some cases, they are many orders of magnitude more accurate than series constructed without time transformations. Other potential applications include pricing of callable bonds and credit derivatives. ∗ I would like to thank seminar participants at INFORMS, CIREQ/CIRANO, the Econometric Society North American Winter Meetings, Princeton University, the FERM annual meeting, Quantitative Methods in Finance, the Hong Kong University of Science and Technology, Singapore Management University, The Ohio State University, the Atlanta Federal Reserve Bank, the Bachelier Society Fourth World Congress, and the China International Conference in Finance, Jin Duan, Mark Fisher, Rüdiger Frey, Kewei Hou, Nengjiu Ju, Yue-Kuen Kwok, Haitao Li, and Jun Yu, for many helpful comments and suggestions. Any remaining errors are solely the responsibility of the author. Much of the material in this paper was originally part of Complex Times: Asset Pricing and Conditional Moments under Non-Affine Diffusions. † Columbus, OH 43210-1144. Phone: +1 614 292 1875. E-mail: kimmel.42@osu.edu. 1. Introduction Finding conditional moments or derivative prices in an economy driven by a continuous-time process is a common problem in financial economics. However, for all but a few models, numeric methods are needed to solve such problems. For some applications (typically estimation by a maximum likelihood or minimum distance search), the pricing or conditional moment problem must be solved many times; if it must be solved numerically each time, the intense computational burden severely limits the class of models that may be used in practice. Non-affine models of, for example, the short interest rate process, are frequently found in the literature; for just a few examples, see Chan, Karolyi, Longstaff, and Sanders (1992), Aı̈t-Sahalia (1996), and Andersen, Benzoni, and Lund (2004). However, such studies usually examine the behavior of the observed short interest rate process, without taking bond pricing implications into account. When the term structure of interest rates as a whole is examined, however, use of affine models is much more common; see, for example, Duffie and Kan (1996), Dai and Singleton (2000), Duffee (2002), Dai and Singleton (2002), Duarte (2004), Thompson (2004), Mosburger and Schneider (2005), Aı̈t-Sahalia and Kimmel (2008), and Cheridito, Filipović, and Kimmel (2007). Egorov, Li, and Ng (2008) use an affine model of the simultaneous evolution of yield curves in two different currencies. The widespread use of affine models in this literature is largely due to the difficulty of calculating bond prices for non-affine models. Although non-linear models with explicit term structure implications have appeared in the literature, they are drawn from a few very specific classes that allow closedform derivation of bond prices; see, for example, Ahn and Gao (1999) and Ahn, Dittmar, and Gallant (2002).1 Kimmel (2008) develops methods for approximation, by power series, of conditional moments and contingent claim prices in a large class of non-affine diffusion models. However, the convergence properties of such power series may be poor for long time horizons. For some applications, this is not much of a problem; for example, if conditional moments are needed for a method of moments estimation, and data are observed frequently, then approximations need only be accurate at short time intervals, such as daily or weekly. However, for other applications, accuracy at much longer time horizons is needed. For example, in term structure applications, one may need to calculate bond prices or swap rates with maturities of many years. A power series representation of bond prices may converge for maturities up to some value, and then diverge for longer maturities; for example, in the Cox, Ingersoll, and Ross (1985) model, inspection of the expression for bond prices reveals that it has singularities (i. e., a point where it is not complex differentiable) at complex values of maturity. Although such complex maturities have no meaning in a bond pricing application, nonetheless, the singularities at these points prevent convergence of the power series of the bond price beyond positive maturities of the same absolute magnitude. Even when the power series representation of a bond price converges for all maturities, as is the case for the model of Vasicek (1977), the convergence may be so slow as to render the use of the series worthless for practical applications. In these two models, bond prices are known in closed-form, so approximation is unnecessary; however, the same issues arise in other models in which bond 1 The non-affine model of Ahn, Dittmar, and Gallant (2002) can be embedded within an affine model with the introduction of additional state variables. These extra state variables do not have independent variation; that is, they are instantaneously perfectly correlated with the original state variables. 1 prices are not known explicitly. We therefore develop the method of time transformations, which often both extends the interval and improves the rate of convergence of power series. This method works by replacing the time variable in the partial differential equation representation of a pricing or conditional moment problem with a non-linear function of itself. We show that proper choice of a non-linear transformation of time never decreases the range of maturities for which the power series converges, and often increases it. A well-chosen time transformation can extend the range of convergence to all positive maturities in many cases, even if the interval of convergence is very short without the time transformation. Furthermore, appropriate use of a time transformation sometimes establishes uniform convergence for all positive time horizons; that is, the asset price or conditional moment can be approximated twenty, fifty, or one hundred years into the future just as accurately as one week ahead. Without the use of time transformations, such uniform convergence can only be achieved in a few very special cases. The rest of this paper is organized as follows. In Section 2, we review the method of series solution of conditional moment and contingent claim pricing problems, and discuss some of the problems of this approach at long time horizons. In Section 3, we consider a family of non-affine transformations of the time variable, and use these transformations to develop power series with improved convergence properties. In Section 4, we apply time transformation methods to the two large families of models studied by Kimmel (2008), and show that both the range and the rate of convergence of power series solutions to conditional moments and bond prices can often be improved dramatically. Section 5 explores several specific models in detail. For the term structure models of Ahn, Dittmar, and Gallant (2002) and Cox, Ingersoll, and Ross (1985), bond prices are known in closed-form; these models therefore serve as illustrative examples to show the accuracy of series approximations using our method. We find that approximations are both easy to derive and very accurate (even with only a few terms) over a very wide range of initial interest rates, for both short and very long maturities. By contrast, approximations based on power series constructed without time transformations perform very poorly for maturities beyond a year or two; they are often many orders of magnitude less accurate than approximations based on our method. We also consider other models, such as the callable bond pricing model of Jarrow, Li, Liu, and Wu (2006), in which prices are not known in closed-form, and show how to approximate them using our technique. Finally, Section 6 concludes. Proofs of all results are found in the appendix. 2. Series Solutions at Long Horizons Continuous-time asset pricing or conditional moment problems are often based on N -dimensional diffusion processes: dXt = µ (Xt ) dt + σ (Xt ) dWt with an initial condition Xt0 = x, where Xt is a vector of N state variables, µ (x) is an N -vector valued function of x, σ (x) is an N × N -matrix valued function of x, and Wt is an N -dimensional standard Brownian 2 motion. Criteria for existence of a solution are found in, for example, Karatzas and Shreve (1991), Stroock and Varadhan (1979), or Liptser and Shiryaev (2001). The conditional moment of ψ (x) is defined as: ¯ £ ¤ f (∆, x) = E ψ (Xt+∆ ) ¯Xt = x (2.1) The price of derivative security that pays a given function ψ (x) of the state vector at maturity satisfies: h R t+∆ i ¯ f (∆, x) = E e− t r(Xu )du ψ (Xt+∆ ) ¯Xt = x (2.2) where r (x) is the short interest rate. Applications from the early days of continuous-time finance were typically option pricing problems, in which Xt would be the price process of a stock or other underlying asset, and ψ (Xt ) would be the payoff of the option as a function of the underlying asset price at maturity. However, typical bond pricing problems are also structured this way; the final condition is then ψ (Xt ) = 1, which is the (fixed) final payoff of a zero-coupon bond at maturity. The expectation in (2.1) is normally taken under true probabilities, whereas the expectation in (2.2) is taken under risk-neutral probabilities. Both types of problems can be formulated as a partial differential equation problem: N N N X ∂2f ∂f ∂f 1 XX 2 σij (x) (∆, x) = µi (x) (∆, x) + (∆, x) − r (x) f (∆, x) ∂∆ ∂xi 2 i=1 j=1 ∂xi ∂xj i=1 (2.3) (2.4) f (0, x) = ψ (x) with r (x) = 0 for a conditional moment problem. The equivalence of the probabilistic problem of (2.1) or (2.2), and the PDE problem of (2.3) and (2.4), is subject to technical conditions; see Levendorskii (2004a) and Levendorskii (2004b) for a recent discussion of this issue for affine models. Techniques for approximating conditional probability densities of non-linear diffusions are derived in Aı̈tSahalia (2002) (for a single state variable) and Aı̈t-Sahalia (2008) (for multiple state variables); see Aı̈t-Sahalia (1999) for examples, and Egorov, Li, and Xu (2001) for an extension to time-inhomogeneous diffusions. With some modification, such techniques could be used to find state price densities for the case of r (x) 6= 0. It might seem that conditional moment and asset pricing problems could be solved by first approximating the conditional probability density or the state price density, and then integrating the ψ (x) function over this density. However, this integral may be difficult or impossible to find explicitly, and convergence of the density approximations to the true density does not guarantee convergence of conditional moment (or asset price) approximations derived in this way to the true conditional moment (or asset price). It is possible, even in cases where the integral of the final condition over the true density exists, that the integral of the final condition over an approximate density does not exist. Noting these problems, Kimmel (2008) skips the intermediate step of finding a density, and instead writes the solution to (2.3) with the specific final condition (2.4) directly as a power series in ∆, centered at zero: f (∆, x) = a0 (x) + ∞ X n=1 3 an (x) ∆n n! (2.5) and derives a recursive relation for the coefficients: a0 (x) = ψ (x) an (x) = N X (2.6) N µi (x) i=1 N ∂an−1 1 XX 2 ∂ 2 an−1 (x) + σij (x) (x) − r (x) an−1 (x) ∂xi 2 i=1 j=1 ∂xi ∂xj (2.7) 2 If the µi (x), σij (x), and r (x) coefficients, as well as the final condition ψ (x), are all infinitely differentiable,2 then derivation of the power series coefficients ai (x), as in (2.6) and (2.7), is straightforward. Note the the time-homogeneity of the coefficients of (2.3) is important; if these coefficients were instead explicit functions of time, calculating the power series coefficients by a method similar to (2.6) and (2.7) would be much more difficult or perhaps impossible. Even in the time-homogeneous case, though, establishing convergence of the power series is a much trickier issue than establishing existence of the coefficients; if the conditional moment or derivative price, f (∆, x), extended to complex values of ∆, has a singularity at some ∆0 6= 0, then the power series does not converge for any |∆| > |∆0 |; furthermore, if f (∆, x) has a singularity at ∆0 = 0, the power series does not converge anywhere (except trivially at ∆ = 0). The region of convergence of the power series thus depends critically on the nature of f (∆, x), not just for the positive real values of ∆ that are of interest for applications, but for all complex ∆. However, f (∆, x) in most cases is known only implicitly as the solution to the PDE, so it is not possible to determine by direct inspection whether f (∆, x) has the necessary properties for a convergent power series (and, furthermore, there would be little reason to find the power series if f (∆, x) were known explicitly). Rather, it must be determined from the coefficients and final condition of the problem (2.3) and (2.4), whether f (∆, x) can be represented by a power series. Kimmel (2008) shows that, for any specification of the coefficients of the scalar version of the PDE, there exists an infinite-dimensional class of final conditions ψ (x) such that the corresponding solution f (∆, x) is analytic in some neighborhood of ∆ = 0; its power series then converges within any circle |∆| < r contained within that region.3 Although he provides a full characterization of all solutions to (2.3) and (2.4) that have convergent power series representations, in practice, this characterization is somewhat tricky to use; it is difficult to determine whether a given ψ (x) lies within the class of final conditions that generate solutions with convergent power series. He therefore goes on to describe two particular families of scalar PDEs for which it is possible to express the class of solutions with convergent power series in a more useful form. Using the change of variable techniques of Colton (1979), he expresses the solution to (2.3) and (2.4) in the scalar case as: f (∆, x) = η (x) h (∆, y (x)) 2 As noted in Kimmel (2008), infinite differentiability of these functions is a sufficient, but not necessary condition for existence of the power series coefficients; it is neither necessary nor sufficient for the convergence of the series. 3 The equivalence of the PDE problem and the probabilistic problem must still be established. 4 For appropriate choice of η (x) and y (x), the function h (∆, y) satisfies the transformed problem: ∂h 1 ∂2h (∆, y) − rh (y) h (∆, y) (∆, y) = ∂∆ 2 ∂y 2 h (0, y) = g (y) (2.8) (2.9) Since any scalar PDE problem can be converted to this form by change of variables, there is no loss of generality by considering only this apparently restrictive form. In particular, note that the changes in variables do not involve the time variable, so that if, for some x, f (∆, x) is an analytic function in some region of ∆, then h (∆, y (x)) is also an analytic function of ∆ in the same region. Kimmel (2008) refers to PDEs expressed in this form as “canonical,” and explicitly characterizes the analytic solutions to (2.8) and (2.9) for two particular specifications of rh (y):4 b2 2 (y − a) + d 2 a b2 rh (y) = 2 + y 2 + d y 2 rh (y) = (2.10) (2.11) These two specifications include many of the models that are commonly used in the literature. For example, the term structure models of Vasicek (1977), Cox, Ingersoll, and Ross (1985), Ahn, Dittmar, and Gallant (2002), and Ahn and Gao (1999) are all covered by one of these two cases, as is the callable bond pricing problem of Jarrow, Li, Liu, and Wu (2006). However, many models that have never previously appeared in the literature can also be expressed in one of these two forms, including many in which the state variable process is strongly non-linear (see the discussion in Kimmel (2008) and in Section 5). For both specifications, the region of analyticity (i. e., complex differentiability) of h (∆, y) (and therefore the interval of convergence for its power series) is determined by two attributes of the final condition g (y). Specifically, for the power series to converge at all, the final condition must be everywhere analytic. Note that this requirement applies to all complex values of y, even if only real values are meaningful in the real-world problem. If this smoothness requirement is satisfied, then the size of the region of analyticity (and therefore the region of convergence of the power series) is determined by a growth condition on g (y). For (2.10), this condition is: ¯ b ¯ kyk2 ¯ − 2 (y−a)2 ¯ g (y)¯ ≤ ke 2 ¯e for some k > 0 and some norm (over the reals) kyk.5 A special case is obtained by taking the norm kyk ≡ √ |y| / k0 for some k0 > 0, in which case the growth condition is: ¯ b ¯ |y|2 ¯ − 2 (y−a)2 ¯ g (y)¯ ≤ ke 2k0 ¯e Note that, like the smoothness condition, the growth condition must be satisfied for all complex values of y, even if only real values are meaningful for the application. The solution h (∆, y) is then analytic in a region 4 Kimmel (2008) also considers the case where rh (y) is linear, but for our purposes, analysis of this case is not very different from (2.10) with b = 0. √ 5 See Kimmel (2008) for a brief discussion of such norms. Multiples of the absolute value function kyk ≡ |y| / k are valid 0 q norms, but so are elliptical norms such as kyk ≡ (Re y)2 /k1 + (Im y)2 /k2 . Many other types of norms are also possible. 5 that is circular in the special case b = 0, but is in general an elongated ellipse-like shape.6 This region includes the circle |∆| < [ln (1 + k0 |2b|)] / |2b|, and the power series of h (∆, y) converges within this circle. If the growth condition is not satisfied for this type of norm with any k0 , then the power series does not converge anywhere (except trivially at ∆ = 0); if the growth condition is satisfied by this type of norm for any k0 , then the power series converges everywhere. Within the region of analyticity in ∆, the solution h (∆, y) is also everywhere analytic in y, although this property is not particularly important, since we, like Kimmel (2008), consider power series only in the time variable. Analyticity of the final condition g (y) is needed to establish analyticity of h (∆, y) in ∆, and therefore convergence of the power series, but analyticity of h (∆, y) in y has no relevance for convergence, since we do not consider power series in y. The situation for (2.11) is more complicated, but in many ways similar to that of (2.10). For the solution h (∆, y) to be analytic, the final condition must be of the form:7 √ √ g1 (y) y 1− 21+8a + g2 (y) y 1+ 21+8a g (y) = √ √ g1 (y) y 1− 21+8a + g2 (y) y 1+ 21+8a ln y √ √ 1+8a 2 ∈ /N 1+8a 2 ∈N (2.12) where g1 (y) and g2 (y) are both even, everywhere analytic functions that satisfy the growth bounds: ¯ b 2 ¯ kyk2 ¯ −2y ¯ g2 (y)¯ ≤ ke 2 ¯e ¯ b 2 ¯ kyk2 ¯ −2y ¯ g1 (y)¯ ≤ ke 2 ¯e for some k > 0 and some norm (over the reals) kyk. Analogously with (2.10), g1 (y) and g2 (y) must be smooth (analytic) and satisfy the growth bounds for all complex values of y, even though the conditional moment or pricing problem only has meaning for real values of y. Again taking a norm of the form kyk ≡ |y| /k0 , the solution h (∆, y) is then analytic in a region of ∆ that includes the circle |∆| < [ln (1 + k0 |2b|)] / |2b|, and its power series converges everywhere within that circle. Except for specific values of a, the solution is not everywhere analytic in y, but can be expressed in terms of two functions that are analytic in both ∆ and y; see Kimmel (2008). However, as in the previous case, analyticity of h (∆, y) in y is not particularly important. The results of Kimmel (2008) therefore allow us to determine when, for certain versions of the canonical PDE, solutions are analytic in the time variable, and can therefore be approximated by the first few terms of a convergent power series. However, for many applications, we would like to approximate solutions for very large values of ∆, and this presents some practical difficulties. First, if there is a singularity anywhere in the complex plane, then convergence of the power series is limited to a circle whose radius is the distance from the origin to the singularity. Therefore, if a power series is to converge for arbitrarily large time horizons, the solution h (∆, y) must have an analytic extension to all complex values of ∆, since a singularity anywhere in the complex plane limits the range of convergence for positive ∆. Second, even if h (∆, y) is everywhere analytic, and its power series converges everywhere, such convergence is, in general, not uniform. Consequently, 6 Although this shape superficially resembles an ellipse, in that it is a smooth, closed curve that is longer in one dimension than the other, its exact shape is different. √ 7 We follow Kimmel (2008) here, but note that the first case listed in (2.12) may be used whether or not 1 + 8a/2 is an integer. If it is, the presence of g2 (y) is redundant; simply adding y generates exactly the same final condition. 6 √ 1+8a g 2 (y) to g1 (y), and setting g2 (y) equal to zero, convergence for large values of ∆ may be very slow, so that many terms of the power series must be calculated for a good approximation; use of a series is then not practical. Table 1 describes the convergence properties of series representations of bond prices in several term structure models. In all of these models, bond prices are known in closed-form.8 In the model of Vasicek (1977), the bond pricing problem reduces to the canonical problem with rh (y) specified by (2.10), and the final condition satisfies the corresponding growth condition for all values of ∆. Consequently, bond prices are everywhere analytic, and a power series converges for all maturities. However, this convergence is not uniform, and can be very slow for long maturities. The situation for the model of Cox, Ingersoll, and Ross (1985) is quite different. Here, the bond pricing problem reduces to the canonical problem (2.8) and (2.9), with rh (y) specified by (2.11). The model of Ahn, Dittmar, and Gallant (2002) is similar to that of Cox, Ingersoll, and Ross (1985), but with rh (y) specified by (2.10) instead. However, in both cases, the applicable growth condition is satisfied for some k0 and not for others, so that analyticity of bond prices cannot be established for all ∆. In fact, in both models, bond prices have singularities at complex values of ∆. Although such complex values are meaningless for applications, the singularities nonetheless prevent convergence of the power series for positive values. As such, bond prices in these models can be represented by a power series up to a certain maturity; beyond that maturity, the series diverges. The situation is worse still for the model of Ahn and Gao (1999), in which bond prices have a singularity at ∆ = 0. The power series for bond prices in this model does not converge for any positive maturity. In all four of the models described in Table 1, bond prices are known in closed-form, so discussion of the convergence properties of their power series is for illustrative purposes only. However, the results of Kimmel (2008) apply to many other models, in which bond prices (or other derivative prices or conditional moments) are not known in closed-form, and similar issues arise in those models. For example, Jarrow, Li, Liu, and Wu (2006) consider a callable bond pricing problem, in which bond prices satisfy:9 ³ ∂f ∂f σ2 x ∂ 2 f c2 ´ (∆, x) = κ (θ − x) (∆, x) + (∆, x) − c1 x + f (∆, x) 2 ∂∆ ∂x 2 ∂x x f (0, x) = 0 This problem can, by change of variables, be converted to the canonical form (2.8) and (2.9), with rh (y) given by (2.11). With some parameter restrictions, the final condition is sufficiently smooth, and also satisfies the growth condition for some k0 , but not for other values. Consequently, the solution to their problem can be represented by a power series that converges for short maturities, but diverges for long maturities. In this problem, the bond price has a singularity for complex values of ∆; although these values have no meaning in the context of the real-world problem, such singularities prevent convergence of a power series for positive values of ∆. We propose the method of time transformations to remedy these problems. Specifically, we apply a change 8 For one of the models, that of Ahn and Gao (1999), the definition of “closed-form” must include confluent hypergeometric functions, which are usually defined as the solution to an ordinary differential equation. With a stricter definition of “closed-form,” bond prices in this model are not known explicitly. 9 Our notation differs somewhat from theirs. 7 of variables to f (∆, x), the solution to (2.3) and (2.4), replacing the time variable, ∆, with a non-linear function of itself, τ (∆). The solution, expressed in terms of τ instead of ∆, satisfies a modified general PDE, with the same final condition. By expressing the pricing PDE in terms of τ instead of ∆, the range of convergence of the power series solution can often be improved. In particular, if the solution h (∆, y) is well-behaved for all positive values of ∆ (which are the values of interest in real-world applications), a well-chosen transformation of the time variable extends the convergence of a power series to all ∆ ∈ [0, +∞). In other words, even if there are singularities for negative or complex values of ∆, so that a power series directly in ∆ converges only for a limited range of positive values, the method of time transformation can extend this range to any arbitrarily long range. If certain additional conditions are met, the time transformation also makes such convergence uniform; in this case, the power series representation of h (∆, y) converges just as quickly for ∆ equal to 30 or 50 years as for ∆ equal to one week. Furthermore, the time transformation we use preserves the ease of calculation of power series coefficients through a recursive relation that is a slightly modified version of (2.7). In the next section, we study the method of time transformations in detail, and examine its effect on convergence of the power series representations of solutions to (2.3) and (2.4). 3. Time Transformations As noted in the preceding section and in Kimmel (2008), a singularity at a negative or complex value of ∆ (i. e., values not of interest for typical applications) nonetheless causes the power series representation of a derivative price or conditional moment f (∆, x) to fail to converge on the interval ∆ ∈ [0, +∞) (i. e., the values normally of interest in applications). However, provided any singularities in f (∆, x) are not near the positive real axis (which is usually the case in applications), it is nonetheless possible, as we now show, to construct power series that converge for all values of ∆ ∈ [0, +∞). In some cases, convergence on this interval can even be uniform, so that the series converges just as quickly for a time period of ten, twenty, or one hundred years into the future, as it does for one week ahead. This goal can be realized by performing a change of the time variable, replacing ∆ with some function τ (∆), and constructing a power series in τ (∆) instead of one in ∆ directly. A power series converges within a circle in the complex plane whose radius is the distance from the point of expansion to the nearest singularity. However, if the function τ (∆) is not affine in ∆, then the transformation effectively distorts time, stretching it in some directions and compressing it in others, so that a circle in the complex plane of τ (∆) is a non-circular shape in the complex plane of ∆. By appropriate choice of τ (∆), singularities in f (∆, x) can be moved away from the origin, whereas the interval ∆ ∈ [0, +∞) can be compressed, so that each point moves closer to the origin, and falls within the circle of convergence. Throughout, we use the same time transformation (or more precisely, since it depends on a parameter, family of time transformations), described in Section 3.1. However, we use this transformation for two distinct purposes, which are described in Sections 3.2 and 3.3, respectively. The first use increases the range of convergence of the power series of the conditional moment or asset price sought; that is, the time transformation allows a series to converge for a longer range of time horizons than a series constructed without the time transformation. We refer to this type of convergence as “small circle convergence,” for reasons discussed 8 below. If the price or conditional moment is well behaved for all positive time horizons, then its power series can be made to converge for all values out to +∞ through use of the time transformation. Even in this case, though, this convergence is in general not uniform; that is, it is slower for longer time horizons. Then only a small number of terms in the power series may be required to approximate the solution very accurately for small ∆, but the number of terms needed increases with ∆, so that practical use of the power series is limited to smaller values of ∆. The second use of the time transformation, which we refer to as “large circle convergence,” establishes uniform convergence out to +∞. If convergence is uniform, then the solution to (2.3) and (2.4) can be approximated by a given number of terms in the power series, and the accuracy of the approximation does not depend on ∆; solutions at a one hundred year time horizon are as easy to approximate as solutions at a one week time horizon, for example. Much stronger restrictions than in the small circle case must be satisfied; specifically, f (∆, x) must have the right type of behavior as ∆ approaches +∞, and use of the time transformation for large circle convergence often requires that the state variable be changed as well. In both the small and large circle convergence cases, however, the computation of a power series in τ (instead of ∆) proceeds by a simple recursive relation. In this section, we examine this family of transformations, and the conditions needed for both small and large circle convergence. Throughout, we use the term structure model of Cox, Ingersoll, and Ross (1985) (for which bond prices are known explicitly) as an illustrative example, but the issues that arise for this model also arise for other models in which prices are not known explicitly. 3.1. The Basic Time Transformation The basic time transformation used throughout is: τ = τk (∆) ≡ 1 − exp (−k∆) (3.1) for some k 6= 0. In typical applications, it is most useful to choose k to be a real positive number, but the results here are valid for any complex (but non-zero) value of k. The inverse transformation is: ∆ = ∆k (τ ) = − ln (1 − τ ) k (3.2) Note that every number (except zero) has infinitely many complex logarithms; we choose the principal branch of the logarithm function −π < Im [ln (1 − τ )] ≤ +π. With this choice, ∆ is an analytic (i. e., complex differentiable) function of τ for all values of τ except real values τ ≥ 1. We then express the solution to (2.3) and (2.4), f (∆, x), as: f (∆, x) = w (τ, x) = w (τk (∆) , x) (3.3) w (τ, x) = f (∆, x) = f (∆k (τ ) , x) (3.4) with the inverse relation: 9 Note that τ (0) = 0. Note also that f (∆, x), without additional restrictions, can be expressed in the form (3.3) only for −π < Im [−k∆] ≤ +π; however, for real k, this range includes all real ∆, which are the values that matter for applications. If f (∆, x) is an analytic function around ∆ = 0, then w (τ, x) is also an analytic function around τ = 0; the reverse implication also holds. Therefore, either both have a convergent power series (around ∆ = 0 and τ = 0, respectively), or neither does. The power series for w (τ, x) can be found by essentially the same method used to find the power series for f (∆, x). By substituting the expression for f (∆, x) from (3.3) into (2.3) and (2.4), the PDE can then be rewritten in terms of τ and w (τ, x) instead of ∆ and f (∆, x): N k (1 − τ ) N N X 1 XX 2 ∂w ∂w ∂2w (τ, x) = (τ, x) + (τ, x) − r (x) w (τ, x) µi (x) σij (x) ∂τ ∂xi 2 i=1 j=1 ∂xi ∂xj i=1 w (0, x) = ψ (x) (3.5) (3.6) This PDE is of the same form as (2.3) and (2.4), except that the derivative with respect to the time variable in (3.5) has an explicitly time-dependent coefficient that is not present in (2.3). However, because of the linear form of this coefficient, it is still possible to derive a power series for w (τ, x) from a simple recursive relation. We now write the power series representation of w (τ, x) in τ , rather than of f (∆, x) in ∆: w (τ, x) = b0 (x) + ∞ X bn (x) n=1 τn n! (3.7) The initial coefficient b0 (x) follows immediately from (3.6): b0 (x) = ψ (x) (3.8) By plugging (3.7) into (3.5), and gathering terms of like order in τ , we find the subsequent coefficients bn (x) for n ≥ 1 recursively: · ¸ N N N 1X 1 XX 2 r (x) ∂bn−1 ∂ 2 bn−1 bn (x) = (x) + (x) − + 1 − n bn−1 (x) µi (x) σ (x) k i=1 ∂xi 2k i=1 j=1 ij ∂xi ∂xj k (3.9) A power series in τ can therefore be calculated just as easily as a power series in ∆, using a very similar recursive relation; the only differences are that the right-hand side has been divided through by k, and an additional term appears in the coefficient of bn−1 (x) in (3.9) that does not appear in the coefficient of an−1 (x) in (2.7). As in the case discussed in Section 3, time-homogeneity of the coefficients of the original pricing PDE is important, but so is the particular form of the time transformation. Calculating the coefficients bn (x) would be much more difficult, or perhaps impossible, if the coefficients of the original PDE were to depend explicitly on time, or if some other type of time transformation were chosen. The inverse time transformation ∆k (τ ) is an analytic function in a neighborhood of τ = 0. If f (∆, x) is an analytic function around ∆ = 0, then it follows from (3.4) that w (τ, x) is also an analytic function around τ = 0. If a solution f (∆, x) to (2.3) and (2.4) cannot be found explicitly, it can still be approximated by a power series in two different ways, which we summarize in the following remark. 10 Remark 1. If the solution f (∆, x) to the conditional moment or asset pricing problem (2.3) and (2.4) can be shown to be analytic in a neighborhood of ∆ = 0 (through the results of Kimmel (2008) or by some other method), then it can be approximated through a truncated power series in two distinct ways: 1. The first few terms of the power series of f (∆, x) (the solution to the original problem, (2.3) and (2.4)) in ∆ can be calculated using (2.6) and (2.7). 2. The first few terms of the power series of w (τ, x) (the solution to the transformed problem, (3.5) and (3.6)) in τ can be calculated using (3.8) and (3.9). An approximation of f (∆, x) (the solution to the original problem (2.3) and (2.4)) can then be constructed from the approximation to w (τ, x) using (3.3). With either of these methods, f (∆, x) can be approximated by the first few terms of a convergent power series in a neighborhood of ∆ = 0. Since the power series converges in this neighborhood, the approximation becomes more accurate when more terms are added. However, the size and shape of the neighborhood of ∆ = 0 in which the series converges is, in general, not the same with the two methods. The first method could even be considered a special case of the second. If we scale the time transformation by k, then we see that τk (∆) approaches ∆ as k goes to zero: lim k→0 τk (∆) =∆ k Throughout, we sometimes speak loosely of the power series of w (τ, x) converging for certain values of ∆. A power series in τ converges within |τ | < s for some s ≥ 0. Therefore, a statement that a power series in τ converges in a given region of ∆ simply means that |τk (∆)| < s everywhere within that region of ∆. Since the relation between τ and ∆ depends on a parameter k, the value of k must be stated explicitly, or be clear from the context, for such statements to be meaningful. If all that is needed is an approximation of f (∆, x) in some very small neighborhood of ∆ = 0, it usually makes little difference which method is used; if f (∆, x) is analytic around ∆ = 0, then a power series in ∆ converges around this point, and the power series of w (τ, x) also converges around τ = 0, which corresponds to a neighborhood of ∆ = 0. If f (∆, x) is not analytic at ∆ = 0, then neither power series converges for any values except ∆ = 0 and τ = 0, respectively, at which f (0, x) = ψ (x) trivially. However, as we will see in the next few sections, for large values of ∆, there are two significant advantages to using the second method for many applications. First, the neighborhood of ∆ = 0 in which the series converges can often be larger when the second method is used; this phenomenon is examined in Section 3.2. Second, convergence within that neighborhood can often be faster with the second method; this phenomenon is studied in Section 3.3. 3.2. Small Circle Convergence The radius of convergence of a power series is the distance from the point of expansion to the nearest singularity in the function being represented. The power series representation of f (∆, x) around ∆ = 0 therefore converges in the region |∆| < s for some s ≥ 0, with s = +∞ possible. (For finite s, the power series may or may not 11 converge on |∆| = s.) Similarly, the power series representation of w (τ, x) around τ = 0 converges in the region |τ | < r for some r > 0. For financial and economic applications, we typically care only about positive values of ∆, so the relevant values of ∆ for which the first power series converges are ∆ ∈ [0, s). We now show that, for appropriate choice of k, use of the basic time transformation results in a power series representation of w (τ, x) in τ that converges on at least ∆ ∈ [0, s), and often for larger values of ∆. That is, appropriate use of the time transformation never decreases the interval of ∆ for which a power series converges, and often increases it. As discussed earlier, the circle |τ | < r within which the power series representation of w (τ, x) converges implicitly specifies a region of ∆. Since the basic time transformation is non-affine, a circle in the complex plane of τ corresponds to a non-circular shape in the complex plane of ∆. We now consider the requirements for convergence of the power series within a circle in |τ | < r with radius 0 < r ≤ 1, and the corresponding region of ∆. The implications of convergence in a circle with radius r > 1 are quite different, and are discussed in the Section 3.3. Within |τ | < r for r ≤ 1, the inverse time transformation is analytic in τ . Consequently, the function w (τ, x) = f (∆, x) is analytic in τ at all points ∆ = ∆k (τ ), |τ | < r, where f (∆, x) is analytic in ∆ (with x representing a vector of state variables). The following theorem characterizes, for 0 < r < 1, a region of ∆ whose analyticity guarantees the convergence of the power series in τ ; the case of r = 1 is treated separately. By convention, the arccosine function takes values in [0, π]. Theorem 1. Let k 6= 0, and let 0 < r < 1. Holding x fixed, if f (∆, x) is defined and analytic in ∆ in the region where: p p 1 − r2 <Im (k∆) < + arccos 1 − r2 cos [Im (k∆)] cos [Im (k∆)] <Re (k∆) < − ln p − ln p − cos2 [Im (k∆)] − (1 − r2 ) + cos2 [Im (k∆)] − (1 − r2 ) − arccos (3.10) (3.11) then the function w (τ, x) = f (∆, x), with ∆ ≡ ∆k (τ ) defined by (3.2), is analytic within the circle |τ | < r. Conversely, if w (τ, x) is defined and analytic in the circle |τ | < r, then f (∆, x) = w (τ, x), with τ ≡ τk (∆) defined by (3.1), is analytic in ∆ in the region indicated by (3.10) and (3.11). Proof: See appendix. Figure 1 shows the shape of the region described by (3.10) and (3.11) for k = 0.1; these shapes are bounded and elongated regions, extending farther in the real positive direction than in other directions. If w (τ, x) is analytic within |τ | < r for one of the values of r shown (with k = 0.1), then its power series converges within that circle, and f (∆, x) can be recovered from this power series, using (3.3), within the corresponding region. Not surprisingly, circles in τ with larger radius correspond to larger regions of ∆; as the radius r approaches 1, all of the positive real axis is eventually included in the corresponding region of ∆. However, this region is bounded in other directions; as r approaches 1 (holding k fixed at 0.1), the region of ∆ described by (3.10) and (3.11) never has real part smaller than −10 ln 2, and never has imaginary part larger than +5π or smaller than −5π. The figure also shows the locations of the singularities of the bond price function for the model of Cox, Ingersoll, and Ross (1985), with parameter values κ = 0.5 and σ = 0.15. (See Table 1 for a description of 12 the risk-neutral interest rate process in this model. We do not specify the value of the θ parameter, because it does not affect the location of the singularities.) The closest singularities to the origin have modulus of approximately 8.237, so the power series (in ∆) representation of bond prices converges only for maturities of less than this amount. (Note that the interval of convergence depends on the values of the κ and σ parameters, as per Table 1.) However, by applying the time transformation with k = 0.1, and constructing a power series representation in τ instead of ∆, the interval of convergence can be much larger. As shown, these singularities all lie outside the r = 0.99 circle, which is the largest circle shown; in fact, the singularities lie outside of any circle with r < 1, as can be confirmed by checking (3.10) and (3.11). Since the r = 0.99 circle includes maturities of more than 50 years, a power series in τ representation of bond prices (with k = 0.1) converges for maturities at least this large. Use of the time transformation in this example therefore increases the interval of the power series convergence by a factor of more than six. (As we see later, the interval of convergence can be improved further still with larger values of k.) Figure 2 illustrates the effects of the time transformation in a different way. One of the shapes shown is a circle in ∆, with radius equal to 50; the other shape, contained within the circle, is a circle in τ projected onto the complex plane of ∆, as described by (3.10) and (3.11) for k = 0.1, with r approximately equal to 0.9866 (i. e., similar to the shapes shown in Figure 1, but with a different radius). With this combination of r and k, the elongated shape reaches a value of +50 on the positive real axis, touching the circle in ∆ at this point. A power series representation of f (∆, x), the solution to (2.3) and (2.4), can be calculated using (2.6) and (2.7). However, if f (∆, x) has a singularity anywhere inside the circle, the radius of convergence of the power series extends only as far as this singularity, and the series diverges for values of ∆ approaching +50. Similarly, a power series representation of w (τ, x), the solution to (3.5) and (3.6), can be calculated using (3.8) and (3.9); f (∆, x) can then be recovered using (3.3). If there is a singularity in f (∆, x) within the elongated shape shown in Figure 2, then the power series representation of w (τ, x) does not converge for all |τ | < 0.9866, and it is not possible to approximate values of f (∆, x) for ∆ approaching +50 this way either. However, if there are singularities inside the circle in Figure 2, but outside the elongated shape, then the power series representation of f (∆, x) does not converge for all |∆| < 50, but the power series representation of w (τ, x) does converge for all |τ | < 0.9866, which effectively includes ∆ ∈ [0, +50). In this case, it is not possible to approximate the solution f (∆, x) for a larger range of ∆ by finding a power series directly, but it is possible by approximating w (τ, x) with a truncated power series instead, and then applying (3.3). In fact, for positive real k, a power series in τ always converges for positive real values of ∆ at least as large as those for which a power series directly in ∆ converges. Therefore, if positive real values of ∆ are of interest (as is usually the case in applications), the use of the time transformation (with positive real values of k) never decreases the range of convergence, and may well increase it. The figure also shows the points of singularity of the bond price function in the model of Cox, Ingersoll, and Ross (1985), with κ = 0.5 and σ = 0.15. As shown, there are eight singularities within the circle in ∆, preventing convergence of a power series for maturities up to ∆ = 50. However, none of these singularities are within the circle in τ , so that use of the time transformation (with k = 0.1) extends convergence at least to ∆ = 50. Both Figures 1 and 2 show circles in τ mapped to ∆ when the parameter of the time transformation is 13 k = 0.1. For the bond pricing example with the parameters chosen, the value k = 0.1 is sufficient to construct convergent power series for very long maturities. However, the location of the singularities in the bond pricing function depends on the parameters, and could be closer to the origin; in this case, the time transformation with k = 0.1 would result in a convergent power series for a shorter range of ∆. For example, continuing with the model of Cox, Ingersoll, and Ross (1985), but with parameter values κ = 1 (instead of κ = 0.5) and σ = 0.15, then there would be singularities inside all of the circles shown in Figure 1, except the r = 0.4 circle. The interval of convergence using k = 0.1 would then be considerably smaller than when κ = 0.5, including maturities up to some value that is considerably less than 20 years. However, when the parameter of transformation k is larger, the elongation of the circles in τ (when mapped to ∆) is more extreme. Figure 3 shows circles in τ mapped to ∆ when the parameter of the time transformation is k = 0.2, with the axes drawn to the same scale as Figure 1. Note, for example, that circles in τ that extend to ∆ = 50 in the positive real direction include substantially smaller regions of ∆ in other directions than when k = 0.1. For example, with k = 0.2, the circle with r = 0.99999 (i. e., the largest circle shown in Figure 3) extends to approximately ∆ = +61.030 in the positive direction, but to ∆ = −3.466 in the negative direction. With k = 0.1, the circle with r = 0.99 (i. e., the largest circle shown in Figure 1) extends not quite as far in the positive direction, to approximately ∆ = +52.958, but extends almost twice as far as the k = 0.2 circle in the negative direction, to ∆ = −6.906. The r = 0.99999 circle in Figure 3 also extends only to about ±7.84 along the imaginary axis, whereas the r = 0.99 circle in Figure 1 extends to about ±14.69 along the imaginary axis. Larger positive values of k mean smaller regions of ∆ are needed to extend the same distance in the positive real direction. Figure 3 also shows the points of singularity for bond prices in the Cox, Ingersoll, and Ross (1985) model with κ = 1; as shown, they are all outside all of the circles shown, guaranteeing convergence for maturities up to more than 50 years. By contrast, with κ = 0.5, convergence cannot be guaranteed even to 20 years, as the singularities appear inside all of the circles (except the r = 0.4 circle) in Figure 1. Figure 4 shows the effect of varying values of k directly, with circles in τ mapped back to ∆ for various combinations of k and r. The combinations are chosen so that the largest positive value of ∆ falling within the circle is always the same (approximately +30.40). As shown, the shapes for large values of k are contained entirely within the shapes for smaller values of k, but still extend the same distance in the positive real direction. The significance of this phenomenon is that higher values of k can cause the power series for w (τ, x) to converge for larger (real positive) values of ∆. For example, if there are no singularities in f (∆, x) within the k = 0.25 shape, but there is a singularity within the k = 0.05 shape, then the power series for w (τ, x) converges for at least |τ | < 0.999 when k = 0.25, which includes the values ∆ ∈ [0, 30.40). However, for k = 0.05, the radius of convergence is less than 0.6105, and does not include the same interval in ∆. Larger values of k never decrease the range of ∆ for which a power series converges, and may well increase it. The qualitative implications of Theorem 1 may therefore be summarized as follows: Remark 2. For applications in which real positive values of ∆ are of interest, if the solution f (∆, x) to a conditional moment or derivative pricing problem (2.3) and (2.4) has a power series in ∆ that converges for some ∆ ∈ [0, s), then: 1. The range of convergence using the time transformation with k > 0 is never smaller than the range 14 without the time transformation, and may well be larger. 2. Increasing the value of k > 0 in the time transformation never decreases the range of convergence, and may well increase it. 3. If the solution has no singularity for any real ∆ ≥ 0, then the range of convergence can be made arbitrarily large by using the time transformation with a sufficiently large value of k > 0. Theorem 1 describes the region in ∆ that corresponds to |τ | < r for 0 < r < 1. If w (τ, x) (for a given value of k) can be shown to be analytic for |τ | < 1, then it is possible to approximate f (∆, x) for all ∆ ∈ [0, +∞) even when a power series directly in ∆ diverges for large values, by first approximating w (τ, x) with a truncated power series in τ , and then applying (3.3). The power series then converges for values of τ that correspond to all ∆ ∈ [0, +∞). The following corollary addresses this case: Corollary 1. Let k 6= 0 be a constant. Holding x fixed, if f (∆, x) is defined and analytic in ∆ in the region where: π π < Im (k∆) < + 2 2 (3.12) Re (k∆) > − ln (2 cos [Im (k∆)]) (3.13) − then the function w (τ, x) = f (∆, x), where ∆ = ∆k (τ ) is defined by (3.2), is analytic in τ in the region |τ | < 1. Conversely, if w (τ, x) is defined and analytic in τ in the region |τ | < 1, then f (∆, x) = w (τ, x), where τ = τk (∆) is defined by (3.1), is analytic in ∆ in the region indicated by (3.12) and (3.13). Proof: See appendix. The shape described by (3.12) and (3.13) is simply the union of the shapes described by (3.10) and (3.11) for all r < 1. As shown in Figure 5, the unit circle |τ | = 1 maps to an open shape in ∆; if the parameter of the time transformation k is positive, then the opening is toward the right. For larger values of k, the circle (mapped to ∆) follows the positive real axis in ∆ more closely; provided f (∆, x) has no singularities in the neighborhood of the positive real axis, convergence for arbitrarily large positive real values of ∆ can be established by choosing a sufficiently large value of k. For example, suppose f (∆, x) has a singularity at ∆ = −5, and nowhere else. A power series directly in ∆ diverges for |∆| > 5. However, the power series representation of w (τ, x) converges for all |τ | < 1, provided k > (ln 2) /5. This value can be determined approximately from the graph, or exactly from (3.13). The interval of convergence is thus extended from ∆ ∈ [0, 5) to ∆ ∈ [0, +∞) through use of the time transformation with sufficiently large k. Similarly, for the example of the term structure model of Cox, Ingersoll, and Ross (1985), a value of k = 0.1 suffices to establish convergence on ∆ ∈ [0, +∞) when κ = 0.5 and σ = 0.15. For κ = 1 and σ = 0.15, a value of k = 0.15 suffices for convergence on ∆ ∈ [0, +∞). The qualitative implications of the corollary may thus be summarized as: Remark 3. For applications in which real positive values of ∆ are of interest, if the solution f (∆, x) to a conditional moment or contingent claim pricing problem (2.3) and (2.4) has no singularities near a real positive value of ∆, then: 15 1. The range of convergence of the power series representation of the solution with the time transformation with k > 0 is never smaller than the range without the time transformation, and may well be larger. 2. Increasing the value of k > 0 in the time transformation never decreases the range of convergence of the power series representation of the solution, and may well increase it. 3. For sufficiently large k > 0, the range of convergence of the power series solution includes ∆ ∈ [0, +∞). Although use of the basic time transformation establishes convergence of a power series for arbitrarily large values of ∆ when there are no singularities near the positive real axis, such convergence is, in general, not uniform on ∆ ∈ [0, +∞). For example, in the model of Vasicek (1977), bond prices are everywhere analytic, so that their power series representation converges everywhere, but this convergence is not uniform for all ∆.10 For the model of Cox, Ingersoll, and Ross (1985), bond prices are not everywhere analytic, but the range of convergence of a power series can be made to include ∆ ∈ [0, +∞) by use of the time transformation with sufficiently large k > 0. However, in this case also, the convergence is, in general, not uniform. In such cases, for very large ∆, the power series still converges, but the rate of convergence may be very slow. If it can be established that a power series converges uniformly on ∆ ∈ [0, +∞), then convergence is guaranteed to occur at a minimum rate no matter how large ∆ is. Uniform convergence on ∆ ∈ [0, +∞) can sometimes be established for series solutions to asset pricing or conditional moment problems, but the conditions needed are quite different than the conditions that establish uniform convergence on finite time intervals. We examine this situation in detail in the next section. 3.3. Large Circle Convergence The previous section considers the problem of extending the range of convergence of a series solution to a conditional moment or asset pricing problem; specifically, it examines convergence properties of power series in τ for circles |τ | < r with 0 < r ≤ 1, and establishes sufficient conditions for convergence on intervals such as ∆ ∈ [0, T ) with T > 0, or, in the case of r = 1, on ∆ ∈ [0, +∞). As discussed, if the solution to an asset pricing or conditional moment problem is well-behaved on ∆ ∈ [0, +∞), then the solution to a transformed problem, constructed using the time transformation, can be shown to be analytic within the circle |τ | < 1. The power series of the solution to the transformed problem then converges within this circle in τ , which effectively includes all positive values of ∆, even if a power series constructed directly in ∆ converges only for a very short range of values. However, even when convergence can be extended to all positive values, such convergence is, in general, only uniform on bounded sets of ∆. Thus, even if a power series converges for all ∆ ∈ [0, +∞), the convergence may be increasingly slow for larger and larger values of ∆. Calculation of a large number of terms in the series may be needed before a good approximation to the solution f (∆, x) is obtained when ∆ is large, and this may severely limit the practicality of the series method of solution. By contrast, if a series converges uniformly on ∆ ∈ [0, +∞), then the rate of convergence does not depend on the value of ∆, and f (∆, x) can be approximated for large and small values of ∆ equally well with the same 10 The only functions for which power series converge uniformly for all ∆ are polynomials, and bond prices in the Vasicek (1977) model are not polynomials in maturity. 16 number of terms. We now derive several results that can be helpful in deriving a series of approximations that converge uniformly, for all time horizons, to an asset price or conditional moment. 3.3.1. Large Circle Convergence with Change of Time A power series representation of f (∆, x) in ∆ never converges uniformly for all ∆, except in the special cases in which f (∆, x) is a polynomial in ∆. However, for k > 0, the basic time transformation maps the positive real line, ∆ ∈ [0, +∞), to a finite interval, τ ∈ [0, 1). A power series converges uniformly on compact sets that fall within the circle of convergence; thus, if the radius of convergence of a power series for w (τ, x) (the solution to the transformed problem, (3.5) and (3.6)) in τ can be shown to be greater than 1, then this series converges uniformly for |τ | ≤ 1, and the circle in τ includes all positive ∆. Uniform convergence on ∆ ∈ [0, +∞), possible with a power series in ∆ only in very limited special cases, may therefore be possible with a power series in τ , since, for positive values of the parameter k of the time transformation, the unbounded interval ∆ ∈ [0, +∞) maps to the bounded interval τ ∈ [0, 1). Although uniform convergence on ∆ ∈ [0, +∞) is a powerful and useful result, it is much harder to establish than simple convergence on the same interval. The implications of Theorem 1 and Corollary 1 are bidirectional. If the solution f (∆, x) to the original problem, (2.3) and (2.4), can be shown to be analytic in ∆ within a given region, then it follows that the solution w (τ, x) to the transformed problem, (3.5) and (3.6), is analytic in a corresponding circle in τ . But the reverse implication also holds; if w (τ, x) is analytic in τ within a circle, then f (∆, x) is analytic within the region indicated in the theorem or corollary statement. Establishing analyticity of the solution to either problem within the given region is sufficient to establish convergence of the power series representation of w (τ, x), and f (∆, x) can then be recovered using (3.3). However, for circles in τ with radius greater than one, the implication breaks down in one direction. Specifically, although analyticity of w (τ, x) in a circle in τ still implies analyticity of f (∆, x) within a corresponding region of ∆, the reverse implication does not hold. The problem is that circles in τ with radius greater than one include the point τ = 1, which does not correspond to any value of ∆. For k > 0, as ∆ approaches +∞, τk (∆) approaches one, but never reaches it for finite values of ∆. Consequently, even if f (∆, x) is everywhere analytic in ∆, the function w (τ, x) may fail to be analytic at τ = 1, and its power series then diverges for all |τ | > 1. The following theorem is essentially analogous to Theorem 1 and Corollary 1, but for circles with r > 1. However, with the earlier results, to establish convergence of a power series solution within a circle in τ , it suffices either to show that w (τ, x) (the solution to the transformed problem) is analytic in τ within a circle, or that f (∆, x) (the solution to the original problem) is analytic in the corresponding region of ∆. When r > 1, however, only the first method may be used; convergence can only be shown by establishing analyticity of the solution to the transformed problem. Theorem 2. Suppose for some x, some k 6= 0, and some region of ∆, that f (∆, x) can be written as: f (∆, x) = w (τk (∆) , x) (3.14) where w (τ, x) is analytic in the circle |τ | < r for some r > 1. Denote by wn (τ, x) the power series (in τ ) of 17 w (τ, x) including terms up to order n: wn (τ, x) ≡ b0 (x) + n X bi (x) τ i i=1 where the bi (x) are the power series coefficients, and define: fn (∆, x) ≡ wn (τk (∆) , x) (3.15) Then holding x fixed, fn (∆, x) converges to f (∆, x) for all complex ∆ such that: ³ ´ p Re (k∆) > − ln cos [Im (k∆)] + cos2 [Im (k∆)] + r2 − 1 (3.16) Furthermore, for any 1 ≤ s < r, fn (∆, x) converges uniformly to f (∆, x) for all complex ∆ such that: ³ ´ p Re (k∆) ≥ − ln cos [Im (k∆)] + cos2 [Im (k∆)] + s2 − 1 Proof: See appendix. Theorem 2 provides a means of constructing a series of approximations to f (∆, x) that converges uniformly on ∆ ∈ [0, +∞). The solution to the asset pricing or conditional moment problem, f (∆, x), is first expressed in terms of w (τ, x), using the time transformation. If w (τ, x) can be shown to be analytic in a circle that includes τ = 1 (in Section 5, we use the results of Kimmel (2008), but other methods may be feasible as well), then it is possible to construct approximations to f (∆, x) that converge uniformly for all positive ∆. Each wn (τ, x) is the power series of w (τ, x), truncated after n + 1 terms. From these approximations to w (τ, x), approximations to f (∆, x) can be constructed through (3.15). The theorem establishes that, given analyticity of w (τ, x) in a sufficiently large circle of τ , the approximations to f (∆, x) converge uniformly in the indicated region. For real ∆ and k > 0, this region simplifies to ∆ ∈ [− ln (1 + s) , +∞). Since s > 0, this interval includes ∆ ∈ [0, +∞), so for most applications, it suffices to show that w (τ, x) is analytic in |τ | < r for any r > 1; increasing the value of r does not result in improved convergence properties on ∆ ∈ [0, +∞). However, as discussed below, it is impossible to apply Theorem 2 at all for some problems, and even when applicable, there are strong restrictions on the choice of the parameter k in the time transformation. The region described by (3.16) is quite different from that described in either Theorem 1 or Corollary 1; Figure 6 shows this region for circles of various radii with k = 0.15. All three circles (in τ ) shown have radius greater than one, but for the goal of establishing uniform convergence on ∆ ∈ [0, +∞), it suffices to show analyticity of w (τ, x) in τ within any circle with r > 1; clearly, the smaller this radius is (while still being greater than one), the easier this task tends to be. Every point to the right of the curves shown in Figure 6 falls within the circle |τ | < r for the corresponding value of r, and every point to the left falls outside the same circle. However, as previously noted, the point τ = 1, which falls within any circle with radius greater than one, does not correspond to any value of ∆, regardless of the value of the parameter k. Consequently, analyticity of f (∆, x) everywhere to the right of the curves shown in Figure 6, while necessary for analyticity of w (τ, x) within the corresponding circle in τ , is not sufficient; w (τ, x) may still fail to be analytic at τ = 1. The figure also shows the points of singularity of bond prices in the model of Cox, Ingersoll, and Ross (1985); 18 as shown, all such singularities are outside the r = 1.01 circle. Since the r = 1 circle is included inside the r = 1.01 circle, this establishes convergence of a power series (in τ ) on ∆ ∈ [0, +∞). However, inspection of the bond price function (which is known explicitly) shows that there is no choice of k at all that makes this function analytic at τ = 1, corresponding to the limiting value of τ as ∆ approaches +∞. It might seem that a series approximation method that results in uniform convergence for all maturities is not feasible for this model, but, as is shown in Section 4, a modified version of Theorem 2 (see Theorem 3 below) may be used to establish the desired result. Another way to see that analyticity of f (∆, x) in ∆, even everywhere, does not necessarily mean analyticity within |τ | < r for r > 1, is to note that Theorem 2 produces f (∆, x) functions that are periodic in ∆. To find a w (τ, x) that generates a given f (∆, x) using (3.14), the choice of k in the time transformation is severely restricted by the periodicity of f (∆, x); furthermore, if f (∆, x) is not periodic, then there is no value of k at all for which an appropriate w (τ, x) can be found. In the case of r ≤ 1, convergence of a power series (in τ ) follows from the analyticity of the function being represented within the region of ∆ corresponding to |τ | < r. In the r > 1 case, however, uniform convergence does not follow from analyticity of the function at all points to the right of the circles (in τ ) shown in Figure 6, because the point τ = 1 does not correspond to any value of ∆. 3.3.2. Examples Consider the example of an Ornstein-Uhlenbeck process. The expected value of such a process satisfies (2.1) with coefficients µ (x) = −κx, σ (x) = 1, and r (x) = 0, and with final condition ψ (x) = x. If κ > 0, the process is stationary, and its expected value converges to a limit as the time horizon approaches positive infinity. The solution is known explicitly, and is f (∆, x) = x exp (−κ∆), which is everywhere analytic in ∆. It can be expressed in terms of τ instead of ∆: κ w (τ, x) = f (∆k (τ ) , x) = x (1 − τ ) k This solution is analytic in τ within a circle |τ | < r for some r > 1 (in fact, for any r > 1) provided κ is a positive integer multiple of k. A power series in τ then converges uniformly for all |τ | ≤ 1, which includes the interval ∆ ∈ [0, +∞) (provided κ, and therefore k, are positive). Note, however, that the function w (τ, x) is not analytic in τ at τ = 1 for any other values of k. For such values, a power series in τ converges within the circle |τ | < 1, corresponding to ∆ ∈ [0, +∞), but this convergence is not uniform. A series representation of the solution that converges for all ∆ ∈ [0, +∞) is therefore possible, as long as κ > 0. However, note that the situation is very different from the case of small circle convergence. Increasing the value of k in the time transformation never decreases the range of convergence, although it may fail to increase it. Here, increasing k can actually cause the convergence properties of the series to get worse; uniform convergence only happens for specific values of k, specifically, those which are κ divided by a positive integer. The largest such value is κ itself, and increasing k to larger positive values causes the convergence of a series on ∆ ∈ [0, +∞) to cease to be uniform. In the large circle case, k which is too large can be just as bad as k which is too small. ¡ ¢ Now, consider the same general PDE, but with final condition ψ (x) = exp −cx2 . The solution (which is 19 the conditional expectation of the final condition) is given by: ³ ´ 2 exp(−2κ∆) exp − 1+cx c (1−exp(−2κ∆)) κ f (∆, x) = p 1 + κc (1 − exp (−2κ∆)) If we choose k = 2κ/n for any integer n > 0, then this solution can be expressed as: ³ ´ cx2 (1−τ )n exp − n c ¡ ¢ 1+ κ [1−(1−τ ) ] w (τ, x) = f ∆2κ/n (τ ) , x = q n c 1 + κ [1 − (1 − τ ) ] For the specific case n = 1, the above expression simplifies to: ³ ´ 2 ) exp − cx1+(1−τ c κτ p w (τ, x) = f (∆2κ (τ ) , x) = 1 + κc τ This function has a singularity at τ = −κ/c. If |c| < |κ|, then the singularity lies outside the unit circle in τ , and the conditions of Theorem 2 are satisfied for some r > 1. For c > 0, the series of approximations to f (∆, x) constructed using (3.15) converges uniformly on ∆ ∈ [0, +∞), provided the tails of the final condition do not go to zero too quickly (i. e., provided |c| < |κ|). However, in the case |c| > |κ|, the singularity at τ = −κ/c lies within the unit circle in τ , and the conditions of Theorem 2 are not satisfied for k = 2κ. Choosing k = 2κ/n for some n > 1 does not improve the situation. Consequently, for c > 0, if the tails of the final condition go to zero too quickly, the conditions of Theorem 2 cannot be satisfied for any value of k at all, and the basic time transformation cannot establish uniform convergence on ∆ ∈ [0, +∞) (although the weaker conditions of Corollary 1 do establish non-uniform convergence on this interval). In the small circle case, provided there is no singularity near the positive real line, larger values of k always increase the range of convergence, until that range includes ∆ ∈ [0, +∞). Here, the situation is very different; a choice of k which is too large is just as bad as one which is too small, in that it fails to establish uniform convergence, and even when there are no singularities near the positive real line, it may be impossible to establish uniform convergence with any choice of k at all. The qualitative aspects of Theorem 2 may be summarized as: Remark 4. The same time transformation (3.1) that can extend the range of convergence of a power series solution to (2.3) and (2.4), can also sometimes establish uniform convergence of the power series on ∆ ∈ [0, +∞), in which case the solution to the pricing or conditional moment problem can be approximated equally accurately for all positive ∆. However: 1. Analyticity of the solution f (∆, x), even for all ∆, is not sufficient for uniform convergence of the power series on ∆ ∈ [0, +∞), for any value of the parameter k. 2. If the solution w (τ, x) to the transformed problem, (3.5) and (3.6), is analytic in τ within a circle |τ | < r for any combination of k > 0 and r > 1, then the power series of the solution to the transformed problem converges uniformly on |τ | ≤ 1. The solution f (∆, x) to the original problem, (2.3) and (2.4), can then be recovered from w (τ, x) using (3.3), and the power series effectively converges uniformly on ∆ ∈ [0, +∞). 20 3. The transformed problem can only be analytic in a circle |τ | < r with r > 1 for specific values of k > 0; it may be that no value of k at all suffices. Analyticity for all such |τ | < 1 is always preserved when k is increased to a larger positive value, but in general, analyticity at τ = 1 is then lost. Increasing the value of k > 0 in the time transformation never decreases the range of convergence of a power series, but can decrease the rate of convergence; since the series converges uniformly only for specific values of k, increasing k can cause uniform convergence to become non-uniform. 3.3.3. Large Circle Convergence with Change of Time and State To apply Theorem 2, analyticity of the solution to the transformed problem, (3.5) and (3.6), must be established. However, this task is difficult; for example, the results of Kimmel (2008) do not apply (at least not directly) to such time-inhomogeneous problems. We therefore often find it more convenient to work with a modified version of Theorem 2, which is based on a change of both time and state variables, as well as the dependent variable; it is sometimes possible to use the change of state variable to eliminate the time-dependency introduced by the change of time variable. We find it convenient, in this case, to focus only on scalar problems. We also assume the PDE is already in the canonical form (2.8) and (2.9); as previously discussed, this results in no loss of generality. We then express the solution of the original problem as: h (∆, y) = ν (∆, y) w (τ (∆) , z (∆, y)) (3.17) If w (τ, z) is analytic in τ , Theorem 2 can still be applied. However, it establishes uniform convergence in ∆ of approximations to w (τ (∆) , z) holding z fixed, rather than what is desired, which is uniform convergence of h (∆, y) holding y fixed; furthermore, it is harder to establish the conditions of the theorem, because of the time-homogeneity of the coefficients of the PDE solved by w (τ, z). We therefore consider a modified version of Theorem 2 that considers convergence of a series of approximations to w (τ, z), with y held fixed instead of z. The particular transformations used are: ν (∆, y) = eλ∆ ξ (y) i √ h k∆ z (∆, y) = k θ + e 2 (y − θ) for arbitrary numbers λ and θ, and an arbitrary function ξ (y). By substituting the expression for h (∆, y) from (3.17) into the canonical PDE, a different PDE and final condition, satisfied by w (τ, z), can be derived. In general, this alternate PDE does not admit straightforward analysis. However, as we see in Section 4, in the specific cases where rh (y) is specified by either (2.10) or (2.11), it is possible to choose k, λ, θ, and ξ (y) so that the PDE satisfied by w (τ, z) is quite simple, and is covered by the results of Kimmel (2008). This may not seem particularly noteworthy, since, for these two specifications of rh (y), the original problems are also covered by the results of Kimmel (2008). However, there is an important difference between the two situations. For the original problem, these results apply to h (∆, y) with time variable ∆ and state variable y; if h (∆, y) is everywhere analytic, then a power series converges on the interval ∆ ∈ [0, +∞), but convergence is in general not uniform. However, in the transformed problem, the results of Kimmel (2008) apply to w (τ, z), with τ as 21 the time variable and z as the state variable. Provided k > 0, the interval ∆ ∈ [0, +∞) maps to τ ∈ [0, 1), and uniform convergence of a power series in τ on this interval is often possible. The following theorem states that if w (τ, z) is analytic in a sufficiently large region, then, with some restrictions on k and λ, it is possible to construct a series of approximations to h (∆, y), the solution to the original problem (2.8) and (2.9), that converge uniformly for all ∆. The problem of establishing that w (τ, z) has the sufficient properties to allow application of the theorem is examined in Section 4. Theorem 3. Suppose the solution h (∆, y) to a conditional moment or asset pricing can be written as: ³ √ £ ¡ ¢ ¤´ h (∆, y) = eλ∆ ξ (y) w τk (∆) , k θ + 1 − τk/2 (∆) (y − θ) for some k 6= 0, arbitrary numbers λ and θ, and an arbitrary function ξ (y), where w (τ, z) is an analytic function of both variables for all z and for all |τ | < r for some r > 1. Denote by wn (τ, z) the power series approximation (in τ ) to w (τ, z) including terms up to order n in τ : wn (τ, z) ≡ b0 (z) + n X bi (z) τ i i=1 where the bi (z) are the power series coefficients. Define: ³ √ £ ¡ ¢ ¤´ hn (∆, y) ≡ eλ∆ ξ (y) wn τk (∆) , k θ + 1 − τk/2 (∆) (y − θ) Then for a fixed value of y, hn (∆, y) converges to h (∆, y) for all complex ∆ such that: ³ ´ p Re (k∆) > − ln cos [Im (k∆)] + cos2 [Im (k∆)] + r2 − 1 (3.18) Furthermore, for any 1 ≤ s < r and for any real c, hn (∆, y) converges uniformly to h (∆, y) for all complex ∆ such that: ³ ´ p Re (k∆) ≥ − ln cos [Im (k∆)] + cos2 [Im (k∆)] + s2 − 1 and Re (λ∆) ≤ c (3.19) Proof: See appendix. The region of ∆ for which hn (∆, y) converges, for a given w (τ, z), is the same as in Theorem 2. However, this theorem may fail to establish uniform convergence in regions where Theorem 2 does. For example, suppose k is positive. If w (τ, z) is analytic in a circle |τ | < r for some r > 1, then Theorem 2 establishes uniform convergence on ∆ ∈ [0, +∞). However, Theorem 3 does not establish uniform convergence for these values if λ is positive (or more generally, if the real part of λ is positive); in this cases, the uniformly convergent approximations wn (τ, z) are premultiplied by a quantity that grows without bond as ∆ approaches positive infinity. The theorem still establishes uniform convergence on ∆ ∈ [0, T ] for any T > 0, but uniform convergence on the entire positive real axis remains elusive. However, if the real part of λ is negative (or zero), uniform convergence on ∆ ∈ [0, +∞) follows by choosing any c ≥ 0. Theorem 3 is particularly useful for several of the cases considered in Section 4. In those cases, a change of time variables introduces time-inhomogeneity to the coefficients of the pricing partial differential equation. 22 But, a corresponding change of the state vector restores time-homogeneity once again. The results of Kimmel (2008) can be used to establish analyticity of the solution to the transformed problem, and Theorem 3 can then be applied to establish uniform convergence of a power series in τ on an interval that includes ∆ ∈ [0, +∞), taking into account the change of both the time and the state variables. Note that this result can be used even in some non-stationary models; for example, uniform convergence of series for bond prices for all positive maturities can be established in the model of Ahn, Dittmar, and Gallant (2002), even when the driving diffusion process is non-stationary. In this case, bond prices are already known explicitly, but the results of this section also apply to a large class of non-linear models for which bond prices are not known explicitly. The next section characterizes in detail two large families of conditional moment and contingent claim pricing problems to which the results of this section can easily be applied. 4. Applying the Time Transformation The previous section develops tools to improve the convergence properties of power series representations of solutions to general diffusion problems. Specifically, it shows how to improve both the range of convergence of a series solution to a conditional moment or asset pricing problem, and also how to extend the rate of convergence, by making it uniform for all positive time horizons, when appropriate conditions are met. However, to apply these results in practice, it is necessary to show that the solution to either the original problem (2.8) and (2.9), or a transformed problem that arises from change of the time variable (and possibly also the state and dependent variables) has an analytic solution; furthermore, since the solution is unknown (which is the whole reason for constructing a series representation), this determination must be made with knowledge only of the coefficients of the PDE problem and the final condition. In this section, we consider the two broad families of such problems discussed in Kimmel (2008), and show how the results of the previous section may be applied to these problems. Both of these families of problems involve scalar diffusions. We assume these problems have already been transformed into the canonical form (2.8) and (2.9); as previously discussed, this results in no loss of generality. 4.1. Brownian Motion The first case we consider is the one Kimmel (2008) refers to as the “Brownian motion” case. This terminology is used because the canonical form of the PDE to be solved is one that arises when a state variable follows a Brownian motion. Note, however, that many non-linear problems also give rise to the same canonical PDE, so this terminology should not be interpreted to mean that the results apply only to problems based on a Brownian motion. This PDE is (2.8) and (2.9) with rh (y) specified by (2.10): · 2 ¸ ∂h 1 ∂2h b 2 (∆, y) = (∆, y) − (y − a) + d h (∆, y) ∂∆ 2 ∂y 2 2 h (0, y) = g (y) (4.1) (4.2) 23 Kimmel (2008) expresses the region of analyticity of h (∆, y), given smoothness and growth conditions on g (y). However, as discussed in the previous section, even if h (∆, y) were to be everywhere analytic, this still would not guarantee uniform convergence of a power series on ∆ ∈ [0, +∞). We therefore consider the problem of establishing analyticity of w (τ, z), the function in Theorem 3 from which h (∆, y) is constructed, for an appropriate choice of k > 0, λ, θ, and ξ (y). If analyticity within |τ | < r can be established for some r > 1 (with the additional restriction on λ), then Theorem 3 applies, and uniform convergence for all positive ∆ can be established. The following theorem provides conditions under which such analyticity in w (τ, z) can be established. Theorem 4. Let b 6= 0, a, and d be arbitrary numbers, and for √ 2b, choose either square root. Let g (y) be analytic for all complex y, and let there exist some c > 0 and some norm (over the reals) kyk such that g (y) satisfies the bound: ¯ µ ¶¯ ¯ y2 ¯ kyk2 ¯e 4 g √y ¯ ≤ ce 2 ¯ ¯ 2b √ Then there exists a w (τ, z), defined and analytic for all complex z and k τ k < 1, that satisfies the partial differential equation with final condition: ∂w 1 ∂2w (τ, z) = (τ, z) ∂τ 2 ∂z 2 µ ¶ √ (z−a 2b)2 z 4 g √ w (0, z) = e 2b Furthermore, h (∆, y), defined by: b 2 h (∆, y) ≡ e− 2 (y−a) −( 2b +d)∆ ³ √ £ ¤´ w τ2b (∆) , 2b a + e−b∆ (y − a) °p ° ° ° satisfies (4.1) and (4.2) for all complex y and ∆ such that ° τ2b (∆)° < 1. Proof: See appendix. Theorem 3 takes as given a w (τ, z) that is analytic in τ , and shows how to use the power series of this function to construct a series of convergent approximations to another function, h (∆, y). Theorem 4 does three things. First, given smoothness and growth conditions on g (y), it shows the existence of a w (τ, z) that satisfies the conditions of Theorem 3, with: k = 2b b λ=− −d 2 θ=a b 2 ξ (y) = e− 2 (y−a) Second, it establishes that w (τ, z) solves a PDE with final condition, so that, since w (τ, z) is analytic for √ k τ k < 1, its power series can be found by the recursive method of (2.6) and (2.7) (with the appropriate changes in notation). Finally, it establishes that the h (∆, y) constructed in Theorem 3 is the solution to √ the problem (4.1) and (4.2). If b is positive and the positive square root 2b is chosen, and the region of analyticity of w (τ, z) established by Theorem 4 includes |τ | ≤ 1, then Theorem 3 applies and, with an additional restriction on b and d, establishes uniform convergence of a series on ∆ ∈ [0, +∞). Since the power series of w (τ, z) converges within a circle, Theorem 4 is best applied with a norm of the type kyk = |y| /k0 . 24 However, the result is no harder to derive with a general norm, and the more general result can sometimes be useful.11 For this reason, we express the result in terms of any norm over the reals, although in all examples considered in Section 5, a circular norm suffices. We further note that the PDE (4.1) and (4.2) is unchanged if b is replaced by −b. The final condition for the PDE satisfied by the solution w (τ, z) to the transformed problem is then potentially complex, even when the final condition in the original problem is a real function. However, this presents no problems; the change of variables that construct h (∆, y) from w (τ, z) ensure that the latter is a real function (provided the final condition is a real function), for either choice. The theorem therefore establishes two different ways to construct a solution to (4.1) and (4.2). Although the two solutions coincide (where they are both defined), they are based on different functions w (τ, z), which may have different regions of analyticity (in τ ). Even if the region of analyticity in τ were the same, these regions would map to different regions of ∆, because the parameter of the basic time transformation is k = 2b for one and k = −2b for the other. Since we are usually interested in problems in which the coefficients of the PDE are real, b will be either real or imaginary; if it is real, then the preferred choice of k is normally the positive value, which helps to establish uniform convergence for ∆ ∈ [0, +∞). The negative choice can help to establish uniform convergence on ∆ ∈ (−∞, 0], but this is of no interest for practical applications. Many problems involving conditional moments or bond pricing with a mean-reverting process are covered by Theorem 4 (after a change of variables), although other cases are covered as well. Conditional moments and bond prices under the term structure models of Vasicek (1977) and Ahn, Dittmar, and Gallant (2002), after a change of variables that puts the appropriate PDE in the canonical form, are covered by this case. For both models, conditional moments of the state variable are everywhere analytic in the time horizon; this follows by applying the results of Kimmel (2008) to the conditional moment sought. The final conditions are sufficiently smooth, and grow at a polynomial rate, more slowly than any exponential function, which is sufficient to guarantee existence of an everywhere analytic moment. If the state variable process is also stationary, Theorem 4 establishes uniform convergence for all ∆ ∈ [0, +∞). Bond prices in the Vasicek (1977) model are everywhere analytic, so a series in ∆ converges for all maturities, whether or not the interest rate process is stationary. However, this convergence is not uniform. Provided the interest rate process is stationary, Theorem 4 establishes uniform convergence of series representations of bond prices for all maturities.12 For the scalar version of the model of Ahn, Dittmar, and Gallant (2002), considered in detail in Section 5, series representations of bond prices directly in maturity converge only for a limited range of maturities. However, Theorem 4 establishes that a series in τ instead of ∆ converges uniformly for all maturities, even if the state variable process is not stationary. Of course, the value of Theorem 4 is not to establish a method of finding convergent series for bond prices in models for which they are already known explicitly, but to find bond prices, 11 If w (τ, z) has singularities within |τ | < 1, then its power series does not converge within any circle that includes τ = 1, and consequently, the approximations to h (∆, y) do not converge on ∆ ∈ [0, +∞). However, provided the singularities are away from the positive real axis in τ , the time transformation can be applied again, and the results of Section 3.2 can extend the region of convergence to τ [0, 1]. This uniformly convergent series of approximations to w (τ, z) could then be used to construct a uniformly convergent series of approximations to h (∆, y) on ∆ ∈ [0, +∞), with a slightly modified version of Theorem 3. However, we do not formally state and prove such a result. 12 These results are not shown, but can be found in a manner similar to the analysis of other models in Section 5. 25 other contingent claim prices, or conditional moments in the wide variety of problems, many non-linear, whose canonical form is given by (4.1) and (4.2). For many such problems, the solutions are not known in closed-form, but can be approximated accurately by our methods. We finally note that Theorem 4 can be extended; if the conditions hold for any norm, then the PDE solution is analytic for all values of ∆. However, uniform convergence of a power series could still only be established in certain directions in the complex plane. We do not formally state and prove such a result. 4.2. General Affine The other case we consider is (2.8) and (2.9) with rh (y) specified by (2.11): µ ∂h 1 ∂2h (∆, y) = (∆, y) − ∂∆ 2 ∂y 2 ¶ a b2 2 + y + d h (∆, y) y2 2 (4.3) h (0, y) = g (y) (4.4) If a = 0, this PDE is a special case of the PDE (4.1). As in that case, the convergence properties of the power series for h (∆, y) can be considerably improved if a time transformation is applied first. Several term structure models that have appeared in the literature reduce to the general affine case after changes of variables. Conditional moments and bond prices in an affine model (specifically, one in which the state variable follows the square-root process of Feller (1951), as in Cox, Ingersoll, and Ross (1985)) are covered by this case; for this reason, Kimmel (2008) refers to this version of the canonical PDE as the “general affine” case. However, many non-affine pricing or conditional moment problems also reduce (when expressed in the canonical form) to this case, so the results of this section are not limited to affine problems. As with the Brownian motion case, Kimmel (2008) expresses the region of analyticity of h (∆, y) in terms of the smoothness and growth properties of g (y), although the class of final conditions that generate analytic solutions is more complicated to characterize. Specifically, the final condition must be of the form: g (y) = g1 (y) y √ 1− 1+8a 2 + g2 (y) y 1+ √ 1+8a 2 (4.5) where g1 (y) and g2 (y) are everywhere analytic and even functions. In the case where √ 1 + 8a/2 ∈ N, the alternate final condition: g (y) = g1 (y) y may be used instead; in this case, √ √ 1− 1+8a 2 + g2 (y) y 1+ √ 1+8a 2 ln y (4.6) 1 + 8a should be interpreted as the positive square root.13 But also as with the Brownian motion case, analyticity of h (∆, y), even everywhere in ∆, does not establish uniform convergence on ∆ ∈ [0, +∞). The following theorem establishes analyticity of w (τ, z), the auxiliary function from Theorem 3, used to construct the solution h (∆, y): 13 As noted in Section 2, Kimmel (2008) requires the alternate final condition in the case where √ 1 + 8a/2 is an integer. However, the final condition (4.5) can still be used, as the g2 (y) part of the final condition is redundant in this case, and could be incorporated into the g1 (y) part of the final condition instead. 26 Theorem 5. Let b 6= 0, a, and d be arbitrary numbers, and for √ 2b, choose either square root. Let g1 (y) and g2 (y) be even functions that are analytic for all complex y, and let there exist some c > 0 and some norm (over the reals) kyk such that g1 (y) and g2 (y) satisfy the bounds: ¯ ¯ µ ¶¯ µ ¶¯ ¯ y2 ¯ ¯ ¯ y2 kyk2 kyk2 ¯e 4 g1 √y ¯ ≤ ce 2 ¯e 4 g2 √y ¯ ≤ ce 2 ¯ ¯ ¯ ¯ 2b 2b √ Then there exist w1 (τ, z) and w2 (τ, z), analytic for all complex z and k τ k < 1, that satisfy the partial differential equations with final conditions: √ ∂w1 1 − 1 + 8a ∂w1 1 ∂ 2 w1 (τ, z) = (τ, z) + (τ, z) ∂τ ∂z 2 ∂z 2 √2z 1 + 1 + 8a ∂w2 1 ∂ 2 w2 ∂w2 (τ, z) = (τ, z) + (τ, z) ∂τ 2z 2 ∂z 2 µ ¶∂z µ ¶ z2 z2 z z w1 (0, z) = e 4 g1 √ w2 (0, z) = e 4 g2 √ 2b 2b Furthermore, h (∆, y), defined by: h (∆, y) ≡ e ( − 2b y 2 − b 2 +d µ ¶ 1−√21+8a µ ¶ 1+√21+8a z z )∆ √ w1 (τ, z) + √ w2 (τ, z) 2b 2b √ −b∆ 2be y, satisfies (4.3) and (4.4), with g (y) specified by (4.5), for all complex y °p ° ° ° and ∆ such that y = 6 0 and ° τ2b (∆)° < 1. where τ = τ2b (∆) and z = Proof: See appendix. This theorem is in many ways similar to Theorem 4, but is also somewhat more complicated, in that the final condition is expressed in terms of two everywhere analytic functions, and the growth condition is applied to each. Like Theorem 4, it establishes existence of analytic (in τ ) auxiliary functions, and specifies the PDE (with final condition) satisfied by these functions; their power series can therefore be found by the recursive method of (2.6) and (2.7). It allows application of Theorem 3, with: µ k = 2b λ = −b 1 ± √ 1 + 8a 2 ¶ −d θ=0 b 2 ξ (y) = e− 2 y y 1± √ 1+8a 2 To apply Theorem 3 to the w1 (τ, z) part of the solution, the minus signs should be chosen in λ and ξ (y); to apply the theorem to the w2 (τ, z) part of the solution, the plus signs should be chosen. The theorem further shows that the h (∆, y) constructed in Theorem 3 is the solution to the problem (4.3) and (4.4), with g (y) specified by (4.5). If b is positive, and the conditions of the theorem are satisfied for the norm √ kyk = |y| /k0 for any k0 > 1 (choosing the positive square root 2b), then the power series for w (τ, z) converges uniformly (holding z fixed) for the values of τ that include ∆ ∈ [0, +∞). With some additional parameter restrictions, Theorem 3 establishes uniform convergence of the series of approximations of h (∆, y), constructed by truncating the power series of w1 (τ, z) and w2 (τ, z) and applying the definition of h (∆, y) in the theorem statement, on ∆ ∈ [0, +∞), holding y fixed. As with Theorem 4, circular norms, of the form kyk = |y| /k0 , are the most useful, since the power series of w1 (τ, z) and w2 (τ, z) converge within a 27 circle. However, in some cases, non-circular norms could still be useful, since a second application of the time transform (with a slight modification of Theorem 3) would allow uniform approximation of w1 (τ, z) and √ w2 (τ, z) on τ ∈ [0, 1] whenever k1k < 1, even if the region k τ k < 1 does not include |τ | ≤ 1. (See the brief discussion in the previous section.) We do not formally state and prove such a result. As with Theorem 4, it is possible to replace b with −b; the PDE satisfied by h (∆, y), (4.3) and (4.4), is then unaltered. The final condition satisfied by the transformed problem may well then be complex, even if the original final condition is a real function. However, this is not a problem since, if the final condition is a real function, the changes of variables that construct h (∆, y) from w1 (τ, z) and w2 (τ, z) ensure that the former is a real function, even if the latter are not. Provided the quadratic coefficient in (4.3) (i. e., b2 /2) is positive, then b is real, and can be chosen to be either positive or negative. The positive choice is most useful for establishing uniform convergence on ∆ ∈ [0, +∞); the negative choice may help to establish uniform convergence on ∆ ∈ (−∞, 0], but this is not a particularly useful result for most real world problems. √ We also note that it is possible to modify Theorem 5 to use the alternate final condition (4.6) in the case 1 + 8a/2 ∈ N. This requires a slight modification to the construction of h (∆, y) from w1 (τ, z) and w2 (τ, z), but also to the partial differential equation solved by w1 (τ, z); this equation is no longer homogeneous, but includes a term on w2 (τ, z) and a term on its first derivative. The recursive procedure to calculate the coefficients is still feasible, but most be modified for this inhomogeneous case. We do not consider this possibility further. As with Theorem 4, many conditional moment or bond pricing problems for models that have appeared in the literature, including (but not limited to) those with a mean-reverting state process, are covered by Theorem 5. In Section 5, we apply this theorem and Theorem 3 to bond prices under the model of Cox, Ingersoll, and Ross (1985), and find that series using the time transformation converge uniformly and rapidly for all positive maturities. We also apply these results to the callable bond pricing model of Jarrow, Li, Liu, and Wu (2006) (for which prices are not known in closed-form), and find that the series representation of bond prices in this model also converges uniformly for all positive maturities. 5. Examples We now consider several examples, and show how to construct approximations of solutions to several asset pricing problems, using time transformations to extend the range and increase the rate of convergence. For two of the problems considered, in Sections 5.1 and 5.2, the quantity sought is already known explicitly, so these cases serve only as illustrative examples, to test the accuracy of approximations based on power series, and to see how much the accuracy is improved by the use of time transformations. For both of these cases, which are bond pricing models, we show that naı̈ve power series (i. e., those constructed directly in the time variable ∆, without using a time transformation) diverge for maturities beyond a certain value; however, time transformation methods extend the range of convergence to include all maturities. Furthermore, we show that approximations based on power series using time transformations are much more accurate than those based on naı̈ve power series, for an extremely wide range of maturities and values of the state variable, virtually 28 certain to include any values needed for a real world application. For the other examples considered, including the callable bond pricing problem of Jarrow, Li, Liu, and Wu (2006), the solution sought is not known in closed-form, so we do not compare approximate solutions based on power series to the true solution. For these cases, we show how to derive a series representation of the solution, and to determine its convergence properties. 5.1. ADG Model We now construct approximations to bond prices under the interest rate model of Ahn, Dittmar, and Gallant (2002). Since bond prices in this model are known in closed-form, this model allows us to determine the accuracy of approximations constructed using our methods. The risk-neutral interest rate process under the scalar version of the model of Ahn, Dittmar, and Gallant (2002) is given by:14 dxt = κ (θ − xt ) dt + σdWt rt = x2 + φ where Wt is a Brownian motion. Zero-coupon bond prices f (∆, r) then satisfy the PDE with final condition: ¡ ¢ ∂f ∂f σ2 ∂ 2 f (∆, x) = κ (θ − x) (∆, x) + (∆, x) − x2 + φ f (∆, x) ∂∆ ∂x 2 ∂x2 f (0, x) = 1 This PDE can be converted to the canonical form by the change of dependent and independent variables discussed in Section 2. With the specific coefficients given by (5.3), these changes of variables are: y (x) = x−θ σ f (∆, r) = e κ(x−θ)2 2σ 2 h (∆, y (x)) The canonical form PDE, with final condition, is then: · 2 ¸ ∂h 1 ∂2h b 2 (∆, y) = (∆, y) − (y − a) + d h (∆, y) ∂∆ 2 ∂y 2 2 κ h (0, y) = e− 2 y 2 (5.1) (5.2) with: a≡− 2θσ b2 b≡ p κ2 + 2σ 2 d≡ κ κ2 θ 2 − +φ b2 2 The quantity inside the square root in the definition of b is clearly positive, and b itself is assigned the positive 14 Our notation differs somewhat from that used by these authors, and is also slightly less general. Ahn, Dittmar, and Gallant (2002) allow linear as well as quadratic interest rate specifications, nesting the model of Vasicek (1977), which is precluded by our parameterization. However, as our purpose here is to provide an illustrative example rather than to construct the most comprehensive model, this less general parameterization suffices. 29 square root. We now apply Theorem 4, with: r 2 −κ 2y g (y) ≡ e kyk ≡ |y| 1 κ − 2 2b Note that the expression inside the square-root in the definition of the norm kyk is necessarily between 0 and 1/2, provided κ is not negative. If κ is negative, this expression could approach 1 arbitrarily closely, depending on the relative magnitudes of κ and σ, but is always smaller than 1, as long as σ 6= 0. In all cases, the square-root sign should be interpreted as the positive square root. Theorem 4 establishes the existence √ of a w (τ, z) that is analytic for all k τ k < 1, and that satisfies the PDE in the theorem statement; since √ k1k < 1, it follows that the region of analyticity of w (τ, z) established by Theorem 4, k τ k < 1, includes a circle |τ | < r for some r > 1. The solution w (τ, z) is known explicitly for this model: à w (τ, z) = e !2 √ z− a 2b 1− κ a2 κ b !− à 2 1− κ 1 −τ 4 b 2 1− κ b ( q 1− τ 2 ¡ 1− κ b ) ¢ However, even if it were not known in closed-form, a power series could be calculated using the recursive relation of (3.8) and (3.9), since w (τ, z) is known to solve the PDE (with final condition) in Theorem 4. The first few terms are: ³ √ ´2 a 2b ´ ³ ´ z − κ 1− b 1 + τ 1 − κ 1 + 1 − κ b 4 b 8 à ! µ ¶ √ 2 2 √ ³ ´ 1− κ 2 2b b a 2b 3 κ 3 z− a − a κκ 4 1− κ 2(1− ) b z − + 1 − b κ w (τ, z) = e ´2 16 16 2 ³ b 1 − τ κ b + 1− à ! √ 4 2 b ³ ´ κ 2 a 2b 1 + 1− z− 64 b 1 − κb + ... ³ Since k1k < 1, this power series converges within a circle that includes τ ∈ [0, 1]. Theorem 4 also provides a construction of an h (∆, y) that is the solution to (5.1) and (5.2). This solution can therefore be approximated by first approximating w (τ, z) with a few terms of its power series, and then applying the construction in the theorem statement to the approximation. Convergence of a series of approximations to h (∆, y) on ∆ ∈ [0, +∞) is then established by Theorem 3, with only mild restrictions on the value of φ (φ ≥ 0, precluding negative interest rates, suffices). Since b is positive, the interval τ ∈ [0, 1] maps to ∆ ∈ [0, +∞), and (given the restriction on φ) Theorem 3 then establishes uniform convergence on this range. Note that stationarity of the state variable process is not required for this uniform convergence. Table 2 shows bond yields based on approximations (in τ ) with terms up to order 2 and 4, and also power series approximations in ∆ (i. e., without using any time transformation methods) with terms up to order 30 8, and compares these approximate yields to the true yields. The parameter values used are κ = 0.4995, θ = 0.1860, σ = 0.0812 and φ = 0.0088, which are chosen so that the unconditional mean and variance of the interest rate process are 5% and 0.001, respectively, and the kurtosis coefficient is 5. For these parameter values, a naı̈ve power series representation of bond prices directly in maturity converges for maturities up to approximately 8.7 years, and diverges for longer maturities (see Table 1). Since the relation between the state variable and the interest rate is not one-to-one, Table 2 shows results for four values of the state variable: two values corresponding to an initial interest rate of 6%, and two values corresponding to an initial interest rate of 600%. For each initial value of the state variable, bond prices are calculated for an extremely wide range of maturities, ranging from 1 to 10, 000 years. The accuracy of the approximations based on series in τ is clear when they are compared to the exact bond prices, and to those based on a naı̈ve power series directly in ∆. As shown, even the 2-term approximations in τ are extremely accurate over this huge range of interest rates and maturities. Far and away the largest yield approximation error occurs with maturity of 1 year and initial interest rate of r = 600% (corresponding to an initial state variable value of x = 2.44769); this error is slightly more than one tenth of a basis point, when the exact yield is almost 400%. Most approximate yields are off by less than one hundredth of a basis point, and often by much less than that. In relative terms, the largest error is less than one part in one hundred thousand of the yield being approximated; for very long maturities, the error is closer to one part in four hundred million. For the 4-term approximations in τ , the errors are much smaller, generally tiny fractions of a basis point; in only one case is the relative error slightly more than one part per billion. By contrast, approximations based on a power series in ∆, even though they include terms up to order 8 in ∆, are much less accurate. For maturity of one year in the two cases in which the initial interest rate is 6%, these approximations are accurate to less than a basis point, but are very inaccurate in the two cases corresponding to r = 600%, producing large negative yields instead of the correct large positive yields. For longer maturities, the accuracy is poor for all four initial values of the state variable process; even at maturity of five years, which is within the interval of convergence, the approximation error is so large as to render the approximations useless in three of the four cases. For longer maturities, series constructed directly in ∆ diverge, and the table shows extreme inaccuracy in these cases. The method of time transformation here has both extended the range of maturities for which power series converge, to positive infinity, and also improved their accuracy, often by many orders of magnitude relative to series derived without time transformations. 5.2. CIR Model We now construct approximations to bond prices under the interest rate model of Cox, Ingersoll, and Ross (1985). Since bond prices are known in closed-form, this model, like that of the previous section, allows us to evaluate the accuracy of approximations constructed using time transformations. The risk-neutral interest rate process is given by: √ drt = κ (θ − rt ) dt + σ rt dWt where Wt is a Brownian motion. We require 2κθ ≥ σ 2 so that the interest rate process cannot achieve the 31 boundary value of rt = 0 with positive probability (see Feller (1951)). Zero-coupon bond prices f (∆, r) then satisfy the PDE with final condition: ∂f ∂f σ2 r ∂ 2 f (∆, r) − rf (∆, r) (∆, r) = κ (θ − r) (∆, r) + ∂∆ ∂r 2 ∂r2 (5.3) (5.4) f (0, r) = 1 This PDE can be converted to canonical form by the change of dependent and independent variables discussed in Section 2 in Kimmel (2008). With the specific coefficients given by (5.3), these changes of variables are: µ √ 2 r y (r) = σ f (∆, r) = 4r σ2 ¶ 14 − θκ2 σ rκ e σ2 h (∆, y (r)) The canonical form PDE, with final condition, is then: ¸ · ∂h 1 ∂2h a b2 2 (∆, y) = y + d h (∆, y) (∆, y) − + ∂∆ 2 ∂y 2 y2 2 κ h (0, y) = y α e− 4 y 2 (5.5) (5.6) with: 2θ2 κ2 2θκ 3 a≡ − 2 + σ4 σ 8 √ b≡ κ2 + 2σ 2 2 d≡− θκ2 σ2 α≡ 1+ √ 1 + 8a 2θκ 1 = 2 − 2 σ 2 The boundary nonattainment condition is needed to establish equality of the two expressions for α. The quantity inside the square root sign in the expression for b is clearly positive, and b is taken to be the positive square root. We now apply Theorem 5, with: "r g1 (y) ≡ 0 2 −κ 4y g2 (y) ≡ e kyk ≡ |y| # 1 κ − +² 2 4b for some ² > 0. Note that the expression inside the square root sign in the definition of the norm kyk is necessarily between 0 and 1/2 if κ is not negative; if κ is negative, this value is always less than one, but can approach one arbitrarily closely if the magnitude of σ is small compared to the magnitude of κ. In either case, by choosing a sufficiently small ², k1k < 1. Theorem 5 establishes the existence of w1 (τ, z) and w2 (τ, z) that √ are analytic for all k τ k < 1, and that satisfy the PDEs in the theorem statement; it is easily verified that w1 (τ, z) is everywhere zero. Since k1k < 1, it follows that the region of analyticity of w2 (τ, z) established by √ Theorem 5, k τ k < 1, includes a circle |τ | < r for some r > 1. The solution w2 (τ, z) for this problem is known explicitly: w2 (τ, z) = ¡ 1− z2 4 −2τ 1− κ 2b ) e( ¡ ¢ ¢α+ 21 1 κ 2 1 − 2b τ However, even if it were not known in closed-form, a power series could be calculated using the recursive relation of (3.8) and (3.9), since w2 (τ, z) is known to solve the PDE (with final condition) stated in Theorem 5. The 32 first few terms are: ¶µ ¶ µ ¶2 # 1 κ z2 1 κ 1 +α − + − 1 + τ 2 2 4b 2 2 4b µ ¶µ ¶µ ¶2 1 3 1 κ 2 +α +α − κ z 2 2 2 2 4b τ w2 (τ, z) = e 4 (1− 2b ) + µ ¶ µ ¶ µ ¶ 3 4 4 2 3 1 κ z 1 κ 2 +z + α − + − 2 2 4b 4 2 4b + ... "µ Since k1k < 1, the power series converges within a circle that includes τ ∈ [0, 1]. Theorem 5 also provides a construction of an h (∆, y) that is the solution to (5.5) and (5.6). This solution can therefore be approximated by first approximating w2 (τ, z) with a few terms of its power series, and then applying the construction in the theorem statement to the approximation. Theorem 3 can then be applied; note that b is positive; the interval τ ∈ [0, 1] then maps to ∆ ∈ [0, +∞). Furthermore, the λ coefficient in the theorem is necessarily negative, so the theorem establishes uniform convergence of approximations to h (∆, y), based on the power series of w2 (τ, z), for all positive ∆. The accuracy of the approximation is clear when it is compared to the exact bond prices, and to those obtained by a power series approximation of f (∆, r) directly in ∆. Table 3 shows bond yields based on approximations (in τ ) with terms up to order 2 and 4, and also power series approximations in ∆ (i. e., without using any time transformation methods) with terms up to order 8, and compares these approximate yields to the true yields. As shown, even the 2-term approximations in τ are extremely accurate over a huge range of interest rates (r = 0.06%, r = 6%, and r = 600%) and maturities (from 1 to 10,000 years). The largest relative error occurs with maturity of 1 year and initial interest rate of r = 0.06%, and is less than four parts in one thousand; most errors are much smaller. For the 4-term approximations in τ , the errors are typically measured in a few thousands of a basis point, or less, providing far greater accuracy then is likely to be needed in any real application. By contrast, the approximations in ∆, even though they include terms up to order 8 in ∆, are severely inaccurate for all but the shortest maturities. With the very high initial interest rate of r = 600%, this approximation is not even accurate for a maturity of one year, the approximation error being twice the size of the exact yield itself. For the two smaller initial interest rates of r = 6% and r = 0.06%, the approximation in ∆ is quite accurate for a maturity of one year, but deviates from the true yield by about 40 basis points (in both cases) with a maturity of five years, despite the fact that the series converges (for these parameter values) for maturities up to approximately 8.24 years. For maturies of 10 years or longer, the approximations based on seres in ∆ are, for all initial interest rates shown, so inaccurate as to be useless for any reasonable application, as might be expected, since the series diverges for these maturities. As with the case of the Ahn, Dittmar, and Gallant (2002) model, the method of time transformation here has not only extended the range of maturities for which the series converges, but also improved the accuracy of the series by many orders of magnitude. 33 5.3. Callable Bonds Jarrow, Li, Liu, and Wu (2006) consider a model in which the prices of callable bonds satisfy: ³ c2 ´ ∂f ∂f σ2 x ∂ 2 f (∆, x) = κ (θ − x) (∆, x) + (∆, x) − c x + f (∆, x) 1 ∂∆ ∂x 2 ∂x2 x f (0, x) = 0 (5.7) (5.8) with c1 > 0 and 2θκ ≥ σ 2 . This PDE can be converted to the canonical form by the change of dependent and independent variables as discussed in Section 2. With the specific coefficients given by (5.7), these changes of variables are: y (x) = √ 2 x σ µ f (∆, x) = 4x σ2 ¶ 14 − θκ2 σ xκ e σ2 h (∆, y (x)) The canonical form PDE, with final condition, is then: ¸ · ∂h 1 ∂2h a b2 2 (∆, y) = y + d h (∆, y) (∆, y) − + ∂∆ 2 ∂y 2 y2 2 κ h (0, y) = y α e− 4 y (5.9) 2 (5.10) with: 2θ2 κ2 2θκ − 4c2 3 a≡ − + σ4 σ2 8 √ b≡ κ2 + 2c1 σ 2 2 d≡− θκ2 σ2 α≡ 2θκ 1 − σ2 2 The quantity inside the square root sign in the expression for b is positive, and b takes the value of the positive square root. The results of Kimmel (2008) apply to (5.9) and (5.10) (with a restriction on the values of a and α), establishing that h (∆, y) is analytic in ∆. However, the region of analyticity (and therefore the interval of convergence of a power series) is bounded; the power series of h (∆, y) diverges for large ∆. But, using our results, the range of convergence can be extended to ∆ ∈ [0, +∞), and this convergence can be made uniform. We now apply Theorem 5, with: "r g1 (y) ≡ e 2 −κ 4y g2 (y) ≡ 0 kyk ≡ |y| # 1 κ − +² 2 4b for any ² > 0.15 Note that the expression inside the square root sign in the definition of the norm kyk is necessarily between 0 and 1/2 if κ is not negative, and between 1/2 and 1 (but never equal to 1) if κ is negative. Either way, with a sufficiently small choice of ² > 0, we have k1k < 1. Theorem 5 establishes the √ existence of w1 (τ, z) and w2 (τ, z) that are analytic for all k τ k < 1, and that satisfy the PDEs (with final conditions) in the theorem statement; consequently, their power series can be found using the recursive method of (3.8) and (3.9). It is clear that w2 (τ, z) is everywhere zero; the region of analyticity of w1 (τ, z) established √ by Theorem 5, k τ k < 1, includes a circle |τ | < r for some r > 1, because k1k < 1 and the norm is circular. 15 This choice of g1 (y) and g2 (y) is appropriate for c2 ≥ 0. If c2 < 0, then the definitions of g1 (y) and g2 (y) can be reversed; this allows application of Theorem 3 with a weaker restriction on the parameters. If the two parts of the final condition are reversed, then the references to w1 (τ, z) and w2 (τ, z) in the subsequent text should also be reversed. 34 The power series of w1 (τ, z) therefore converges uniformly on a region that includes τ ∈ [0, 1]. The first few terms of this series are: µ ¶ ¸ (α − γ) (α + γ − 1) 2α + 1 ³ κ ´ z2 ³ κ ´2 + 1− + 1− 1 + τ 2z 2 4 2b 8 2b (α − γ) (α − γ − 2) (α + γ − 1) (α + γ − 3) 4 4z ³ ´ (2α − 1) (α − γ) (α + γ − 1) κ µ ¶α−γ + 1− z2 κ z 2 2 1− 4z 2b w1 (τ, z) = √ e 4 ( 2b ) + τ µ ¶³ ´ 2 2b (2α + 3) (2α + 1) (α − γ) (α + γ − 1) κ 2 + 1− + 16 8 2b κ ´3 z 4 ³ κ ´4 (2α + 3) z 2 ³ 1 − + 1 − + 16 2b 64 2b + ... · where: γ≡ 1− √ 1 + 8a 2 Theorem 5 also provides a construction of an h (∆, y) that is the solution to (5.5) and (5.6). This solution can therefore be approximated by first approximating w1 (τ, z) with a few terms of its power series, and then applying the construction in the theorem statement to the approximation. Theorem 3, given sufficient restrictions on α, then establishes uniform convergences of the approximations to h (∆, y) on ∆ ∈ [0, +∞). Note that b is positive; the interval τ ∈ [0, 1] then maps to ∆ ∈ [0, +∞). As with the previous two models, the method of time transformation here has not only extended the range of maturities for which the series converges, but also established uniform convergence on the entire range. 5.4. Other Models and Applications For two of the three models considered in the preceding sections, the solution to the pricing problem is known in closed-form. In all three models, the state variable follows an affine process. However, as noted in Kimmel (2008), the same canonical form PDEs that arise in affine diffusion problems also arise in may non-linear problems. Continuing with bond pricing as the motivating example (the task in two of the three preceding sections), we note that non-linear term structure models in which bonds can be priced using our results are rather easily constructed. This can be seen simply by reversing the change of variables (in particular, the change of dependent variable) needed to put a pricing problem in the canonical form. We begin with: 1 ∂2h ∂h (∆, y) = (∆, y) − rh (y) h (∆, y) ∂∆ 2 ∂y 2 h (0, y) = g (y) where rh (y) is specified by either (2.10) or (2.11). Suppose that the final condition g (y) satisfies the conditions of Theorem 4 (when rh (y) is given by (2.10)) or Theorem 5 (when rh (y) is given by (2.11)) for the norm kyk = |y| (1 − ²) for some 0 < ² < 1, and that b2 > 0. Then by either Theorem 4 or 5, w (τ, z) is analytic for all 35 2 |τ | < 1/ (1 − ²) , and its power series converges uniformly on |τ | < 1. It follows that approximations to h (∆, y) that converge uniformly on ∆ ∈ [0, +∞) can be constructed, using either Theorem 4 or 5 (depending on the specification of rh (y)), and also Theorem 3. In the case of rh (y) given by (2.10), g (y) must be everywhere analytic and satisfy: ¯ b 2 ¯ 2 2 ¯ 2z ¯ ¯e g (z)¯ ≤ ceb(1−²) |z| In the case of rh (y) given by (2.11), it is g1 (y) and g2 (y) that must be everywhere analytic and satisfy the same growth condition. Provided the additional constraint on the sign of λ (from Theorem 3) is satisfied, the result is uniform convergence of approximations to h (∆, y) for all ∆. The class of functions g (y) (or g1 (y) and g2 (y)) that satisfy these conditions is extremely broad. Polynomials, exponential functions, products of the two, linear combinations of the products, etc., all qualify, as do ¡ ¢ functions of the form exp ky 2 for k < 1, such exponentials multiplied by polynomials, linear combinations thereof, etc. Periodic functions such as sin (ky) and cos (ky), these functions multiplied by polynomials, linear combinations of such functions, etc., also qualify. Some functions that may even seem to be non-analytic ¡ √ ¢ ¡ √ ¢ at first glance can also qualify; Kimmel (2008) cites functions of the form exp k y + exp −k y , which, ¡√ ¢ √ despite the appearance of the square root, is analytic in y; functions such as sin y / y are also analytic in y. Such functions can be multiplied by polynomials, multiplied by each other, added together, etc., to form final conditions that satisfy the conditions of Theorem 4 or 5. Given such a final condition, the power series of the corresponding PDE solution can be found, and the series converges for values of τ corresponding to all ∆ ∈ [0, +∞). For such final conditions, a corresponding term structure model can be reverse engineered, by reversing the change of dependent variable. Take: f (∆, y) = h (∆, y) /g (y) Then f (∆, y): · ¸ ∂f g 0 (y) ∂f 1 ∂2f g 00 (y) (∆, y) = (∆, y) + (∆, y) − r (y) − f (∆, y) h ∂∆ g (y) ∂y 2 ∂y 2 2g (y) f (0, y) = 1 so that f (∆, y) may be interpreted as the price of a zero-coupon bond in a model driven by a state variable 36 and interest rate process:16 g 0 (Yt ) dt + dWt g (Yt ) g 00 (Yt ) rt = rh (Yt ) − 2g (Yt ) dYt = Bond prices in the implied model can be approximated by truncated power series, and the series converge for all maturities ∆ ∈ [0, +∞). There are two types of choices of g (y) for which the implied state variable process is affine, or can be converted to an affine process by change of independent variable; those two choices are: g (y) = ec0 +c1 y+c2 y 2 g (y) = c0 ec1 y y c2 Any other choice of final condition generates a non-affine term structure model, but bond prices can still be approximated uniformly using our methods in such models. The above analysis assumes a g (y) (or g1 (y) and g2 (y)) that satisfies the growth condition for a norm of the form kyk = |y| (1 − ²). However, it is possible to construct term structure models for which bond prices can be approximated uniformly for all maturities, from an even broader class of final conditions. For a very ¡ ¢ simple example, take g (y) = exp −3bc0 y 2 /2 for some c0 ≥ 1. This final condition does not satisfy the growth condition for any norm of the form kyk = |y| (1 − ²). However, it does satisfy the final condition with other types of norms, and for some of these norms, k1k < 1. For example, take: r kyk ≡ 1 4 + ²2 2 2 (Re y) + (Im y) 1 + ²1 3c0 − 1 where ²1 > 1 and ²2 > 0. Theorems 5 and 3 still apply with this norm, but establish the analyticity of w (τ, z) √ (the solution to the transformed problem) only within k τ k < 1; this region does not include the circle |τ | ≤ 1, needed for uniform convergence for all ∆ ∈ [0, +∞). However, the “small circle” results can be applied to the transformed problem, resulting in a compound time transform. First, the time and state variable are changed (as per Theorems 5 and 3), so that the solution w (τ, z) can be shown to be analytic in a region that includes τ = 1; next, a time transformation is applied to the transformed problem, extending the range of convergence, as discussed in Section 3.2. A power series in the second transformed time variable, then converges on an interval that includes ∆ ∈ [0, +∞). Most of the examples discussed so far are the pricing of default-free zero-coupon bonds in term structure models, although we have also considered the callable bond pricing problem of Jarrow, Li, Liu, and Wu (2006). Another potential application of time transformation methods is the pricing of credit derivatives, such as credit default swaps. When calculating prices of such instruments, it is often necessary to calculate quantities such 16 It is still necessary to establish that the coefficients of the PDE and the final condition are sufficiently regular so that the probabilistic problem (i. e., finding the risk-neutral expectation of the discounted payoff of the bond) is equivalent to the partial differential equation problem. These conditions can be verified on a case-by-case basis, using well-known results in the extant literature. 37 as: h R t+∆ i f (∆, x) = E e− t ru +λu du where the short interest rate rt and the default intensity λt are both functions of the state variable process Xt . For appropriate rt and λt processes, our methods provide a way of approximating quantities such as f (∆, x), possibly uniformly for all time horizons, even when the closed-form solution is unknown. Finally, we note that although some of our methods apply only to single-factor models, many multi-factor problems can be decomposed into a system of single-factor problems. For example, if multiple state variables follow independent processes, and enter the interest rate in an additively separable way, then the problem of pricing a zero-coupon bond in the multi-factor setting is equivalent to the problem of pricing zero-coupon bonds in several single-factor models. If our methods apply to each of the single-factor models, then the price of the bond in the multi-factor model can be approximated, perhaps uniformly in all maturities, even if closed-form bond prices are not known. 6. Conclusion We have developed the method of time transformations to improve both the range and the rate of convergence of power series representations of solutions to asset pricing or conditional moment problems that arise in a continuous-time setting. In some cases, our methods allow accurate approximation of prices (or conditional moments) for arbitrarily long time horizons. These methods make feasible the rapid calculation of bond prices for many models in which such calculation would otherwise not be practical, and therefore make feasible estimation techniques for non-affine models based on likelihood or minimum distance searches. We use the term structure models of Ahn, Dittmar, and Gallant (2002) and of Cox, Ingersoll, and Ross (1985), for which bond prices are already known in closed-form, to evaluate our methods, and find that, although naı̈ve power series expansion of the bond price function performs very poorly for long maturities, use of time transformations dramatically improves the accuracy of the series, resulting in uniform converge for all maturities; comparison to true bond yields shows extreme accuracy with only a few terms in the power series over a very wide range of initial interest rates and bond maturities. We also use the callable bond pricing model of Jarrow, Li, Liu, and Wu (2006), in which prices are not known in closed-form, and show that our methods allow approximations of bond prices in this model also that converge uniformly for all maturities. We also consider other potential applications, such as pricing of credit derivatives. Possible future work includes extension of our method to multivariate diffusions. Some multivariate diffusion problems can be broken into independent scalar diffusion problems; for this class of multiple diffusion problems, no additional work is needed to apply our methods. However, in general, this is not the case; one state variable may appear in the drift or diffusion coefficient of another state variable, or multiple state variables may not enter the final condition or interest rate function additively. It is possible to construct a large class of models for which the pricing or conditional moment PDE is the same as the pricing PDE for multiple affine diffusions. Since expectations of polynomials of affine diffusions are analytic in the time horizon, at least 38 a partial characterization of the final conditions with analytic moments is possible; as in the univariate case, each final condition with analytic moments corresponds to a term structure model with analytic bond prices. Such methods remain to be explored in full detail, however. 39 References Ahn, D., and B. Gao (1999): “A Parametric Nonlinear Model of Term Structure Dynamics,” Review of Financial Studies, 12, 721–762. Ahn, D.-H., R. F. Dittmar, and A. R. Gallant (2002): “Quadratic Term Structure Models: Theory and Evidence,” Review of Financial Studies, 15, 243–288. Aı̈t-Sahalia, Y. (1996): “Testing Continuous-Time Models of the Spot Interest Rate,” Review of Financial Studies, 9, 385–426. (1999): “Transition Densities for Interest Rate and Other Nonlinear Diffusions,” Journal of Finance, 54, 1361–1395. (2002): “Maximum-Likelihood Estimation of Discretely-Sampled Diffusions: A Closed-Form Approximation Approach,” Econometrica, 70, 223–262. Aı̈t-Sahalia, Y. (2008): “Closed-Form Likelihood Expansions for Multivariate Diffusions,” Annals of Statistics, 36, 906–937. Aı̈t-Sahalia, Y., and R. L. Kimmel (2008): “Estimating Affine Multifactor Term Structure Models Using Closed-Form Likelihoods,” NBER Working Paper. Andersen, T. G., L. Benzoni, and J. Lund (2004): “Stochastic Volatility, Mean Drift, and Jumps in the Short Rate Diffusion: Sources of Steepness, Level and Curvature,” working paper, Northwestern University, University of Minnesota, and Nykredit Bank. Chan, K. C., G. A. Karolyi, F. A. Longstaff, and A. B. Sanders (1992): “An Empirical Comparison of Alternative Models of the Short-Term Interest Rate,” Journal of Finance, 48, 1209–1227. Cheridito, P., D. Filipović, and R. L. Kimmel (2007): “Market Price of Risk Specifications for Affine Models: Theory and Evidence,” Journal of Financial Economics, 83, 123–170. Colton, D. (1979): “The Approximation of Solutions to the Backwards Heat Equation in a Nonhomogeneous Medium,” Journal of Mathematical Analysis and Applications, 72, 418–429. Cox, J. C., J. E. Ingersoll, and S. A. Ross (1985): “A Theory of the Term Structure of Interest Rates,” Econometrica, 53, 385–408. Dai, Q., and K. J. Singleton (2000): “Specification Analysis of Affine Term Structure Models,” Journal of Finance, 55, 1943–1978. (2002): “Expectations puzzles, time-varying risk premia, and affine models of the term structure,” Journal of Financial Economics, 63, 415–441. Duarte, J. (2004): “Evaluating an Alternative Risk Preference in Affine Term Structure Models,” Review of Financial Studies, 17, 379–404. Duffee, G. R. (2002): “Term Premia and Interest Rate Forecasts in Affine Models,” Journal of Finance, 57, 405–443. Duffie, D., and R. Kan (1996): “A Yield-Factor Model of Interest Rates,” Mathematical Finance, 6, 379– 406. Egorov, A., H. Li, and D. Ng (2008): “A Tale of Two Yield Curves: Modeling the Joint Term Structure of Dollar and Euro Interest Rates,” Discussion paper, West Virginia University, University of Michigan, Cornell University. Egorov, A., H. Li, and Y. Xu (2001): “Maximum Likelihood Estimation of Time Inhomogeneous Diffusions,” Discussion paper, Cornell University. 40 Feller, W. (1951): “Two Singular Diffusion Problems,” Annals of Mathematics, 54, 173–182. Jarrow, R., H. Li, S. Liu, and C. Wu (2006): “Reduced-Form Valuation of Callable Corporate Bonds: Theory and Evidence,” working paper. Karatzas, I., and S. E. Shreve (1991): Brownian Motion and Stochastic Calculus. Springer-Verlag, New York. Kimmel, R. (2008): “Complex Times: Asset Pricing and Conditional Moments under Non-Affine Diffusions,” Ohio State University working paper. Levendorskii, S. (2004a): “Consistency Conditions for Affine Term Structure Models,” Stochastic Processes and Their Applications, 109, 225–261. (2004b): “Consistency Conditions for Affine Term Structure Models II: Option Pricing under Diffusions with Embedded Jumps,” University of Texas working paper. Liptser, R. S., and A. N. Shiryaev (2001): Statistics of Random Processes. Springer Verlag, Berlin, second edn. Mosburger, G., and P. Schneider (2005): “Modelling International Bond Markets with Affine Term Structure Models,” Discussion paper, Vienna University of Economics and Business Administration. Stroock, D. W., and S. R. S. Varadhan (1979): Multidimensional Diffusion Processes. Springer-Verlag, New York. Thompson, S. (2004): “Identifying Term Structure Volatility from the LIBOR-Swap Curve,” Discussion paper, Harvard University. Vasicek, O. (1977): “An Equilibrium Characterization of the Term Structure,” Journal of Financial Economics, 5, 177–188. 41 A. Appendix: Proofs This appendix includes proofs of all theorems and corollaries in the main text. A.1. Proof of Theorem 1 We first show that, for a given value of k 6= 0 and 0 < r < 1, ∆ satisfies (3.11) if and only if: |τk (∆)| < r (A.1) Note that (3.11) is equivalent to: ¡ ¢ 2 (exp [− Re (k∆)] − cos [Im (k∆)]) < cos2 [Im (k∆)] − 1 − r2 This follows by multiplying (3.11) through by negative one (which reverses the inequalities), exponentiating and subtracting cos [Im (k∆)] from all three expressions (which preserves the inequalities), and squaring the results (which replaces two inequalities by one). After some rearrangement, this inequality can be written as: exp [−2 Re (k∆)] − 2 exp [− Re (k∆)] cos [Im (k∆)] + 1 < r2 But note that: |τk (∆)| = p (A.2) 1 − 2 exp [− Re (k∆)] cos [Im (k∆)] + exp [−2 Re (k∆)] (A.3) The quantity inside the square root sign is the left-hand side of (A.2). Therefore, if ∆ satisfies (3.11), it also satisfies (A.1). Since all the steps can be reversed, it follows that if ∆ satisfies (A.1), it must also satisfy (3.11). These results hold in particular for ∆ = ∆k (τ ) for some τ 6= 1. But for any τ 6= 1, τk (∆k (τ )) = τ , so it follows from the above results that if |τ | < r, then ∆ = ∆k (τ ) satisfies (3.11). Satisfaction of (3.10) follows from the definition of the inverse time transformation: Re (1 − τ ) − ln |1 − τ | + ı arccos |1 − τ | k∆k (τ ) = − ln (1 − τ ) = Re (1 − τ ) − ln |1 − τ | − ı arccos |1 − τ | Im τ ≥ 0 Im τ < 0 The argument to the arccos function is always positive, so it takes values in [0, π/2). Holding |τ | fixed, 2 2 the smallest possible argument is 1 − |τ | , which results when Re τ = |τ | ; the arccos function thus takes its √ √ maximum and minimum values ± 1 − r2 when τ = r2 ±ır 1 − r2 . Satisfaction of (3.10) follows immediately. We now note that τk (∆) is everywhere analytic in ∆. The inverse transformation ∆k (τ ) has a singularity at τ = 1, and, since we choose the logarithm such that −π < Im ln (z) ≤ +π, it also has a branch cut discontinuity on the portion of the positive real axis τ ∈ [1, +∞). But these points lie outside of |τ | < r, so in this region, ∆k (τ ) is analytic in τ . The analyticity of f (∆, x) in ∆ given the analyticity of h (τ, x) in τ , as well as the converse, then follow immediately from the composition of two analytic functions. 42 A.2. Proof of Corollary 1 The restrictions on ∆ given by (3.12) and (3.13) are simply those of (3.10) and (3.11) with the value of r approaching 1 from below (note that the upper limit on Re (k∆) approaches +∞ in this case). Thus, if |τ | < 1, ∆k (τ ) satisfies (3.12) and (3.13), and conversely, if ∆ satisfies (3.12) and (3.13), then |τk (∆)| < r for some r < 1. Therefore, given analyticity of h (τ, x) for some |τ | < 1, analyticity of f (∆, x) at ∆ = ∆k (τ ) follows from Theorem 1. The converse follows analogously. A.3. Proof of Theorem 2 The condition |τk (∆)| < r is equivalent to: 2 (exp [− Re (k∆)] − cos [Im (k∆)]) < r2 − 1 + cos2 [Im (k∆)] This follows by substituting in the right-hand side of (A.3) for |τk (∆)|, squaring both sides (which preserves the inequality), and rearranging. However, for r > 1, the right-hand side is strictly positive for any value of Im (k∆). Consequently, the only requirement for |τk (∆)| < r with r > 1 is: p p − r2 − 1 + cos2 [Im (k∆)] < exp [− Re (k∆)] − cos [Im (k∆)] < + r2 − 1 + cos2 [Im (k∆)] Note that the left inequality is always satisfied. The right inequality is equivalent to: ³ ´ p Re (k∆) > − ln cos [Im (k∆)] + r2 − 1 + cos2 [Im (k∆)] (A.4) So if ∆ satisfies (A.4), then |τk (∆)| < r. Since w (τ, x) is analytic for all |τ | < r, its power series converges in this region, and converges uniformly in |τ | ≤ s for any 0 < s < r. The series approximations fn (∆, x) are simply wn (τ, x) evaluated at τ = τk (∆), so fn (∆, x) converges in the region of ∆ corresponding to |τ | < r, and uniformly in the region of ∆ corresponding to |τ | ≤ s for any 0 < s < r. A.4. Proof of Theorem 3 The function w (τ, z) also satisfies the conditions of Theorem 2 (with x replaced by z). So wn (τk (∆) , z) converges to w (τk (∆) , z) for all ∆ that satisfy (3.18) and for all z. But hn (∆, y) and h (∆, y) are simply wn (τ, z) √ £ ¡ ¢ ¤ and w (τ, z) premultiplied by exp (λ) ξ (y) with τ = τk (∆) and z = z (∆, y) ≡ k θ + 1 − τk/2 (∆) (y − θ) substituted in. Therefore, for any y and for any ∆ that satisfies (3.18), hn (∆, y) converges to h (∆, y). £ ¤2 All that remains is to establish uniform convergence on (3.19). To do this, we note that 1 − τk/2 (∆) = p ¯ ¯ 1 − τk (∆), from which it follows that ¯τk/2 (∆)¯ ≤ 1 + 1 + |τk (∆)|. Therefore, if |τk (∆)| ≤ s, we have ¯ ¯ √ ¯τk/2 (∆)¯ ≤ 1 + 1 + s, so τk/2 (∆) belongs to a compact set. Since z (∆, y) is continuous, z also belongs to a compact set. It follows that wn (τ, z) converges uniformly to w (τ, z). Furthermore, from the restriction on λ, the value of exp (λ∆) ξ (y) is bounded. Consequently, hn (∆, y) converges uniformly to h (∆, y). A.5. Proof of Theorem 4 The theorem establishes the existence (and analyticity) of a solution w (τ, z) to one PDE with final condition, and another solution h (∆, y) to a different PDE with final condition. The first result, the existence and 43 analyticity of w (τ, z), is established by application of Theorem 2 in Kimmel (2008). The function: µ ¶ √ (z−a 2b)2 z 4 gw (z) ≡ e g √ 2b satisfies the conditions of that theorem, so there exists a w (τ, z), defined and analytic for all complex z and √ τ such that k τ k < 1, that satisfies: ∂w 1 ∂2w (τ, z) = (τ, z) ∂τ 2 ∂z 2 (A.5) (A.6) w (0, z) = gw (z) °p ° ° ° Then h (∆, y) is defined and analytic for all complex y and ∆ such that ° τ2b (∆)° < 1, and: · µ ¶ ¸ √ 2 b b ∂h b ∂w ∂w (∆, y) = e− 2 (y−a) −( 2 +d)∆ − + d w (τ, z) + 2be−2b∆ (τ, z) − 2bbe−b∆ (y − a) (τ, z) ∂∆ 2 ∂τ ∂z ·h ¸ i 2 2 3 ∂ h 2 − 2b (y−a)2 −( 2b +d)∆ 2 −2b∆ ∂ w −b∆ ∂w 2 (∆, y) = e b (y − a) − b w (τ, z) + 2be (τ, z) − (2b) (y − a) e (τ, z) ∂y 2 ∂z 2 ∂z √ £ ¤ where τ = τ2b (∆) and z = 2b a + e−b∆ (y − a) . Substituting these into the PDE in the theorem statement, and taking advantage of the fact that w (τ, z) is a solution of (A.5) and (A.6), it can be seen that h (∆, y) is indeed a solution. Furthermore, h (∆, y) satisfies the final condition, since: ³ ³ √ ´ ³√ ´ √ ´ 2 2 2 b b b h (0, y) = e− 2 (y−a) w τ (0) , 2by = e− 2 (y−a) w 0, 2by = e− 2 (y−a) gw 2by = g (y) So h (∆, y) satisfies both the general PDE and the final condition. A.6. Proof of Theorem 5 The proof is similar to the proof of Theorem 4, but uses Lemma 5 instead of Theorem 2 from Kimmel (2008). The final conditions: gw1 (z) ≡ e z2 4 µ g1 z √ 2b ¶ gw2 (z) ≡ e z2 4 µ g2 z √ 2b ¶ √ satisfy the conditions of Lemma 5, so there exist w1 (τ, z) and w2 (τ, z), analytic for all z and all k τ k < 1, that satisfy: √ ∂w1 1 − 1 + 8a ∂w1 (τ, z) = (τ, z) + ∂τ ∂z √2z ∂w2 1 + 1 + 8a ∂w2 (τ, z) = (τ, z) + ∂τ 2z ∂z 1 ∂ 2 w1 (τ, z) 2 ∂z 2 1 ∂ 2 w2 (τ, z) 2 ∂z 2 (A.7) (A.8) w1 (0, z) = gw1 (z) (A.9) w2 (0, z) = gw2 (z) (A.10) 44 °p ° ° ° Then h (∆, y) is defined and analytic for all complex y 6= 0 and ∆ such that ° τ2b (∆)° < 1, and: b 2 b ∂h (∆, y) =e− 2 y −( 2 +d)∆ ∂∆ µ µ ( − 2b y 2 − +e )∆ b 2 +d b b 2 ∂2h (∆, y) =e− 2 y −( 2 +d)∆ 2 ∂y µ µ ( − 2b y 2 − +e where τ = τ2b (∆) and z = √ )∆ b 2 +d z √ 2b z √ 2b z √ 2b z √ 2b ¶ 1−√21+8a ¶ 1+√21+8a ¶ 1−√21+8a ¶ 1+√21+8a ∂w1 ∂w1 2be−2b∆ (τ, z) − bz (τ, z) ∂τ ∂z · µ √ ¶ ¸ 1 + 8a − b 1− + d w1 (τ, z) 2 ∂w2 ∂w2 (τ, z) − bz (τ, z) 2be−2b∆ ∂τ ∂z · µ √ ¶ ¸ 1 + 8a − b 1+ + d w2 (τ, z) 2 √ µ ¶ 1 + 8a ∂w1 −2b∆ 1 − 2b e − z (τ, z) 2 ∂z ∂ 2 w1 −2b∆ +2be (τ, z) 2 ∂z µ √ µ ¶ ¶ 2a 1 + 8a 2 2 + b y − 2b 1 − + 2 w1 (τ, z) 2 y √ ¶ µ 1 + 1 + 8a ∂w 2 −2b∆ −z (τ, z) 2b e 2 ∂z 2 ∂ w 2 +2be−2b∆ (τ, z) 2 ∂z µ √ µ ¶ ¶ 1 + 8a 2a 2 2 + b y − 2b 1 + + 2 w2 (τ, z) 2 y 2be−b∆ y. Substituting these into the PDE and taking advantage of the fact that w1 (τ, z) and w2 (τ, z) are the solution of (A.7) through (A.10), it can be seen that h (∆, y) is a solution to the general PDE in the theorem statement. Furthermore, h (∆, y) satisfies the final condition, since: ³ √ ³ √ ´ √ ´i 1+ 1+8a w1 τ (0) , 2by + y 2 w2 τ (0) , 2by √ ³ √ ´i h 1−√1−8a 1+ 1−8a b 2 2 = e− 2 y y w1 + y 2 w2 0, 2by √ ³√ ´ ³√ ´i h 1−√1−8a 1+ 1−8a b 2 2 gw1 2by + y 2 gw2 2by = e− 2 y y b h (0, y) = e− 2 y 2 h y 1− √ 1+8a 2 = g (y) So h (∆, y) satisfies both the general PDE and the final condition. 45 Projection of Circles _W_=r onto 'for Basic Time Transformation (k=0.1) 20 15 10 r=0.40 Im(') 5 r=0.70 r=0.90 0 -10 0 10 20 30 40 50 60 r=0.97 r=0.99 -5 CIR Singularities -10 -15 -20 Re(') Figure 1: This figure shows the position of circles |τ | = r with radii 0 < r < 1 projected onto the plane of ∆, with the parameter of transformation in (3.1) given by k = 0.1. As shown, the projections of the circles in τ onto ∆ are elongated shapes which extend farther in the positive real direction than in any other direction. In general, the direction of elongation is the same as the direction of k in the complex plane. If a function is analytic in ∆ at every point inside one of the circles in τ , then the power series representation to that function in τ converges within that circle. The figure also shows the singularity points for bond prices in the model of Cox, Ingersoll, and Ross (1985), with κ = 0.5 and σ = 0.15. (The value of θ in this model does not affect the location of the singularities.) For these parameter values, the singularities nearest the origin are located at ∆ = −5.865 ± 5.784, and, as a result, the power series representation of the bond price in ∆ converges only for maturities of up to 8.237 years. However, these singularities lie outside all of the circles shown; therefore, the power series representation of the bond price in τ = τk (∆) (for k = 0.1) converges in a region at least as large as the region enclosed by the largest circle shown, which includes maturities of more than 50 years. 46 Required Region of Analyticity with and without Time Transformation (k=0.1) 60 40 20 Im(') Without With k=0.1 0 -80 -60 -40 -20 0 20 40 60 80 CIR Singularities -20 -40 -60 Re(') Figure 2: This figure shows a circle in ∆ with r = 50 (i. e., a circle without the time transformation), and a circle in τ with an approximate radius r = 0.9866 (i. e., a circle with the time transformation, using k = 0.1), projected onto the plane of ∆. As shown, the circle in τ is completely contained within and makes up only a small portion of the circle in ∆; however, both include ∆ ∈ [0, 50). The use of the time transformation therefore effectively extends the interval of convergence of a power series. If the function being represented has a singularity inside the circle in ∆, but outside the circle in τ , then a power series in ∆ does not converge on the interval ∆ ∈ [0, +50). However, a power series in τ converges on the interval τ ∈ [0, 0.9866) (the ending point being approximate), which corresponds to ∆ ∈ [0, +50); the value of f (∆, x) on this interval can then be found from w (τ, x) by applying (3.3). The figure also shows the points of singularity of the bond price function for the model of Cox, Ingersoll, and Ross (1985), with κ = 0.5 and σ = 0.15. As shown, there are no singularities within the circle in τ , so that convergence of the series representation (in τ ) of bond prices is guaranteed for maturities of up to at least 50 years; however, there are eight singularities within the circle of ∆, so that a series (in ∆) converges only for a much smaller range of maturities. (Specifically, series in ∆ converge on ∆ ∈ [0, +8.237). 47 Projection of Circles _W_=r onto 'for Basic Time Transformation (k=0.2) 20 15 10 r=0.70000 5 Im(') r=0.90000 r=0.99000 0 -10 0 10 20 30 40 50 60 70 r=0.99900 r=0.99999 -5 CIR Singularities -10 -15 -20 Re(') Figure 3: This figure shows the position of circles |τ | = r with radii 0 < r < 1 projected onto the plane of ∆, with the parameter of transformation in (3.1) given by k = 0.2. Note that the circles in τ are more elongated and follow the positive real axis more closely than in the k = 0.1 case, shown in Figure 1. A power series in τ therefore always converges for at least as large a range of positive real values of ∆ in the k = 0.2 case as in the k = 0.1 case. Increasing the (positive real) value of k never decreases the interval of convergence for positive real ∆, and often increases it. The figure also shows the points of singularity of the bond price function for the model of Cox, Ingersoll, and Ross (1985), with κ = 0.5 and σ = 0.15. Note that all of the singularities lie outside all of the circles shown, so that, with k = 0.2, convergence for maturities of at least 60 years can be established. As can be seen from Figure 1, the interval of convergence for a series based on k = 0.1 is substantially smaller. 48 Projection of Circles _W_=r onto 'for Basic Time Transformation 20 15 10 Im(') 5 0 -15 -10 -5 0 5 10 15 20 25 30 35 k=0.05 k=0.10 k=0.15 k=0.20 k=0.25 -5 -10 -15 -20 Re(') Figure 4: This figure shows circles in τ projected onto the plane of ∆, for various combinations of r and k. For each choice of k shown in the legend, the corresponding value of r was chosen so that all circles reach the same maximum value on the positive real axis, of approximately ∆ = 30.4. The radii are approximately r = 0.6105, r = 0.9066, r = 0.9792, r = 0.9954, and r = 0.9990 for k = 0.05 through k = 0.25, respectively. As shown, the regions enclosed by the circles are smaller for larger values of k, even though all circles reach the same maximum positive real value ∆ = 30.4. Use of larger values of k can therefore effectively increase the range of convergence of a power series. If there are singularities inside the k = 0.05 circle but outside the k = 0.25 circle, then use of the basic time transformation ensures convergence of a power series on ∆ ∈ [0, +∞) if k = 0.25, but only for a smaller range of ∆ if k = 0.05. 49 Projection of Circle _W_=1 onto 'for Basic Time Transformation 40 30 20 Im(') 10 0 -20 -10 0 10 20 30 40 50 k=0.05 k=0.10 k=0.15 k=0.20 k=0.25 -10 -20 -30 -40 Re(') Figure 5: This figure shows the position of the circles |τ | = 1 projected onto the plane of ∆, for different values of the parameter of transformation of (3.1). As shown, the unit circle projections onto ∆ are elongated shapes opening up toward the right (i. e., toward large positive real values) in the complex plane of ∆. In general, the direction of the opening is the same as the direction of k in the complex plane. If a function is analytic in ∆ at every point inside one of the circles in τ (i. e., to the right of the the corresponding shapes in the figure), then the power series of τ converges within that region. For positive values of k, all circles in τ with radius equal to one include ∆ ∈ [0, +∞). 50 Projection of Circle _W_=r onto 'for Basic Time Transformation (k=0.15) 80 60 40 Im(') 20 r= 1.01 r= 5.00 0 -30 -20 -10 0 10 20 30 40 r=25.00 CIR Singularities -20 -40 -60 -80 Re(') Figure 6: This figure shows the position of circles |τ | = r projected onto the plane of ∆, where the parameter of the transformation of (3.1) is k = 0.15. As shown, the projections of the circles in τ onto ∆ are periodic functions of the imaginary part of ∆. Points to the right of the curves are inside the corresponding circles in τ , and points to the left are outside the same circles. If a function is analytic within the indicated circle in τ , then it is also analytic as a function of ∆ in the indicated region. Note that all the circles shown include the interval ∆ ∈ [0, +∞). However, analyticity of a function to the right of one of the curves shown is not sufficient to ensure analyticity within the indicated circle in τ , and therefore also not sufficient to establish uniform convergence of a power series on ∆ ∈ [0, +∞); analyticity at τ = 1 is also required, but this point does not correspond to any value of ∆. 51 52 3 2 dr = κr (θ − r) dt + σr dW r = x2 + d dx = κ (θ − x) dt + σdW dr = κ (θ − r) dt + σ rdW ³ ³ 2κ2 σ4 2 2 + κ 8θ y 2 1 y2 2κ+4 2 ³σ 3 8 ´ ´ κ2 θ σ2 + − κθ + + ´ ´2 2 2 3 + 2θσ4κ − 2θκ 8 σ2 2 2 θκ2 − +y 2 κ +2σ 8 σ2 ³ κ2 +2σ 2 y + κ22θσ 2 +2σ 2 κ2 θ 2 + κ2 +2σ2 − κ2 + d 1 y2 κ2 2 ¢2 ¡ y + κσ2 κ σ2 +θ − 2 − 2κ2 2 2 2 σ ¡ yσ ¢ 3 + 2κ2 κ σ κ 2 2 y 2 κθ 4 e− 4 y 2 e− e− 2 y 2 ¡ yσ ¢ 1 − 2θκ 2 κ e− 2 y Canonical PDE Coefficient Final Condition ∆= ∆= σ √ σ 2 κ2 +2σ 2 ∆=0 √ ³ ´i h √ ln 1+ κ2 κ− κ2 +2σ 2 +(2n+1)πı κ2 +2σ 2 ³ ´i h √ ln 1+ κ2 κ− κ2 +2σ 2 +(2n+1)πı None Singularities , n∈N , n∈N ∆ ∈ 0, ∆ ∈ 0, σ κ2 +2σ 2 κ2 +2σ 2 ∆=0 2 √ √ ³ ´i2 h √ ln 1+ κ2 κ− κ2 +2σ 2 +π 2 r σ ´i2 h ³ √ +π 2 ln 1+ κ2 κ− κ2 +2σ 2 r ∆ ∈ [0, +∞) Range of Power Series Convergence This table shows information relevant to the convergence properties of the power series representations of bond prices in four term structure models. In all four models, bond prices are known in closed form. The first column specifies the model; “Vas” denotes the model of Vasicek (1977), “CIR” refers to Cox, Ingersoll, and Ross (1985), “ADG” denotes the model of Ahn, Dittmar, and Gallant (2002), and “AG” refers to Ahn and Gao (1999). The second column specifies the interest rate processes in the respective models. The next two columns show the coefficients and final conditions of the pricing PDE for each model, when expressed in the canonical form of (2.8) and (2.9). The next column shows the location of singularities in the bond price, as a function of ∆. The last column shows the interval of convergence of the power series for positive values of ∆. Note that in the Vas model, there is no singularity anywhere in the bond price function; the power series representation of bond prices therefore convergences for all maturities. In the AG model, there is a singularity at ∆ = 0, and consequently, a power series representation of bond prices does not converge for any non-zero maturity. For the CIR and ADG models, there are singularities at complex values of the bond price function, which prevent convergence of a series representation for large positive values of ∆. The range of convergence for these two models can be extended by the use of time transformation methods; uniform convergence in maturity can also be established under these two models and under the Vas model using time transformations. AG ADG CIR dr = κ (θ − r) dt + σdW Vas √ Interest Rate Process (Risk-Neutral) Model Table 1: Convergence Properties of Bond Prices in Several Models Table 2: Bond Yield Approximations in ADG Model Maturity (years) Exact Zero-coupon Bond Yields Order 2 Approx. in τ Order 4 Approx. in τ Yield Rel. Error Yield Rel. Error Order 8 approx. in ∆ Yield Rel. Error Typical initial interest rate—r = 6%, x = +0.226223 1 5 10 20 50 100 1000 10000 5.850026% 5.354871% 5.112256% 4.967059% 4.878684% 4.849224% 4.822709% 4.820058% 5.850060% 5.354894% 5.112268% 4.967064% 4.878686% 4.849225% 4.822709% 4.820058% 1 5 10 20 50 100 1000 10000 3.255580% 2.593916% 3.510528% 4.156640% 4.554494% 4.687128% 4.806500% 4.818437% 3.255598% 2.593937% 3.510540% 4.156646% 4.554496% 4.687130% 4.806500% 4.818437% 1 5 10 20 50 100 1000 10000 393.2333% 135.6025% 71.67386% 38.29582% 18.21030% 11.51503% 5.489290% 4.886716% 393.2344% 135.6026% 71.67387% 38.29583% 18.21030% 11.51503% 5.489290% 4.886716% 5.75 × 10−6 4.37 × 10−6 2.31 × 10−6 1.19 × 10−6 4.83 × 10−7 2.43 × 10−7 2.44 × 10−8 2.44 × 10−9 5.850026% 5.354871% 5.112257% 4.967059% 4.878684% 4.849224% 4.822709% 4.820058% 4.25 × 10−10 7.53 × 10−10 4.01 × 10−10 2.06 × 10−10 8.39 × 10−11 4.22 × 10−11 4.25 × 10−12 4.25 × 10−13 5.850024% 5.366424% N/A N/A N/A N/A N/A N/A −3.31 × 10−7 2.16 × 10−3 N/A N/A N/A N/A N/A N/A 3.254145% −54.71145% −84.53861% −70.68393% −43.11839% −27.13691% −4.558774% −0.640115% −4.41 × 10−4 −2.21 × 101 −2.51 × 101 −1.80 × 101 −1.05 × 101 −6.79 × 100 −1.95 × 100 −1.13 × 100 −596.6695% −385.3111% −248.7733% −152.2866% −75.61808% −43.36144% −6.178866% −0.802100% −2.52 × 100 −3.84 × 100 −4.47 × 100 −4.98 × 100 −5.15 × 100 −4.77 × 100 −2.13 × 100 −1.16 × 100 −627.5459% −391.1627% −251.6678% −153.7253% −76.19144% −43.64775% −6.207464% −0.804959% −2.72 × 100 −4.70 × 100 −5.63 × 100 −6.21 × 100 −6.18 × 100 −5.47 × 100 −2.17 × 100 −1.17 × 100 Typical initial interest rate—r = 6%, x = −0.226223 5.64 × 10−6 8.21 × 10−6 3.33 × 10−6 1.42 × 10−6 5.17 × 10−7 2.51 × 10−7 2.45 × 10−8 2.44 × 10−9 3.255580% 2.593916% 3.510528% 4.156640% 4.554494% 4.687128% 4.806500% 4.818437% 3.10 × 10−10 1.36 × 10−9 5.78 × 10−10 2.46 × 10−10 8.99 × 10−11 4.37 × 10−11 4.26 × 10−12 4.25 × 10−13 High initial interest rate—r = 600%, x = +2.44769 2.80 × 10−6 2.82 × 10−7 1.71 × 10−7 1.54 × 10−7 1.29 × 10−7 1.02 × 10−7 2.15 × 10−8 2.41 × 10−9 393.2333% 135.6025% 71.67386% 38.29582% 18.21030% 11.51503% 5.489290% 4.886716% 7.53 × 10−10 5.86 × 10−11 3.01 × 10−11 2.67 × 10−11 2.25 × 10−11 1.78 × 10−11 3.73 × 10−12 4.19 × 10−13 High initial interest rate—r = 600%, x = −2.44769 1 5 10 20 50 100 1000 10000 365.1619% 105.7295% 54.34346% 29.52725% 14.70262% 9.761194% 5.313906% 4.869178% 365.1622% 105.7295% 54.34348% 29.52725% 14.70263% 9.761195% 5.313906% 4.869178% 9.19 × 10−7 1.37 × 10−7 2.08 × 10−7 1.99 × 10−7 1.60 × 10−7 1.21 × 10−7 2.22 × 10−8 2.42 × 10−9 365.1619% 105.7295% 54.34346% 29.52725% 14.70262% 9.761194% 5.313906% 4.869178% 1.55 × 10−10 1.88 × 10−11 3.55 × 10−11 3.47 × 10−11 2.79 × 10−11 2.10 × 10−11 3.85 × 10−12 4.21 × 10−13 This table shows bond yields for the model of Ahn, Dittmar, and Gallant (2002), calculated exactly, by series approximation in ∆, and by series approximation in τ . For series in τ , approximations including terms up to order τ 2 and τ 4 are included; for series in ∆, terms up to order ∆8 are included. The parameters used are κ = 0.4995, θ = 0.1860, σ = 0.0812, and φ = 0.0088. For these parameter values, the unconditional mean and variance of the interest rate process are 5% and 0.001, respectively; the unconditional kurtosis of the interest rate process is 5. For each approximation, the relative error is shown, i. e., the approximation error divided by the exact yield. Since each possible value of the interest rate corresponds to two different values of the state variable, we show results for four different initial values of the state variable, two corresponding to an interest rate of 6% and two corresponding to an interest rate of 600%. As shown, the approximations in τ are highly accurate with only a small number of terms, across an extremely wide range of initial interest rates and maturities. By contrast, the approximations in ∆, even with a larger number of terms, are accurate only for very short maturities, and when the initial interest rate is not very large. For these parameter values, the series in ∆ diverges for maturities greater than approximately 5.24 years. Entries of “N/A” indicate that the bond price approximation is zero or negative, so that the corresponding yield approximation is not real-valued. Table 3: Bond Yield Approximations in CIR Model Maturity (years) Exact Zero-coupon Bond Yields Order 2 Approx. in τ Order 4 Approx. in τ Yield Rel. Error Yield Rel. Error Order 8 approx. in ∆ Yield Rel. Error Typical initial interest rate—r = 6% 1 5 10 20 50 100 1000 10000 6.409818% 7.124507% 7.379918% 7.523945% 7.611073% 7.640116% 7.666256% 7.668869% 6.416892% 7.138838% 7.388510% 7.528294% 7.612813% 7.640986% 7.666343% 7.668878% 1.10 × 10−3 2.01 × 10−3 1.16 × 10−3 5.78 × 10−4 2.29 × 10−4 1.14 × 10−4 1.13 × 10−5 1.13 × 10−6 6.409823% 7.124555% 7.379951% 7.523962% 7.611080% 7.640120% 7.666256% 7.668870% 7.82 × 10−7 6.82 × 10−6 4.47 × 10−6 2.24 × 10−6 8.85 × 10−7 4.41 × 10−7 4.39 × 10−8 4.39 × 10−9 6.409817% 6.751295% −15.58451% −35.51708% −28.93838% −20.01544% −3.842947% −0.568490% −1.88 × 10−7 −5.24 × 10−2 −3.11 × 100 −5.72 × 100 −4.80 × 100 −3.62 × 100 −1.50 × 100 −1.07 × 100 1.749034% 5.416320% N/A N/A −24.30220% −18.86522% −3.783540% −0.563010% 1.07 × 10−6 8.26 × 10−2 N/A N/A −4.29 × 100 −3.51 × 100 −1.49 × 100 −1.07 × 100 −480.3992% −363.4862% −238.0046% −146.9416% −73.49005% −42.29912% −6.072789% −0.791494% −2.02 × 100 −2.66 × 100 −2.97 × 100 −3.28 × 100 −3.42 × 100 −3.22 × 100 −1.69 × 100 −1.10 × 100 Low initial interest rate—r = 0.06% 1 5 10 20 50 100 1000 10000 1.749033% 5.003263% 6.246237% 6.954521% 7.383299% 7.526229% 7.654867% 7.667731% 1.755732% 5.017505% 6.254826% 6.958871% 7.385039% 7.527099% 7.654954% 7.667739% 1 5 10 20 50 100 1000 10000 472.4884% 219.2489% 120.7479% 64.46632% 30.38848% 19.02882% 8.805126% 7.782757% 472.6014% 219.2735% 120.7569% 64.47067% 30.39022% 19.02969% 8.805213% 7.782765% 3.83 × 10−3 2.85 × 10−3 1.38 × 10−3 6.25 × 10−4 2.36 × 10−4 1.16 × 10−4 1.14 × 10−5 1.13 × 10−6 1.749037% 5.003311% 6.246270% 6.954538% 7.383306% 7.526233% 7.654867% 7.667731% 2.62 × 10−6 9.61 × 10−6 5.27 × 10−6 2.42 × 10−6 9.12 × 10−7 4.47 × 10−7 4.40 × 10−8 4.39 × 10−9 High initial interest rate—r = 600% 2.39 × 10−4 1.12 × 10−4 7.41 × 10−5 6.75 × 10−5 5.73 × 10−5 4.57 × 10−5 9.88 × 10−6 1.12 × 10−6 472.4887% 219.2490% 120.7480% 64.46634% 30.38848% 19.02882% 8.805126% 7.782757% 8.09 × 10−7 5.34 × 10−7 2.92 × 10−7 2.61 × 10−7 2.22 × 10−7 1.77 × 10−7 3.82 × 10−8 4.33 × 10−9 This table shows bond yields for the model of Cox, Ingersoll, and Ross (1985), calculated exactly, by series approximation in ∆, and by series approximation in τ . For series in τ , approximations including terms up to order τ 2 and τ 4 are included; for series in ∆, terms up to order ∆8 are included. The parameters used are κ = 0.5000, θ = 0.0800, and σ = 0.1500. For each approximation, the relative error is shown, i. e., the approximation error divided by the exact yield. As shown, the approximations in τ are highly accurate with only a small number of terms, across an extremely wide range of initial interest rates and maturities. By contrast, the approximations in ∆, even with a larger number of terms, are accurate only for very short maturities, and when the initial interest rate is not very large. For these parameter values, the series in ∆ diverges for maturities greater than approximately 8.24 years. Entries of “N/A” indicate that the bond price approximation is zero or negative, so that the corresponding yield approximation is not real-valued. 54