3 Approximation of the Model and Parameters Estimation In Correction as of 8-12-2008, page 1, the model is presented as πππ(ππ‘ ) = [1 − (1 − πΉ)ππ‘ ][ππ(ππ‘ ) − ππ(ππ‘−ππ‘ )] + (1 − πΉ)ππ‘ π·π‘ ππ‘ + (1 − πΉ)ππ‘ π√ππ‘ππ‘ (3.1). Convergence of (π − π )ππ ππ ππ In Correction as of 8-12-2008, page 10, a calculable formula is presented, given πΉ, π, π‘, ππ , πΌ, π½. We can try to verify that π·π‘ is relatively small compared to the {ππ(ππ‘ )} from numerical tests. Some results are attached below. Table 3.1 Convergence of (1 − πΉ)ππ‘ π·π‘ ππ‘ Mean Reversion Alpha Beta Target Volatility Mean of Dt Sdev of Dt Mean of (1F)Dtdt 0.2 8 4 0.5 0.028996634 0.000301201 0.001933109 0.2 8 1 0.5 0.032955065 0.000115053 0.002197004 0.2 2 4 0.5 0.032714718 0.000133706 0.002180981 0.2 2 1 0.5 0.027579286 3.02452E-05 0.001838619 0.8 8 4 0.5 0.296378125 0.016555595 0.004939635 0.8 8 1 0.5 0.375057671 0.007657011 0.006250961 0.8 2 4 0.5 0.365161335 0.008317215 0.006086022 0.8 2 1 0.5 0.355641648 0.002590482 0.005927361 (The VBA code for calculating π·π‘ is in “3.1-convergence of Dt”) From the table above, we can see that the mean of (1 − πΉ)π·π‘ ππ‘ is relatively small compared to the log of the interest rate. Therefore, we can approximate the original model by πππ(ππ‘ ) ≈ [1 − (1 − πΉ)ππ‘ ][ππ(ππ‘ ) − ππ(ππ‘−ππ‘ )] + (1 − πΉ)ππ‘ π√ππ‘ππ‘ or (3.2) ππ(ππ‘ ) ≈ [1 − (1 − πΉ)ππ‘ ]ππ(ππ‘ ) + (1 − πΉ)ππ‘ ππ(ππ‘−ππ‘ ) + (1 − πΉ)ππ‘ π√ππ‘ππ‘ (3.3). In other words, in (3.3), the ππ(ππ‘ ) is a weighted average of ππ(ππ‘ ) and ππ(ππ‘−ππ‘ ). Notice that a larger πΉ assigns more weight to ππ(ππ‘ ), that is, a larger πΉ will bring the interest rate to the regime target more quickly. In (3.3), we need to input two vectors and a mean reversion factor to yield a smooth path . The first vector, π1 , indicates when the regime switch take place, the second vector, π2 , indicates what the new regime target is when the previous regime is replaced. The mean reversion is denoted as F in (3.1)~(3.3). As we discuss above, (3.3) is an effective approximation of the original model. Given two vectors, mean reversion factor, and an initial interest rate, (3.3) will yield a smooth interest path. If we regard this smooth path as estimation of the historical path, we can calculate the MSE for a specified set {π1 , π2 , πΉ}. The next target is to find the appropriate set {π1 , π2 , πΉ} the yields the minimum MSE and regard them as the real historical situation. In practice, it is impossible to try each and every set among the parameter space. A tradeoff between accuracy and efficiency needs to be made. One Possible Way of Parameters Estimation (1) Find a ππ : We start by assuming that if there is a regime switching, it must be on the date when a new monthly interest rate is available. By this assumption, we reduce the continuous distribution of π1 into discrete distribution. Next, we assume that the range between two regimes must be greater than 12 months. Let ππ(π π ) represent the natural logarithm of the ith month interest rate. And let βππ(π π ) = ππ(π π ) − ππ(π π−1 ). Since we already have an estimator of the coefficient of noise term, denoted as π. Next define a vector π(π) by π(π) = 1, π(π) = 0.5, π(π) = −0.5, π(π) = −1, { ππ W < βππ(π π ) ππ 0 ≤ βππ(π π ) ≤ π ππ − π ≤ βππ(π π ) ≤ 0 ππ βππ(π π ) < −π Now, define another vector πΆπ(π) by π+12 πΆπ(π) = 1, ππ ∑ π(π) ≤ πΆπππ‘ππππ π=π π+12 πΆπ(π) = 0, { ππ ∑ π(π) < πΆπππ‘ππππ π=π If πΆπ(π) = 1, then we say that in ith month, there could be a regime switching. If πΆπ(π) = 0, then we say there is no regime switching in ith month. Next, let us define sets ππ to store these πΆπ(π)′π . Define ππ by the following process: ππ = {ππ,1 , ππ,2 , ππ,3 , β― , ππ,ππ }, such that (i) ππ,1 < ππ,2 < β― < ππ,ππ (ii) πΆπ(ππ,1 ) = πΆπ(ππ,2 ) = β― πΆπ(ππ,ππ ) = 1 (iii) πΆπ(π) = 0, for ∀π, π . π‘ ππ,1 < π < ππ,ππ πππ π ≠ ππ,1 , π ≠ ππ,2 , β― π ≠ ππ,ππ (iv) πΆπ(π) = 0, for ∀π, π . π‘ ππ,ππ < π < ππ,ππ + 12 Now, if there are πΎ sets of ππ , we assume that there will be πΎ regime switching points through historical path. And π1 is the set of candidates of the 1st regime switching point; π2 is the set of candidates of the 2nd regime switching point; etc. Obviously, there will be (π1 π2 β― ππΎ ) combinations of π1. These are the combinations we would like to consider. (2) Find a ππ : Assume we select a π1 from previous step. How do we find a good enough π2 ? If a π1 is given, then we know where exactly the regime switching points are located. In each regime, we try different target of interest rate (discretely) until we find one that yields minimum MSE (or its squared root, i.e., SSE) in this certain regime. Then we take this target as our estimation. Also, we keep the smooth path that generated by this target. The edge point of this smooth path will be the beginning point of the smooth path over next regime so that we have a continuous smooth path. Remark: First, given π1, in order to find a π2 that yields global minimum MSE, we have to assume F (mean reversion). The prior value of F is believed to be between 0.2 and 0.5. Second, the calculation of step (1) and step (2) is very likely a time-consuming process since we intend to find a π2 that yields global minimum MSE. It would be tempting if we can try all possible {π1 , π2 , πΉ} that belongs to the parameter space Θ. However, that is unachievable. The purpose of the method provided above is find a subset of Θ, denoted as Θ0 , and try all possible {π1 , π2 , πΉ} that belongs to this subset from a cautious and reasonable manner. We cannot ensure that the solution from subset Θ0 would be the global solution, but it should be reasonably good since one needs to try hundreds of millions of scenarios to find this solution from Θ0 . (3) One Way to Improve Efficiency: It is possible that in step (1), (π1 π2 β― ππΎ ) would be a very large number and the calculation would be too massive to handle. To avoid massive calculation, we can do a tradeoff between accuracy and efficiency. An example is presented as below to illustrate this idea: Suppose that in step (1), there are (20 × 20 × 20 × 20 × 20) combinations of π1, and that in step (2), we want to discretely try 10 different targets in each regime. Theoretically, we have to examine 20 × 20 × 20 × 20 × 20 × 10 × 10 × 10 × 10 × 10 = 160,000,000 different paths and select one that yields minimum MSE. However, an alternative way is presented as below: ππ = {ππ,1 , ππ,2 , ππ,3 , β― , ππ,ππ } (i) Find a curve that yields minimum MSE in the 1st regime (i.e., from 1st month to some point in π1 ), regardless of regimes thereafter. We have to examine 20×10=200 paths here. Once a path that yield minimum MSE is found, take where that path ends as the desired switching point and that target as the desired target. (ii) Start with the desired switching point and desired target we get in the previous regime and follow the same procedure in step (i) to find next regime switching point and target. (iii) Repeat step(ii) until we find all the desired regime switching points and regimes. Remark of the alternative way: The total paths we need to examine is 20×10×5=1000, in other words, the calculation is greatly reduced. However, the π1 and π2 we find using this alternative method is not a global optimal solution, even though they also yield a small MSE. (4) A Recursive Process to Derive F, ππ and ππ We should keep in mind that only if we assume a prior value of F can we use the previous methods to derive π1 , π2 and then estimate other parameters. After that, we can present a recursive process to update the prior estimate of F until it converges. The process can be interpreted by the following steps: (i) Assume a prior value of F, say πΉ (1), then perform the methods we mention above to derive π1 and π2 , (1) then denote them as π1 (1) , π2 , respectively. (ii) Since we know that (ππ‘ ) ≈ [1 − (1 − πΉ)ππ‘ ]ππ(ππ‘ ) + (1 − πΉ)ππ‘ ππ(ππ‘−ππ‘ ) + (1 − πΉ)ππ‘ π√ππ‘ππ‘ , define Μπ‘ ) = [1 − (1 − πΉ)ππ‘ ]ππ(ππ‘ ) + (1 − πΉ)ππ‘ ππ(ππ‘−ππ‘ ) and the sum of squared error πππΈ = ππ(π 2 Μπ‘ ) − ππ(ππ‘ )) . Next, based on π (1) , π (1), try different values of F until we find a F that yields ∑ (ππ(π 1 2 minimum SSE . Denote this F as πΉ (2) and take it as the new F (a posterior estimation). (iii) By combining step (i) and step (ii), we create a recursive process. This process can be simply (1) (1) (2) (2) (3) (3) expressed as: πΉ (1) → π1 , π2 → πΉ (2) → π1 , π2 → πΉ (3) → π1 , π2 proved theoretically, but in practice the sequence converges rapidly. → β― . The convergence is not Remark of the Estimation of F in the Recursive Process: In (3,3), we have πππ(ππ‘ ) ≈ [1 − (1 − πΉ)ππ‘ ][ππ(ππ‘ ) − ππ(ππ‘−ππ‘ )] + (1 − πΉ)ππ‘ π√ππ‘ππ‘ , and to estimate F from a minimum MES, or SSE manner , we are minimizing 2 ∑[(1 − πΉ)ππ‘ π√ππ‘ππ‘ ] , π‘ or proximately minimizing ∑[πππ(ππ‘ ) − βπ‘ ]2 , where, βπ‘ = [1 − (1 − πΉ)ππ‘ ][ππ(ππ‘ ) − ππ(ππ‘−ππ‘ )] π‘ We know that π|βπ‘ |/ππΉ > 0, in other words, |βπ‘ | and F are positively correlated. Now, assume that we have a very huge change of interest rate at π‘πΎ , then we are indeed minimizing 2 [πππ(ππ‘πΎ ) − βπ‘πΎ ] + ∑ [πππ(ππ‘ ) − βπ‘ ]2 . π‘≠π‘πΎ Since we are minimizing the sum of squared errors, the effect of this very huge change of interest rate πππ(ππ‘πΎ ) is, in fact, even amplified. In order to minimize the whole term, we have to accept a larger F, which gives us a larger |βπ‘ | across the whole period. Therefore, in regimes with huge volatility, trying to find estimation of F from the minimum sum of squared error approach is not recommended because the existence of these huge changes of interest rate greatly reduces the accuracy of this method. Alternatively, one can do the estimation from a minimum sum of absolute squared error approach. (5) Use MLE to estimate π, π, ππ , ππ If we know when and to what the regime targets change, we can use the MLE to estimate these parameters. For ππ , ππ , we know that MLE estimators for them are sample mean and the biased sample deviation, respectively. For , π½ , it is a little complicated. We can work out the likelihood function involved πΌ, π½. (1) Since we observe π‘1 , the probability density of this event is ππΜ (π‘1 ). (2) ππ‘1 ,π‘2 (π‘1 , π‘2 ) = ππ‘2|π‘1 (π‘2 |π‘1 )ππΜ (π‘1 ), where ππ‘2 |π‘1 (π‘2 |π‘1 ) = ππ (π‘2 − π‘1 ). We can further infer that ππ‘1 ,π‘2 ,…,π‘π (π‘1 , π‘2 , … , π‘π ) = ππ‘π|π‘1 ,π‘2 ,…,π‘π−1 (π‘π |π‘1 , π‘2 , … , π‘π−1 )ππ‘1 ,π‘2 ,…,π‘π−1 (π‘1 , π‘2 , … , π‘π−1 ). So we can conclude π½−2 that ππ‘1 ,π‘2 ,…,π‘π½−1 (π‘1 , π‘2 , … , π‘π½−1 ) = ππΜ (π‘1 ) ∏1 ππ (π‘π+1 − π‘π ). (3) Since (π‘π½ − π‘π½−1 ) is a censored observation, the term we would like to maximize is then π½−2 ππΜ (π‘1 ) ∏1 ∞ ππ (π£)ππ£ π½ −π‘π½−1 ππ (π‘π+1 − π‘π ) ∫π‘ π½−2 = ππΜ (π‘1 ) ∏1 ππ (π‘π+1 − π‘π ) ππ (π‘π½ − π‘π½−1 ) The VBA code implemented this estimation if in “3.2-path mimic” and “3.3-MLE for parameters” (6) Implying the Parameter Estimation Technique Applying this technique, we have a convergence set of {π1 , π2 , πΉ}, which is visually showed in figure 3.1. Figure 3.1 {π1 , π2 , πΉ} -2 1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477 505 533 561 589 617 645 673 701 -1.5 -2.5 Ln(rt) -3 Smooth -3.5 Regime -4 -4.5 The estimated parameters are: Table 3.2 Estimation of Parameters Parameters Estimation πΌ 7.100068 π½ 1.139999 Mean Target 0.060762 Target Volatility, i. e. , σ π 0.669872 Initial Target 0.021237 0.3674902 F Noise Analysis And Kalman Filter (π) (π) Once we find the convergent πΉ (π) , π1 and π2 , the residual term (noise) at time t is then Μπ‘ ) − ππ(ππ‘ )), the noise figure in historical date is presented below. (ππ(π Figure 3.2 noise 0.2 0.1 0 -0.1 0 100 200 300 400 500 600 700 800 noise -0.2 -0.3 -0.4 In the figure above, we can see that the residual terms seem to have heteroskedasticity, in other words, residuals are different in different period. A way to explain this heteroskedasticity is that other than endogenous noise term ππ‘ (1 − πΉ) π√ππ‘ππ‘ , there are exogenous noises, which are independent from the model. These exogenous noises can be observation error, which represents the mispricing of the market. By introducing these exogenous noises, denoted by ππ‘ , we can regard the historical interest rate path as: Μπ‘ ) = ππ(ππ‘ ) + ππ‘ ππ(π where in a kth regime, π£π‘ ~π(0, π π )πΈ[ππ ] = πΈ[ππ ], πππ[ππ ] = πππ[ππ ], πππ ππ ⊥ ππ πππ ∀π ≠ π One important feature about ππ‘ is that it does not affect the stochastic model, but only affects the observation of the interest rate path. Kalman filter operates recursively on streams of noisy input data to produce a statistically optimal estimate of the underlying system state. After introducing ππ‘ into this stochastic model, we can construct a kalman filter in kth regime (from πππ‘β month to (ππ+1 − 1)π‘β month) by taking the following steps : 2 (i) Initialization. π = (π(1 − πΉ)ππ‘ √ππ‘) , π = π π , πππ π‘_ππ(πππ ) = ππ(πππ ) , πππ π‘_πππ = 0. (ii) Predict. πππππ_ππ(πππ +1 ) = [1 − (1 − πΉ)ππ‘ ]ππ(ππ‘ ) + (1 − πΉ)ππ‘ πππ π‘_ππ(πππ ), πππππ_πππ+1 = (1 − πΉ)2ππ‘ πππ π‘_πππ + π. Μ (iii) Update. πππ +1 = ππ(π Κ ππ+1 = ππ +1 ) − ππππ πππ (πππ +1 ) , πππ +1 = πππππ_πππ+1 + π , πππππ_πππ+1 /πππ +1 , πππ π‘_ππ(πππ +1 ) = πππππ_ππ(πππ +1 ) + Κ ππ+1 πππ +1 , πππ π‘_πππ +1 = (1 − Κ ππ+1 )πππππ_πππ+1 . (iv) Repeat step (ii) and (iii) until the whole interest rate path is covered. πππππ_ππ(ππ‘ ) is the estimated real interest rate path yielded by Kalman filter. The VBA code implemented this estimation if in “3.4-kalman filter” 4 Refernce 1. Bridgeman, J. G., “Random switching times among randomly parameterized regime of random interest rate scenarios”, Actuarial Research Clearing House (ARCH) 2007.1 (January 2007) 2. Bridgeman, J. G., “Combinatorics for Moments of a Randomly Stopped Quadratic Variation Process”, University of Connecticut, 2012.8 3. Bridgeman, J. G., “Moments of a Regime-Switching Stochastic Interest Rate Model with Randomized Regimes”, University of Connecticut, 2007.12 4. Brow, R. G. and Hwang, P. Y. C., Introduction to Random Signals and Applied Kalman Filtering, Wiley (1997)