Difference Equations “…I believe that the theory that space is continuous is wrong… I rather suspect that the simple ideas of geometry, extended down into infinitely small space, are wrong.” Richard Feynman, The Character of Physical Law, pp. 166-7. Suppose x f ( x, y ) y g ( x, y ) Since x limt 0 x(t ) x(t t ) , if the time step, Δt, is small then it is approximately true t that x(t ) x(t t ) f ( x(t t ), y (t t )) t or x(t ) t f ( x(t t ), y (t t )) x (t t ). Think of Δt as the interval between data collection events. So we have a difference equation system: x(t ) t f ( x(t t ), y (t t )) x(t t ) y (t ) t g ( x(t t ), y (t t )) y (t t ) This is roughly Euler’s (pronounced “oiler’s”) method. Example from Lynch’s system 10.7: Original system is x y x(1 2 x 2 3 y 2 ) y x y(1 2 x 2 3 y 2 ) (1) Suppose you don’t know the parameters (1’s, 2’s, and 3’s) in the functions above, but you do know the form of the equation, i.e., you know x Ay x( B Cx2 Dy 2 ) y Ex y( F Gx2 Hy 2 ) (2) and you want to statistically fit (2) to data that has been collect at intervals of Δt. Using the results above, we construct the difference equations x(t ) t { Ay(t t ) x(t t )[ B Cx(t t ) 2 Dy (t t ) 2 ]} x(t t ) (3) y (t ) t {Ex(t t ) y (t t )[ F Gx(t t ) 2 Hy (t t ) 2 ]} y (t t ) Distributing and combining terms, this is the same as x(t ) ΔtA y (t t ) (1 Bt ) x(t t ) Ctx(t t ) 3 Dtx(t t ) y (t t ) 2 y (t ) tEx(t t ) (1 Ft ) y (t t ) Gty(t t ) x(t t ) 2 Hty(t t ) 3 (4) It is easy to “play” with a system like this in Excel. (“Predator Prey example from 4 point 7.xls” in Lynch.) The fx line from the Excel image above gives the formula defining B3, the second x term, in terms of the previous x and y together with the time step in column H. With two clicks in the bottom right corner of B3, that formula can be applied all the way down the B column. With the same sort of formula, you can generate a column of y terms. It is important to make t as small (“short”) as possible. To illustrate, I solve for the solution to the differential equations in (1) and get the figure below. Notice the limit cycle. 0.6 y 0.4 0.2 0.5 0.5 1.0 1.5 x 0.2 0.4 0.6 If I use the difference equation (3) with the exact same parameters and a time step of 0.3, I get the next figure. (The “dots” are the discrete values obtained with the differential equation. 0.6 y 0.4 0.2 0.5 0.5 1.0 1.5 x 0.2 0.4 0.6 But if I use a time step of 0.01, I get: 0.6 y 0.4 0.2 0.5 0.5 1.0 1.5 x 0.2 0.4 0.6 So it looks like small Δt give a difference equation that is closer to the differential equation. (“difference equation comparison with differential.nb”) The payoff to this is that you can now do a least squares regression of x(t) on x(t-1), y(t1), x(t-1)3, and x(t-1)Y(t-1)2 from equations in (4) to get ordinary least squares estimates of parameters A, B, C, and D. Similarly, we can estimate the parameters for y(t) for the second equation from (4) and get estimates for E, F, G, and H. (Δt, your sampling time interval, will be known going in.) When you run these regressions, the parameters given to you by the OLS package will be –Δt·A, (1 + B)Δt, C·Δt, and D·Δt, so some minor algebra (using the time step Δt) will be needed to get A, B, C, and D—if you want to use those in the differential equation in (2). Many researchers just form the problem directly as a difference equation problem, and don’t bother either deriving it all from differential equations or going back from the difference equation to the differential equation Here’s a check without randomness in Minitab. Notice that we get the parameters exactly and that t-ratios are infinite with R2 = 100%. x = -0.000000 + 1.01 xt-1 - 0.0100 yt-1 - 0.0200 x^3 0.0300 x y^2 800 cases used 1 cases contain missing values Predictor Coef Constant -0.00000000 xt-1 1.01000 yt-1 -0.0100000 x^3 -0.0200000 x y^2 -0.0300000 s = 0 Stdev 0.00000000 0.00000 0.0000000 0.0000000 0.0000000 R-sq = 100.0% t-ratio * * * * * p * * * * * R-sq(adj) = 100.0% y =0.000000 + 0.0100 xt-1 + 1.01 yt-1 - 0.0200 y x^2 0.0300 y^3 800 cases used 1 cases contain missing values Predictor Coef Constant 0.00000000 xt-1 0.0100000 yt-1 1.01000 y x^2 -0.0200000 y^3 -0.0300000 s = 0 Stdev 0.00000000 0.0000000 0.00000 0.0000000 0.0000000 R-sq = 100.0% t-ratio * * * * * p * * * * * R-sq(adj) = 100.0% Now suppose “nature” uses the true x’s and y’s but there is noise in our measurements of the output of x(t) and y(t). The outputs with errors are called x + ex and y + ey. x+ex = - 0.00811 yt-1 + 1.00 xt-1 - 0.0197 x^3 - 0.0333 x y^2 800 cases used 1 cases contain missing values Predictor Coef Noconstant yt-1 -0.008108 xt-1 1.00323 x^3 -0.019666 x y^2 -0.03330 Stdev t-ratio p 0.003378 0.00667 0.006227 0.03521 -2.40 150.33 -3.16 -0.95 0.017 0.000 0.002 0.344 y+ey = 0.00772 xt-1 + 1.01 yt-1 - 0.0393 y x^2 - 0.0185 y^3 800 cases used 1 cases contain missing values Predictor Noconstant xt-1 yt-1 y x^2 y^3 Coef Stdev t-ratio p 0.007723 1.00516 -0.03931 -0.01854 0.003912 0.02957 0.05510 0.08151 1.97 33.99 -0.71 -0.23 0.049 0.000 0.476 0.820 Actual + and Fit o from discrete equation and minitab 0.7 Fit y 0.2 -0.3 -0.8 -0.5 0.5 1.5 Fit x The noisier the data is, the harder it is to make decent estimates of the parameters (naturally). So, like everything else in stat, this isn’t foolproof. Here’s a picture from Mathematica. See “predator prey example from Lynch 4 point 7.nb” The dots are the synthetic data, and the sliders let us change the parameters while watching the fixed data in the background. Here I’ve simplified equation system (2) to be x y x(1 ax 2 by 2 ) y x y (1 cx 2 dy 2 ) a 2. b 3. c 2. d 3. x InterpolatingFunction 0., 20. , ,y InterpolatingFunction 0., 20. , , 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.5 0.0 0.5 1.0 1.5 See Lynch’s discussion of Henon map for other 2-variable difference equation example. “difference equations.doc” same name in Minitab