lecture note 8

STAT 497 LECTURE NOTES 8 ESTIMATION 1 ESTIMATION • After specifying the order of a stationary ARMA process, we need to estimate the parameters. • We will assume (for now) that: 1. The model order (p and q) is known, and 2. The data has zero mean. • If (2) is not a reasonable assumption, we can subtract the sample mean Y , fit a zero-mean ARMA model: B  X t   B at where X t  Yt  Y Then use X t  Y as the model for Yt. 2 ESTIMATION – Method of Moment Estimation (MME) – Ordinary Least Squares (OLS) Estimation – Maximum Likelihood Estimation (MLE) – Least Squares Estimation • Conditional • Unconditional 3 THE METHOD OF MOMENT ESTIMATION • It is also known as Yule-Walker estimation. Easy but not efficient estimation method. Works for only AR models for large n. • BASIC IDEA: Equating sample moment(s) to population moment(s), and solve these equation(s) to obtain the estimator(s) of unknown parameter(s). 1n E (Yt )   Yt    Y n t 1 OR 1n E YtYt k    YtYt k   k  ˆk n t 1  k  ˆ k 4 THE METHOD OF MOMENT ESTIMATION • Let n is the variance/covariance matrix of X with the given parameter values. • Yule-Walker for AR(p): Regress Xt onto Xt−1, . . ., Xt−p. • Durbin-Levinson algorithm with  replaced by ˆ . • Yule-Walker for ARMA(p,q): Method of moments. Not efficient. 5 THE YULE-WALKER ESTIMATION • For a stationary (causal) AR(p) p E  X t k  X t    j X t  j    E  X t k at , k  0,1,..., p    j 1    0     p   a2 and  p  p  0  where   1 ,2 ,. p  To calculate the values E  X t k at , we have used the RSF of the process : X t    B at . 6 THE YULE-WALKER ESTIMATION • To find the Yule-Walker estimators, we are using,  k  ˆk or k  ˆ k .  ˆ pˆ  ˆ p Thus, the Yule - Walker equations for ˆ :  2 ˆ   a  ˆ0  ˆˆ p • These are forecasting equations. • We can use Durbin-Levinson algorithm. 7 THE YULE-WALKER ESTIMATION • If ˆ0  0, then ˆ m is nonsingular. • If {Xt} is an AR(p) process, 2 asymp .   1  a ˆ  ~ N   , p   n  ˆ a2 p  a2 1  ˆ kk ~ N  0,  for k  p.  n asymp . Hence, we can use the sample PACF to test for AR order, and we can calculate approximate confidence intervals for the parameters. 8 THE YULE-WALKER ESTIMATION • If Xt is an AR(p) process, and n is large, n ˆ    ~ N 0,ˆ a2ˆ p1  approx . • 100(1)% approximate confidence interval for j is  ˆ 1 1 / 2 a ˆj  z / 2 ˆ p  n jj 9 THE YULE-WALKER ESTIMATION • AR(1) Yt  Yt 1  at Find the MME of . It is known that 1 = .  1  ˆ1 n  Yt  Y Yt 1  Y    ˆ1  t 1 n  Yt  Y  2 t 1 10 THE YULE-WALKER ESTIMATION • So, the MME of  is n ~  Yt  Y Yt 1  Y    t 1 n  Yt  Y  2 t 1 • Also,  is unknown. 2 a  a2 0  1   2  • Therefore, using the variance of the process, we can obtain MME of  a2 . 11 THE YULE-WALKER ESTIMATION  0  ˆ0  a2 1 n 2   Yt  Y  2 1    n t 1     n 1 ~ 2 2 2 ~ Yt  Y   a  1  n t 1 n 1 2 2 2 ~  a  1  ˆ1  Yt  Y  n t 1 12 THE YULE-WALKER ESTIMATION • AR(2) Yt  1Yt 1  2Yt 2  at Find the MME of all unknown parameters. • Using the Yule-Walker Equations 1 1  1  2 1  1  1  2  2  11  2   2   2 1 1  2  2 13 THE YULE-WALKER ESTIMATION • So, equate population autocorrelations to sample autocorrelations, solve for 1 and 2. 1 1  ˆ1   ˆ1 1  2  2  ˆ 2   2 1 1  2  2  ˆ 2 14 THE YULE-WALKER ESTIMATION ˆ 2  ˆ12 ~ ˆ1 1  ˆ 2  2  and 1  . 2 2 1  ˆ1 1  ˆ ~ 1 Using these we can obtain the MME of  2 a ~ ~ 2 ~  a  ˆ0 1  1ˆ1  2 ˆ 2  2  To obtain MME of a , use the process variance formula. 15 THE YULE-WALKER ESTIMATION • AR(1) 0   2 a 1    2 ˆ1 1 ˆ ˆ   1 ˆ1   ˆ1 ˆ0 • AR(2) 1 ˆ  1  ˆ ˆ0 ˆ1   ˆ1   1  1     ˆ 2         ˆ   ˆ2  ˆ1 ˆ0   ˆ2   2 16 THE YULE-WALKER ESTIMATION • MA(1) Yt  at  at 1 • Again using the autocorrelation of the series at lag 1,  1    ˆ1 2 Choose the root so 1     2 ˆ1    ˆ1  0 2  1  1  4  ˆ ~ 1 1, 2  2 ˆ1 that the root satisfying the invertibility condition 17 THE YULE-WALKER ESTIMATION • For real roots, 1  4 ˆ12  0  0.25  ˆ12  0.5  ˆ1  0.5 If ˆ1  0.5, unique real roots but non-invertible. If ˆ1  0.5 , no real roots exists and MME fails. If ˆ1  0.5, unique real roots and invertible. 18 THE YULE-WALKER ESTIMATION • This example shows that the MMEs for MA and ARMA models are complicated. • More generally, regardless of AR, MA or ARMA models, the MMEs are sensitive to rounding errors. They are usually used to provide initial estimates needed for a more efficient nonlinear estimation method. • The moment estimators are not recommended for final estimation results and should not be used if the process is close to being nonstationary or noninvertible. 19 THE MAXIMUM LIKELIHOOD ESTIMATION • Assume that a ~ N 0, . i .i . d . t 2 a • By this assumption we can use the joint pdf f a1 ,, an   f a1  f an  instead of f  y1 ,, yn  which cannot be written as multiplication of marginal pdfs because of the dependency between time series observations. 20 MLE METHOD • For the general stationary ARMA(p,q) model Yt  1Yt 1     pYt  p  at  1at 1     q at q or at  Yt  1Yt 1     pYt  p  1at 1     q at q where Yt  Yt  . 21 MLE • The joint pdf of (a1,a2,…, an) is given by f a1 ,, an  , , , 2 a   2  2 n / 2 a  1 n 2 exp  2  at   2 a t 1  where   1 ,, p  and   1 ,. q . • Let Y=(Y1,…,Yn) and assume that initial conditions Y*=(Y1-p,…,Y0)’ and a*=(a1-q,…,a0)’ are known. 22 MLE • The conditional log-likelihood function is given by n S*   ,  ,   2 2 ln L , , , a    ln2 a   2 2 a2 where S*  , ,    at2  , , Y , Y* , a*  is the n t 1 conditiona l sum of squares. Initial Conditions: Y*  Y and a*  E at   0. 23 MLE • Then, we can find the estimators of =(1,…,p), =(1,…, q) and  such that the conditional likelihood function is maximized. Usually, numerical nonlinear optimization techniques are required. After obtaining all the estimators, S* ˆ ,ˆ,ˆ    d. f . 2 ˆa where d.f.=  of terms used in SS   of parameters = (np)  (p+q+1) = n  (2p+q+1). 24 MLE • AR(1) Yt  Yt 1  at where at ~ N 0, a2 . i .i .d . f a1 ,, an   2  2 n / 2 a  1 n 2 exp  2  at   2 a t 1  Y1  Y0  a1  Let' s take Y0  0 Y2  Y1  a2  a2  Y2  Y1 Y3  Y2  a3  a3  Y3  Y2  Yn  Yn1  an  an  Yn  Yn1 25 MLE The Jacobian will be a2 Y2 J   an Y2 a2 a2 1 0  0  Y3 Yn   1  0     1     an an  0 0  1 Y3 Yn f Y2 ,, Yn Y1   f a2 ,, an  J  f a2 ,, an  26 MLE • Then, the likelihood function can be written as L , a2   f Y1 ,, Yn   f Y1  f Y2 ,, Yn Y1   f Y1  f a2 ,, an  1/ 2  1     e  2 0  Y1 0 2 2 0  n 1 / 2  1 n Y Y 2 t t 1 2   1   2  2 a  e 2 a t  2    . where Y1 ~ N  0,  0  2 1   2 a 27 MLE • Hence, 1 2  1 n 2 2 2  L ,   exp  2  Yt  Yt 1   1   Y1  n / 2   2 a2   2 a t 2 2 a • The log-likelihood function: n n 1 2 2 ln L , a    ln 2  ln  a  ln 1   2   2 2 2   1 n 2 2 2  2   Yt  Yt 1   1   Y1  2 a t 2      S*     S   28 MLE • Here, S*() is the conditional sum of squares and S() is the unconditional sum of squares. • To find the value of  where the likelihood function is maximized,  ln L , a2   0  ˆ.  • Then, S ˆ  ˆ  . n 2 a 29 MLE • If we neglect ln(12), then MLE=conditional LSE. max L , •   min S  .  If we neglect both ln(1 ) and 1   Y max L ,   min S  .  2 a 2 2 2 a 2 1 , then * 30 MLE • Asymptotically unbiased, efficient, consistent, sufficient for large sample sizes but hard to deal with joint pdf. 31 CONDITIONAL LEST SQUARES ESTIMATION • AR(1) Yt  Yt 1  at where at ~ N 0, a2 . i .i . d . at  Yt  Yt 1 SSE   a   Yt  Yt 1   S*   for observed Y1 ,...,Yn . n t 1 2 t n 2 t 1 n dS*    2 Yt 1 Yt  Yt 1   0 t 1 d n  YtYt 1  ˆ  t 2n 2 Y  t 1 t 2 32 CONDITIONAL LSE • If the process mean is different than zero  Yt  Y Yt 1  Y  n ˆ  t 2  Yt 1  Y  n 2  MME of  t 2 33 CONDITIONAL LSE • MA(1) Yt  at  at 1 ,1    1, at ~ WN _ Normal0,  2 a IF : at  Yt  Yt 1   2Yt 2    AR  – Non-linear in terms of parameters – LS problem – S*() cannot be minimized analytically – Numerical nonlinear optimization methods like Newton-Raphson or Gauss-Newton,... *There are similar problem is ARMA case. 34 UNCONDITIONAL LSE min S   wrt  where S     Yt  Yt 1   1   Y1 n 2 2 2 t 2 dS    0  ˆ d • This nonlinear in . • We need nonlinear optimization techniques. 35 BACKCASTING METHOD • Obtain the backward form of ARMA(p,q) 1   F     F Y  1   F     F a p 1 p q t 1 q t where F Yt  Yt  j . j • Instead of forecasting, backcast the past values of Yt and at, t  0. Obtain the unconditional log-likelihood function, then obtain the estimators. 36 EXAMPLE • If there are only 2 observations in time series (not realistic) Y1  a1 Y2  a2  a1 where a1 and a2 ~ N 0, . 2 Find the MLE of  and  a . i .i . d . 2 a 37 EXAMPLE • US Quarterly Beer Production from 1975 to 1997 > par(mfrow=c(1,3)) > plot(beer) > acf(as.vector(beer),lag.max=36) > pacf(as.vector(beer),lag.max=36) 38 EXAMPLE (contd.) > library(uroot) Warning message: package 'uroot' was built under R version 2.13.0 > HEGY.test(wts =beer, itsd = c(1, 1, c(1:3)), regvar = 0,selectlags = list(mode = "bic", Pmax = 12)) Null hypothesis: Unit root. Alternative hypothesis: Stationarity. ---HEGY statistics: Stat. p-value tpi_1 -3.339 0.085 tpi_2 -5.944 0.010 Fpi_3:4 13.238 0.010 > CH.test(beer) ------ - ------ ---Canova & Hansen test ------ - ------ ---Null hypothesis: Stationarity. Alternative hypothesis: Unit root. L-statistic: 0.817 Critical values: 0.10 0.05 0.025 0.01 0.846 1.01 1.16 1.35 39 EXAMPLE (contd.) > plot(diff(beer),ylab='First Difference of Beer Production',xlab='Time') > acf(as.vector(diff(beer)),lag.max=36) > pacf(as.vector(diff(beer)),lag.max=36) 40 EXAMPLE (contd.) > HEGY.test(wts =diff(beer), itsd = c(1, 1, c(1:3)), regvar = 0,selectlags = list(mode = "bic", Pmax = 12)) ---- ---HEGY test ---- ---Null hypothesis: Unit root. Alternative hypothesis: Stationarity. ---HEGY statistics: Stat. p-value tpi_1 -6.067 0.01 tpi_2 -1.503 0.10 Fpi_3:4 9.091 0.01 Fpi_2:4 7.136 NA Fpi_1:4 26.145 NA 41 EXAMPLE (contd.) > fit1=arima(beer,order=c(3,1,0),seasonal=list(order=c(2,0,0), period=4)) > fit1 Call: arima(x = beer, order = c(3, 1, 0), seasonal = list(order = c(2, 0, 0), period = 4)) Coefficients: ar1 ar2 ar3 sar1 sar2 -0.7380 -0.6939 -0.2299 0.2903 0.6694 s.e. 0.1056 0.1206 0.1206 0.0882 0.0841 sigma^2 estimated as 1.79: log likelihood = -161.55, aic = 335.1 > fit2=arima(beer,order=c(3,1,0),seasonal=list(order=c(3,0,0), period=4)) > fit2 Call: arima(x = beer, order = c(3, 1, 0), seasonal = list(order = c(3, 0, 0), period = 4)) Coefficients: ar1 ar2 ar3 sar1 sar2 sar3 -0.8161 -0.8035 -0.3529 0.0444 0.5798 0.3387 s.e. 0.1065 0.1188 0.1219 0.1205 0.0872 0.1210 sigma^2 estimated as 1.646: log likelihood = -158.01, aic = 330.01 42

lecture note 8

Related documents

Products

Support

lecture note 8

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib