Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models Author(s): G. E. P. Box and David A. Pierce Source: Journal of the American Statistical Association, Vol. 65, No. 332 (Dec., 1970), pp. 15091526 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2284333 . Accessed: 03/10/2011 05:11 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org ? Journalof the AmericanStatisticalAssociation December 1970, Volume 65, Number332 Theoryand Methods Section Distribution ofResidualAutocorrelations inAutoregressiveIntegrated MovingAverageTimeSeriesModels G. E. P. BOX and DAVIDA. PIERCE* Many statisticalmodels,and in particularautoregressive-movingaveragetime seriesmodels,can be regardedas means of transforming the data to whitenoise, thatis, to an uncorrelatedsequenceoferrors.If theparametersare knownexactly, this randomsequence can be computeddirectlyfromthe observations;when this calculationis made withestimatessubstitutedforthe trueparametervalues, the resultingsequence is referredto as the "residuals,"whichcan be regardedas estimates of the errors. If the appropriatemodel has been chosen,therewill be zero autocorrelationin the errors.In checkingadequacy of fitit is therefore logical to study the sample autocorrelationfunctionof the residuals.For large samples the residualsfroma correctlyfittedmodelresembleverycloselythe trueerrorsoftheprocess;however, care is needed in interpreting the serial correlationsof the residuals.It is shown herethat the residual autocorrelationsare to a close approximationrepresentable as a singularlineartransformation ofthe autocorrelations ofthe errorsso that they possessa singularnormaldistribution. Failingto allowforthisresultsin a tendency to overlookevidenceof lack of fit.Tests of fitand diagnosticchecksare devised whichtake thesefactsinto account. 1. INTRODUCTION An approach to the modelingof stationaryand non-stationarytime series such as commonlyoccur in economic situationsand controlproblemsis discussedby Box and Jenkins[4, 5], buildingon the earlierworkofseveralauthors beginningwithYule [19] and Wold [17], and involvesiterativeuse ofthe threestage process of identification,estimation,and diagnosticchecking.Given a discretetimeseriesZt, Zt-1, Z t-2, * * -and usingB forthebackwardshiftoper- atorsuch that Bzt = Zt1, thegeneralautoregressive-integratedmovingaverage (ARIMA) model of order(p, d, q) discussedin [4, 5] may be written +(B)Vdzt = O(B)at (1.1) where cp(B)=1- 01B- * * * - ,BP and 0(B)=1-01BOqBql {at} is a sequence of independentnormal deviates with common variance 0a2, to be referredto as "white noise," and where the roots of b(B) =0 and 0(B) =0 lie outsidetheunitcircle.In otherwords,if Wt = VdZt = (1-B)dz, is thedthdifference of the seriesZt, then wtis the stationary,invertible,mixed autoregressive (AR)-moving average (MA) processgivenby Wt i=1 iwt-i -,=. Ojat-1 + at, andpermitting d>O allowstheoriginal seriesto be (homogeneously) nonsta* G. E. P. Box is professorofstatistics,UniversityofWisconsin.David A. Pierceis on leave fromthe Departmentof Statistics,Universityof Missouri,Columbia,as statistician,Research Department,Federal ReserveBank of Cleveland.This workwas supportedjointlyby the Air Force Officeof ScientificResearchunderGrantAFOSR69-1803and bytheU. S. ArmyResearchOfficeunderGrantDA-ARO-D-31-124-G917. 1509 1510 Journalof the AmericanStatisticalAssociation,December 1970 tionary.In some instancesthe model (1.1) will be appropriateaftera suitable transformation is made on z; in othersz may representthe noise structureafter allowingforsome systematicmodel. This generalclass of models is too rich to allow immediatefittingto a particularsample series {Zt } = strategyis therefore Z21 . . . Zn, and the following employed: is used to finda smallersubclass of modelsworthcon1. A processof identification sideringto representthe stochasticprocess. 2. A modelin thissubclassis fittedby efficient statisticalmethods. 3. An examinationof the adequacy of the fitis made. The object of the thirdor diagnosticcheckingstage is not merelyto determine whetherthereis evidence of lack of fitbut also to suggestways in which the model may be modifiedwhen this is necessary.Two basic methodsfor doing thisare suggested: in a way it is feared The modelmay be deliberatelyoverparameterized Overfitting. nay be neededand in a mannersuch thattheentertainedmodelis obtainedby setting certainparametersin the moregeneralmodel at fixedvalues, usually zero. One can thencheckthe adequacy of the originalmodelby fittingthe moregeneralmodel and consideringwhetheror not the additionalparameterscould reasonablytake on the specifiedvalues appropriateto the simplermodel. is most useful Diagnosticchecksapplied to theresiduals.The methodof overfitting this inwherethe nature of the alternativefearedmodel is known.Unfortunately, formationmaynot alwaysbe available,and less powerfulbut moregeneraltechniques are needed to indicate the way in which a particularmodel mightbe wrong.It is natural to considerthe stochasticpropertiesof the residuals a = (di, 42, * , a,)' calculatedfromthe sample seriesusing the model (1.1) with estimatesq 1,qP2, cp; 01,02, 0, substitutedfor the parameters.In particulartheirautocorrelation function rk P (1.2) dgtat-k/,2 dt2 may be studied. Now if the model were appropriate and the a's for the particularsample serieswere calculated using the trueparametervalues, then these a's would be uncorrelatedrandom deviates, and their first m sample autocorrelations , rm)',wherem is small relativeto n and r = (r1,r2, rk Eatat-k E at (1.3) , would formoderateor large n possess a multivariatenormal distribution[1]. } are uncorrelatedwithvariances Also it can readilybe shownthat the {rA; V (rAk)= n-k from which it follows in particular that the statistic n(n+2) would forlargen be distributedas approximation, (1.4) 1/n, n(n + 2)(14 , (n-k)-'rk2 x2 withm degreesoffreedom;or as a further inTimeSeriesModels ResidualAutocorrelations 1511 It is temptingto suppose that these same propertiesinightto a sufficient approximationbe enjoyedby the r's fromthefitted model; and diagnosticchecks based on this suppositionwere suggestedby Box and Jenkins [4] and Box, Jenkins,and Bacon [6]. If this assumptionwerewarranted,approximatestandard errorsof I/V/n [or more accurate standard errorsof V/n- k/n(n+2)] could be attached to the r's and a quality-control-chart type of approach used, with particularattentionbeing paid to the r's of low orderforthe indication of possible model inadequacies. Also it might be supposed that Equation (1.5) with r's replacing r's would still be approximatelyvalid, so that large values of thisstatisticwould place the modelundersuspicion. It was pointed out by Durbin [10], however,that this approximationis invalid when applied to the residual autocorrelationsfroma fittedautoregressive model. For example,he showed that r' calculated fromthe residuals of a firstorderautoregressiveprocesscould have a muchsmallervariance than ri for whitenoise. The presentpaper thereforeconsidersin some detail the propertiesof the r's and in particulartheircovariance matrix,both for AR processes (Sections 2 and 3) and for MA and ARIMA processes (Section 5). This is done with the intentionof obtaininga suitable modificationto the above diagnosticchecking procedures(Sections4 and 5.3) The problemof testingfitin time series models has been consideredpreviously by several authors.Quenouille [14]1 developed a large-sampleprocedure forAR processesbased on theirsample partialautocorrelations, whichpossesses the same degree of accuracy as the presentone.2 Quenouille's test was subsequently extended [3, 15, 18] to cover MA and mixed models. Whittle [16] proposed tests based on the likelihood ratio and resemblingthe overfitting methodabove. The presentprocedure(a) is a unifiedmethodequally applicable to AR, MA, and general ARIMA models, (b) is motivated by the intuitive idea that the residualsfroma correctfitshould resemblethe true errorsof the in the model process,and (c) can be used to suggestparticularm-odifications when lack of fitis found [5 ]. 2. DISTRIBUTION OF RESIDUAL AUTOCORRELATIONS FOR THE AUTOREGRESSIVE PROCESS In this section we obtain the joint large-sampledistributionof the residual autocorrelationsr=(ri, * * , 9m)' wherer4kis givenby (1.2), foran autoregressive process. This is done by firstsettingforthsome generalpropertiesof AR processes,using these to obtain a set of linearconstraints(2.9) satisfiedby the }, and then approximatingrkby a firstorderTaylor expansion(2.22) about irk the whitenoise autocorrelationrk.Finally,theseresultsare combinedin matrix formto establish a linear relationship(2.27) between r and r analogous to that betweenthe residualsand true errorsin a standardregressionmodel,from which the distribution(2.29) of r readily follows. Subsections 2.5-2.7 then discussexamplesand applicationsofthisdistribution. 1 See also [111. 2 The authorsare gratefulto a refereeforthis observation. December1970 Association, Statistical Journal of theAmerican 1512 Process 2.1 TheAutoregressive The generalAR processof orderp, (B)yt - at, (2.1) whereB, ?(B), and ta,} are as in (1.1), can also be expressedas a movingaver*) to ob(1 +11B+k2B2+ age of infiniteorderby writingAI(B)= 4-'(B) tain yt - o Vjat-j; VI(B)at 2- (2.2) in the relationVI'(B) 0(B) where 4to=1. By equating coefficients that the VI'sand O's satisfythe relation 1V/V-1+ v OV-41l + ***+ + ***+ 0141Y-1 V < 0Y p 1, it is seen 23 > P. 7 ?OP4,JV-X = 0 forv<0, we have Thereforeby setting /41 iIo =1; 0, 4(B)4I' v 0. (2.4) Suppose then we have a series {yt} generatedby the model (2.1) or (2.2), where in general yt= Vdzt can be the dth difference(d 0, 1, 2, - - ) of the actual observations.Then forgivenvalues = (s, . , kr)' of the parameters we can define = yt(- tit=at( - qy-l p(B)yt =- (2.5) and the correspondingautocorrelation k= (2.6) iLtt-k rk(4>) = E Thus, in particular, 1. at(+)=atasin (2.1), (2.2); 2. at(+) are the residualswhen (2.1) is fittedand least squares estimated + obtained; and 3. rk(4) and rk(+) are respectivelythe residual and whitenoise autocorrelations (1.2) and (1.3). -Q 2.2 LinearConstraints on ther's It is knownthat the residuals {at } above satisfythe orthogonality conditions n A _ Et_P+1 atyt_ =0 1 j p. (2.7) Thereforeif we let B - *^-pBP)-') = (I1- A(B)-4-(B) (2.8) then yt= 4(B)dt, and from(2.7) we have 0 -ZEt Ek kkd?atk j - Ek 6kfk+j = E2 f'kf)k+j+ Op(l/n) (2.9) inTimeSeriesModels ResidualAutocorrelations 1513 wherethe symbolintroducedin (2.9) denotes "orderin probability"as defined in [13]. In leading up to (2.9) we have presumablysummed an infinitenumberof autocorrelationsfroma finiteseries.However since {y,} is stationarywe have 41k-*Oas k becomeslarge; and unless 4)is extremelyclose to the boundaryofthe stationarityregion,this dyingoffof {i4 is fast so that the summationcan generallybe stopped at a value of k much less than n. More precisely,we are assumingthat n is largerthan a fixednumberN and for such n there exists a sequence ofnumbersmnsuch that orsmaller,and (a) all 3Cjwherej > m. - p are oforder1/-VTt (b) the ratio m./nis itselfof order1/\/n. Then in (2.9) and in all followingdiscussionthe errorin stoppingthe summations at k=m (we writem for mnin the sequel) can to the presentdegree of approximationbe ignored; and (b) also ensures that "end effects"(such as therebeing only n-k termssummedin the numeratorof rkcomparedwith n termsin the denominator)can also be neglected. of r, aboutrk 2.3 LinearExpansion is of definedby VE(4j_-j)2, The root mean square errorof cj, 1<j?p, order1/V/f,and we can thereforeapproximatefk by a firstorderTaylor expansion about =4) (evaluating the derivatives,however,at 45rather than 4) in orderto obtain the simplification(2.12) below). Thus rk= rk+ E 1 (oj - fj)gjk + Op(1/n), (2.10) where (2.11) I- j- afj P=A Now [E t2] = 0 at 4= (2.12) so that [ - Vi - E atat-k E &2]-1 (2.13) _C where ek = = = E [k(B)yt][L(B)yt_k] Et ZtZ0 Ej=Do 4ijYt-iYt-k-jt (2.14) wherein (2.14) and below, 4o = ko -1. From (2.13) and (2.14) it followsthat sJk = Z E-aYt2 =o j 'i[r(Y)7-i+j i=0 k-i [r(j)k_+I + + r(Y)k?iJ] rQ()k+ i-jj t_0 '^P =09ijr(Y)f-j (2.15) 1514 Journalof the AmericanStatisticalAssociation,December 1970 where r() E YtYt-v yt2t Let us approximateajk by replacing "'s and r(y)'sin (2.15) by ?'s and p's (the theoreticalparametersand autocorrelationsof the autoregressiveprocess {Yt and denotethe resultby bjk. That is, aft + [Pk-i+j =- Z=-o Zj=O Pk+i-j1 (2.16) ciojPi-j Now fromBartlett'sformula[2, Equation (7) ] we have + rk(y)=Pk and as in the discussionpreceding(2.10), thus 3jk = 3jk + (2.17) Op(l/V/n), Op(l/Vn); j = 4j+ (2.18) O p(l/n), so that equation (2.10) holdswhen Ak iS replacedby 5jk. By makinguse of the recursiverelationwhichis satisfiedby the autocorrelationsofan autoregressiveprocess,namely pv 4lpv-l -ppv=p =(B)p. 0, v> 1 (2.19) expression(2.16) can be siiyplifiedto yield I:=o (3jk SbiPk-j+i (2.20) - O'ipi EP'1=o Thus ajk depends oiily on (k-j), arid we thereforewrite Sk-j= to show that straightforward ajk. Then it is (a) o = 1 (b) SP= 0, v < 0, and thus (e) 'p(B) 3V- = o, o v > 1. Comparing (a), (b), and (c) with the correspondingresults (2.4) for 41, we therefore have 6,5=1, that is 3jk = (2.21) ilk-j, whence,fork= 1, 2, * * *, m, rk = rk + j=1 (pi - $j)tk-j + Op(1/n). (2.22) 2.4 Representation of r as a LinearTransformation of r We can now establish a relationshipbetween the residual autocorrelations r and the whitenoise autocorrelationsr. Let 1515 Residual Autocorrelations in TimeSeries Models 0 i1t1 1 J/t2 61i 0 . 1 . X= *. . 1 _V/m-1 . A/m-2 . O- . (2.23) mp [Xj I X21 * * * I Xp]. Then to Op(1/n)we can write(2.22) in matrixformas (2.24) r-r + X(+-P), wherefrom(2.9) O. rX= (2.25) If we now multiply(2.24) on both sides by Q = X(X'X)-lX', (2.26) (I - Q)r. (2.27) then using (2.25) we obtain r = It is known [1] that r is very nearly normal for n moderatelylarge. The vector of residual autocorrelationsis thus approximatelya linear transformation of a multi-normalvariable and is thereforeitselfnormallydistributed. Specifically, r (2.28) N(O, (1/n)I), and hence r 'N(O, (1/n)[I - Q]). (2.29) Note that the matrixI - Q is idempotentof rank m- p, so that the distributionofr has a p-dimensionalsingularity. 2.5 FurtherConsiderationof the Covariance Structureof the r's It is illuminatingto examinein greaterdetail the covariance matrixof r, or equivalentlythe matrixQ. The latter matrixis idempotentof rank p, and its non-nulllatentvectorsare the columnsofX. Also, X'X- EzQVj2j'jii E {j ,j2J,/-i *..p* _ t41j-p+1 E2qpj->2 P1 Pi PP1 . . P2 ay2P[ La2 - Pp-, Pp2 * * * . . . _. I EzV14'j-pi- (2.30) December1970 Association, Statistical Journal of theAmerican 1516 whichwhen multipliedby ta2 is the autocovariancematrixof the processitself. Let cU be the (ij)th elementof (X'X)-' (given explicitlyin [9]), and similarly qij for Q. If * * , qlj-,) denotes the jth row of X, then (=(Ajj, kj'(X'X)-1tj qij " -= 23 (2.31) 1t=l {_kCU41j_t t, Ct__ -1 k=4 (-n)colrv[p,fj] if i 5 j. Since the elementsof each columnof X satisfythe recursiverelation(2.4), we have t(B) tj = 0, and hence )(B)qti = 0, (2.32) wherein (2.32) B can operateeitheron i or on j. This establishesan interesting recursivestructurein the residual autocorrelationcovariance matrix (1/n) *(I -Q) and providesan importantclue as to how rapidlythe covariances die out and the variances approach 1. Also, because of this propertythe entire covariancematrixis determinedby specifyingthe elements qll q12 q22 . . . *q* Ulp (2.33) 22p qPP of Q, whichare readilyobtainedby invertingthe X'X matrix(2.30). and secondorderprocesses 2.6 CovarianceMatrixof r forfirst Consider,for example, the firstorder autoregressiveprocess yt-=yt_-+at, whichin accordancewith(2.2) we can writeas (2.34) Yt = (1 - OB)-'at = E'= o yat-j. For this process,Aj= 4i and (X'X)-l = 1 -2. From (2.31) the (ij)th elementof Q is thereforei+j-2(1 thecovariancematrixofthe so thatapproximately - .2), sampleresidualautocorrelationsis r = (l/n)(l-Q) = l/n 2 + 4)3 1 - _<>2+ 04 - ->O (2 + ?04 - (pl + 0?5 . 04 + (p6 (A>5 1 - (p3 + . (-5 For the second orderprocess y= (1 - 41B - 402B2)-'at we have n rl~~~~~~OY = ,(B)at, (2.36) [ 1517 inTimeSeriesModels ResidualAutocorrelations (x')' x- 2 -a'1 cr2(1 - p12) r1 2 -i2)Oa2 a (1 - (1 + -pl 4)2)[(1 - )2)2 - 012] Thus qll 1-422, q12 fromwhichQ and lar, 3 + -P142(l r = L/n(I V(f) =1/n - q22 4)2), 1 22 012(l + 02 )2, Q) may be determinedusing (2.32). In particu- .22 V(f2) = 1/n[4)22 + V(Pk) = 1/n[l - 4)12(1 + Olqk,k-, and 02qk,k-2], k > 3. 4)2)21 - (2.37) From these exampleswe can see a generalpatternemerging.As in (2.33) the firstp variances and correspondingcovariances will be heavily dependenton the parameters4)1,- - *, 4, and in generalcan depart sharplyfromthe correspondingvalues forwhitenoise autocorrelations,whereasfork> p+1 a "1" is introducedinto the expressionforvariances (as in (2.35) and (2.37)), and the like (2.32) ensuresthat as k increasesthe { fk} behaveincreasingly recursion the corresponding{rk} with respectto both theirvariances and covariances. of nE 2.7 Thedistribution We have remarkedearlier that if the fittedmodel is appropriateand the parameters4+are exactlyknown,thenthe calculated at's would be uncorrelated normaldeviates,theirserialcorrelationsr wouldbe approximatelyN(O, (1/n)I), and thus n El rk2would possess a x2 distributionwith m degreesof freedom. large so that the elementsafterthe We now see that if m is taken sufficiently mthin the latent vectorsof Q are essentiallyzero, then we should expect that to the orderof approximationwe are hereemploying,the statistic $ nZ 2 (2.38) obtained when estimates are substitutedfor the true parameterse? in the model,will stillbe distributedas X2,only now withm- p ratherthan m degrees of freedom.This resultis of considerablepracticalinterestbecause it suggests that an overall test of the type discussedin [4] can in fact be justifiedwhen suitable modificationscomingfroma more carefulanalysis are applied. Later we considerin more detail the use of this test, along with procedureson individual r's, in diagnosticchecking. 3. MONTE CARLO EXPERIMENT We have made certain approximationsin derivingthe distributionof the of interestto investigatethisdistriresidualautocorrelations,and it is therefore bution empiricallythroughrepeatedsamplingand to comparethe resultswith (2.29). This was done forthe firstorderAR processfor 0=O, ?.1, ?.3, ?A.5, ?.7, ?.9. For given 4, s =50 sets of n - 200 random normal deviates were generated on the computerusing a method described in [7], with separate aggregatesof deviates obtained for each parametervalue. For the jth set a Journalof the AmericanStatisticalAssociation,December 1970 1518 series{yt(i)} was generatedusingformula(2.34), +() was estimated,{t4i) and thequantities determined, E (j) rk -- - A,(j) A(j) at at-k computedfor1<k<m=20, 1<j<s=50. covariances Ck= - (3.1) This yieldedsamplevariancesand 1 50 (3.2) EJ,= rk rt and samplecorrelations I7k* = Ckt/VCkkCZe. (3.3) The resultsofthisMonteCarlosamplingare set out in detailin [8] and in used. As an exampleof the adequacyof the approximations generalconfirm these calculations, Table I compares the empiricalvariances (3.2) of Pk and correlations(3.3) of (rj, 9k) with theirtheoreticalcounterpartsobtained from (2.35). Allowingfor the sampling errorof the Monte Carlo estimates themselves, thereis good agreementbetweenthe two sets of quantities,a phenomenonwhichoccurredalso forthe othervalues of 0 considered. Since the large-samplevariance 02/n of 9l departsthe mostfromthe common variance of 1/n for white noise autocorrelations,an examinationof the empirical behavior of this quantityis of particularinterest.Thus Figure 1 shows the sample variance of PI for ) = 0, ?.1, + .3 ? .5, +.7, +.9 in relationto the parabola V(f1)= 02/n,withreasonableagreementbetweenthe two. (The coefficient of variation of the sample variance of fk for4) 40 is approximatelyV2/s -1/5, independentofk and n; at 4 = 0, V(i) =0(1/n2). Table 1. THEORETICAL (FROMMONTE(AS IN (2.35)1 AND EMPIRICAL OF SAMPLE CARLOSAMPLING)VARIANCES AND CORRELATIONS RESIDUALAUTOCORRELATIONS FROM FIRST-ORDER AR PROCESS WITH 0=.5 Variance of rh k 1 2 3 4 (multipliedbyn) Theoretical Empirical .250 .244 .813 .953 .988 Theoretical 1.000 Empirical 1.000 - .832 - .812 .864 - .189 - .186 .967 - .047 - .221 .741 .997 1.240 7 8 9 10 1.000 1.000 1.000 1.000 .870 1.203 .982 .881 .999 r%and rhk .676 5 6 Correlation between -.384 -.094 -.023 -.012 -.006 -.003 -.301 -.366 .083 -.148 -.009 -.080 1519 inTimeSeriesModels ResidualAutocorrelations (DOTS) VARIANCES OF ri (LINE)AND EMPIRICAL Figure1. THEORETICAL ,?2 1.O 20 ~~~~~~~v(rl)l \ n f n ~~~n \ / 0.2 -1.0-.9-.8-.7-.6-.5 .2 .1 . 4-.3-.2-.10 .3 .4 .5 .6 .7 .8- .9 1.0 There are severaladditionalcomparisonswhichcan be made based on certain functionsofthe r's. Thus we have seen that =^ Eok0pk = ? (3.4) and in the courseofour derivationswe have had to make the approximation 0.2Ix rk - ? (3.5) Some indicationof the validity of this approximationis gained by examining the actual values of Ifromthe samplingexperiment,which were foundto be distributedabout zero with a variance of about one-hundredththat which would have been expectedfromthe same linear formin whitenoise autocorrelations. Of considerableimportancebecause of its role in diagnosticcheckingis an examinationofthe quantity n 1k rk2SZ0k 200 A20 (3.6) which as in (2.38) should possess a XI-distributionwith v= m -1 = 19 degrees of freedom.Such a distributionhas a mean and variance of 19 and 38, respectively,with which the Monte Carlo values can be compared. When this was w e found signifidone, the overall or pooled empiricalmean was fount, is plausible, however,when it is realfrom19. This difference cantly different ized that the statisticn Elm Wkpossessesa x2m-P_ distributiononly insofaras the whitenoise autocorrelationsra= (ri, a nc, r)' have a commonvariance of 1/n; and from(1.4) it is seen thatthisapproximationoverestimatesthn ue variance ofa givenrkby a factorof (n +2)/ (n- k). In particular,forn=200, m= 20, and y actualvarianceV(rk)is 190/202 94 percentofthe a typicalvalueofk 10, the 1/n approximation.Since the residualautocorrelationSr are by (2.27) a linear of r, it is reasonableto expect that a comparable depressionof transformation 1520 Statistical Association, December1970 Journal of theAmerican the variances of {rik} would occur,and this would account forthe discrepancy rk2 enbetween the theoreticaland empiricalmeans of the statistic200 counteredabove. (This phenomenonwould also explain the tendencyforthe empiricalvariances themselves,such as those in Table 1, to take on values averagingabout 5 percentlowerthan thosebased on the matrix(I /n)(I - Q) of (2.29).) El IN DIAGNOSTICCHECKING 4. USE OF RESIDUAL AUTOCORRELATIONS We have obtained the large sample distributionof the residual autocorrelations r fronm fittingthe correctmodel to a time series,and we have discussed the ways in whichthis distributiondepartssignificantly fromthat of the white noise autocorrelationsr. It is desirable now to considerthe practical implicationsoftheseresultsin examiningthe adequacy offitofa model. First of all it appears that even thoughthe r's have a variance/covariance matrixwhich can differvery considerablyfromthat of the r's, the statistic ET=1 Pk2 will (since the matrixI- Q is idempotent)stillpossess a X2-distribution, only now with m-p ratherthan m degreesof freedom.Thus the overall x2-testdiscussed in Section 1 may be justifiedto the same degree of approximation as before when the number of degrees of freedomis appropriately modified. However, regardingthe "quality-control-chart" procedure,that is the comparison of the {rk} with their standard errors,some modificationis clearly needed. Figure 2 shows the straight-linestandard errorbands of width 1/V/nassociated withany set of whitenoise autocorrelations{rk}. These stand in marked contrast to the correspondingbands for the residual autocorrelations{rk}, derived fromtheircovariance matrix (1/n)(I -Q) and shown in Figure 3 for selected firstand second order AR processes. Since it is primarilythe r's of small lags that are mostusefulin revealingmodel inadequacies,we see that the consequenceof treatingP's as r's in the diagnosticcheckingprocedurecan be a seriousunderestimationof significance, that is, a failureto detectlack of fitin the modelwhenit exists.Of course,ifthe model would have been judged inadequate anyway,our convictionin thisregardis now strengthened. Suppose, forexample,that we identifya series of length 200 as firstorder Figure2. STANDARD ERROR LIMITS FOR WHITE NOISE AUTOCORRELATIONSrk 2 0 2 3n61 k+ 1 2 3 4 5 6 Residual Autocorrelations in TimeSeries Models 1521 autoregressiveand afterfitting =.5. Suppose also that r',= .10. Now the standard errorof r1forwhitenoise is 1//n= .07, so that Pi is well withinthe limits in Figure 2. Thereforeifwe erroneouslyregardedthese as limitson rfwe would probably not conclude that this model was inadequate. However, if the true process actually were firstorderautoregressive(say with 0=.5), the standard errorof Pi would be | j /v/n=.035; since the observedri=.10 is almost three timesthis value, we shouldbe verysuspiciousof the adequacy of this fit. The situationis furthercomplicatedby the existenceof ratherhigh correlations betweenthe r's, especiallybetweenthose of small lags. For the firstorder process,the mostseriouscorrelationis p[rf, r2] = 1 + _1 which,for example, approaches -1 as p0---*+ and is still as large as - .6 for q5=.7. Correlationamong the r's is even more prevalentin second and higherorderprocesses,where(as forvariances) those involvinglags up to k = p can be particularlyserious.From then on theirmagnitudeis controlledby the recursive relationship(2.32); in particular,the closer + is to the boundary ofthe stationarityregion,the slowerwill be the dyingout of covQrk,r9) or p(rk, r although oftenin these situationsthe less serious will the initial correlations pQri,92),p('2, r3), p(ri, r3), etc.,tendto be. We have thus seen that the departureofthe distributionofthe residualautocorrelationsr fromthat of white noise autocorrelationsr is serious enoughto Figure3. STANDARDERRORLIMITSFOR RESIDUALAUTOCORRELATIONS Pi k - 1 2 2 3 4 5 6 1 2 1 2 3 4 5 6 1 /iY2 2 (a) = AR(1) 2 53 (b) 2 2 0 0 1 .1 /jj~~~~~~~~~~~~~n V'_(c) AR(2), 4j .5 2 .253 AR(1), n (d) AR(2), .77 = ~j 1.0 ~2 75 1522 Journal of theAmerican Statistical December1970 Association, warrantsome modificationsin theiruse in diagnosticchecking.The residual autocorrelationfunction,however,remainsa powerfuldevice forthis purpose. 5. DISTRIBUTION OF RESIDUAL AUTOCORRELATIONS FOR THE MOVING AVERAGEAND GENERAL ARIMAPROCESSES In obtainingthe distributionof -, -*, rAm)'forthe pure autoregressive process in Section 2, considerable use was made of the recursive relation O(B)pk= 0, whichis not satisfiedby movingaverage modelsyt= O(B)at, or more generallyby mixed models of the form(1.1) with wt = Vdzt denotingthe stationarydth difference. It is fortunate,therefore, that these models have in commonwith the pure AR models (2.1) an importantproperty(derived in Section 5.1) because of which the distributionof their residual autocorrelationscan be found as an immediateconsequence of the autoregressivesolution (2.29). This propertyis that if two time series, (a) the mixed autoregressive-movingaverage series (1.1), and (b) an autoregressiveseries r(B)xt = (1 - r1B - *- *-rp+,qBP+q)xt = at (5.1) are both generatedfromthesame set ofdeviates{at}, and moreover if ir(B) -= (B)O(B), (5.2) then when these models are each fittedby least squares, theirresiduals,and hence also theirresidualautocorrelations, will be verynearlythe same. Thereforeif a mixed model of order (p, d, q) is correctlyidentifiedand fitted,its residual autocorrelationsforn sufficiently large will be distributedas thoughthe model had been of order (p+q, d, 0) withthe relationsbetweenthe two sets of parametersgivenby (5.2). In particularthe ,6'scomprisingthe X-matrix(2.23) forthe model (1.1) are the coefficients in +(B) = [O(B)G(B) ]i 5.1 Equalityof ResidualsinARand ARIMAModels Let Wt and Xt be as in (1.1) and (5.1); (5.2) thenimplies wt =2(B)xt. (5.3) As in (2.5), define =tAR=atARQI) = + 7r(B)xt=- 1rij (5.4) where7ro -1, and now also at* at*(+, 0) = k(B)#'(B)wt i=oq?Bt[ Z0-o jBL]-'wc, (5.5) where ko= =-1. We will expand these quantities about the true parameter values and go througha least squares estimationin each case whichis analogous to writingthe linearregressionmodely - X5 +? as e =-y - y X( - +)? = X& + ?, (5.6) forfixedg, and then performing the regressiondirectlyon e ratherthan on y. The equality of the residualsin the two cases depends heavily on the fact that the derivativesin each expansioninvolve the same autoregressivevariable xt. ResidualAutocorrelations inTimeSeriesModels 1523 Thus a4tAR ____ = <j 1 Xttj - (4, 6) at 3-G(B)xt_j ao, 1 < j < p &-'(B)Wt_j, - p + q, irrespectiveof =; < (4, 0); and = 1< j < 3 =(B)6-2(B)wtj, at (4) 0) 4(B)xt_j (4, 0). Then atAR + atAR (5.7) j)t- -rj j=1 and approximately at* = at* at + J=j + E- (4i - + St=1 = At* + + (i j:=pI = i Thus letting = (f, ., - ) -_ -_ -_ i)Xt_i [fi(oj (fl 6j) -j( (Gj - Oj)O(B)xt_j - j)xt-j (5.8) j(4i -i)]Xt q ( i -) -_j)Xt_ _X_ (5.9) Oxt- and I-=_ 3,q)' (j t)xt-x - -_ il Ep+q - Y1 ki)G(B)xt_, 1 [i(oj Eq at* + = - (ci we see that (5.10) AL, = whereA is a (p+q) -square matrixwhose elementsinvolve I but not the true parametervalues L. For example,ifp = q = 1, we would have Now equations (5.7) and (5.9) can be writtenas aAR =+va+X( ..=) ~4ia*a+X(5-) wherethe errorin (5.13) is 0(1 that,at=X =0, and + =4, - (5.12) (5.13) 2), and wherewe have made use of the fact f1 atAR = at* = at. (5.14) Thus in (5.12) the sum of squares a'a = Eat2 = E [atAR(Q,)]2 is minimizedas a functionofX when -X= =- ( X)lXfXfAAR, (5.15) 1524 Journalof the AmericanStatisticalAssociation,December 1970 while in (5.13) if we write a* = a + X[A(-L)J= a + Z(%-L), then the sum of squares a'a = Eat2 = E 2 [a is minimizedas a functionof Z when == - -= - (Z'Z)-Z'a* A-1(g - that is, -0 Then by settinga equality = (5.16) =(X'X)-fX'a*. a in (5.15) and (5.16), we have from(5.14) the important -_ =X (X'X) - 'X'a and finallyby setting"."-"^in to Op(1/n) aAR = a + X(:n:) = (5.17) - (5.12) and (5.13), it followsfrom(5.17) that =a +X(5 -) = (5.18) a*, and thus (to the same order) fAR = r*, as we set out to show. 5.2 Monte Carlo Experiment The equality (5.18) betweenthe residualsfromthe autoregressiveand mixed modelsdepends on the accuracyof the expansion(5.8), that is, on the extentof linearityin the movingaverage model,betweenthe true and estimatedvalues 0 and 0. It is thereforeworthwhileto confirmthis model-dualityby generating and fittingpairs of series of the form(1.1) and (5.1) and conmparing theirresiduals, or more to our purpose,theirresidual autocorrelations.This was done forp+ q = 1 and p + q = 2 forseriesof length200. Some indicationof the closeTable2. RESIDUALCORRELATIONS FROM FIRSTORDERAR AND MA TIMESERIESGENERATED FROM SAME WHITENOISE (n= 200) k 1 2 3 4 5 6 7 8 9 10 q or 0-+ q0= .1 -0- =$= .3 .9 PkAR PkMA PkAR PkMA rkAR PkMA -.029 .164 .096 -.050 - .003 - .143 -.023 -.040 .010 -.049 .159 -.010 .169 .099 -.049 - .006 - .144 -.026 -.041 .009 -.049 .057 .003 .044 - .098 .014 .057 .010 -.004 -.054 .052 -.065 .543 -.005 .045 - .096 .021 .058 .012 .001 -.046 .052 -.067 .451 -.048 .157 .008 - .126 .034 - .091 -.001 -.038 - .004 .113 .922 -.057 .151 .009 -.127 .035 - .090 -.000 -.035 .000 .116 .870 ResidualAutocorrelations inTimeSeriesModels 1525 ness of the agreementis obtained fromthe few resultsforfirstorderAR and MA processesshownin Table 2, whereit is seen that the residual autocorrelation 1kAR and rkMA are equal or nearlyequal to the second decimal place. A samplingexperimentof the type describedin Section 3 was also performed for the firstorder MA process. The resultswere very similar,which is to be expectedin view of (5.18). 5.3 Conclusions We have shown above that to a close approximationthe residualsfromany moving average or mixed autoregressive-moving average process will be the same as those froma suitablychosen autoregressiveprocess. We have further confirmedthe adequacy of this approximationby empirical calculation. It followsfromthis that we need not consider separately these two classes of processes;more precisely, 1. We can immediatelyuse the AR result to writedown the variance/covariance matrixof r for any autoregressive-integrated moving average process (1.1) by consideringthe corresponding variance/covariancematrixof r fromthe pure AR process ir(B)xt= O(B)4(B)xt = at. (5.19) 2. All considerationsregardingthe use of residual autocorrelationsin tests of fit and diagnosticcheckingdiscussedin Section4 forthe autoregressive modelthereforeapply equally to movingaverage and mixedmodels. 3. In particularit followsfromthe above that a "portmanteau"testforthe adequacy n of any ARIMA processis obtainedby referring with nE ik4 to a x2distribution v degreesof freedom,wherev= m-p -q. REFERENCES [11 Anderson,R. L., "Distributionof the Serial CorrelationCoefficient," The Annals of MathematicalStatistics,13 (March 1942), 1-13. [21 Bartlett,M. S., "On the TheoreticalSpecificationand SamplingPropertiesof AutocorrelatedTime Series," Journalof theRoyal StatisticalSociety,Series B, 8 (April 1946), 27-41. and Diananda, P. H., "Extensionsof Quenouille'sTests forAutoregressive [31 Schemes,"JournaloftheRoyalStatisticalSociety,SeriesB, 12 (April1950), 108-15. [41Box, G. E. P. and Jenkins,G. M., StatisticalModelsforPredictionand Control,Technical Reports #72,77, 79, 94, 95, 99, 103, 104, 116, 121, and 122, Departmentof Statistics,Universityof Wisconsin,Madison, Wisconsin,1967. , Time Series Analysis Forecastingand Control,San Francisco: Holden-Day, [51 Inc., 1970. and Bacon, D. W., "Models forForecastingSeasonal and Non-SeasonalTime [61 Series,"in B. Harris,ed., SpectralAnalysisof TimeSeries,New York: JohnWiley & Sons, Inc., 1967. [71 Box, G. E. P. and Muller, M. E., "Note on the Generationof Random Normal Deviates," TheAnnalsofMathematical Statistics, 29 (June1958), 610-11. [81 Box, G. E. P. and Pierce, D. A., "Distributionof Residual Autocorrelationsin IntegratedAutoregressive-Moving AverageTime Series Models," Technical Report #154,DepartmentofStatistics,UniversityofWisconsin,Madison,April,1968. [91 Durbin, J., "EfficientEstimationof Parametersin Moving Average Models," Biometrika,46 (December 1959), 306-16. , "TestingforSerial Correlationin Least-Squares RegressionWhen Some of [101 the Regressorsare Lagged DependentVariables,"Econometrica, 38 (May 1970), 41021. 1526 Journalof the AmericanStatisticalAssociation,December 1970 [11] Grenander,U. and Rosenblatt,M., StatisticalAnalysis of StationaryTime Series, New York: JohnWiley& Sons, Inc., 1957. [12] Mann, H. B. and Wald, A., "On the Statistical Treatmentof Linear Stochastic Difference Equations,"Econometrica, 11 (July1943), 173-220. [13] "On StochasticLimit and OrderRelationships,"The Annals ofMathematical Statistics,14 (September1943), 217-26. [141 Quenouille,M. H., "A Large-SampleTest for the Goodness of Fit of Autoregressive Schemes," Journalof theRoyal StatisticalSociety,Series A, 110 (June 1947), 123-9. [151 Walker,A. M., "Note on a Generalizationofthe Large-SampleGoodnessofFit Test forLinear Autoregressive Schemes,"JournaloftheRoyal StatisticalSociety,SeriesB, 12 (April1950), 102-7. 39 (December 1952), 309-18. [161 Whittle,P., "Tests of Fit in Time Series,"Biometrika, [171 Wold, H., A Studyin theAnalysisof StationaryTime Series,Stockholm:Almquist and Wiksell,1938. P"A Large-SampleTest forMovingAverages,"JournaloftheRoyal Statistical [181 Society,Series B, 11 (April1949), 297-305. [19] Yule, G. U., "On a Method of InvestigatingPeriodicitiesin Disturbed Series, with SpecialReferenceto Wolfer'sSunspotNumbers," PhilosophicalTransactions,(A) 226 (July1927), 267-98.