STAT 497 APPLIED TIME SERIES ANALYSIS INTRODUCTION 1 DEFINITION yt t • A stochastic process is a collection of random variables or a process that develops in time according to probabilistic laws. • The theory of stochastic processes gives us a formal way to look at time series variables. 2 DEFINITION • A time series is a sequence of actual, fixed, values, like: 61, 63, 58, 64, 56, 48, 39, 42, ... • A stochastic process is a sequence of random variables that have some kind of specified correlation or other distributional relationship between them. 3 DEFINITION • Stochastic processes are often used in modeling time series data - we assume that the time series we have was produced by a stochastic process, find the parameters of a stochastic process that would be likely to produce that time series, and then use that stochastic process as a model in predicting future values of the time series. 4 DEFINITION Y w, t : Stochastic Process belongs to sample space Indexed set • For a fixed t, Y(w, t) is a random variable • For a given w, Y(w, t) is called a sample function or a realization as a function of t. 5 DEFINITION • Time series is a realization or sample function from a certain stochastic process. • A time series is a set of observations generated sequentially in time. Therefore, they are dependent to each other. This means that we do NOT have random sample. • We assume that observations are equally spaced in time. • We also assume that closer observations might have stronger dependency. 6 DEFINITION • Discrete time series is one in which the set T0 at which observations are made is a discrete set. Continuous time series are obtained when observations are recorded continuously over some time interval. 7 EXAMPLES • Data in business, economics, engineering, environment, medicine, earth sciences, and other areas of scientific investigations are often collected in the form of time series. • Hourly temperature readings • Daily stock prices • Weekly traffic volume • Annual growth rate • Seasonal ice cream consumption • Electrical signals 8 EXAMPLES 9 EXAMPLES 10 EXAMPLES 11 EXAMPLES 12 EXAMPLES 13 EXAMPLES 14 EXAMPLES 15 EXAMPLES 16 EXAMPLES 17 EXAMPLES 18 EXAMPLES (SUNSPOT NUMBERS) 19 OBJECTIVES OF TIME SERIES ANALYSIS • Understanding the dynamic or timedependent structure of the observations of a single series (univariate analysis) • Forecasting of future observations • Ascertaining the leading, lagging and feedback relationships among several series (multivariate analysis) 20 STEPS IN TIME SERIES ANALYSIS • Model Identification – – – – Time Series plot of the series Check for the existence of a trend or seasonality Check for the sharp changes in behavior Check for possible outliers • Remove the trend and the seasonal component to get stationary residuals. • Estimation – MME – MLE • Diagnostic Checking – Normality of error terms – Independency of error terms – Constant error variance (Homoscedasticity) • Forecasting – Exponential smoothing methods – Minimum MSE forecasting 21 CHARACTERISTICS OF A SERIES • For a time series Yt , t 0,1,2, THE MEAN FUNCTION: t EYt Exists iff E Yt . The expected value of the process at time t. THE VARIANCE FUNCTION: 2 2 0 Var Yt E Yt t E Yt t 0 0 2 t 2 22 CHARACTERISTICS OF A SERIES • THE AUTOCOVARIANCE FUNCTION: t ,s CovYt , Ys EYt t Ys s E YtYs t s ; t , s 0, 1, 2, Covariance between the value at time t and the value at time s of a stochastic process Yt. • THE AUTOCORRELATION FUNCTION: t ,s t ,s CorrYt , Ys ,1 t ,s 1 t s The correlation of the series with itself 23 EXAMPLE • Moving average process: Let ti.i.d.(0, 1), and Xt = t + 0.5 t−1 24 EXAMPLE • RANDOM WALK: Let e1,e2,… be a sequence of 2 i.i.d. rvs with 0 mean and variance e . The observed time series Yt , t 1,2,, n is obtained as Y1 e1 Y2 e1 e2 Y2 Y1 e2 Y3 e1 e2 e3 Y3 Y2 e3 Yt e1 et Yt Yt 1 et 25 A RULE ON THE COVARIANCE • If c1, c2,…, cm and d1, d2,…, dn are constants and t1, t2,…, tm and s1, s2,…, sn are time points, then n m m n Cov ciYti , d jYs j ci d j Cov Yti , Ys j j 1 i 1 i 1 j 1 m m n i 1 2 Var ciYti ci Var Yti 2 ci c j Cov Yti , Yt j i 1 i 1 i 2 j 1 26 JOINT PDF OF A TIME SERIES • Remember that FX1 x1 : the marginal cdf f X 1 x1 : the marginal pdf FX 1 , X 2 ,, X n x1 , x2 ,, xn : the joint cdf f X 1 , X 2 ,, X n x1 , x2 ,, xn : the joint pdf 27 JOINT PDF OF A TIME SERIES • For the observed time series, say we have two points, t and s. • The marginal pdfs: fYt yt and fYs ys • The joint pdf: fYt ,Ys yt , ys fYt yt . fYs ys 28 JOINT PDF OF A TIME SERIES • Since we have only one observation for each r.v. Yt, inference is too complicated if distributions (or moments) change for all t (i.e. change over time). So, we need a simplification. 12 r.v. Y4 10 8 6 r.v. Y6 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 29 JOINT PDF OF A TIME SERIES • To be able to identify the structure of the series, we need the joint pdf of Y1, Y2,…, Yn. However, we have only one sample. That is, one observation from each random variable. Therefore, it is very difficult to identify the joint distribution. Hence, we need an assumption to simplify our problem. This simplifying assumption is known as STATIONARITY. 30 STATIONARITY • The most vital and common assumption in time series analysis. • The basic idea of stationarity is that the probability laws governing the process do not change with time. • The process is in statistical equilibrium. 31 TYPES OF STATIONARITY • STRICT (STRONG OR COMPLETE) STATIONARY PROCESS: Consider a finite set of r.v.s. Yt1 ,Yt2 ,,Ytn from a stochastic process Y w, t ; t 0,1,2,. • The n-dimensional distribution function is defined by FYt ,Yt ,,Yt yt1 , yt2 ,, ytn Pw : Yt1 y1 ,, Ytn yn 1 2 n where yi, i=1, 2,…, n are any real numbers. 32 STRONG STATIONARITY • A process is said to be first order stationary in distribution, if its one dimensional distribution function is time-invariant, i.e., FYt y1 FYt k y1 for any t1 and k. 1 1 • Second order stationary in distribution if FYt ,Y 1 t2 y1, y2 FYt1k ,Yt2 k y1, y2 for any t1 ,t2 and k. • n-th order stationary in distribution if FYt ,,Yt y1,, yn FYt k ,,Yt k y1,, yn for any t1 , ,tn and k. 1 n 1 n 33 STRONG STATIONARITY n-th order stationarity in distribution = strong stationarity Shifting the time origin by an amount “k” has no effect on the joint distribution, which must therefore depend only on time intervals between t1, t2,…, tn, not on absolute time, t. 34 STRONG STATIONARITY • So, for a strong stationary process i) fYt1 ,,Ytn y1,, yn fYt1k ,,Ytn k y1,, yn ii) EYt EYt k t t k ,t , k Expected value of a series is constant over time, not a function of time 2 2 2 Var Y Var Y , t , k iii) t t k t t k The variance of a series is constant over time, homoscedastic. iv) CovYt , Ys CovYt k , Ys k t ,s t k ,s k , t , k t s t k s k h Not constant, not depend on time, depends on time interval, which we call “lag”, k 35 STRONG STATIONARITY 12 Yt 10 8 Y1 Y2 Y3 6 4 ……………………………………….... 2 2 2 2 Yn 2 t 0 1 2 3 4 5 6 7 8 9 10 11 12 CovY2 , Y1 21 1 CovY3 , Y2 3 2 1 CovYn , Yn 1 n ( n 1) 1 Affected from time lag, k CovY3 , Y1 31 2 CovY1, Y3 13 2 36 STRONG STATIONARITY v) Corr Yt , Ys Corr Yt k , Ys k t ,s t k , s k , t , k t s t k s k h Let t=t-k and s=t, t ,t k t k ,t k , t , k • It is usually impossible to verify a distribution particularly a joint distribution function from an observed time series. So, we use weaker sense of stationarity. 37 WEAK STATIONARITY • WEAK (COVARIANCE) STATIONARITY OR STATIONARITY IN WIDE SENSE: A time series is said to be covariance stationary if its first and second order moments are unaffected by a change of time origin. • That is, we have constant mean and variance with covariance and correlation beings functions of the time difference only. 38 WEAK STATIONARITY EYt , t VarYt 2 , t CovYt , Yt k k , t Corr Yt , Yt k k , t From, now on, when we say “stationary”, we imply weak stationarity. 39 EXAMPLE • Consider a time series {Yt} where Yt=et and eti.i.d.(0,2). Is the process stationary? 40 EXAMPLE • MOVING AVERAGE: Suppose that {Yt} is constructed as et et 1 Yt 2 and eti.i.d.(0,2). Is the process {Yt} stationary? 41 EXAMPLE • RANDOM WALK Yt e1 e2 et where eti.i.d.(0,2). Is the process {Yt} stationary? 42 EXAMPLE • Suppose that time series has the form Yt a bt et where a and b are constants and {et} is a weakly stationary process with mean 0 and autocovariance function k. Is {Yt} stationary? 43 EXAMPLE Yt 1 et t where eti.i.d.(0,2). Is the process {Yt} stationary? 44 STRONG VERSUS WEAK STATIONARITY • Strict stationarity means that the joint distribution only depends on the ‘difference’ h, not the time (t1, . . . , tk). • Finite variance is not assumed in the definition of strong stationarity, therefore, strict stationarity does not necessarily imply weak stationarity. For example, processes like i.i.d. Cauchy is strictly stationary but not weak stationary. • A nonlinear function of a strict stationary variable is still strictly stationary, but this is not true for weak stationary. For example, the square of a covariance stationary process may not have finite variance. • Weak stationarity usually does not imply strict stationarity as higher moments of the process may depend on time t. 45 STRONG VERSUS WEAK STATIONARITY • If process {Xt} is a Gaussian time series, which means that the distribution functions of {Xt} are all multivariate Normal, weak stationary also implies strict stationary. This is because a multivariate Normal distribution is fully characterized by its first two moments. 46 STRONG VERSUS WEAK STATIONARITY • For example, a white noise is stationary but may not be strict stationary, but a Gaussian white noise is strict stationary. Also, general white noise only implies uncorrelation while Gaussian white noise also implies independence. Because if a process is Gaussian, uncorrelation implies independence. Therefore, a Gaussian white noise is just i.i.d. N(0, 2). 47 STATIONARITY AND NONSTATIONARITY • Stationary and nonstationary processes are very different in their properties, and they require different inference procedures. We will discuss this in detail through this course. 48