PCD – Préparation et caractérisation des données Chapitre 10 – Attributs pour les séries temporelles (time series) Andrei Popescu-Belis Plan 1. Definition of time series and main objectives of their study 2. Modeling time series 3. Seasonality 4. Outliers. Missing data 5. Preprocessing using smoothing 2 1. Definition of time series and main objectives of their study 3 Definition • A sequence of measures of a given phenomenon taken at regular time intervals • the interval between observations can be any time interval • e.g., hourly, daily, weekly, monthly, quarterly, annually, or every n years • it is assumed that these time periods are equally spaced • Examples • quarterly Italian GDP of the last 10 years • weekly supermarket sales of the previous year • yesterday’s hourly temperature measurements 4 What is not a time series? • Cross-sectional data = multiple objects observed at a particular time • e.g., customers’ behavioral data at today’s update, companies’ account balances at the end of the last year • Time series data = one single object (product, country, sensor, …) observed over multiple equally-spaced time periods • Time series can have other attributes apart from time-based features • if those are also time-based → multivariate time series • if those are static → univariate time series with static features 5 Objectives of time series analysis in PCD • Summary description • graphical or numerical • picture the behavior of the time series • Interpretation of specific features of the series • e.g. seasonality, trend, relationship with other series • Decomposition, smoothing Source: https://xkcd.com/523/ 6 Objectives of time series analysis in APN • Forecasting: predict the values of the time series at π‘ + 1, π‘ + 2,…, π‘ + π • Causal models: regression analysis (AR) • Smoothing: moving averages (MA), exponential smoothing • Notions of time series analysis (up to ARIMA) • Classification of time series β’There is no measure of reliability for descriptive analysis, but there are error measures for forecasting “It is difficult to make predictions, especially about the future.” “Forecasting is the art of saying what will happen in the future and then explaining why it didn’t.” 7 2. Modeling time series 8 Decomposition of a time series • Four main components • seasonal component (S) | trend (T) | cyclical component (C) | irregular (I) • Additive model: ππ‘ = ππ‘ + ππ‘ + πΆπ‘ + πΌπ‘ • multiplicative models exist too and are sometimes more appropriate • Consequence on processing 1. 2. 3. 4. Remove seasonal component using smoothing Remove trend using regression Remove cyclical component (if there is one) using percentage ratio Obtain the residual part (irregular noise) 9 1. Seasonal component • Fluctuations that recur during specific time periods • periodical patterns repeating at a constant frequency • for instance, because the values are influenced by seasonal factors • e.g., the quarter of the year, the day of the week, etc. β’ Seasonality is always of a fixed and known period • but it can be affected by “calendar” effects • e.g., monthly production could differ if there are 4 or 5 Sundays in a month β More details below 10 2. Trend • Describes long–term movement of the series • generally, a long-term increase or a decrease • not necessarily a linear evolution • could also be, e.g., exponential • reflects the underlying level of the series • e.g., influence of a growth in population, or the inflation β Other names • secular trend • trend-cycle component, if combined with the cycle 11 3. Cycle • Long-term fluctuations that occur regularly in the series • oscillatory component: upward, then downward, and so on • smooth and quasi-periodic fluctuations around the trend • no set repetitions or seasonality, may vary in length β Difficult to detect, because cycles may be confused with the trend component β Not all time series have cycles 12 4. Irregular (residual) component • What remains after the previous components have been removed • random fluctuations that cannot be attributed to the systematic components • short-term fluctuations in the series, which are not predictable • Assumed to be the sum of random factors (white noise) not relevant for describing the dynamics of the series β The study of the residual may help to detect outliers or anomalies 13 STL decomposition into 3 components in R • Use the Loess algorithm to decompose a time series into seasonal, trend and irregular components • STL = Seasonal Decomposition of Time Series by Loess • Example in R: load time series load('time_mat.Rdata’) # cycling incidents in Montreal in 5 years startday <- ymd("2006-01-01", tz='EST') all_days <- startday + c(0:(365*5)) * days(1) # vector of all days from 2006 to 2010 included plot(all_days, incident_rate, type = "l", xlab = "Date", ylab = "Number of incidents") 14 ir_ts <- ts(incident_rate, frequency = 365) plot(stl(ir_ts, 365)) STL # create time series # perform STL decomposition & plot the result The gray bars have the same absolute size (here, ca. 10) to allow comparisons between the four curves 15 3. Seasonality and outliers 16 Forms of seasonal effects 1. Additive • components are independent from each other • e.g., an increase in the trend will not cause an increase in the magnitude of seasonal dips • the difference between raw data and trend is roughly constant at similar periods of time (same months, same quarters, etc.) irrespectively of the evolution of the trend 2. Multiplicative • components are not independent from each other • the amplitude of the seasonality increases with an increasing trend (and vice-versa) β If the variation in the seasonal pattern (around the trend-cycle) appears to be proportional to the level of the time series, then a multiplicative model may be more appropriate 17 Causes of seasonality • Depend on the domain and the periodicity of the seasonality 1. Climate: variation of the weather across seasons • e.g., seasons’ influence on agricultural data; on data about electricity use 2. Institutions, social habits, administrative rules • e.g., influence of Christmas (or other holidays); the fiscal year; the academic calendar; etc. 3. Indirect seasonality: due to seasonality affecting a cause of the parameter under study • e.g., toy industry is affected early by Christmas 18 Calendar effects • In a normal February, there are 4 days of each type: Monday, Tuesday, etc. • but all other months have an excess of some types of days! • If the value of the series is higher on some days compared to others, then the series may have a trading day effect • e.g., production usually increases with the number of working days in the month because many businesses close on Saturday and Sunday • Calendar effects are more complex for holidays that are not always on the same day of a month (US: Labor Day, Thanksgiving) or even not in the same month (EU: Easter, Ascension, Whit Monday) 19 Role of calendar effects on seasonality • If data points are months and data has annual seasonality, then all is well βΊ • But what about weekly observations? Most years have 52 weeks, but some years have 53 weeks ο • the seasonality cycle (averaged over many years) has a period of 52.18 data points (365.25 days/year divided by 7 days/week) • If the period of observations is smaller than a week, it gets more complicated • hourly data can have daily seasonality (F = 24), and/or weekly seasonality (F = 24 × 7 = 168), and/or annual seasonality (F = 24 × 365.25 = 8766) Source : KNIME tutorial L4-TS, © 2021 KNIME AG 20 Seasonal and calendar adjustments • Form of noise reduction or smoothing that seeks to eliminate seasonal and calendar effects from the series • for data collected at a larger frequency than annual, e.g. monthly or weekly • makes data comparable across countries, hemispheres, etc. • Procedure a. decompose the time series into its components, including the seasonal one (which is periodical and has zero mean) b. remove (by subtraction) the seasonal component from the series c. if present, also remove calendar effects β’ the result is a combination of trend and residual only (if no cycles) 21 “Seasonal and calendar adjustments transform the world we live in into a world with no such effects: the temperature is exactly the same in winter as in summer, there are no holidays, Christmas is abolished, and people work every day in the week with the same intensity, with no break over the weekend.” “Introduction to Seasonal Adjustment”, Dario Buono and Véronique Elter, EUROSTAT, 2015 22 4. Outliers. Missing data 23 Outliers • Data that do not fit in the tendency, or which are outside an expected range 1. Additive outlier: affects the value of one observation, not the others 2. Temporary change: one observation is too high (or low) and then the deviation reduces gradually until the time series returns to the initial level 3. Level shift: starting from a given moment, the level undergoes a permanent change (due to a change either in the phenomenon or in its measurement) Additive Outlier Temporary Change Level Shift 1,16 1,14 1,12 1,1 1,08 1,06 1,04 1,02 1 0,98 jan.98 jan.99 jan.00 jan.01 jan.02 jan.03 jan.04 jan.05 jan.06 Source: D. Buono & V. Elter, “Introduction to Seasonal Adjustment”, EUROSTAT, 2015 • If they cannot be otherwise removed, then: (1) and (2) are included in the irregular (residual) component, and (3) affects the long-term trend 24 Toolkits for outlier/anomaly detection • Examples: TODS, Merlion, DeepAD, etc. • AnomalyKiTS: see the article “Anomaly Detection Toolkit for Time Series” by Dhaval Patel et al. (IBM), Proceedings of AAAI, 2022 • PyOD: popular outlier detection toolkit, but no support for time series 25 Imputation of missing data • Clearly not with the mean of the entire series • cannot use the values from the future • With the rolling mean / moving average • fill in with average over the previous n periods • Interpolate between the previous and the next value(s) • typically linear interpolation, but not only • implies some “looking into the future”, which may be acceptable Source : https://neptune.ai/blog/time-series-prediction-vs-machine-learning 26 5. Preprocessing using smoothing 27 First idea of smoothing: moving average (MA) • Smoothing techniques are used to reduce irregularities in time series data and filter out (part of) the white noise • Methods • consider an odd number of observations 2n + 1 • include in the center the data point being averaged • MA of order 2n+1 for time t : MA(2n+1)t = [Yt+n + … + Yt + … +Yt-n] / (2n+1) • weighted past average: MA(3)t = w1οYt + w2οYt-1 + w3οYt-2 with weights e.g. 1/2, 1/3, 1/6 β When smoothing with past values only, the last smoothed value MAt can be used to forecast Yt+1, which is better than just using the value Yt 28 Time series analysis: exponential smoothing • Use an exponential smoothing constant, w, between 0 and 1 • The exponentially smoothed series Et based on the original time series Yt combines the present with the past E1 = Y1 E2 = wY2 + (1 – w)E1 … • Exponentially weighted moving average (EWMA) β’ removes rapid fluctuations in time series = less sensitive to short-term changes β’ identifies overall trend Et = wYt + (1 – w)Et–1 • Note: w affects smoothness of result Source : J.T. McClave, P.G. Benson, T.T.Sincich, Statistics for Business and Economics, Ch. 13, Pearson Education, 2011 29 Example • Stock prices at the end of a month, from January 2005 to December 2006: black line 70 60 50 40 • A lower value of w (blue line, w = 0.2) puts less emphasis on the current value of the series for time t, and gives a smoother curve, while a larger value of w Source : Statistics for Business and Economics, Pearson Education, 2011, chapitre 13 (red line, w = 0.8) puts more emphasis on the current value and leads to a more variable curve, closer to the original data 30 20 10 Nov-06 Dec-06 Oct-06 Sep-06 Aug-06 Jul-06 Jun-06 May-06 Apr-06 Mar-06 Feb-06 Jan-06 Dec-05 Nov-05 Oct-05 Aug-05 Sep-05 Jul-05 Jun-05 May-05 Apr-05 Mar-05 Feb-05 Jan-05 0 30 Formulation of the problem of forecasting • The real problem • given a time series (Yt)1 ο£ t ο£ n • for any given p > n • use (Yt)1 ο£ t ο£ n to predict Yp → the prediction (=forecast) is noted Ft • The approach for development and testing • for any given k < n, use (Yt)1 ο£ t ο£ k to predict Yk+1 • compare the forecast Ft with the actual Yt and obtain a performance score 31 Application of EWMA to forecasting • Solution to the problem: take the forecast Ft for the previous period and modify it using the forecast error Yt – Ft with a factor w • Ft+1 = Ft + w(Yt – Ft) • also written as Ft+1 = wYt + (1 – w)Ft • which is exactly the equation of the exponentially weighted moving average (EWMA) • How to choose the smoothing factor w? • 1D grid search: minimize the sum of the squares of the forecast errors over the training data 32 Holt-Winters forecasting method • Widely used method to model trend and seasonal variation • Holt (1957) extended exponential smoothing to allow forecasting with trends • Winters (1960) extended it to capture also seasonality • see C. Chatfield, “The Holt-Winters Forecasting Procedure”, Journal of the Royal Statistical Society, Series C, vol. 27, n. 3, 1978 β outperformed by more recent & complex methods (e.g., ARIMA) • Holt’s exponential smoothing: two constants (w, v) and two equations • Holt-Winters: extended to three equations (level, trend, seasonal variation) 33 Computing the two components of Holt 1. Exponentially-smoothed component: Et with constant 0 < w < 1 2. Trend component: Tt with its own smoothing constant 0 < v < 1 • v close to 0 ο more weight to remote past trend • v close to 1 ο more weight to recent past trend • Given the time series (Yt)1 ο£ t ο£ n E2 = Y 2 E3 = wY3 + (1 – w)(E2 + T2) … Et = wYt + (1 – w)(Et–1 + Tt–1) T 2 = Y2 – Y1 T3 = v(E3 – E2) + (1 – v)T2 … Tt = v(Et – Et–1) + (1 – v)Tt–1 • Forecasting k steps ahead (k = 1, 2, …): Ft+k = Et + kTt 34 Example of forecasting • Using the same stock price as above (January 2005 – December 2006), we compare 3 models for forecasting the price for the next 4 months • model I: Exponential smoothing with w = 0.2 • model II: Exponential smoothing with w = 0.8 • model III: Holt with w = 0.8 and v = 0.7 Source : Statistics for Business and Economics, Pearson Education, 2011, chapitre 13 Model MAD MAPE RMSE I 5.53 9.50 6.06 II 5.28 9.11 5.70 III 3.31 5.85 3.44 Evaluation measures for forecasting Source : KNIME tutorial L4-TS, © 2021 KNIME AG 36 To be continued in APN Apprentissage Non supervisé (automne, S5) 37