Semi-automatic integration in Dutch Supply and Use Tables – with special reference to time-series Marcel Pommée National Accounts Department Statistics Netherlands (CBS) Centraal Bureau voor de Statistiek Outline • SUTs, characteristics • Reasons for automation • Automation with ‘machines’ • Balancing machine • Quarterly machine • Time-series machine • Time-series project • Summary Centraal Bureau voor de Statistiek SUTs, characteristics • Annually, quarterly (t+45 and t+30 ?) • Detail: industries (120), commodities (630), expenditure categories (50) • Focus on year on year changes, no seasonal adjustment • Simultaneous balancing of current and constant prices • Data gaps filled with assumptions and extrapolations • Manual balancing Centraal Bureau voor de Statistiek Reasons for automation • Efficiency gains: budget reductions up to 34% expected in 2018 • Part of redesign of chain of economic statistics: • • • More structured process Top-down approach Focus on major problems • Quality • • • • Visibility various adjustment steps (transparancy) Consistency over time Reproduce results (consistency) First gdp estimates quickly available (analysis) Centraal Bureau voor de Statistiek Automation with ‘machines’ • Quadratic optimization model • • Minimizing the adjustment needed to the growth rates of quarterly series T-1 and unbalanced data in current and constant prices available • Only semi-automatic integration • • Major problems tackled manually Small problems resolved through automation • Machines for different purposes: • • • Balancing machine: balancing single SUT Quarterly machine: rebasing years and aligning quarters Time-series machine: rebasing time-series Centraal Bureau voor de Statistiek Balancing machine • Balancing single SUT: major problems solved manually • Hard and soft constraints • • • • • • • • • Suppy is equal to use by commodity Preserve price indices by commodity Preserve i/o-ratios by branches of industry Compute trade and transport margins Compute taxes and subsidies on products Upper and lower bounds for individual variables Fixation of variables Weighting of variables based on quality of datasource Specific relations (import and re-exports, building materials and construction) Centraal Bureau voor de Statistiek Quarterly machine • Compilation cycle: final (F), preliminary (P), very preliminary (V) • Quarterly machine • Input: F-year and 12 quarters of previous cycle • Output: rebased P- and V- year and 12 aligned quarters • Updating of P- and V-year • Selected information added • With balancing machine Centraal Bureau voor de Statistiek Time-series machine • Time-series (1990-2009) based on benchmark revision 2010 • Reconstruction of complete SU and IO tables • Earlier • • Year by year compilation => very time-consuming Difficult to preserve price and volume indices of original series • Time-series machine • • • • • New levels given by benchmark year and reference years Reference years, e.g. 1987, 1995, 2001 (previous revision years) Preservation of price and volume indices of original series Iterative process, manual intervention Result: fully consistent time-series in current and constant prices Centraal Bureau voor de Statistiek Time-series project (1) • Year 2010 ESA 2010 conceptual revision and benchmark (statistical) revision Revised GDP 7,6% higher (concepts 3% and benchmark 4,6%) • Covers Fully consistent ANA, QNA, ASA, QSA, LA, SUTs, IO-tables, and regional accounts • Planning Benchmark year 2010 2001-2009, up to 2013 1995-2000 • Extremely tight schedule Centraal Bureau voor de Statistiek publication 6th March 2014 publication 20th June 2014 publication 24th September 2014 Time-series project (2) • Time-series 2001-2009: series of problems • Start-up problems time series machine (coding errors, retrieving data, capacity limitations, processing time) • Takes a lot of time to specify constraints (fixation of variables, notably government data, weighting of variables, notably prices) • Many interdependencies with other NA-modules (LA, ASA, government data and financial institutions, fisim) • Time-series machine had to make quite large adjustments in the original series due to substantial level shifts in revision year Centraal Bureau voor de Statistiek Time-series project (3) • Time-series 2001-2009: consequences • Difficult to understand what the machine is doing (black box) • Planning deadlines were not met • Hardly any documentation • Data results machine less than optimal: extensive manual interventions => publication is on provisional basis • Highly motivated team but tension and frustration due to adversities Centraal Bureau voor de Statistiek Time-series project (4) • Time series 1995-2000: gaining experience Compilation process split into parts with: • • • Firstly a basic run without some constraints (no commodity balancing) => less adjustments by machine Secondly manual intervention to solve major problems (imbalances) Final run to solve minor inconsistencies and restore all constraints Results were much better • • Better understanding of the machine output Less or almost no manual intervention needed Centraal Bureau voor de Statistiek Time-series project (5) • Time series 1995-2000: some lessons learned • Takes a lot of time to smoothly run the time-series machine • Manual adjustments are mostly complex as they often affect large parts or the whole time-series • Due to all interdependencies it is important to stick to the planning • Complexities warrant an experienced team of compilers Centraal Bureau voor de Statistiek Summary • Almost 3 years experience with ‘machines’ • Pros • • • Efficiency gains: in terms of less fte’s More robust statistical process Improved quality • Cons • • • Investment to develop and to build Programming errors More complex: takes time to understand what the machine is doing Centraal Bureau voor de Statistiek