UNECE Workshop on Short-Term Statistics (STS) and Seasonal Adjustment 14 – 17 March 2011, Astana, Kazakhstan STS Compilation with Multiple Data Sources Anu Peltola Economic Statistics Section, UNECE Overview Data collection • • • Compilation of results • • • Sampling Administrative data Combining multiple data sources Data editing Non-response and weighting Treatment of non-comparable changes Publication Improvement March 2011 UNECE Statistical Division 2 Theoretical Concept – A Key to Good Quality Define the purpose of an indicator Links to the real world • • • What should it describe? Who are the users/uses (internal/external)? Possible data sources Act Plan Links to other statistics • • • • Differences in concepts, scope, methods Goal variables – national accounts/SBS Regular benchmarking Follow-up of differences Time Check Do Continuous improvement By Deming March 2011 UNECE Statistical Division 3 Quality Production Process Bring the collected data to the level of the intended statistical output! Correction of systematic errors in data Check for the most important observations Collection of data Index calculation March 2011 UNECE Statistical Division Publication 4 Data Collection Statistical Units Corner stones of business statistics • • Legal unit -> enterprise (services) -> enterprise groups Establishment (for industry/construction) Business registers are fundamentally important • Bridge between administrative and statistical units • Definition of the economic activity class (ISIC/NACE) • Improve its comprehensiveness – use as a frame • Examine opportunities to use administrative data • Interactive: update with information from STS UN: International recommendations for the Index of Industrial Production & EC: STS Metholodological manual March 2011 UNECE Statistical Division 5 System of Statistics Source: Statistics Finland, Strategy for economic statistics March 2011 UNECE Statistical Division 6 Data Collection Questionnaire Design Give clear instructions • Revisions to earlier months • • Explain the concepts to the respondents Aim to pre-fill the questionnaire with data given earlier Leave space for reporting revisions Always test changes to questionnaires Inform the respondents of the use of data Develop useful feedback for respondents • March 2011 your company compared to others in the same activity UNECE Statistical Division 7 Data Collection Sampling in Practice Many surveys are for units above a size threshold • Based on business register and periodically reviewed In drawing a sample, special attention to be paid to: • • • • Burdensome and problems with the coverage of small units Level of details to be published Resources available Accuracy and timeliness required Response burden Simple/stratified sampling by activity and size March 2011 UNECE Statistical Division 8 Total population of units in the Business Register Stratification by economic activity Large units Medium units Small units Covered on a complete enumeration basis Covered by sampling Covered mainly by administrative sources or administrative sources March 2011 UNECE Statistical Division > Business Register to be kept up-to-date with new units 9 Data Collection Administrative Data Sources Administrative registers or datasets can be used as: • • • • • Single source in their own right Frame for sampling via the Business Register Complementary source Validation Data source for small enterprises For STS limited administrative sources available: • • • March 2011 VAT (value added tax) Social security data (employment and labor cost) Building permits, etc. UNECE Statistical Division 10 Data Collection Pros and Cons of Admin Data? + Reduction of response burden + Reduction of costs, data collection and manual work + Total populations - detailed classifications/regional indicators + Better quality and coverage (of smallest units) Data content, units, concepts and definitions may differ - Dependence on few large data suppliers - Timeliness - may require use of estimation - Access and confidentiality - Non-observed economy unlikely to be included - Requires good IT capacity by the supplier and the NSO - March 2011 UNECE Statistical Division 11 Data Collection Administrative Data and Quality National ID-system for enterprises New production methods: • • The most important units to direct collection • to correct for negative values and different concepts slow accumulation > estimation of missing data Active co-operation with large enterprises Development of questionnaires: • • March 2011 Simplification – part of information from registers Efficiency – electronic data collection UNECE Statistical Division 12 Data Collection Legislative Issues Compulsory to use existing data (if suitable) in statistics production Guaranteed access to administrative sources State government and social security institutions obliged to deliver their data to the NSO • • Free of charge or compensation of direct costs Co-operation in making changes in data collection To ensure data confidentiality • March 2011 Individual data collected for statistics should not be handed over to any use other than statistics or research! UNECE Statistical Division 13 Compilation Central Role of VAT Data Source: Statistics Finland March 2011 UNECE Statistical Division 14 Compilation Linking Admin and Survey Data sample, basic info 1. release 2. release Business Register e.g. 290 000 units • Unit IDs • Activity code • Location • Mergers • LKAU (regional) revision combining Sample e.g. 2000 units feedback to BR • Turnover • Mergers VAT e.g. 250 000 units optimal sampling updates to BR activity of units March 2011 small & medium enterprises UNECE Statistical Division • Turnover • Estimates for output and missing data 15 Compilation Data Control and Editing Studying data to identify errors • Detect errors that have a significant influence • Check whether values are within given ranges • Check whether values for related variables are coherent • Compare to past responses (previous months and a year ago) Give top priority to outliers and errors that have the largest impact on the results Outlier values require careful treatment • May be correct but caused by unusual circumstances Source: Methodology of Short-Term Business Statistics, EC March 2011 UNECE Statistical Division 16 Compilation Treating Non-Response Controlling response burden • Better planning of data collection process • Offering various channels for respondents Reducing the effect of non-response • Alternative source, e.g. administrative data • Imputation based on historical data • Mean value imputation, donor/nearist neighbour, regression of variables March 2011 UNECE Statistical Division 17 Compilation Comparing Unit Level Data 80000 70000 60000 Change 115% 50000 40000 30000 20000 10000 0 1 2 3 4 5 6 7 8 9 10 11 12 Months Previous year March 2011 Current year UNECE Statistical Division 18 Compilation Impact on the Results 180 160 -4.12 29.70 140 index -2.53 120 -1.20 100 -2.40 -2.33 -1.13 80 -0.58 60 1 2 3 4 5 6 7 8 Months Index without a unit March 2011 Index with a unit UNECE Statistical Division 19 Compilation Non-Comparable Changes (NCCs) Structural changes in the population: • New units are set up and others stop existing Units may be taken over, merged or split up Units may expand, contract or change their activities • • Reasons for large changes 1) 2) 3) Errors Actual changes that are comparable Actual changes that are non-comparable UN Guide on the Impact of Globalization on National Accounts > helps with STS as well March 2011 UNECE Statistical Division 20 Compilation Example of NCCs Previous year Current year Unit A Turnover = 100 million Exchange of goods 50 million Unit AB Turnover = (100-50) + 75 = 125 million Unit B Turnover = 75 million Turnover drops by one third due to a merger! No change in the level of activity! March 2011 UNECE Statistical Division 21 Compilation Alternative Treatments of NCCs 1. All changes are recorded as they are (actual) − − + Contaminated with apparent, non-comparable changes Difficult to obtain a picture of economic reality Simplicity 2. Panel method • − − + March 2011 Only same units in both periods are included Start-ups and closures would be cancelled out Seriously biased results in highly dynamic populations Simplicity UNECE Statistical Division 22 Compilation Alternative Treatments of NCCs 3. Overlapping method • • Actual comparable changes are not adjusted Other changes are made comparable by a. Collecting comparable information (largest units) b. Replacing non-comparable figure by an estimate c. Taken the unit out of calculation (no effect to results) − + March 2011 Requires more work Results reflect actual changes in economic activity UNECE Statistical Division 23 Compilation Confrontation with Other Sources Regular confrontation may reveal discrepancies Aim at coherence: value = price x output First at the aggregated level and where necessary at lower levels (largest units) Knowledge of differences between statistics helps communication with users Quality reviews of indicators to be undertaken March 2011 UNECE Statistical Division 24 New Requirements for STS? Globalization • Internationally comparable data needed • Treatment of more complex business activities Increasing amount of services • Detection of turning points • Output and price measures, industrial services Longer time series and seasonal adjustment Coherence • March 2011 Compare to National Accounts and between price/volume/value indicators UNECE Statistical Division 25