Data Collection Administrative Data Sources

advertisement
UNECE Workshop on Short-Term Statistics (STS)
and Seasonal Adjustment
14 – 17 March 2011, Astana, Kazakhstan
STS Compilation
with Multiple Data Sources
Anu Peltola
Economic Statistics Section, UNECE
Overview

Data collection
•
•
•

Compilation of results
•
•
•


Sampling
Administrative data
Combining multiple data sources
Data editing
Non-response and weighting
Treatment of non-comparable changes
Publication
Improvement
March 2011
UNECE Statistical Division
2
Theoretical Concept –
A Key to Good Quality

Define the purpose of an indicator
Links to the real world
•
•
•

What should it describe?
Who are the users/uses (internal/external)?
Possible data sources
Act
Plan
Links to other statistics
•
•
•
•
Differences in concepts, scope, methods
Goal variables – national accounts/SBS
Regular benchmarking
Follow-up of differences
Time
Check
Do
Continuous
improvement
By Deming
March 2011
UNECE Statistical Division
3
Quality

Production Process

Bring the collected data to the level of the intended
statistical output!
Correction of
systematic errors
in data
Check for the
most important
observations
Collection
of data
Index
calculation
March 2011
UNECE Statistical Division
Publication
4
Data Collection
Statistical Units

Corner stones of business statistics
•
•

Legal unit -> enterprise (services) -> enterprise groups
Establishment (for industry/construction)
Business registers are fundamentally important
•
Bridge between administrative and statistical units
• Definition of the economic activity class (ISIC/NACE)
• Improve its comprehensiveness – use as a frame
• Examine opportunities to use administrative data
• Interactive: update with information from STS
UN: International recommendations for the Index of Industrial Production &
EC: STS Metholodological manual
March 2011
UNECE Statistical Division
5
System of Statistics
Source: Statistics Finland, Strategy for economic statistics
March 2011
UNECE Statistical Division
6
Data Collection
Questionnaire Design

Give clear instructions
•

Revisions to earlier months
•
•



Explain the concepts to the respondents
Aim to pre-fill the questionnaire with data given earlier
Leave space for reporting revisions
Always test changes to questionnaires
Inform the respondents of the use of data
Develop useful feedback for respondents
•
March 2011
your company compared to others in the same activity
UNECE Statistical Division
7
Data Collection
Sampling in Practice

Many surveys are for units above a size threshold
•


Based on business register and periodically reviewed
In drawing a sample, special attention to be paid to:
•
•
•
•

Burdensome and problems with the coverage of small units
Level of details to be published
Resources available
Accuracy and timeliness required
Response burden
Simple/stratified sampling by activity and size
March 2011
UNECE Statistical Division
8
Total population of units
in the Business Register
Stratification by economic activity
Large units
Medium units
Small units
Covered
on a complete
enumeration basis
Covered
by sampling
Covered
mainly by
administrative sources
or administrative
sources
March 2011
UNECE Statistical Division
> Business Register to be
kept up-to-date with new
units
9
Data Collection
Administrative Data Sources

Administrative registers or datasets can be used as:
•
•
•
•
•

Single source in their own right
Frame for sampling via the Business Register
Complementary source
Validation
Data source for small enterprises
For STS limited administrative sources available:
•
•
•
March 2011
VAT (value added tax)
Social security data (employment and labor cost)
Building permits, etc.
UNECE Statistical Division
10
Data Collection
Pros and Cons of Admin Data?
+ Reduction of response burden
+ Reduction of costs, data collection and manual work
+ Total populations - detailed classifications/regional indicators
+ Better quality and coverage (of smallest units)
Data content, units, concepts and definitions may differ
- Dependence on few large data suppliers
- Timeliness - may require use of estimation
- Access and confidentiality
- Non-observed economy unlikely to be included
- Requires good IT capacity by the supplier and the NSO
-
March 2011
UNECE Statistical Division
11
Data Collection
Administrative Data and Quality


National ID-system for enterprises
New production methods:
•
•

The most important units to direct collection
•

to correct for negative values and different concepts
slow accumulation > estimation of missing data
Active co-operation with large enterprises
Development of questionnaires:
•
•
March 2011
Simplification – part of information from registers
Efficiency – electronic data collection
UNECE Statistical Division
12
Data Collection
Legislative Issues



Compulsory to use existing data (if suitable) in
statistics production
Guaranteed access to administrative sources
State government and social security institutions
obliged to deliver their data to the NSO
•
•

Free of charge or compensation of direct costs
Co-operation in making changes in data collection
To ensure data confidentiality
•
March 2011
Individual data collected for statistics should not be
handed over to any use other than statistics or research!
UNECE Statistical Division
13
Compilation
Central Role of VAT Data
Source: Statistics Finland
March 2011
UNECE Statistical Division
14
Compilation
Linking Admin and Survey Data
sample,
basic info
1. release
2. release
Business Register
e.g. 290 000 units
• Unit IDs
• Activity code
• Location
• Mergers
• LKAU (regional)
revision
combining
Sample
e.g. 2000 units
feedback
to BR
• Turnover
• Mergers
VAT
e.g. 250 000 units
optimal
sampling
updates to BR
activity of units
March 2011
small & medium
enterprises
UNECE Statistical Division
• Turnover
• Estimates for output
and missing data
15
Compilation
Data Control and Editing

Studying data to identify errors
•
Detect errors that have a significant influence
• Check whether values are within given ranges
• Check whether values for related variables are coherent
• Compare to past responses (previous months and a year ago)


Give top priority to outliers and errors that have the
largest impact on the results
Outlier values require careful treatment
•
May be correct but caused by unusual circumstances
Source: Methodology of Short-Term Business Statistics, EC
March 2011
UNECE Statistical Division
16
Compilation
Treating Non-Response

Controlling response burden
•
Better planning of data collection process
• Offering various channels for respondents

Reducing the effect of non-response
•
Alternative source, e.g. administrative data
• Imputation based on historical data
• Mean value imputation, donor/nearist neighbour,
regression of variables
March 2011
UNECE Statistical Division
17
Compilation
Comparing Unit Level Data
80000
70000
60000
Change 115%
50000
40000
30000
20000
10000
0
1
2
3
4
5
6
7
8
9
10
11
12
Months
Previous year
March 2011
Current year
UNECE Statistical Division
18
Compilation
Impact on the Results
180
160
-4.12
29.70
140
index
-2.53
120
-1.20
100
-2.40
-2.33
-1.13
80
-0.58
60
1
2
3
4
5
6
7
8
Months
Index without a unit
March 2011
Index with a unit
UNECE Statistical Division
19
Compilation
Non-Comparable Changes (NCCs)
Structural changes in the population:

•
New units are set up and others stop existing
Units may be taken over, merged or split up
Units may expand, contract or change their activities
•
•

Reasons for large changes
1)
2)
3)

Errors
Actual changes that are comparable
Actual changes that are non-comparable
UN Guide on the Impact of Globalization on
National Accounts > helps with STS as well
March 2011
UNECE Statistical Division
20
Compilation
Example of NCCs
Previous year
Current year
Unit A
Turnover =
100 million
Exchange of goods
50 million
Unit AB
Turnover =
(100-50) + 75
= 125 million
Unit B
Turnover =
75 million
Turnover drops by one third due to a merger!
No change in the level of activity!
March 2011
UNECE Statistical Division
21
Compilation
Alternative Treatments of NCCs
1. All changes are recorded as they are (actual)
−
−
+
Contaminated with apparent, non-comparable changes
Difficult to obtain a picture of economic reality
Simplicity
2. Panel method
•
−
−
+
March 2011
Only same units in both periods are included
Start-ups and closures would be cancelled out
Seriously biased results in highly dynamic populations
Simplicity
UNECE Statistical Division
22
Compilation
Alternative Treatments of NCCs
3. Overlapping method
•
•
Actual comparable changes are not adjusted
Other changes are made comparable by
a. Collecting comparable information (largest units)
b. Replacing non-comparable figure by an estimate
c. Taken the unit out of calculation (no effect to
results)
−
+
March 2011
Requires more work
Results reflect actual changes in economic
activity
UNECE Statistical Division
23
Compilation
Confrontation with Other Sources





Regular confrontation may reveal
discrepancies
Aim at coherence:
value = price x output
First at the aggregated level and where
necessary at lower levels (largest units)
Knowledge of differences between statistics
helps communication with users
Quality reviews of indicators to be undertaken
March 2011
UNECE Statistical Division
24
New Requirements for STS?

Globalization
•
Internationally comparable data needed
• Treatment of more complex business activities

Increasing amount of services
•

Detection of turning points
•

Output and price measures, industrial services
Longer time series and seasonal adjustment
Coherence
•
March 2011
Compare to National Accounts and
between price/volume/value indicators
UNECE Statistical Division
25
Download