The Use of Administrative Sources for Economic Statistics An Overview Steven Vale

advertisement
The Use of Administrative
Sources for Economic Statistics
An Overview
Steven Vale
Office for National Statistics
UK
Contents
• Definitions
• Advantages of using administrative
data
• Common problems
• Quality of administrative data
• Using administrative data in practice
• Conclusions
Narrow Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Private
Sector
Wider Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Private
Sector
Administrative sources
are sources containing
information which is not
primarily collected for
statistical purposes.
Reasons for this Definition
• Privatisation of some government
functions
• Growth of private sector “value-added
re-sellers”
• User interest in new types of data
Benefits of Administrative Data
• Cost
– Surveys / censuses are expensive,
administrative data are often “free”
• Response burden
– Reduced burden on data suppliers
– Statistics can be compiled more
frequently with no extra burden
Benefits of Administrative Data
• Coverage
– Full coverage of target population
– No survey errors and lower non-response
– Better small-area data
• Timeliness (sometimes!)
• Public image
– Making use of existing data can enhance
the prestige of a statistical organisation by
making it seem more efficient
Population Census Costs
2000-2001
• UK, €367m, €6.2 per person
• Austria, €56m, €6.9 per person
• Finland, €0.8m, €0.2 per person
Source: Eurostat – Documentation of the 2000
round of population and Housing censuses in the
EU, EFTA and Candidate Countries; Table 22
Common Problems
• Administrative units do not always
coincide with statistical units
• Conversion via automatic rules for
simple cases
• Profiling for more complex cases
– Gives a better understanding of
complex business structures
– Expensive and needs trained staff
Common Problems
• Different definitions and classifications
– Administrative and statistical priorities are
often different
– Conversion matrices needed for different
classifications
• Timeliness
– Data arrive too late
– Data relate to a different time period
Lag in days
1000
950
900
850
800
750
700
650
600
550
500
450
400
350
300
250
200
150
100
50
0
Frequency (thousands)
VAT Birth Lags
200
180
160
140
120
100
80
60
40
20
0
VAT Birth Lags
• 2/3 of businesses are on the register
within 2 months of start-up
• Mean lag = 4 months due to “outliers”
• Median = Approx. 40 days
• Some pre-register - negative lags
Common Problems
• Change management
– Risk of changes in government policy,
thresholds, definitions, coverage etc.
– Need contingency plans
• Data from multiple sources
– Matching / linking issues
– Data conflicts – priority rules
Quality of Administrative Data
• There are many aspects to quality
• Administrative data will be better than
survey data in some aspects but not
in others
• It is important to look at overall quality
• Do the data meet the needs of users?
Three Aspects of Quality
• Quality of incoming data
• Quality of processing
(matching, merging, ...)
• Quality of outputs - likely to be
different to survey based outputs,
but are they better?
Quality Measurement
• How to measure the quality of data
from administrative sources?
– Comparing sources
– Quality check surveys
– Knowledge of source (metadata)
– Quality reports / templates
Quality Templates
Companies House Data
• Framework: Contract
• Frequency: Quarterly updates, continuous
on-line access
• Timeliness: Good
• Quality:
Good
• Delivery:
CD-ROM / Internet
• Key content: Legal name, company number
Using Administrative Data
• Conversion to statistical concepts and
definitions
• Linking / Matching
– Exact Matching - linking records from
two or more sources, often using
common identifiers
– Probabilistic Matching - determining the
probability that records from different
sources should match, using a
combination of variables
UK Business Register
VAT
Survey
inputs
Satellite
registers
Company
registrations
Business
Register
PAYE
Geographic
information
systems
Dun and
Bradstreet
Satellite Registers
Examples of Satellite Registers
• Tourism - hotel register (category,
number of beds)
• Transport - vehicle or ship register
(type, capacity)
• Distributive trades - buildings register
(building size, sales area)
Conclusions
• Administrative sources should be
defined in the widest sense
• There are many benefits in using
administrative data, particularly
reduced costs
• There are problems when using
administrative data, but usually
someone has found a solution
Conclusions
• Most problems can be reduced by
effective planning and detailed
knowledge of the source
• The benefits are often greater than
the costs
Thank-you for listening.
Any Questions?
steve.vale@ons.gov.uk
Download