– UK activities Data Quality Iain Macleay

advertisement
Data Quality – UK activities
Iain Macleay
Head of Energy Balances, Prices and Publications
27 September 2013
Contents
1.
2.
3.
4.
5.
Aspects of quality
Standard errors
Revisions
Risk based quality reviews
Quality training
Aspects of quality
DECC follow UK statistical practice:
- Relevance
- Accuracy
- Timeliness & punctuality
- Accessibility & clarity
- Comparability
- Coherence
Timeliness & punctuality
DECC release data to pre-announced, year in advance,
timetable – all releases at 9:30am.
Energy Trends –
Thursday 26 September;
Thursday 19 December
Thursday 27 March
Thursday 26 June
Dates set for coming year, by the DECC Chief Statistician
– no political interference
If data not released at 9:30 – DECC need to report breech
to UK National Statisticians Office
Data released as soon as available
Accuracy
Difficult to measure, but …
- Sample sizes of surveys published with
information on coverage
- Where useful, standard errors published
- Weighting to adjust for coverage
- Administrative sources used where
appropriate
- Check accuracy of recording by comparing
data sources (volume surveys, price surveys,
company reports)
Sample sizes and standard errors for Quarterly Fuels
Inquiry
- published in industrial price methodology note
Relevance
- As working in policy departments –
regular liaison so data meet needs.
- Also try to anticipate their future needs.
- Regularly survey of wider user
community to check meeting their
needs, every 2 to 3 years – results
published on web
- Review content of press notices and
channels of communication (tweets
etc)
Accessibility & clarity
- Data presented in consistent format
- Helpful commentary drawing users to key
points of interest (even if politically difficult),
written independently by the statistics team
- Clear info on contact details of DECC
statistical teams
- All info available for free on web
- Metadata published – detailed method notes
on web
- Some info on revisions published
Revisions – final consumption annual
growth after one quarter
Standard t-test for the revisions
Number of observations (n)
Mean of the revisions (m)
Variance of the revisions (s2)
t-statistics
t-critical(±)
Test significant at 5% significance level?
Test for significance of mean revisions
Test used
Test significant?
Adjusted t-stat for the revisions
32
-0.1074
0.4026
-0.9571
2.0395
No
Standard
No
First order of autocorrelation (a) of revisions
Adjusted variance of the revisions
Number of independent observations (n*)
t-adjusted
t-adjusted critical(±)
-0.0654
0.3531
32
-1.0219
2.0369
No
Test significant at 5% significance level?
Mean Revisions =
Absolute mean revisions =
Is test significant?
-
Revisions in the year-on-year percentage growth rates
estimates in final consumption, after 1 quarter
1.5
1.0
Percentage points
0.5
0.0
-0.5
-1.0
-1.5
-2.0
Q1-2005
Q1-2006
Q1-2007
Q1-2008
Q1-2009
Q1-2010
Q1-2011
Q1-2012
Q1-2013
0.1074
0.4842
No
Coherence & comparability
- Monthly data consistent with quarterly,
and annual data – revised in line with
better more complete information
- Standard geographies used where
possible
- Energy balance format used so supply
and demand consistent
Quality reviews
- Data collections and publications
should be reviewed on a regular basis
- Tricky in practice – time consuming
activity
- Risk based approach being trialled
- Methodically go through checklist
- Most activities fairly low risk
Risk based review template
Sources
Methods
Systems
Processes
Quality
Users &
reputation
People
Census
Data
acquisition/qu
estionnaire
design
System a
Data
collection &
preparation
process
Relevance
User feedback
People
Admin
Coverage of
data
System b
Results &
analysis
processes
Accuracy
Future user
needs
Survey
Processing,
edit &
imputation
Timeliness &
punctuality
Reputation
Analysis
Accessibility &
clarity
Disclosure
Comparability
Coherence
Domestic fuel prices inquiry
Sources
Methods
Systems
Processes
Quality
Users &
reputation
People
Survey –
98%
sample
coverage
Complex
detailed survey,
many issues
including change
of tariff structure
etc.
System
redesigned in
2012
Data
validation &
editing
Produces bills
based on
standard
consumption
rather than
actuals
Good
feedback
received
New person
each year as
data
processed by
sandwich
student
Good
geographical
coverage
Spread sheet
back-up
available
Main system
newly
developed,
but back-up
used as
double check
Release 12
weeks after
end of quarter
Key policy
area – so new
data needs
emerging
Large survey –
company 100
tariffs in 14
regions - so
much scope for
problems.
Actions
1. Meet companies to improve form filling
2. Engage pro-actively with policy to find future needs
Analysis
Information
published is
disclosive, but
pre-agreed with
former monopoly
suppliers
3.
4.
5.
6.
Ensure good documentation
Have sufficient staff trained to use system
Check data with that from similar surveys
Check data against firms published annual reports
Quality training
- How do we ensure good quality
statistics
- Well trained staff
- Training sessions held focusing on
quality
- All staff to attend – take through stages
of statistical value chain
- In DECC two statisticians trained up to
train others
Download