Data Quality – UK activities Iain Macleay Head of Energy Balances, Prices and Publications 27 September 2013 Contents 1. 2. 3. 4. 5. Aspects of quality Standard errors Revisions Risk based quality reviews Quality training Aspects of quality DECC follow UK statistical practice: - Relevance - Accuracy - Timeliness & punctuality - Accessibility & clarity - Comparability - Coherence Timeliness & punctuality DECC release data to pre-announced, year in advance, timetable – all releases at 9:30am. Energy Trends – Thursday 26 September; Thursday 19 December Thursday 27 March Thursday 26 June Dates set for coming year, by the DECC Chief Statistician – no political interference If data not released at 9:30 – DECC need to report breech to UK National Statisticians Office Data released as soon as available Accuracy Difficult to measure, but … - Sample sizes of surveys published with information on coverage - Where useful, standard errors published - Weighting to adjust for coverage - Administrative sources used where appropriate - Check accuracy of recording by comparing data sources (volume surveys, price surveys, company reports) Sample sizes and standard errors for Quarterly Fuels Inquiry - published in industrial price methodology note Relevance - As working in policy departments – regular liaison so data meet needs. - Also try to anticipate their future needs. - Regularly survey of wider user community to check meeting their needs, every 2 to 3 years – results published on web - Review content of press notices and channels of communication (tweets etc) Accessibility & clarity - Data presented in consistent format - Helpful commentary drawing users to key points of interest (even if politically difficult), written independently by the statistics team - Clear info on contact details of DECC statistical teams - All info available for free on web - Metadata published – detailed method notes on web - Some info on revisions published Revisions – final consumption annual growth after one quarter Standard t-test for the revisions Number of observations (n) Mean of the revisions (m) Variance of the revisions (s2) t-statistics t-critical(±) Test significant at 5% significance level? Test for significance of mean revisions Test used Test significant? Adjusted t-stat for the revisions 32 -0.1074 0.4026 -0.9571 2.0395 No Standard No First order of autocorrelation (a) of revisions Adjusted variance of the revisions Number of independent observations (n*) t-adjusted t-adjusted critical(±) -0.0654 0.3531 32 -1.0219 2.0369 No Test significant at 5% significance level? Mean Revisions = Absolute mean revisions = Is test significant? - Revisions in the year-on-year percentage growth rates estimates in final consumption, after 1 quarter 1.5 1.0 Percentage points 0.5 0.0 -0.5 -1.0 -1.5 -2.0 Q1-2005 Q1-2006 Q1-2007 Q1-2008 Q1-2009 Q1-2010 Q1-2011 Q1-2012 Q1-2013 0.1074 0.4842 No Coherence & comparability - Monthly data consistent with quarterly, and annual data – revised in line with better more complete information - Standard geographies used where possible - Energy balance format used so supply and demand consistent Quality reviews - Data collections and publications should be reviewed on a regular basis - Tricky in practice – time consuming activity - Risk based approach being trialled - Methodically go through checklist - Most activities fairly low risk Risk based review template Sources Methods Systems Processes Quality Users & reputation People Census Data acquisition/qu estionnaire design System a Data collection & preparation process Relevance User feedback People Admin Coverage of data System b Results & analysis processes Accuracy Future user needs Survey Processing, edit & imputation Timeliness & punctuality Reputation Analysis Accessibility & clarity Disclosure Comparability Coherence Domestic fuel prices inquiry Sources Methods Systems Processes Quality Users & reputation People Survey – 98% sample coverage Complex detailed survey, many issues including change of tariff structure etc. System redesigned in 2012 Data validation & editing Produces bills based on standard consumption rather than actuals Good feedback received New person each year as data processed by sandwich student Good geographical coverage Spread sheet back-up available Main system newly developed, but back-up used as double check Release 12 weeks after end of quarter Key policy area – so new data needs emerging Large survey – company 100 tariffs in 14 regions - so much scope for problems. Actions 1. Meet companies to improve form filling 2. Engage pro-actively with policy to find future needs Analysis Information published is disclosive, but pre-agreed with former monopoly suppliers 3. 4. 5. 6. Ensure good documentation Have sufficient staff trained to use system Check data with that from similar surveys Check data against firms published annual reports Quality training - How do we ensure good quality statistics - Well trained staff - Training sessions held focusing on quality - All staff to attend – take through stages of statistical value chain - In DECC two statisticians trained up to train others