Professor David Rhind CBE Chair of APPSI The National Archives, Kew, Richmond, Surrey, TW9 4DU Email:secretariat@appsi.gsi.gov.uk The Rt Hon Francis Maude MP Minister for the Cabinet Office Cabinet Office 70 Whitehall London SW1A 2AS 21 October 2013 Dear Minister, You recently appointed me to the Public Sector Transparency Board (PTSB) with a particular remit to feed in views of the Government’s Advisory Panel on Public Sector Information (APPSI)1. Set out below is one item often discussed by APPSI which is now highly topical. The draft PSTB minutes contain a reference to the quality of already published data sets2. There is a clear tension between immediate publication of data after collection and after full quality assurance. Both have advantages. Stephan Shakespeare sought to side-step this problem by creating a twin-track approach to data publication with near-immediate publication followed by re-release when the data had been ‘cleaned’ and explained. Thus far, however, this debate has been couched in very general terms. Set out below is a detailed example which emphases the importance of detailed quality information at least about high priority data sets and some conclusions about prioritisation. Statistics and the Scottish Referendum The Scottish Referendum will occur in August 2014. Already the debate has included very contradictory public statements made by different parties based on their very different interpretations of official statistics. The UK Statistics Authority (UKSA) recognises of course that political debate will often include impassioned statements. But, in a bid to minimise misunderstandings, that body has recently published a user guide to referendum-relevant economic and other statistics3. This describes which statistics are suitable for particular purposes and which are comparable and which are not (e.g. because they are collected with different assumptions, classifications or for very different purposes by the different governments in the UK). A parallel and complementary piece of work in the Government Statistical Service has ranked the intra-UK comparability of groups of statistics. Recognising that users may have trouble finding the relevant information from the 1700 or so statistics 1 I am Deputy Chair of the UK Statistics Authority (UKSA) and hence can also draw on experience there as well as from APPSI. 2 (Page 3, National Information Infrastructure, Observations and objectives set by the Board, point 3) 3 http://www.statisticsauthority.gov.uk/assessment/monitoring/monitoring-reports/monitoring-report-62013---official-statistics-in-the-context-of-the-referendum-on-scottish-independence.pdf publications each year, the Office for National Statistics will produce a consolidated compendium of comparable statistics in early 2014. Conclusions The conclusions which can be drawn from this and other experience are: High priority data sets which are used for matters of national importance (as in the Referendum) need clear and detailed descriptions of their strengths and weaknesses if mistakes in analysis and misrepresentations are to be minimised. It seems unlikely however that the Government could expend the resources required to do this for all of the many thousands of existing government data sets. That said, a core set of metadata about information provenance and quality is essential for all Open Data. This reality emphasises the need for prioritisation of data sets as promised in the Government’s response to the Shakespeare Review The National Information Infrastructure should be seen to have a core of these high priority data sets, with a second tier of important data sets and a third tier of those of unknown value. Over time, as new applications are found, data sets may migrate between tiers. My guess is that only a few hundred data sets really fall in the top tier. The process of identifying them needs to be pragmatic and is, I understand, under way under the leadership of Sir Mark Walport. A major issue is how departments can, in an era of financial constraints, be incentivised to document data and make it easily accessed. The prioritisation scheme may help. The UKSA is beginning a review of what the whole pan-government statistical system needs to be to meet future needs. There may be merit in such a forwardlooking overview occurring more widely. I trust this is helpful. I am copying this for information to Lord McNally (APPSI’s Minister), Sir Mark Walport, Sir Nigel Shadbolt and Paul Maltby. Yours sincerely David Rhind Chairman of APPSI