DUG's submission - Demographics User Group

advertisement
Public Administration Select Committee
Study 6: Statistics and Open Data
Submission on behalf of the Demographics User Group
2 September 2013
Executive Summary
The Demographics User Group (DUG)1 represents 14 major commercial companies
– Barclays, Boots, Camelot, Centrica, Co-operative Group, E.ON, Everything
Everywhere, GSK, John Lewis, Marks & Spencer, Sainsbury’s, Tesco, and
Whitbread – which make extensive use of government statistics and geographical
data to understand local markets and consumers, and make decisions about large
investments in delivering better services.
These are the tip of the iceberg of 2.3 million businesses in the UK, many of which
can increase their efficiency, and grow, by using data gathered by government, which
has the great advantage of consistent collection across the whole of the country.
The key themes of this note are to:




Recognise the importance of government open data to business
Welcome the acceleration in progress in recent years
Alert PASC to the fact that users outside the public sector (such as
businesses and charities) are not able to enjoy all the free access
arrangements which have been made for public sector users
Urge PASC to press for the National Address Gazetteer and the Postcode
Address File to be made open data
PASC’s questions
1. Why is open data important?
1.1 Companies require the best possible information available to understand their
markets, and to make major investment decisions such as the opening of new retail
outlets. Government collects huge volumes of data about citizens, and the country’s
physical infrastructure, and this has the additional advantage of often being done
consistently for the whole of Great Britain, or even the entire United Kingdom. We
endorse the Shakespeare Review’s approach that information collected at public
expense should be publicly available (subject to confidentiality limitations).
1.2 The case for open data often focuses on the opportunities for new informationbased companies to start-up and to grow, but we believe that the benefits of
increasing the efficiency of existing business-to-consumer companies (such as
retailers) are much larger: for example, the use of a better address register to reduce
the failure rate of millions of home deliveries would result in considerable savings
over one year.
1
http://www.demographic.co.uk/dug.html
2. Why does the Government need an open data strategy?
2.1 The great benefit of the Government’s open data strategy is that it sets the tone
in the public services, creating the expectation that data will be released unless there
is good reason.
3. What should the Government’s aims be for the release of open data?
Are the Government’s stated key outcomes in its Open Data Strategy the
right ones?
3.1 We strongly support the Government’s policy on Open Data, believing that as the
use of data increases and extends to new users, this creates new value.
4. How can those engaged in open data, and those engaged in producing
government statistics work together effectively to produce new data?
4.1 This requires continuous dialogue, so that data owners can better understand
users’ needs. The Office for National Statistics has long put significant effort into
consultation with users of the Census – the biggest of all open datasets – and this
has been very successful. The Royal Statistical Society’s Statistics User Forum
provides an excellent mechanism for understanding the needs of a wide range of
user groups. And the recently established Open Data User Group provides a
valuable vehicle for identifying users’ needs and priorities.
5. How can more statistics and administrative data of all kinds become more
freely available?
5.1 The use of government administrative files to create new statistics needs to be
accelerated. Administrative files accumulated by departments such as HMRC, DWP,
Education, the NHS, and the Home Office are immensely rich potential sources of
information about the population and its social characteristics. In recent years more
use has been made of such files to produce aggregate statistics for small areas, but
we believe that there is scope to create much more value at relatively low cost in two
ways:

Existing statistics, but for smaller areas. Although the 2001 Census produced
many statistics down to Output Area level (c.120 households), statistics
produced in the following decade from administrative files have been created
only for larger, cruder areas. Simply aggregating administrative records files
to OA level would increase the value of the information, and be a quick win.

New statistics from underutilised administrative files. This was encouraged in
the Treasury Select Committee’s report “Counting the Population” in 2008,
and is being taken forward by ONS in its “Beyond 2011” investigation of
alternatives to another Census, which is creating a massive opportunity to
use government administrative information for statistical purposes. In
particular, HMRC is the obvious source of information on Income, and Wealth.
2
6. Is open data presented well and of adequate quality?
a. Are the formats of the data being published accessible, useable and
understandable to the public?
b. What metadata is needed to make releases useful?
c. Who will use the data released?
6.1 The top priority for users of open data has been to see the principle of free
access established, and datasets published, even if their quality is less than perfect,
and their formats not ideal. This leaves scope for improvement, and also for better
ways to search for and find relevant data. The website www.data.gov.uk now has
>10,000 datasets, and effort now needs to be made to highlight those which are of
particular
interest
to
certain
categories
of
users.
For
example,
http://www.retailresearchdata.org/Default.aspx the ESRC’s Retail Research Data
website, enables insight and store location analysts working in retail organisations to
get easy access to (just) those free datasets which may be of value to their
businesses: less means more.
6.2 In this way, usage can spread from a small number of specialists, to more
mainstream analysts, and then to the much more numerous wider public, greatly
increasing use and hence the value of the data. This has been very apparent with the
increased use of the Census in the last two decades.
7. How successful has the Government’s Open Data initiative been in changing
behaviour in the Civil Service and wider public sector?
7.1 The Government’s initiative has had significant success in changing behaviour in
Whitehall Departments, and DUG really appreciates its involvement in Transparency
Boards such as Welfare (DWP), and Tax (HMRC).
7.2 However, in the wider public sector, progress has in some cases been fiercely
resisted by BIS and the Treasury. Two matters are of great concern to us:
 Firstly, the defence of the Trading Fund model, which can result in setting
prohibitively high prices aimed at a very small captive market of business
users: this approach is still used by Ordnance Survey for some of its data.
 Secondly, arrangements have been made through the Public Sector Mapping
Agreement for public sector bodies to have free access at the point of use to
Ordnance Survey’s mapping, the National Address Gazetteer, and the Post
Office’s Postcode Address File, and this is an enlightened policy. But it does
not extend to other users such as business, or charities: this is iniquitous,
especially when the government is encouraging businesses to grow. It is to
be hoped that the Shakespeare Review’s proposal for National Core
Reference Data will finally provide a mechanism to solve the problem, but,
having seen the case repelled for 15 years, we are not optimistic.
7.3 We also feel that many civil servants do not realise that “commercial users” are of
two distinct types: value-added resellers (VARs, which sell data to other businesses),
and business-to-consumer companies (which provide services to very large numbers
of citizens). For the latter, key government data sets are "business as usual"
resources, and are therefore structurally important in maintaining and growing
profitability, jobs, and tax revenues. It follows that government should not be content
to seek only the views of VARs when wishing to understand the needs of commercial
users.
3
8. Which datasets are the most important?
a. What are the best examples of data being made open and resultant
benefits to business or society?
8.1 The decision to make the 2001 Census freely available was a great milestone on
the journey to open data: use by commercial companies greatly increased, and it has
been used as the basis for thousands of investment decisions. The arrival of the
2011 Census is just as significant. More broadly, members of DUG have welcomed
many new open data sources, especially (some) Ordnance Survey mapping,
postcode directories, DWP statistics for small areas, Land Registry house prices, and
GP prescribing information.
8.2 The scope for further progress is summarised in our Data Manifesto (see Annex).
Of these, our top priorities are:
 The National Address Gazetteer
 All the mapping, including boundaries, needed by government, and provided
for in the Public Sector Mapping Agreement
9. How effective is the work being undertaken by the Cabinet Office to monitor
the progress of Departments in publishing their agreed datasets?
9.1 It appears to be very effective indeed.
Keith Dugmore
Director, Demographics User Group
20 Russell House
Cambridge Street
London SW1V 4EQ
TelNos: 020 7834 0966 (landline); 07976 750094 (mobile)
Email: dugmore@demographic.co.uk
PASC – Open Data – DUG Submission
4
Annex: DUG members’ needs for data from government – a manifesto
(Updated January 2013)
This is the latest version of priorities identified by the 14 large commercial companies who
are members of the Demographics User Group www.demographicsusergroup.co.uk
We believe that they would also benefit many of the country’s other 2.3 million
businesses and, indeed other organisations such as charities, and citizens generally.
Introduction:

In many cases individual unit records are the ideal flexible data source, but if
they need to be protected for confidentiality, tagging them with an Output Area
code, or aggregating them to OA statistics, maximises value.

General issues: access / timeliness / format.

Six topics are identified (bold & in green) as priorities.

In some cases (marked #) the 2011 Census will provide new information (but
only as at March 2011).
Broad Category
Geographical
Backdrop
Retail centres
Workplaces
(# & see some
information in the 2011
Census)
People’s movements /
transport / location /
commuting
(# & see some
information in the 2011
Census)
Specifics
(1) All the mapping, including boundaries,
needed by government, and provided for
in the Public Sector Mapping Agreement

Infrastructure developments & plans

Flood maps
Retail Outlets:

Numbers & type

Speedily updated, inc. pop-up shops

Historical Data
Locations, and numbers of workers:

Head Offices

Local Units

Business & Science parks
Traffic flows: Mode (road, rail, bus, tram, bike, +
pedestrians) and Destinations (workplace, retail,
etc.).
Addresses – home &
others
2
 Ordnance
Survey
 LAs / CLG / OS
 Env Agency
 Valuation Office
Agency / LAs?
 Inter
Departmental
Business
Register?
 Department for
Transport for all
those in this
section?
Car parks; Congestion charge areas
(2) Counts of people at locations (& mobile
phone coverage)
Telecommunication
Possible
Government
sources
Time-based data:

Seasonality

Weekday / weekend

Day & day part

Mobile coverage / phones per cell by time

Broadband access / usage / speed

Cable & broadband exchange traffic

National Statistics Postcode Directory –
omitted fields, e.g. delivery points

PAF, & Postcode changes
(3) The National Address Gazetteer
(& see Govt news
29 Nov 20112)
 Better data from
OFCOM: more
recent, and for
all UK
 ONS
 Royal Mail
 GeoPlace/OS
http://www.cabinetoffice.gov.uk/resource-library/open-data-measures-autumn-statement-2011
5

Properties – housing &
business





People & their
circumstances
(# & see some
information in the 2011
Census)
Business, the economy
& investment
Weather


Addresses of premises (schools, hospitals,
 (various)
surgeries, clinics, etc.)
Council Tax bands for domestic properties, &  VOA & LAs
receipts
Housing stock, & sales & their prices
 Land Registry
(& see Govt
news 29 Nov
2011)
House rents
 LAs / CLG?
House building & conversion completions
 LAs / CLG
Planning applications – domestic and
 LAs / CLG
business properties
Valuation lists for business properties
 VOA
Aggregate data from government data silos  DWP, HMRC,
– person & household; ideally a single
Education, etc.
customer / citizen view
 LAs

Electoral Roll (if not opted out)

County Court Judgments for debt – personal  MoJ & Registry
& corporate
Trust
(4) Household income & disposable income /
cost of housing / wealth
 HMRC

 ONS

Immigration / migration; house occupancy /
multiple occupation
Company information

Efficiency by area; GDP by area

Levels of government investment:
geographical location & nature
Weather: historic and current data, and also
forecasts

 Companies
House ( & see
Govt news 29
Nov 2011)
 ONS / HMT /
BIS?
 ONS / HMT /
BIS?
 Met Office (&
see Govt news
29 Nov 2011)
ONS’s Statistics
Neighbourhood
Statistics, and new
statistics created from
administrative databases
(5) Recreate existing statistics at Output
Area level (c.f. the current higher / less
valuable Super OA level), and create new
OA-level statistics starting with the
topics identified by the Beyond 2011
project
Government’s existing
sample surveys (e.g.
those held at the
University of Essex)
(6) Provide access to anonymised unit
records for analytical purposes. All
surveys should be coded with ONS’s
Output Area Classification (OAC). The
Living Costs and Food Survey, the
Wealth & Assets Survey, Understanding
Society, and National Well-Being (the
“Happiness Index”) are of particular
interest to commercial companies.
DUG’s Data Manifesto – January 2013
Keith Dugmore
6
 ONS and
government
departments
 ONS / ESRC
Download