Opening data - The National Archives

advertisement
Linked Government Data
John Sheridan
2 March 2011
“With linked data, when you have some of it,
you can find other, related, data.”
Tim Berners-Lee,
“Linked Data Design Issues”
http://www.w3.org/DesignIssues/LinkedData.html
Henry Maudslay
(1771–1831)
He also developed the first industrially
practical screw-cutting lathe in 1800, allowing
standardisation of screw thread sizes for the
first time. This allowed the concept of
interchangeability (a idea that was already
taking hold) to be practically applied to nuts
and bolts. Before this, all nuts and bolts had
to be made as matching pairs only. This
meant that when machines were
disassembled, careful account had to be kept
of the matching nuts and bolts ready for when
reassembly took place.
http://en.wikipedia.org/wiki/Henry_Maudslay
Five stars
*
**
***
****
*****
5
make your stuff available on the Web (whatever format)
under an open licence
make it available as structured data (e.g., Excel instead
of image scan of a table)
use non-proprietary formats (e.g., CSV instead of
Excel)
use URIs to identify things, so that people can point at
your stuff
link your data to other data to provide context
Three projects
• data.gov.uk
o Supporting the transparency agenda with Linked Data
• legislation.gov.uk
o First step towards a Linked Data Statute Book
• nationalarchives.gov.uk
o Semantic Knowledge Base for the Web Archive
6
“The Government believes that we need
to throw open the doors of public
bodies, to enable the public to hold
politicians and public bodies to account.”
The Coalition Agreement.
“We will ensure that all data published by
public bodies is published in an open
and standardised format, so that it can
be used easily and with minimal cost by
third parties.”
The Coalition Agreement.
We are:
• developing standards for responsible publishing of key
types of data (financial data, organisation data, aggregate
statistics, location data)
• developing guidance, practices and tools that make it
easy to publish data in Linked Data form, at low cost
• making it easy for people to consume data in a
programmatic way (the Linked Data API as well as native
Linked Data techniques such as the provision of SPARQL
Endpoints)
STANDARDS
10
Director
General
Director
(Operations)
Deputy Director
(A)
Director
(Strategy)
Deputy Director
(A)
2008
2009
2010
A
1,345
1,456
2,301
B
2,112
3,543
2,111
C
2,345
2,987
2,455
D
6,342
6,256
6,123
E
7,435
7,432
8,102
Transaction
Date
Supplier
Amount
A-1263
09/09/2010
Spottiswoode & Co
£ 2,345
A-1264
09/09/2010
JSB & Sons
£ 2,111
A-1265
09/09/2010
BLG Ltd
£ 2,455
A-1266
09/09/2010
Spottiswoode & Co
£ 6,123
A-1267
09/09/2010
BLG Ltd
£ 8,102
Standards
• Re-use where we can, create where we must
• Small, high level, light weight vocabularies
o Examples include datacube, organization, provenance
• Create local specialisations
o Examples include payments, central-government
• Post hoc linking
12
DATA
13
http://reference.data.gov.uk/id/day/2011-01-13
http://reference.data.gov.uk/id/department/CO
http://transport.data.gov.uk/id/station/WAT
http://education.data.gov.uk/id/school/341451
http://location.data.gov.uk/id/3245677362123
http://www.legislation.gov.uk/id/ukpga/2009/12/section/2
PRODUCTION
15
Gridworks (Google Refine)
16
Gridworks: map and export Linked Data
17
PUBLICATION
18
19
Linked Data API
•
•
•
•
•
•
20
Open Standard
Generic approach for creating APIs from Linked Data
Sits on top of a Linked Data store
Several implementations, most mature is Puelia
Examples for education and transport
Also, organisations, payments information
21
22
And wouldn’t it be cool if we had…
UNAMBIGUOUS DEFINITIONS
26
Legislation as data
• Three considerations for legislation as data
o Typographic layout
o Versioning / changes over time
o Semantics
• Semantic representation using RDF and Linked Data
o URIs for things
o RDF data model
o subject - property - object
• Requires granular URIs to name things
o Identifier
o Document
o Representation
27
“A” changes “B” when “C” says so
28
“A” changes “B” when “C” says so
29
“A” changes “B” when “C” says so
Academies
Act 2010
Section 19 (2)
Confers power
Secretary of
State
Makes
Academies
Act 2010
Section 12 (4)
Commences
Inserts text into
Charities Act
1993 Schedule
2 (ca)
30
SI 2010/1937
Schedule 3
Legislation URIs
• Identifier
o http://www.legislation.gov.uk/id/{type}/{year}/{number}/section/{number}
o eg http://www.legislation.gov.uk/id/ukpga/2010/32/section/12/4
• Document
o http://www.legislation.gov.uk/{type}/{year}/{number}/section/{number}
o eg http://www.legislation.gov.uk/ukpga/2010/32/section/12#section-12-4
• Representations
o /data.xml
o /data.xht
o /data.pdf
o /data.rdf
o and for any list, /data.feed
31
Legislation URIs, time and extents
• Identifier
o http://www.legislation.gov.uk/id/{type}/{year}/{number}/section/{number}
• eg http://www.legislation.gov.uk/id/ukpga/2010/32/section/12/4
• Document versions
o In force
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12
o Prospective
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/prospective
o Point in time
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/2010-12-01
o Extents
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/england
• eg http://www.legislation.gov.uk/ukpga/2010/32/section/12/scotland
32
33
34
Web Archive - Semantic Knowledge Base
• The National Archives operates the UK Government
Website Archive
• Second most used web archive in the world
• Links to withdrawn documents are maintained – preserving
wide variety of information, from datasets to documents and
press releases
• Web archives are notoriously difficult to search using
standard search technology – size, number of duplicates
• Procured SKB, competitive process
• Solution being delivered by a consortium (technologies from
Ontotext, University of Sheffield)
35
Download