Data Linking Framework Vision

advertisement
Data Linkage Framework: Progress towards the vision
This paper sets out how the plans for delivering the Data Linkage Framework have developed since
the publication of ‘Joined Up Data for Better Decisions’ at the end of 2012.





The vision, the ambitions and aims remain unchanged (annex a)
The Guiding Principles remain unchanged (annex b)
The plan for a National Privacy Advisory Committee remains unchanged (annex c)
The commitment to transparency, and to communication and public engagement being
embedded across the framework, all remain unchanged (separate paper on communications
and engagement being considered by the Board covers this).
The need for a centre of expertise for leading the development of data linkage IT,
methodology and logistics and that delivers a linkage service remains unchanged. The details
of how this building block of the framework should be delivered have altered somewhat in
light of developments in Health Informatics and Research Council activity.
The main body of this paper explores the issues around the changes to what, in ‘Joined Up Data for
Better Decisions’, had been envisaged as a combined Information Gateway and Data Linkage Service.
The Data Linkage Framework Board are asked to consider the content of this paper alongside the
presentation from Steve Pavis at the Board meeting, and advise on the future direction of work.
Sara Grainger and Steve Pavis, 29 August 2013
1
Reimagining the Data Linkage Centre
When we published Joined Up Data for Better Decisions we thought delivery would look like the
diagram below, with a Data Linkage Service (renamed from ‘Centre’ following misperceptions that
this indicated a data warehousing model) sitting between NRS and ISD using each organisation to
conduct indexing and linking respectively, to maintain separation of functions and utilise existing
skills and resource in those organisations.
The establishment of a node of the Farr Institute in Scotland and the possibility of an Economic and
Social Research Council funded Administrative Data Research Centre (ADRC) offers further
opportunity to leverage academic expertise in the use of administrative data and requires that the
model outlined in Joined up Data for Better Decisions be revisited. The proposed revised model has
the Administrative Data Research Centre and Data Linkage Service/Centre as one entity, co-located
with the Farr Institute and informatics experts, researchers, statisticians and data scientists at No9
Bioquarter, and puts the focus on customer service and support through the eData Research and
Innovation Service (eDRIS) taking on the role of information gateway.
2
The table below provides information on the primary customers and variation in charging models
between Farr, ADRC and DSLS. There are overlaps between the ADRC and DSLS in that both are
primarily using similar data sources (non health administrative data, plus survey and novel sources of
data to a lesser extent), but with variation between their primary target customers. Farr has a
pricing model which charges customers on a per project basis, while ADRC and DSLS are free to
customers at the point of use (funding being provided on block basis by the ESRC and Scottish
Government respectively). Consideration could be given to amalgamating the ADRC and DSLS into a
single service which utilises ‘non health administrative data’ to meet the needs of all types of
customer as shown in the diagram on the previous page. This would provide a clearer service
offering from customers’ perspectives.
Primary customer
Charging model
Collaborative
partners
Title
Funder
Farr
Institute
(Scotland)
MRC
Health researchers Customer
Farr UK nodes
from public, private charged on a per
6 Universities
and 3rd sector orgs
project basis
Administrative
Data Research
Centre
ESRC
Academics
health)
Data Sharing and
Linkage Service
Scottish
Government
(non- 30 projects per ADRC UK nodes
year
funded
Scottish
centrally by ESRC
Universities
Public sector policy 60 projects per Scottish
year (30 SG plus Government
makers
30 other), funded Directorates
Private sector
by
Scottish
Government
3rd Sector orgs
General public
The Scottish ADRC proposal is deliberately designed to ensure a smooth pathway for the extension
of eDRIS to support a wider range of academic research on administrative datasets (ie beyond
health). The proposal is for an academic research programme involving seven interconnected
research strands, with each supported by a research fellow/data scientist. These research
fellows/data scientists will quickly develop expertise within their specialist topic areas and know the
associated administrative data well. The intention, through co-location within the same physical
space, is for the eDRIS team to work very closely with the topic specialist research fellows/data
scientists. Indeed, over time, we intend to encourage periods of shadowing or possibly secondment
between the research fellow/data scientists and eDRIS research coordination roles.
The proposed Scottish ADRC is part of a UK Administrative Data Network. The network is intended to
help with consistency in approaches to sharing and linking data across the UK allowing us to
influence how data linkage is delivered across the UK and helping us to access data from sources in
other UK government departments and devolved administrations.
3
Merging the Administrative Data Research Centre and Data Linkage Service would involve combining
funding and achieving economies of scale in staffing and administration while serving a broader
range of customers than either would individually, and avoiding confusion amongst customers.
eDRIS will provide researchers with a single, initial point of contact for the use of administrative data
(http://www.isdscotland.org/Products-and-Services/eDRIS/). A named ‘research coordinator’ will
work with the researcher throughout the life of the project to:









help the researcher to understand the DSLS and wider processes around access to data,
becoming an approved researcher, use of safe havens etc
ensure that the researcher is in contact with key academic staff or dataset experts, as
required
provide advice on the available datasets and their suitability for the proposed study,
including advice on coding, metadata and previous work undertaken with specific datasets
help to obtain the required data controller permissions, including working with the National
Privacy Advisory Committee once this is established
liaise with data suppliers and the IT hardware providers to ensure data are transferred safely
and efficiently between the Indexer, linkage agent and safe haven.
setup Secure file transfer accounts and thin client accounts, and ensuring that the research
is able to access study data an undertake analyses
undertake probabilistic linking when ‘indexing’ techniques are not appropriate
develop and keeping up to date with probabilistic linkage techniques
undertake disclosure control prior to data leaving the safe haven environment.
IT infrastructure and ensuring citizens’ privacy
Within Scotland there has been a degree of investment in IT infrastructure to support health
informatics primarily through the Wellcome and research councils funding of the SHIP programme
and the Chief Scientist Office (CSO) Scottish Health Sciences Collaboration The SHIP grant provided
monies for initial scoping and development of an IT infrastructure. The Beyond 2011 programme in
NRS considering data linkage approaches to delivering data traditionally gathered through the
national Census has also delivered improved technical infrastructure.
Farr and the ADRC/DSLS will utilise (and extend) the IT infrastructure put in place by SHIP, National
Services Scotland, NHS and NRS. This existing NSS infrastructure was developed after significant
consultation with both academics and the wider public. (Details of the SHIP infrastructure are
published
at
http://www.scotship.ac.uk/sites/default/files/Reports/SHIP_BLUEPRINT_DOCUMENT_final_100712.pdf
and
for
reasons of brevity not repeated in detail). It is a system which follows current best practice in data
linkage by removing identifying personal identifying information as early as possible in the process
and ensures that privacy is maintained through a strict separation of functions between the indexer
(to be provided by NRS) and linkage agent at Atos. Linked data are held within a secure environment
and accessed either through a Virtual Private Network (allowing the research to access data from
their home institution) or through a physical safe setting (a network of which are emerging across
4
Scotland). Researchers only ever gain access to anonymised data within a safe haven arrangement
and only then after completing a certified course and signing both stringent user terms and
conditions, and confidentiality documents.
At the heart of the infrastructure lies an IBM Netezza server. This is supplemented with various
other application and calculation servers to ensure that the computing performance within each of
the available analytic software packages (SPSS, STATA, SAS, Revolution R) is outstanding.
Key components of the SHIP infrastructure are currently provided by a commercial supplier (Atos)
within a secure data centre. We would like to continue to use this service initially but over the
medium term (2-3 yrs) it would be appropriate to test the market and ensure best value for money is
being achieved. The advantages of using the existing supplier initially relate to the:
o
o
o
o
o
speed with which the DSLS can go live
security has been tested and ‘signed off’ by NHS Scotland (including penetration
testing by an external supplier)
secure file transfer protocols, thin client setups and analytic software (SPSS,
Revolution R, STATA and SAS) are all in place and ready to use
data are routinely backed up and held securely
there are clear service level agreements and a resilient infrastructure.
5
6
ANNEX A: THE VISION, AMBITIONS AND AIMS
Our vision for the future is one where evidence of what works in delivering positive outcomes for
all of Scotland is delivered quickly and efficiently with minimal burden on front-line services. By
improving the ethical and legal governance arrangements, and the technical capacity to securely
and efficiently link statistical data, we will enable the research needed to inform policy decisions.
Scotland will be recognised the world over as a hub of innovative and powerful statistical
research, attracting investment and job creation.
We have an ambition to build on existing successful programmes collaboratively to create a
culture where legal, ethical, and secure data-linkage is accepted and expected

By fostering collaboration between existing data linkage programmes and initiatives we will
co-ordinate what is currently a fragmented landscape of activities to achieve immediate
benefits through the sharing of ideas, solutions, best practice and methods.

By encouraging collaboration in the use and procurement of data linkage ICT across the
public sector, we will avoid unnecessary duplication and so reduce purchasing and running
costs.

By increasing the value of datasets, increasing the wealth of good practice and experience in
data linkage research, and demonstrating that Scotland is a world-leader in this field, we will
continue to encourage research investment into Scotland.
We have an ambition to minimise the risks to privacy and enhance transparency, by driving up
standards in data sharing and linkage procedures

By recommending a set of guiding principles for all data linkage activity we will both raise
standards and create a clear and consistent approach to data linkage across Scotland.

By working with the Information Commissioner’s Office to increase the understanding of the
Data Protection Act and other legislation across all those involved in linkage activities we will
encourage respect for privacy and proportionate and effective approaches to mitigating
privacy risks

By encouraging transparency, openness and public involvement in decision making we will
increase public understanding about how and why personal data are used for statistical and
research purposes, and ensure the public value of research involving linkage methods

By co-ordinating and harmonising data access and approval processes across sectors,
without adding layers of bureaucracy, we will streamline the establishment and
management of data linkage projects
We have an ambition to fully realise the benefits that can be achieved through data-linkage to
maximise the value of existing data

By enhancing the data standards and statistical capacity we will improve the quality of data
that exists and make advances on the evidence base, particularly in terms of a joined-up
understanding of how outcomes are achieved, allowing for more informed spending on
public services and early interventions that save money in the long run.
7
ANNEX B: THE GUIDING PRINCIPLES
The Guiding Principles are the foundation of the Framework. They are designed to assist data
controllers and other decision makers (e.g. ethics committees, privacy committees, data access
panels) to adopt a common framework for decision-making and to take a proportionate approach to
managing the risks inherent in any data linkage.
Copies of the principles are available in booklet form and online.
The principles are not rules and are not prescriptive. They are principles that we recommend are
considered ahead of any data linkage activity and where they can guide deliberations on a given data
linkage practice.
The added value of the principles lies in their guiding effect for decision-makers who must decide
whether to approve data sharing or linkage. They provide a common framework for thinking about
the kinds of issues in play and for justifying decisions about linkage or sharing. They operate most
effectively when judgment must be exercised about whether linkage or sharing should take place.
For example, a linkage might be perfectly lawful but there might still be reasons to ask on what basis
it should take place, if at all. The principles assist these deliberative processes.
The principles are intended to promote the public interest in scientifically sound, ethically robust
research while appropriately protecting privacy and encouraging a proportionate approach where
actions taken to reduce the risks to privacy are in proportion to those risks, factoring in the potential
benefits of the research.
There are three central considerations that the principles aim to assist:



do the potential public benefits from the research justify the risks to privacy?
what can be done to mitigate the risks to privacy?
what can be done to increase the public benefits of data linkage and sharing?
Consideration and proportionate application of the principles should help balance these
considerations, increase the public benefits from data usage and mitigate risks to privacy. A common
framework of reference for decision-making should help to promote consistency of decision-making
and also to foster a degree of trust in the high levels of protection and transparency that the system
delivers.
The principles are the basis of all other elements of delivery of the Framework



our engagement work intends to widely inform and educate on how application of the
principles can deliver public benefit through data linkage
our work on privacy and ethics intends to help data controllers and other to use the
principles to allow them share data appropriately
our work on delivering data linkage infrastructure intends to apply the principles to facilitate
research which successfully deliver both public benefit and proportionate privacy protection
8
ANNEX C: THE PRIVACY ADVISORY COMMITTEE
Barriers to data linkage include considerable variation in the interpretation of the legal landscape
and public opinion as well as a range of ethical challenges and risks associated with data linkage. We
aim to address this barrier through public engagement (next annex) and through co-ordinating and
harmonising data access and approval processes across sectors without adding layers of
bureaucracy.
The primary function of a privacy advisory committee would be to offer advice to data custodians on
cross-sectoral linkage applications, increase confidence within the public on research and address
data access issues. The development of this capacity will be taken forward in close discussion with
public bodies across Scotland to avoid overlap, additional bureaucracy and to ensure added-value.
Since publication of ‘Joined up Data for Better Decisions’ NSS:ISD have explored the potential for a
National Privacy Advisory Committee for the whole of the NHS in Scotland. This has been done with
a view to the potential that, if successful, it could expand to other (non-health) sectors in Scotland,
or that a similar model could be applied elsewhere. ISD and the Office of the Chief Statistician and
Performance have been in close communication about the responses received to the consultation
and are considering questions that have arisen such as whether the initial NPAC should be for both
Health and Social Care, rather than just health as well as, for example:







How can we most effectively get ‘buy-in’ from all sectors, necessary for the privacy advisory
committee to function well
How should we best articulate the necessity of the Privacy Committee and the value it will
add to the landscape in Scotland?
How will the privacy service be funded and demonstrate value for money?
Who will it report to and how will members be selected?
How will it function with other UK, national and local structures?
What kind of model should it operate in terms of logistics and applications?
How will complaints against NPAC be received?
A working group has been established to explore possible answers to these questions, with the
following remit:







Identify key stakeholders across the public sector in Scotland
Act as an ambassador for NPAC
Advise on, steer and contribute to ideas for how a NPAC could work. (Helping to shape a
proposal or options for consultation.)
Contribute to and direct delivery of consultation exercise
Provide guidance to DLF policy team on NPAC matters including highlighting risk
Provide guidance on and help deliver communications and engagement
Helping with a joined up, coordinated response with related work streams.
9
Download