Data Linkage Framework: Progress towards the vision This paper sets out how the plans for delivering the Data Linkage Framework have developed since the publication of ‘Joined Up Data for Better Decisions’ at the end of 2012. The vision, the ambitions and aims remain unchanged (annex a) The Guiding Principles remain unchanged (annex b) The plan for a National Privacy Advisory Committee remains unchanged (annex c) The commitment to transparency, and to communication and public engagement being embedded across the framework, all remain unchanged (separate paper on communications and engagement being considered by the Board covers this). The need for a centre of expertise for leading the development of data linkage IT, methodology and logistics and that delivers a linkage service remains unchanged. The details of how this building block of the framework should be delivered have altered somewhat in light of developments in Health Informatics and Research Council activity. The main body of this paper explores the issues around the changes to what, in ‘Joined Up Data for Better Decisions’, had been envisaged as a combined Information Gateway and Data Linkage Service. The Data Linkage Framework Board are asked to consider the content of this paper alongside the presentation from Steve Pavis at the Board meeting, and advise on the future direction of work. Sara Grainger and Steve Pavis, 29 August 2013 1 Reimagining the Data Linkage Centre When we published Joined Up Data for Better Decisions we thought delivery would look like the diagram below, with a Data Linkage Service (renamed from ‘Centre’ following misperceptions that this indicated a data warehousing model) sitting between NRS and ISD using each organisation to conduct indexing and linking respectively, to maintain separation of functions and utilise existing skills and resource in those organisations. The establishment of a node of the Farr Institute in Scotland and the possibility of an Economic and Social Research Council funded Administrative Data Research Centre (ADRC) offers further opportunity to leverage academic expertise in the use of administrative data and requires that the model outlined in Joined up Data for Better Decisions be revisited. The proposed revised model has the Administrative Data Research Centre and Data Linkage Service/Centre as one entity, co-located with the Farr Institute and informatics experts, researchers, statisticians and data scientists at No9 Bioquarter, and puts the focus on customer service and support through the eData Research and Innovation Service (eDRIS) taking on the role of information gateway. 2 The table below provides information on the primary customers and variation in charging models between Farr, ADRC and DSLS. There are overlaps between the ADRC and DSLS in that both are primarily using similar data sources (non health administrative data, plus survey and novel sources of data to a lesser extent), but with variation between their primary target customers. Farr has a pricing model which charges customers on a per project basis, while ADRC and DSLS are free to customers at the point of use (funding being provided on block basis by the ESRC and Scottish Government respectively). Consideration could be given to amalgamating the ADRC and DSLS into a single service which utilises ‘non health administrative data’ to meet the needs of all types of customer as shown in the diagram on the previous page. This would provide a clearer service offering from customers’ perspectives. Primary customer Charging model Collaborative partners Title Funder Farr Institute (Scotland) MRC Health researchers Customer Farr UK nodes from public, private charged on a per 6 Universities and 3rd sector orgs project basis Administrative Data Research Centre ESRC Academics health) Data Sharing and Linkage Service Scottish Government (non- 30 projects per ADRC UK nodes year funded Scottish centrally by ESRC Universities Public sector policy 60 projects per Scottish year (30 SG plus Government makers 30 other), funded Directorates Private sector by Scottish Government 3rd Sector orgs General public The Scottish ADRC proposal is deliberately designed to ensure a smooth pathway for the extension of eDRIS to support a wider range of academic research on administrative datasets (ie beyond health). The proposal is for an academic research programme involving seven interconnected research strands, with each supported by a research fellow/data scientist. These research fellows/data scientists will quickly develop expertise within their specialist topic areas and know the associated administrative data well. The intention, through co-location within the same physical space, is for the eDRIS team to work very closely with the topic specialist research fellows/data scientists. Indeed, over time, we intend to encourage periods of shadowing or possibly secondment between the research fellow/data scientists and eDRIS research coordination roles. The proposed Scottish ADRC is part of a UK Administrative Data Network. The network is intended to help with consistency in approaches to sharing and linking data across the UK allowing us to influence how data linkage is delivered across the UK and helping us to access data from sources in other UK government departments and devolved administrations. 3 Merging the Administrative Data Research Centre and Data Linkage Service would involve combining funding and achieving economies of scale in staffing and administration while serving a broader range of customers than either would individually, and avoiding confusion amongst customers. eDRIS will provide researchers with a single, initial point of contact for the use of administrative data (http://www.isdscotland.org/Products-and-Services/eDRIS/). A named ‘research coordinator’ will work with the researcher throughout the life of the project to: help the researcher to understand the DSLS and wider processes around access to data, becoming an approved researcher, use of safe havens etc ensure that the researcher is in contact with key academic staff or dataset experts, as required provide advice on the available datasets and their suitability for the proposed study, including advice on coding, metadata and previous work undertaken with specific datasets help to obtain the required data controller permissions, including working with the National Privacy Advisory Committee once this is established liaise with data suppliers and the IT hardware providers to ensure data are transferred safely and efficiently between the Indexer, linkage agent and safe haven. setup Secure file transfer accounts and thin client accounts, and ensuring that the research is able to access study data an undertake analyses undertake probabilistic linking when ‘indexing’ techniques are not appropriate develop and keeping up to date with probabilistic linkage techniques undertake disclosure control prior to data leaving the safe haven environment. IT infrastructure and ensuring citizens’ privacy Within Scotland there has been a degree of investment in IT infrastructure to support health informatics primarily through the Wellcome and research councils funding of the SHIP programme and the Chief Scientist Office (CSO) Scottish Health Sciences Collaboration The SHIP grant provided monies for initial scoping and development of an IT infrastructure. The Beyond 2011 programme in NRS considering data linkage approaches to delivering data traditionally gathered through the national Census has also delivered improved technical infrastructure. Farr and the ADRC/DSLS will utilise (and extend) the IT infrastructure put in place by SHIP, National Services Scotland, NHS and NRS. This existing NSS infrastructure was developed after significant consultation with both academics and the wider public. (Details of the SHIP infrastructure are published at http://www.scotship.ac.uk/sites/default/files/Reports/SHIP_BLUEPRINT_DOCUMENT_final_100712.pdf and for reasons of brevity not repeated in detail). It is a system which follows current best practice in data linkage by removing identifying personal identifying information as early as possible in the process and ensures that privacy is maintained through a strict separation of functions between the indexer (to be provided by NRS) and linkage agent at Atos. Linked data are held within a secure environment and accessed either through a Virtual Private Network (allowing the research to access data from their home institution) or through a physical safe setting (a network of which are emerging across 4 Scotland). Researchers only ever gain access to anonymised data within a safe haven arrangement and only then after completing a certified course and signing both stringent user terms and conditions, and confidentiality documents. At the heart of the infrastructure lies an IBM Netezza server. This is supplemented with various other application and calculation servers to ensure that the computing performance within each of the available analytic software packages (SPSS, STATA, SAS, Revolution R) is outstanding. Key components of the SHIP infrastructure are currently provided by a commercial supplier (Atos) within a secure data centre. We would like to continue to use this service initially but over the medium term (2-3 yrs) it would be appropriate to test the market and ensure best value for money is being achieved. The advantages of using the existing supplier initially relate to the: o o o o o speed with which the DSLS can go live security has been tested and ‘signed off’ by NHS Scotland (including penetration testing by an external supplier) secure file transfer protocols, thin client setups and analytic software (SPSS, Revolution R, STATA and SAS) are all in place and ready to use data are routinely backed up and held securely there are clear service level agreements and a resilient infrastructure. 5 6 ANNEX A: THE VISION, AMBITIONS AND AIMS Our vision for the future is one where evidence of what works in delivering positive outcomes for all of Scotland is delivered quickly and efficiently with minimal burden on front-line services. By improving the ethical and legal governance arrangements, and the technical capacity to securely and efficiently link statistical data, we will enable the research needed to inform policy decisions. Scotland will be recognised the world over as a hub of innovative and powerful statistical research, attracting investment and job creation. We have an ambition to build on existing successful programmes collaboratively to create a culture where legal, ethical, and secure data-linkage is accepted and expected By fostering collaboration between existing data linkage programmes and initiatives we will co-ordinate what is currently a fragmented landscape of activities to achieve immediate benefits through the sharing of ideas, solutions, best practice and methods. By encouraging collaboration in the use and procurement of data linkage ICT across the public sector, we will avoid unnecessary duplication and so reduce purchasing and running costs. By increasing the value of datasets, increasing the wealth of good practice and experience in data linkage research, and demonstrating that Scotland is a world-leader in this field, we will continue to encourage research investment into Scotland. We have an ambition to minimise the risks to privacy and enhance transparency, by driving up standards in data sharing and linkage procedures By recommending a set of guiding principles for all data linkage activity we will both raise standards and create a clear and consistent approach to data linkage across Scotland. By working with the Information Commissioner’s Office to increase the understanding of the Data Protection Act and other legislation across all those involved in linkage activities we will encourage respect for privacy and proportionate and effective approaches to mitigating privacy risks By encouraging transparency, openness and public involvement in decision making we will increase public understanding about how and why personal data are used for statistical and research purposes, and ensure the public value of research involving linkage methods By co-ordinating and harmonising data access and approval processes across sectors, without adding layers of bureaucracy, we will streamline the establishment and management of data linkage projects We have an ambition to fully realise the benefits that can be achieved through data-linkage to maximise the value of existing data By enhancing the data standards and statistical capacity we will improve the quality of data that exists and make advances on the evidence base, particularly in terms of a joined-up understanding of how outcomes are achieved, allowing for more informed spending on public services and early interventions that save money in the long run. 7 ANNEX B: THE GUIDING PRINCIPLES The Guiding Principles are the foundation of the Framework. They are designed to assist data controllers and other decision makers (e.g. ethics committees, privacy committees, data access panels) to adopt a common framework for decision-making and to take a proportionate approach to managing the risks inherent in any data linkage. Copies of the principles are available in booklet form and online. The principles are not rules and are not prescriptive. They are principles that we recommend are considered ahead of any data linkage activity and where they can guide deliberations on a given data linkage practice. The added value of the principles lies in their guiding effect for decision-makers who must decide whether to approve data sharing or linkage. They provide a common framework for thinking about the kinds of issues in play and for justifying decisions about linkage or sharing. They operate most effectively when judgment must be exercised about whether linkage or sharing should take place. For example, a linkage might be perfectly lawful but there might still be reasons to ask on what basis it should take place, if at all. The principles assist these deliberative processes. The principles are intended to promote the public interest in scientifically sound, ethically robust research while appropriately protecting privacy and encouraging a proportionate approach where actions taken to reduce the risks to privacy are in proportion to those risks, factoring in the potential benefits of the research. There are three central considerations that the principles aim to assist: do the potential public benefits from the research justify the risks to privacy? what can be done to mitigate the risks to privacy? what can be done to increase the public benefits of data linkage and sharing? Consideration and proportionate application of the principles should help balance these considerations, increase the public benefits from data usage and mitigate risks to privacy. A common framework of reference for decision-making should help to promote consistency of decision-making and also to foster a degree of trust in the high levels of protection and transparency that the system delivers. The principles are the basis of all other elements of delivery of the Framework our engagement work intends to widely inform and educate on how application of the principles can deliver public benefit through data linkage our work on privacy and ethics intends to help data controllers and other to use the principles to allow them share data appropriately our work on delivering data linkage infrastructure intends to apply the principles to facilitate research which successfully deliver both public benefit and proportionate privacy protection 8 ANNEX C: THE PRIVACY ADVISORY COMMITTEE Barriers to data linkage include considerable variation in the interpretation of the legal landscape and public opinion as well as a range of ethical challenges and risks associated with data linkage. We aim to address this barrier through public engagement (next annex) and through co-ordinating and harmonising data access and approval processes across sectors without adding layers of bureaucracy. The primary function of a privacy advisory committee would be to offer advice to data custodians on cross-sectoral linkage applications, increase confidence within the public on research and address data access issues. The development of this capacity will be taken forward in close discussion with public bodies across Scotland to avoid overlap, additional bureaucracy and to ensure added-value. Since publication of ‘Joined up Data for Better Decisions’ NSS:ISD have explored the potential for a National Privacy Advisory Committee for the whole of the NHS in Scotland. This has been done with a view to the potential that, if successful, it could expand to other (non-health) sectors in Scotland, or that a similar model could be applied elsewhere. ISD and the Office of the Chief Statistician and Performance have been in close communication about the responses received to the consultation and are considering questions that have arisen such as whether the initial NPAC should be for both Health and Social Care, rather than just health as well as, for example: How can we most effectively get ‘buy-in’ from all sectors, necessary for the privacy advisory committee to function well How should we best articulate the necessity of the Privacy Committee and the value it will add to the landscape in Scotland? How will the privacy service be funded and demonstrate value for money? Who will it report to and how will members be selected? How will it function with other UK, national and local structures? What kind of model should it operate in terms of logistics and applications? How will complaints against NPAC be received? A working group has been established to explore possible answers to these questions, with the following remit: Identify key stakeholders across the public sector in Scotland Act as an ambassador for NPAC Advise on, steer and contribute to ideas for how a NPAC could work. (Helping to shape a proposal or options for consultation.) Contribute to and direct delivery of consultation exercise Provide guidance to DLF policy team on NPAC matters including highlighting risk Provide guidance on and help deliver communications and engagement Helping with a joined up, coordinated response with related work streams. 9