Motivation The project The middleware Applications Summary GIMI: Generic Infrastructure for Medical Informatics Andrew Simpson, on behalf of the GIMI consortium Oxford University Computing Laboratory July 2008 Simpson et al GIMI Motivation The project The middleware Applications Summary 1 Motivation 2 The project 3 The middleware sif Plug-ins and data agnosticism Evolving access control 4 Applications Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications 5 Summary Simpson et al GIMI Motivation The project The middleware Applications Summary Motivation Medical (and other) researchers often have significant amounts of data collected over a period of years that they would like to provide access to in real-time: to improve patient care; to act as a training resource; to achieve ‘bigger and better research’; to ‘realise assets’ On the other side of the equation, such researchers often wish to ‘get at’ large data sources in real-time The facilitation of secure data access, sharing and integration across organisational boundaries is essential in both cases These issues are, of course, also relevant in many other contexts Simpson et al GIMI Motivation The project The middleware Applications Summary Motivation “Electronic records could facilitate new interfaces between care and research environments, leading to great improvements in the scope and efficiency of research. Benefits range from systematically generating hypotheses for research to undertaking entire studies based only on electronic record data . . . Clinicians and patients must have confidence in the consent, confidentiality and security arrangements for the uses of secondary data. Provided that such initiatives establish adequate information governance arrangements, within a clear ethical framework, innovative clinical research should flourish. Major benefits to patient care could ensue given sufficient development of the care-research interface via electronic records.” J. Powell and I. Buchan. Electronic health records should support clinical research. Journal of Medical Internet Research, 7(1):e4, 2005. Simpson et al GIMI Motivation The project The middleware Applications Summary Project aims The main aim of GIMI is to develop a generic, dependable middleware layer capable of: 1 (in the short term) supporting data sharing across disparate sources to facilitate healthcare research, delivery, and training; 2 (in the medium term) facilitating data access via dynamic, fine-grained access control mechanisms 3 (in the longer-term) interfacing with technological solutions deployed within the NHS The technology development is being validated via three applications Simpson et al GIMI Motivation The project The middleware Applications Summary Partners University of Oxford (Computing Laboratory and Engineering Science) University College London Loughborough University t+ Medical IBM UK Siemens Molecular Imaging The National Cancer Research Institute Funding from TSB; total project value 2.4M pounds Simpson et al GIMI Motivation The project The middleware Applications Summary Work-packages Work-package 1: project management (OUCL) Work-package 2: core technology (OUCL) Work-package 3: long-term conditions (t+ Medical and OUES) Work-package 4: mammography auditing and training (UCL and Loughborough University) Work-package 5: medical imaging in cancer care (OUES) Work-package 6: clinical data management and interoperability (OUCL) The project structure corresponds to a ‘hub-and-spoke’ model Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control The middleware The aims of the project give rise to two distinct—but complementary—technologies: sif (service-oriented interoperability framework), and evolving access control Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Not another middleware framework . . . Yes . . . but . . . Previous experience (e-DiaMoND—based on a combination of GT3, OGSA-DAI and IBM products) and a lack of options three or so years ago (WSRF panned by Watson and Parastatidis; GT4 not yet released; OMII-UK not particularly portable at that point) meant that going our own way and using a WS-I+ (with our + being WS-Security) approach was the best option for us Data is at the heart of our concerns, so we wanted a data-oriented solution . . . and we also wanted a portable and interoperable one Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Abstraction Issues pertaining to secure transfer and data federation are abstracted from the end user The middleware is agnostic to the kind of data that is shared; furthermore, it is agnostic to what is done to that data Via a ‘plug-in’ mechanism, domain specialists develop applications to access and manipulate remotely held data We aim to facilitate technology-agnosticism via data-agnosticism Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Interoperability We assume that legacy systems have integral value: there are no pre-conceived schemas, ontologies or interfaces that render legacy data incompatible It is the applications and communities that determine compatibility It is our view that lower level interoperability should be achieved via the use of open standards; higher level—or semantic—interoperability should be achieved on an application-by-application, or domain-by-domain basis The aim is to facilitate ‘bottom-up’—rather than ‘top-down’—virtual organisations Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control sif and GIMI VO1 VO3 VO2 App 1 App 2 App 3 sif Source 1 Source 2 sif Source 3 Simpson et al Source 4 GIMI Source 5 Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Secure data sharing The responsibility for determining access to data resides with the data’s owner: it is not for us to impose pre-conceived policies on data owners; as such, expressive and flexible models should be in place to accommodate users’ needs We are interested in abstract—but not restrictive—representations We are also interested in utilising audit information to facilitate the construction of ‘meta-policies’ Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Technology drivers The use of technologies that are: portable, interoperable, standards-based, freely available, and (where possible) open source resulting in middleware that: meets our requirements with respect to abstraction, interoperability and data ownership (amongst others), requires minimum buy-in from data owners, and is straightforward for application developers to pick up and play with Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control A health grid architecture Virtual Organisation E P1 I E P2 Data Data ws1 I ws2 Hospital 1 Simpson et al Hospital 2 GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Federated database access Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Plug-ins We wish to offer functionality for three types of plug-in: data, file, and algorithm By using a standard plug-in interface for each of the three types it becomes possible to add heterogeneous resources into a virtual organisation There is no need for the resource being advertised through the plug-in system to directly represent the physical resource—what is advertised as a single data source may come from any number of physical resources, or even another distributed system Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Plug-ins (continued) It is envisaged that the following will be ultimately accessible via plug-ins: Relational databases XML databases Object databases Picture Archiving and Communication Systems (PACS) Local file systems Network file systems GIMI file system NHS Secondary Uses Service Algorithms Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Plug-in authoring Plug-in types have different schemas to conform to Each schema allows the author to define the interface to their plug-in The plug-ins are implemented in Java and packaged as Jar files Can support embedded libraries and supporting files A plug-in defines any translation necessary between common plug-in interfaces and actual back-end implementation Simpson et al GIMI Motivation The project The middleware Applications Summary sif Plug-ins and data agnosticism Evolving access control Evolving access control Via XACML we offer the potential for fine-grained access control policies Furthermore, by coupling the access and audit mechanisms and allowing the description of meta-policies, data access can change as a consequence of observed actions The approach permits the expression of meta-policies (policies about policies), which allow policies to evolve on the basis of observed events Relevant events are observed as all external data access is channelled through a single access point Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Long-term conditions There is a drive towards self-management as a means of improving the health of patients with long-term conditions This needs to be supported by comprehensive IT systems for disease management, which integrates all relevant information Initial research has focused on the development of robust algorithms for alerting healthcare professionals when a patient’s data deviate from the expected pattern Telemedicine trial involving two GP surgeries from the local PCT—involving delivering blood-glucose summary data from a telemedicine handset to a GP’s desktop Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Application 1 authorisation requirements One application allows a patient to use their mobile phone to send pertinent information to a central server, where it is recorded; second allows GPs to access the data pertaining to their own patients This gives rise to the requirement that a particular doctor can only view data for their own patients t+ Medical are providing additional applications that make use of the plug-in mechanism to provide processed data Access to the various plug-ins also requires restricting to only those who are permitted access to the processed data Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Mammography auditing and training The main aim is to develop a prototype training tool for screening mammography which could offer radiologists a unique educational experience based on: intelligent selection of training activities, a large number of richly annotated images (previous PERFORMS cases), a dataset annotated with specific learning goals, and e-learning materials Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Application 2 authorisation requirements The application requires access to health-related data, together with data associated with the person undergoing the training or assessment “Only the person who the data relates to and their supervisor (or those approved by the data subject) can see the data” Institutions have indicated that they need to be able to control who has access to any given annotated training data set It is important to be able to define policies that depend on the actions being performed: a radiologist undergoing assessment should not have any cases from their own institution contained within the data comprising the test set Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Medical imaging in cancer care The overall aim of this application area is to: develop medical image algorithms for application to breast and colorectal cancer, build on previous projects, notably eDiaMoND, Mammogrid, and the NCRI Informatics Initiative demonstrator on colorectal cancer, and to provide a testbed and user feedback for prototypes as they are developed throughout the duration of the project The application is primarily concerned with providing diagnostic information based on image analysis of mammograms; to validate these algorithms, researchers require access to a large volume of image data from research archives and live healthcare systems Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Application 3 authorisation requirements “Only allow researcher Y to see anonymised data pertaining to the following group of patients” “No researcher may see any data pertaining to the following patients” “Only data pertaining to the following patients may be viewed” “Each researcher has access to a customised list of attributes pertaining to particular patients” “A researcher can only see patients involved in a particular trial” Simpson et al GIMI Motivation The project The middleware Applications Summary Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications Other applications NeuroGrid A prototype demonstrator for the NCRI Facilitating access to a Dementia data set based at the John Radcliffe Hospital in Oxford Simpson et al GIMI Motivation The project The middleware Applications Summary Simpson et al Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications GIMI Motivation The project The middleware Applications Summary Simpson et al Long-term conditions Mammography auditing and training Medical imaging in cancer care Other applications GIMI Motivation The project The middleware Applications Summary It’s all about the data . . . GIMI has been driven by the needs of appropriate, secure data sharing Rather than adopt an existing toolkit-based approach, a lightweight system built from Java and web services (with a bit of C that is hidden away) has been developed The need for fine-grained, expressive, user-defined authorisation policies, coupled with strong auditing mechanisms—as well as the requirement that the tasks of sharing data should be simplified as much as possible—has led to the adopted approach Simpson et al GIMI Motivation The project The middleware Applications Summary What the middleware doesn’t do Data ‘grids’, not compute grids Semantic interoperability Data management Simpson et al GIMI Motivation The project The middleware Applications Summary Going on our own Amongst other things we have had to implement the following things that would have come as standard using a grid toolkit: Certificate Authority(CA) management Use openssl Proxy certificates Have a custom ticketing system Service registries Use a peer-to-peer system to locate nodes Every node has the same set of services File storage and retrieval Each node has a simple file system Files are stored and retrieved from WebDAV folders Remote code execution Computation is performed using algorithm plug-ins which typically run on their own servers Simpson et al GIMI Motivation The project The middleware Applications Summary And there’s still ‘lashing’ together to do . . . It’s just the concerns are slightly different (“will Sun’s extensions to the WS-security libraries work with IBM JVMs?”)—but we are now where we wanted to be three years ago—and, most importantly, are now in a position to build applications for new collaborators in a matter of weeks Simpson et al GIMI Motivation The project The middleware Applications Summary Progress Simpson et al GIMI