Grid Based Clinical Trial Scenarios Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow ros@dcs.gla.ac.uk 22nd September 2004 LSG-RG GGF-12, 22 Sep 2004 NeSC in the UK Transition to OGSI/OGSA under discussion NeSC HPC(x) Two UK OGSA Gride-Science projects started in January Previous workTest on UK Grid based on GT2 UCL, Imperial College, Universities of Edinburgh and Newcastle Glasgow Demonstrated broad set of applications across it X Universities of Portsmouth, Reading, Manchester, Westminster and CCLRC X Monte Carlo simulations of ionic diffusion through radiation damaged crystal structures The next Grid X Integrated Earth system modelling Belfast software X BLAST on the Grid X Grid Integration Test Script Suite X … X Challenges/ There are still issues to be resolved Opportunities OGSA definition and delivery ? X X Standards OGSI, WSRF, … …and Technologies GT3, GT4… Daresbury Lab Edinburgh Newcastle White Rose Grid ManchesterGrid Service Core National Hosting environments & Platforms CSAR Oxford Combinations of services supported RAL Cardiff Material and grids to support adopters Cambridge Hinxton London Southampton LSG-RG GGF-12, 22 Sep 2004 Overview Context of clinical trials challenges/requirements Overview of Virtual Organisations for Trials and Epidemiological Studies (VOTES) project Scenarios to be supported Initial thoughts on Grid usage Future Plans LSG-RG GGF-12, 22 Sep 2004 Clinical Trials Background Reliable assessment of moderate effects of treatment of important diseases (e.g. cancer) requires studies that guarantee strict control of bias (requiring randomization and analysis) strict control of random error Up to date information on patterns of disease/frequency of clinical procedures can help inform the design of a clinical trial enable appropriate strategies for rapid and efficient recruitment to be devised Generating this evidence typically requires the collection of data on many thousands of people over a period of several years. essential to avoid misleading results caused by the play of chance Clinical trials usually require collaborative effort data are collected from individuals and from existing health records from multiple investigative sites Study progress, data quality, analysis of results managed by one or more coordinating centres LSG-RG GGF-12, 22 Sep 2004 Clinical Trials Demands Need to answer questions such as How many men in Scotland between the ages of 4565 had a heart attack in the last 5 years? Of those that did, which of those men took drug X? Did those taking drug X suffer any further major events? Also eventually would like to be able to answer questions like, X is the high rate of heart attacks in Scotland caused by genetic factors – ( …or love for deep fried Mars Bars!!!) LSG-RG GGF-12, 22 Sep 2004 Clinical Trials Requirements Need to ensure Ethical and legal factors considered X Patient consent for the data to be made available Right people seeing right data and in right context In moving data from site A to site C can it go via site B? Right services made available X For finding, accessing, querying, analysing data… – Note general publish model mostly not possible Provenance of data X Where did it come from, how was it generated, has it been validated, … LSG-RG GGF-12, 22 Sep 2004 VOTES Virtual Organisations for Trials and Epidemiological Studies 3 year (£2.8M) MRC funded project expected to start imminently Plans to develop framework for producing Grid infrastructures to address key components of clinical trial/observational study X X X Recruitment of potentially eligible participants Data collection during the study Study administration and coordination – Involves Glasgow, Oxford, Leicester, Nottingham, Manchester, Imperial Clinical Virtual Organisation Framework Used to realise CVO-1 (e.g. for data collection) CVO-2 (e.g. for recruitment) LeiNott GLA Disease registries Hospital databases Transfer Grid GPs OX IMP Clinical trial data sets LSG-RG GGF-12, 22 Sep 2004 VOTES Goals Recruitment of potentially eligible participants Major limiting step in clinical trials – Avoid sending out 200,000 letters inviting people to enter trial » Better to focus on specific target groups » Ideally deal with patients who have given consent X Information is available electronically (in various DBs) therefore should in principle be able to access and use it – … but clinical data sets owned by variety of organisations? – existing approaches at access and using these data sets cumbersome » e.g. clinical trials officer physically taking laptop to NHS office and querying DB whilst being monitored, » … data sets deleted once statistics recorded LSG-RG GGF-12, 22 Sep 2004 VOTES Goals Data collection during the study Increased usage of direct data entry systems X X reduced need for extensive checks against paper case report forms improves quality of data – automated checks during the interview » e.g. range checks, consistency within and between forms, eligibility for randomization Also in many clinical research settings, internet access not always available when clinician interviewing a study participant X e.g. whilst on hospital ward Need remote data entry systems for large-scale studies to be portable X Should support offline functionality where data transferred intermittently to and from the study databases Should support rapid tailoring to meet requirements of new trials or observational studies LSG-RG GGF-12, 22 Sep 2004 VOTES Goals Study Management Traditionally study data held on a central computer database and, as a consequence, remote access to live data substantially limited X Better to have web (GRID)-based applications to facilitate more flexible coordination of large-scale studies – allow multiple users to have access to data and functions appropriate to their role in the study » should allow for fine grained authorisation… X X X X X Should support access to relevant up-to-date information to others involved in the study – e.g. Steering Committee or the Data Monitoring Committee Allow to log details of essential documents such as ethics and regulatory approvals – support rapid dissemination of study documentation such as changes to protocol Support coordinated transport of study treatment and collection of study samples – allow authorised study clinicians to review relevant data on individual study participants, including case report forms, blood results, and serious adverse events Should provide current information about trials activity, – reports on the number of records screened, invitations produced, … Will allow study coordinators to identify and address barriers to recruitment at early stage LSG-RG GGF-12, 22 Sep 2004 System Scenario for Patient Identification & Recruitment via GPs GPASS with Private Data Sets Check signature If ok run XML search against GP DB using, e.g. XML search capability for GPASS (EMIS/VAMP) GPASS pulls other data, e.g. from SCI store? Patients consent? Forms auto completed and results to portal to consenting patients Trials coordinator (secure/authorised log-in) Portal has details of all GPs, … or can establish this someway? (SCI store for GP Code, GP Practice Code, CHI etc) Trials Portal Personalised Services Follow link with Trial Trial Trial X.509 cert’s in #1 #2 #3 browsers Authentication, GP Decides to download authorisation signed XML pro-forma for Trial #1 Pre-designed for specific VO… trials / protocols Mostly completed (trial criteria fields) – contain patient matching criteria Link in to data collection/analysis work OGSA-DAI Automatically produce letters for matching patients e.g. using SCI Discharge GP with browser SCI Gateway Need to get certificates issued for local GPs and tell them how to get them into browsers etc Email describes nature of trial, things to do to participate, other info ($) has link to secure web service (Trial #1) LSG-RG GGF-12, 22 Sep 2004 All comms through SCI Gateway Relevant trial information added to secure repository – to be used with other Grid services (for stats, etc …) Important to get meta-data too for provenance etc Secure Data Repository Work Required Portal development Security aspects Web services for patient identification, data validity checking, … Workflows for cross checking various resources, e.g. death register, cancer register, with hospital records, GPASS info, … etc Engineer trial forms Link to coding resources, e.g. WHO medications ontology, MedRA for adverse event information, … Should capture protocol information and data sets required Designing XML queries Conform to existing schemas for NHS Scotland Should capture criteria for patient matching for specific trials Establish Secure Repository for Trials Basis for overall trials, history, provenance, record linkage, … Establish procedures (use existing ones?) for meta-data collection X Links to data annotation/provenance activities of NDCC LSG-RG GGF-12, 22 Sep 2004 Grid Security OGSA security Single sign-on based on (X.509) digital certificates X establish credentials – Certification authority based (RAL in UK) Services (and clients) have APIs for fine grained security X Based on GSS-API Provides for authentication but need authorisation X Various technologies for authorisation including PERMIS, CAS, … Collaborating with PrivilEge and Role Management Infrastructure Standards Validation (PERMIS) team X Lead by Prof David Chadwick, University of Salford – (www.permis.org) LSG-RG GGF-12, 22 Sep 2004 Security Authorisation PERMIS allows to Define roles for who can do what on what X Policy = { Role x Target x Action } – Can user X invoke service Y and access or change data Z? » Policies created with PERMIS PolicyEditor (output is XML based policy) LSG-RG GGF-12, 22 Sep 2004 Security Authorisation PERMIS Privilege Allocator then used to sign policies Associates roles with specific users X Policies stored as attribute certificates in LDAP server When is authorisation done? Two main choices X X Portal personalised for users based on their policies – If not allowed to invoke service then they do not get to see it Actions of users (with given role) are authorised every time the service is invoked – They can see the service but potentially not be allowed to invoke it » Performance issues… but more likely scenario for authorisation X In both cases, if not explicitly agreed in policy then rejected and logged! – Both cases to be explored Use of the GGF SAML AuthZ specification X Already investigated in BRIDGES project – Identified issues with standards LSG-RG GGF-12, 22 Sep 2004 Future Plans Accessing and using clinical trials data only the start Ideally want to link these data sets to numerous others to support X X Systems biology Wider e-Health initiatives Genetics Healthcare Initiative proposal recently submitted to Chief Scientists Office plans to do exactly this across Scotland LSG-RG GGF-12, 22 Sep 2004 LSG-RG GGF-12, 22 Sep 2004 + links to plant/crops, environmental, health, … information sources Populations Organisms Physiology Organs Tissues Cell signalling Cell Protein-protein interaction (pathways) Protein functions Protein Structures Gene expressions Nucleotide structures Nucleotide sequences The Grid Enabled Bio-Data Future Bridges Project C F G V ir t u a l P u b lic a lly C u r a te d D a t a E nsem bl O r g a n is a t io n O M IM G la s g o w S W I S S -P R O T P riv a te E d in b u r g h MGI VO Authorisation P r iv a te d ata O x fo rd d ata Information Integrator … RGD L e ic e s te r D ATA HUB P r iv a te d ata N e th e rla n d s OGSA-DAI bl a st Synteny Grid Service P riv a te data HUGO P r iv a te data London P riv a te d ata + LSG-RG GGF-12, 22 Sep 2004 Bridges Portal LSG-RG GGF-12, 22 Sep 2004 MagnaVista www.nesc.ac.uk LSG-RG GGF-12, 22 Sep 2004 MagnaVista LSG-RG GGF-12, 22 Sep 2004 QTL upload LSG-RG GGF-12, 22 Sep 2004 QTL upload LSG-RG GGF-12, 22 Sep 2004 QTL browsing LSG-RG GGF-12, 22 Sep 2004 Grid Blast Client • Allows ‘genome scale’ blasting • Uses ScotGrid and idle compute resources of training lab Condor pool LSG-RG GGF-12, 22 Sep 2004 LSG-RG GGF-12, 22 Sep 2004 LSG-RG GGF-12, 22 Sep 2004 LSG-RG GGF-12, 22 Sep 2004 DyVOSE Project Dynamic Virtual Organisations for e-Science Education project Two year project started 1st May 2004 funded by JISC Exploring advanced authorisation infrastructures for security X … in Grid Computing Module as part of advanced MSc at Glasgow – Will provide insight into rolling Grid out to the masses! ScotGrid GU Condor pool Other (known!) Grid resources Education VO policies PERMIS based Authorisation checks Authorisation decisions LSG-RG GGF-12, 22 Sep 2004 Scottish Bioinformatics Research Network Four year proposal (£2.4M) expected to start November 2004 Funded by Scottish Enterprise, Scottish Higher Education Funding Council, Scottish Executive Environment and Rural Affairs Department X Involves Glasgow, Dundee, Edinburgh, Scottish Bioinformatics Forum Aim to provide bioinformatics infrastructure for Scottish health, agriculture and industry X X X Infrastructure support at Dundee, Edinburgh and Glasgow to support first-rate research in bioinformatics at each academic institute Infrastructure support at three institutes, to support inter-institutional sharing of compute and data resources through application of Grid computing Outreach and training activities mediated by the Scottish Bioinformatics Forum LSG-RG GGF-12, 22 Sep 2004 Conclusions NeSC Glasgow focussed largely on life sciences and security The GRID is happening and will influence life sciences Government SR allocated £115M for eScience Life sciences are key target areas for eScience Grid middleware and standards still “evolving“ Non-trivial learning curve, immaturity of technology Security absolutely needs to be addressed by middleware developers AND educators Real explorations essential to inform wider standards developers GGF, OASIS, IETF, W3C, .... and middleware producers LSG-RG GGF-12, 22 Sep 2004 Conclusions Main take home message: need coordination of X X GRID software engineering/standards Life science needs/standards Without this, metaphor of electric power GRID doomed LSG-RG GGF-12, 22 Sep 2004