Shared Health Research Information Network Three axis for rapid production grade deployment: 1. POLICY 2. TECHNOLOGY 3. RESEARCH SCENARIOS Andrew McMurry Sr. Research Software Developer Harvard Medical School Center for BioMedical Informatics Children's Hospital Informatics Program at Harvard-MIT HST Andrew_McMurry(@) hms.harvard.edu https://catalyst.harvard.edu/shrine Outline of topics covered Policy History of success cross-institutional IRB agreements Integrated health care entities Across independent HIPAA covered entities Technology SHRINE Architecture Current status and roadmap Development Challenges and Opportunities Intended future translational research scenarios for Translational Research Requiring Human Specimens for Population Health Surveillance for Observational Studies of Genetic Variants History of cross-institutional IRB agreements Integrated health care entities Partners RPDR i2b2 Clinical Research Chart Everyday patient encounters huge research cohorts Shawn Murphy et all (wont steal their thunder here) Centralized Research Patient Data Repository shared among Massachusetts General Hospital (MGH), Brigham and Women's Hospital (BWH), Faulkner Hospital (FH), Spaulding Rehabilitation Hospital (SRH), and Newton Wellesley Hospital (NWH) History of cross-institutional IRB agreements http://spin.chip.org/irb.html Across independent HIPAA covered entities SPIN: Federated query over locally controlled de-identified databases Distributed pathology database shared by Brigham & Women's Hospital* Beth Israel Deaconess Medical Center* Cedars-Sinai Medical Center Dana-Farber Cancer Institute* Children's Hospital Boston* Harvard Medical School* Massachusetts General Hospital* National Institutes of Health National Cancer Institute Olive View Medical Center Regenstrief Institute University of California at Los Angeles Medical Center University of Pittsburgh Medical Center VA Greater LA Healthcare System * Participate in live “Pathology Specimen Locator” collaboration History of cross-institutional IRB agreements SHRINE approach : leverage has worked in the past Secure IRB approvals for I2b2 local database at each site Separate set of approvals for federated queries across sites SHRINE governance principles Hospital Autonomy: each site remains in control over all disclosures Patient privacy: no attempts to re-identify patients Non compete: no attempts to compare quality of care across sites SHRINE Technical Architecture Bird’s Eye View Leverage local i2b2 deployments Broadcast queries and aggregate responses across autonomous sites as if they were “one clinical data warehouse” There is no central database Connect sites in a peer-to-peer or hub-spoke fashion SHRINE Technical Architecture Technical Architecture Architecture, “cell” view 2009 deliverable Architecture, sequence diagram view SHRINE Technical Architecture Current Status Harvard Effort Prototype system running live at Harvard across BIDMC, Children’s, and Partners representing both BWH and MGH. Uses 1 year of real patient data Demographics and diagnosis Under tight IRB control SHRINE Technical Architecture Current Status National Effort: west coast partners University of Washington UCSF UC Davis Recombinant End-to-End Demo March 18th (3 week turn around time) SHRINE Technical Architecture Current Status National Effort: sleep study partners Case Western Reserve Institute University of Washington-Madison Marshfield Clinic (potentially others as well) I2B2 users interested in using SHRINE for sleep studies SHRINE Technical Architecture I2b2 single site query demo http://I2b2.org/software SHRINE multi-site demo http://cbmi-lab.med.harvard.edu:8443/i2b2 SHRINE Technical Architecture Timeline and Roadmap By end of 2009, Harvard SHRINE queries for aggregate counts Demographics + ICD9 Diagnosis Current work Polishing demostration software for relase Medications and Labs Next Steps Browseable random LDS datasets Downloadable LDS No plans for PHI Development Challenges and Opportunities 1. Grid computing makes multi-threading look simple by comparison Politically impossible to send patient data to each ‘grid’ node Grid computing and federated queries are VERY different Pre-processing can be used effectively as shown in our use cases 2. Open Source strategy 1. Writing plug-ins for the SHRINE network Development Challenges and Opportunities 1. Grid computing makes multi-threading look simple by comparison 2. Hosted retreat to address Open Source strategy Harvard CTSA, CHIP, I2B2, Partners, DFCI, private companies Science Commons, jQuery Actively launching an open source portal Test driven development with continuous integration Release early release often All milestones measured by what we can get IRB approved and deployed with real clinical data 3. Writing analysis plug-ins for the SHRINE network Development Challenges and Opportunities 1. Grid computing makes multi-threading look simple by comparison 2. Open Source strategy 1. Writing analysis plug-ins for the SHRINE network • Using I2b2 Java Workbench Using I2b2 Web Querytool • By pre-processing results when required for patient privacy * • (Shawn Murphy et all) (Griffin Weber et all) * http://www.jamia.org/cgi/content/abstract/14/4/527 SHRINE: Intended Investigation Use Cases For translational studies requiring human specimens For Population Health Surveillance For Observational Studies of Genetic Variants* Examples shown here reflect current projects which will use the SHRINE infrastructure for Translational Research Requiring Human Specimens NCI vision 2001: Vast collections of human specimens and relevant clinical data exist all over the country, yet are infrequently shared for cancer research. Challenges: How to link existing pathology systems for cancer research? How to ensure patient privacy in accordance with HIPAA? How to encourage hospital participation? Availability Millions of Paraffin Embedded Tissues Smaller Collections of Fresh / Frozen Tissues for Translational Research Requiring Human Specimens Shared Pathology Informatics Network National prototype including HMS, UCLA, Indiana, UPMC, … Live Production instance at HMS including 4 hospitals Created Open Source Tools caBIG adopted caTIES from SPIN Influenced Markle’s Common Framework federated query TMA construction using specimens from four sites http://spin.chip.org for Translational Research Requiring Human Specimens for Translational Research Requiring Human Specimens For Population Health Surveillance For translational research requiring human specimens For Population Health Surveillance Geotemporal cancer disease incidence rates Seasonal infectious diseases such as influenza Disease flares such as Irritable Bowel Disease (IBD) Other use cases exist, these are the ones under concentrated study For Population Health Surveillance: disease outbreaks For Population Health Surveillance: seasonal influenza http://aegis.chip.org/flu For Population Health Surveillance: pharmacovigilance http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0000840 SHRINE: Intended Investigation Use Cases For translational research requiring human specimens For population health surveillance For Observational Studies of Genetic Variants* High throughput genotyping + High throughput phenotyping + High throughput sample acquisition = Orders of magnitude Faster to obtain huge populations for genomic studies Cheaper *Courtesy of Zak Kohane For observational studies of genetic variants High throughput sample acquisition CRIMSON High throughput genotyping CRIMSON samples SNP arrays High throughput phenotyping Natural language processing “smoking status” Orders of magnitude Faster to obtain huge populations for genomic studies Cheaper “disruptive technology” Lynn Bry, MD, PHD et all Summary of topics covered Overcome statistical noise and reproducibility with large patient populations Policy History of cross-institutional IRB agreements Technology Architecture Current status and roadmap Development Challenges and Opportunities Intended future translational research scenarios for Translational Research Requiring Human Specimens for Population Health Surveillance for Observational Studies of Genetic Variants Acknowledgements: Core SHRINE team Zak Kohane Griffin Weber Shawn Murphy Dan Nigrin Ken Mandl Sussane Churchill Doug Macfadden Matvey Palchuck Andrew McMurry (SHRINE Lead / HMS) (HMS CTO / bidmc) (I2B2 CRC / partners) (Children’s CIO) (Public Health Use Cases/ CHIP IHL) (I2B2 Executive director) (HMS CBMI IT Director) (Ontology Lead / HMS) (Architect / HMS) Could give an entire talk on all the collaborators, multi-institutional effort. Asking forgiveness from those not listed Acknowledgements: Core SPIN team Zak Kohane (SPIN PI / HMS) Frank Kuo (PSL Program Director / BWH) (PSL Pathologist / MGH) (PSL Pathologist / BIDMC) (PSL Pathologist / Children’s) (PSL Developer / BWH ) (Biosurviellance PI/ Children’s) (Biosurviellance Dev Lead / Children’s) (SPIN Developer/ HMS) (SPIN Developer / NCI at HMS) (SPIN Developer / HMS John Gilbertson Mark Boguski Antonio Perez Mike Banos Ken Mandl Clint Gilbert Greg Polumbo Ricardo Delima Britt Fitch http://spin.chip.org/community.html Acknowledgements: Core I2b2 team https://www.i2b2.org/about/structure.html Thank You http://catalyst.harvard.edu/shrine Andrew_McMurry (@) hms.harvard.edu