The DREAM Rheumatoid Arthritis Responder Challenge: Motivation, Data, Scoring and Results LARA MANGRAVITE SAGE BIONETWORKS ON BEHALF OF THE RA CHALLENGE ORGANIZING TEAM Challenge Organizers Solly Sieberts Abhi Pratap Christine Suver Bruce Hoff Thea Norman Venkat Balagurusamy Stephen Friend Gustavo Stolovitzky Funders Eli Stahl, Mt Sinai Gaurav Pandey, Mt Sinai Jing Cui, Brigham and Women’s Andre Falcao, U Lisbon Robert Plenge, Merck Peter Gregersen, Feinstein Institute Jeff Greenberg, Corrona Dimitrios Pappas, Corrona Kaleb Michaud, Arthritis Internet Registry Generators of Training Dataset Rheumatoid Arthritis Treatment ~30% of RA patients fail to respond to anti-TNF therapy -- Predicting nonresponse would assist in precision medicine, clinical trial design, and development of new therapies Robert Plenge Pharmacogenetics of antiTNF response N SNPheritability (se) P-value All patients 2617 0.18 (0.10) 0.02 etanercept 716 0 (0.34) 0.5 infliximab 857 0.62 (0.29) 0.02 adalimumab 1027 0.36 (0.25) 0.08 infliximab + adalimumab 1899 0.36 (0.13) 0.003 Drug n=2,706 Ciu and Stahl et al PLoS Genetis 2013 Eli Stahl Rationale Given sizable estimated heritability, is it possible to use genetic features to predict treatment response? Polygenic approach: Combined influence of weak effects Population subtypes: Not all individuals react similarly Does genetic heritability foretell genetic prediction? RA Responder Challenge Design Discovery (phase I) GWAS of treatment response in RA (n≈2,700 patients) Polygenic SNP predictor of response Refine model Genomic data (e.g., expression profiling) Peer insights 1) 2) etc. Plenge et. al. Nature Genetics 2013 Open Collabora on synapse RA Responder Challenge Design Discovery (phase I) Validation (phase II) GWAS of treatment response in RA (n≈2,700 patients) Polygenic SNP predictor of response Submit models Refine model Genomic data (e.g., expression profiling) Peer insights 1) 2) etc. Plenge et. al. Nature Genetics 2013 Open Collabora on synapse GWAS of treatment response in RA (n≈1,100 patients) Score models RA Responder Challenge Design Discovery (phase I) Validation (phase II) GWAS of treatment response in RA (n≈2,700 patients) Polygenic SNP predictor of response Submit models Refine model Genomic data (e.g., expression profiling) Peer insights 1) 2) etc. Plenge et. al. Nature Genetics 2013 Open Collabora on synapse GWAS of treatment response in RA (n≈1,100 patients) responses Score models Publica on Peer-review RA Challenge Data Discovery Dataset Test Data Genotypes ~ 2.3 million SNPs Genotypes ~ 2.3 million SNPs Clinical ~ 6 traits Clinical ~ 6 traits Response N=2076 Combine set from 4 studies N=723 Generated for this challenge RA Challenge: Build the best possible predictors of anti-TNFa response in RA Team Phase Community Phase TEAM PHASE February - June 2014 Self-aggregate into teams and build the best possible predictor of response. COMMUNITY PHASE July - October 2014 Work together across teams to assess the contribution of genetics to prediction. RA Responders Challenge Predict treatment response as measured by change in disease activity score (DAS28) in response to antTNFa therapy. Scoring: Average rank of pearson correlation and spearman correlation. Identify poor responders to anti-TNFa therapy as defined by EULAR criteria. Scoring: Average rank of AUC and PR. Team Phase Results Subchallenge 1: Predicting deltaDAS Best models: Team Guan Lab Solly Sieberts Subchallenge 2: Predicting nonresponders Best models: Team Guan Lab & Team SBI_Lab 32 teams The Community Phase (July – October) Work in collaboration to determine: -- Whether genetic information contributes in a meaningful way to predictions? -- Best possible predictors of response. -- What components of the modeling approaches are most beneficial for this question. Community Phase Participants Community Phase Logistics First part: teams split into groups and shared knowledge to help inform one another’s efforts Second part: all teams came together to devise an analytical plan to explicitly address these questions. Teams share ideas and then work individually to provide: Do models using genetic features improve on prediction relative to clinical models? What is the contribution of feature selection vs. modeling algorithm on performance? Does the use of biological priors in feature selection improve relative to random selection? Can supervised ensemble approach improve upon individual predictions? Subchallenge 1:Predicting deltaDAS Subchallenge 1:Predicting deltaDAS Subchallenge 2: Predicting Nonresponders Subchallenge 2: Predicting Nonresponders Ensemble Modeling by Gaurav Pandey Conclusions Gaussian Process Regression appears to work best with this type of problem. SNP selection more important than algorithmic selection in most cases. Genetic information improves prediction of nonresponders over use of clinical information. Ability to predict response based on clinical features may be valuable to clinicians in and of themselves. Today’s Speakers: Best Performers from Independent Team Phase Fan Zhu on behalf of Team Guan Lab A generic method for predicting clinical outcomes and drug response Javier Garcia-Garcia on behalf of Team SBI_Lab Predicting response to arthritis treatments: regression-based gaussian processes on small sets of SNPs