The Pistoia Alliance pistoiaalliance.org A Construct for Pre-competitive Collaboration and Open Innovation 2010 Agenda • Origins of Pistoia – History – Industry Drivers – Technology Trends • Scope and Operations of Pistoia – Mission, Membership, Governance – Projects and Deliverables • Discussion Industry Driver: Externalization PHARMA CHEM CRO SYNTHESIZE REGISTER DESIGN DISTRIBUTE ASSAY DISTRIBUTE REPORT PHARMA CHEM SYNTHESIZE BIO CRO Selectively Integrated Model DESIGN DATA CRO Fully Internal Model PHARMA Cost pressures, disruptive technologies, and other forces often drive business processes to be externalized. BIO ASSAY REGISTER REPORT DATA Emerging Net-centric Pharma Processes PHARMA 1 CRO 1 PHARMA 2 CRO 2 PHARMA 3 CRO 3 CRO 4 External Interaction BioIT Alliance How the groups interact to support the wider information supply chain Implementations All business models Product and Service Suppliers Pilots & Prototypes Grand Challenges & Use Cases IMI PRISM SILA W3C HCLS Pistoia Alliance CDisc Opportunity: Changing Tech Landscape More Robust Technologies • Web 2.0 / 3.0 • Services-Oriented Architecture • Software-as-a-Service • Open Source Initiatives More Robust External Content • Publicly available chem and bio sources • Richer literature content • Academic Sources of Tools and Data The Path Forward: Standardize, Simplify, Centralize • Standardize our interfaces and messages • Simplify our cross-industry architectures and support models • Centralize services to reap economies of scale and scope Pistoia Mission • The Mission – Pistoia will standardize and streamline data interchange in life science R&D. • The Method – Precompetitive collaboration between life science, academia, and commercial partners. • The Result – Standardization, Simplification & centralisation will drive down the cost of data exchange, cloud computing, and process outsourcing. • The Benefit – Informatics organizations can streamline commodity services, and focus investment on innovation in R&D. Benefits of Pistoia • R&D Organizations – Optimized investments • e.g. reduction of redundant investments across industry – Increased agility to leverage global R&D • Rapid integration, streamlined data interchange and analysis • Informatics Solution Providers – New markets and business models – Reduced cost-of-entry to markets – Reduction of customized solutions, leading to higher margins Pistoia Membership updated: April 2010 Pistoia Summary • Overall: – Not just a standards body • Where possible adopting/promoting existing standards. – Defining, refining & publishing X-Pharma use cases – Funding mechanism for pilot PoCs to promote stds and use case adoption – Influencing the vendor community – Helping develop business models – Ability to support future Informatics Innovation Pistoia Programme Plan Q22010 (Draft) yy Working Groups Domains-Governance Domains-Direction Pistoia Activity Workshop P Publication Related External Initiatives Working Group – possible Knowledge and W Extra funding raised Develop Standard Approved UNL Not started Pistoia Participants (Ticker code) M Multiple Participants SESL AZN GSK ROG PFZ UNL Information M Services (Ian Dix & Cory Brouwer) Vocab scoping W Domain Vision Biology Vocab Phase 2 M P Open Pharmacology (OPS) IMI P AZN GSK NOV Sequence W Services Domain Vision P AZN GSK NOV PFZ Chemistry ELN Query Services P ELN Query Services Phase 2 M Domain Vision P W Domain Vision Translational P Investigation/Study/Assay (ISA) Infrastructure CaBig Board face to face Planning Pistoia EBI Advisory board? W Pistoia Web site Collaboration Environment P Technical Vision BBSRC Links to external groups Technical Governance P External Liason Comms Strategy 2009 2010 Now In-flight 2011 Current, Active Pistoia Projects • Semantic Enrichment of Scientific Literature – An open knowledge brokering framework standard which will reduce the costs of integration from disparate sources. • Sequence Services – A standard service to provide access to public, private & commercial data & tools, that will enable scientists to search, store & analyse all their sequence based data in a single web interface. • ELN Query Service – A query service standard applicable for use with data types commonly found in electronic lab notebooks SESL Overview SESL: Biomedical Knowledge Service Framework Multiple Consumers Target Dossier Compound Dossier Disease Dossier Service Layer Open Assertion & Meta Data Mgmt Stds Transform / Translate Consumer Integrator Firewall Network Viz Std Public Vocabularies Business Rules Knowledge Applications Common Proprietary Service Service Broker Broker Content Suppliers Supplier Firewall Db 2 Effort required to fit DBs to service layer Db 4 Corpus 1 Db 3 Corpus 5 A Production SESL Service Consumer Side Exemplar Application Disease Dossier License Service Layer Assertion & Meta Data Mgmt Transform / Translate Integrator Std Public Vocabularies Service Layer Business Rules Transform / Translate Broker Org #1 Corpus 1 Supplier Side Integrator Service Layer Business Rules Transform / Translate Corpus 5 Corpus 4 Service Layer Business Rules Transform / Translate Integrator Corpus 9 Db 7 Db 6 Assertion & Meta Data Mgmt Std Public Vocabularies Broker Org #3 Broker Org #2 Db 3 Db 2 Assertion & Meta Data Mgmt Std Public Vocabularies Corpus 8 Business Rules Integrator Broker Org #4 Corpus 13 Db 11 Db 10 Assertion & Meta Data Mgmt Std Public Vocabularies Corpus 12 Db 15 Db 14 Corpus 16 Current SESL Participants AstraZeneca GSK Roche Pfizer Unilever European Bioinformatics Institute Oxford University Press Nature Publishing Group Elsevier Royal Society of Chemistry Funding Funding Funding Funding FTEs FTEs & Hosting Content Content Content Content Sequence Services The Vision As we Propose Today Client Client Client Client Services Services Services Data Data Data Client External Research Partners Services Commercial data Data Data Data Data Data Public Data External Research Partners Private Data Public Data Data Client Data Data Commercial Etc… Data Data Data Overall Status <Sequence Services> Y One-Slide Status as of <May 2010> Project Description Key Accomplishments to date As a drive to cuts costs, encourage standards, and provide simplification it is proposed that Pistoia commission a set of secure internet hosted sequence services. These services will ultimately provide access to public, private & commercial data & tools, that will enable scientists to search, store & analyse all their sequence based data in a single web interface. Status Planned Actual Project Initiation Deliverable / Milestone Q3 - 2009 achieved Vision & UseCase definitions Q4-2009 achieved Engage with 3rd Party Organisations Q1-2010 Formal Presentations for POC projects May /June2010 Deliver Phase 1 POC DD-MonthYY DD-MonthYY Defined Project Vision. Split Vision into achievable phases of delivery. Defined Phase 1 use cases. Focus on Non-Functional usecases. Scoring criteria in final stages of drafting. 5 Vendor presentations booked during May / June 2010. Cognizant, British Telecom, ThomsonReuters, Genome Quest, & Constellation Technologies. Issues / Risks / Escalation requests • • • Risk of partners not being willing to engage. (low – risk) Risk of not being able to find partner(s) who can undertake the work within our estimated budget. (med – risk) Risk of service performance not reaching acceptable levels (med – risk) Q3-2010 Budget Summary Working Group Simon Thornber (GSK) Cary O’Donnell (AstraZeneca) Quan Yang (Novartis) Monica Arenz (Novartis) Budget Actual Variance £210K £0K +£210K Schedule Y Cost R Resources Y Project Phase Moving to Implement Technical G Bus Obj G ELN Query Services ELN Query Service Vision Exploitation Clients Exploitation Clients ELN Application ELN Application Pistoia Query Services Pistoia Query Services Core Services Core Services Data Services Data Services Data Services Data Services Data Services Data Services Experiments Analytical Chemical Structures Experiments Analytical Chemical Structures Overall Status ELN Query Services Workgroup Dashboard Status Report as of May 2010 Project Description Key Accomplishments to date To deliver a query service standard applicable for use with data types commonly found in electronic lab notebooks (ELN’s). The initial scope will be against chemistry related ELN’s but the solution should aim to be general enough that it can be applied to other scientific notebook applications. Project Benefits Searching of data stored in ELN’s from different vendors. Lowering the costs of using ELN data with partners and CRO’s. Status Deliverable / Milestone Actual Issues / Risks / Escalation requests Team start-up April 09 April 09 Phase 1 . Publication of final user stories Sept 09 Nov09 Phase 2. Review of interim standard April 10 Cost Issue: No budget associated with workstream. Need finance for phase 2, to service RFP, will likely need further finance for phase 3. Schedule Issue: Team have a day-job that takes precedence to workstream. Resources issues: Need for architecture input to assist with review of phase 2 output. John Duncan Proposed. Technical Issue: unlikely until phase 3 April 10 31-Dec-10 Budget Summary Working Group Defined project scope/phases/deliverables Delivered phase 1 user stories to define problem space. Engaged group for phase 2, and created RFP to engage external resource. RFP exercise completed, vendor for phase 2 standard development chosen. GGA engaged and work commenced. Planned Phase 2. Publication of Query Service definition Phase 3. Delivery of POC using data standard Y Richard Bolton, GSK, Coordinator David Drake, AZ, team member Steve Trudel, Pfizer, team member John Duncan, Pfizer, team member Uwe Geissler, Novartis, team member Carol McNab, BMS, team member Vendor representatives from Symyx, Edge, Accelrys Budget Actual Variance $0K $15K -$15K Schedule Y Cost Y Project Phase Work Resources Y Technical G Bus Obj G Current Status • There are now 31 members of the group in the Ning website from a mix of pharma (GSK, Pfizer, Novartis, AZ, BMS), and vendors (Symyx, Edge Consulting, Accelrys, CS yet to join). Migrated to Basecamp. • Active Participation at biweekly meetings from GSK/AZ/Pfizer/BMS/Symyx/Edge/Accelrys • • Agreed 3 delivery phases Phase 1 Definition of problem space and creation of users stories. – Complete. User Story Document ‘published’ Phase 2 Creation of ELN Query services definition. – End to end process run through by team to create a full model for two of the user stories. – GGA chosen to complete work. Funding agreed and approved by operations team. Work started but contract not yet in place. Phase 3 Creation of POC in partnership with Vendor. – Not yet started. Will likely require vendor partnership, budget and technology decision. • • www.pistoiaalliance.org If you want to go fast, go alone. If you want to go far, go together. Backup Slides Where We Are Who we Are: Board of Directors GSK AZ Novartis Pfizer Lundbeck BMS Roche Accelrys ChemAxon Symyx CambridgeSoft Infosys Thomson Reuters Pistoia Membership Levels • Core Member ($15,000) – Those organisations wishing to strongly influence the strategy & direction of the Alliance – Have a majority on the Governance & Strategy Board – Pharma, Life Science, Chemicals/Biologics primary business focus • Participating Member ($10,000) – Those wishing to influence the technical outcomes – Access to Governance & Strategy Board member openings – Technical & Standards Team voting • Contributing Member (free) – Technical & Standards Team voting member – Working Group participation – Offer opinions on technical issues Based around the experiences of: http://www.consortiuminfo.org/ Inconsistent Semantics degrade Effective Communication C B A The result from our proprietary assay is “B” CRO But we only record numbers in our assays!! PHARMA Greater Challenge: Unknown Semantic Collisions Image/quote from Abdul-Malik Shakir Revealing assumptions is an essential component of effective communication. Crossing the Chasm • Pistoia is the BRIDGE to cross the chasm to a more agile pre-competitive environment STANDARDIZE PISTOIA SIMPLIFY MEMBERS CENTRALIZE SUPPLIERS Effective Collaboration Participation, direction Energy Influence Trust Recognise Value (internal and external) Delivery FTE, $$, IP Assets, Ideas, concepts, pragmatism Pre-competitive Space in the Technology Lifecycle Experiment Innovation Mainstream Commodity Legacy Precompetitive Space Opportunities for Standards Learn from Other Industries Transportation Banking Retail Automotive Geospatial Clinical Healthcare Pistoia Standards Process Governance & Operations Governance & Strategy Board Operational Team Life Science Community Pistoia Working Groups Software and Service Providers submit Technical & Standards Teams coordinate propose, comment Pharma/ BioTech/Agro publish Not for Profit (e.g. IMI, EBI) 36 SESL Mock Up Slide 1 Gene: abc Relationship: Any Disease: Diabetes Constraint: Species: Any Tissue: Any SESL Mock Up Slide 2 Export to Network View Pivot on Assertion Gene R’ship Disease Species Evidence Co-occurs Diabetes Mus Paper UID:1234 1 abc1 Up-Reg Diabetes Homo ArrayExpress: XXX 2 abc1 Co-occurs Diabetes Homo Paper UID:1344 3 abc2 4 abc13 Co-occurs Diabetes Mus Paper UID:1314 5 abc7 Mutation Diabetes Rattus OMIMI: XXX Paper UID:45643 Co-occurs Diabetes Mus 6 abc1 Co-occurs Diabetes Homo Paper UID:2143 7 abc1 Co-occurs Diabetes Mus Paper UID:1204 8 abc1 SESL Mock Up Slide 3 Export to Network View Return Gene R’ship Disease Species Supporting Evidence Co-occurs Diabetes Mus 3 1 abc1 Co-occurs Diabetes Homo 1 2 abc1 Up-Reg Diabetes Homo 1 3 abc1 4 abc7 Mutation Diabetes Rattus 1 1 5 abc13 Co-occurs Diabetes Mus 6 abc2 Co-occurs Diabetes Rattus 1 Pistoia Collaborative Working e.g. 3 parties working together Past - Independence X Y Emerging – Open Collaboration X More Y X Y Z overlap Z We have all worked separately on our environments and with partners since we had budget and people As Is - Sequence Services X Sequences Y Z Agreeing the pre competitive space, allows for collaboration on Standards and Services Vision - Sequence Services X Y 3rd Z Companies replicate much of the same functionality and internally host external content to ensure high service levels and privacy Party Service Sequences Z Develop Services that allow decommissioning of interna services at lower or equivalent costs. Also allows for future enhancement costs to be shared Pistoia Domains – focused on business workflows/supply chains Enabling Vocabulary Knowledge and Information Services Visualisation Workflow Application Integration Others Biology Data Services Chemistry Data Services Translationa l Data Services Scope of Pistoia Efforts Target ID Hit ID Lead ID Lead Opt Which Target? Which Compound? Disease Association Bioprocess Assoc Druggability ‘On Target’ Safety Risk Validation Tools Competitive Position Variant Selection … DMPK Properties? BioAssay Development Activity-Dose studies? ‘Off Target’ Safety Risk? Synthesis routes? Competitive Position? … Phase II Which Disease? What Biomarkers? CD positioning? Safety Biomarkers? Efficacy Biomarkers? … Genome/Genetic Data Genome/Genetic Data Sequence Data Expression Data Phase I Structural Data Pathway Data Patent Data Pharmacology Data Literature Data 42 Phase III Background—How it all started • In Pistoia, Italy • Meeting of GSK, AZ, Pfizer and Novartis— identified similar challenges and frustrations in discovery informatics 43