ELN Query Service

advertisement
The Pistoia
Alliance
pistoiaalliance.org
A Construct for Pre-competitive
Collaboration and Open Innovation
2010
Agenda
• Origins of Pistoia
– History
– Industry Drivers
– Technology Trends
• Scope and Operations of Pistoia
– Mission, Membership, Governance
– Projects and Deliverables
• Discussion
Industry Driver: Externalization
PHARMA
CHEM CRO
SYNTHESIZE
REGISTER
DESIGN
DISTRIBUTE
ASSAY
DISTRIBUTE
REPORT
PHARMA
CHEM
SYNTHESIZE
BIO CRO
Selectively
Integrated
Model
DESIGN
DATA CRO
Fully Internal
Model
PHARMA
Cost pressures, disruptive technologies, and other forces
often drive business processes to be externalized.
BIO
ASSAY
REGISTER
REPORT
DATA
Emerging Net-centric Pharma
Processes
PHARMA
1
CRO
1
PHARMA
2
CRO
2
PHARMA
3
CRO
3
CRO
4
External Interaction
BioIT
Alliance
How the groups
interact to support
the wider
information supply
chain
Implementations All business
models
Product and
Service
Suppliers
Pilots &
Prototypes
Grand
Challenges &
Use Cases
IMI
PRISM
SILA
W3C
HCLS
Pistoia
Alliance
CDisc
Opportunity: Changing Tech
Landscape
More Robust Technologies
• Web 2.0 / 3.0
• Services-Oriented Architecture
• Software-as-a-Service
• Open Source Initiatives
More Robust External Content
• Publicly available chem and bio sources
• Richer literature content
• Academic Sources of Tools and Data
The Path Forward:
Standardize, Simplify, Centralize
• Standardize our interfaces and
messages
• Simplify our cross-industry
architectures and support models
• Centralize services to reap economies
of scale and scope
Pistoia Mission
• The Mission
– Pistoia will standardize and streamline data interchange in life
science R&D.
• The Method
– Precompetitive collaboration between life science, academia, and
commercial partners.
• The Result
– Standardization, Simplification & centralisation will drive down the
cost of data exchange, cloud computing, and process outsourcing.
• The Benefit
– Informatics organizations can streamline commodity services, and
focus investment on innovation in R&D.
Benefits of Pistoia
• R&D Organizations
– Optimized investments
• e.g. reduction of redundant investments across industry
– Increased agility to leverage global R&D
• Rapid integration, streamlined data interchange and
analysis
• Informatics Solution Providers
– New markets and business models
– Reduced cost-of-entry to markets
– Reduction of customized solutions, leading to
higher margins
Pistoia Membership
updated: April 2010
Pistoia Summary
• Overall:
– Not just a standards body
• Where possible adopting/promoting existing standards.
– Defining, refining & publishing X-Pharma use cases
– Funding mechanism for pilot PoCs to promote stds and
use case adoption
– Influencing the vendor community
– Helping develop business models
– Ability to support future Informatics Innovation
Pistoia
Programme Plan Q22010 (Draft)
yy
Working Groups
Domains-Governance
Domains-Direction
Pistoia Activity
Workshop
P
Publication
Related External Initiatives
Working Group – possible
Knowledge
and
W
Extra funding raised
Develop
Standard Approved
UNL
Not started
Pistoia Participants (Ticker code)
M
Multiple Participants
SESL
AZN GSK ROG PFZ UNL
Information
M
Services
(Ian Dix & Cory
Brouwer)
Vocab scoping
W
Domain Vision
Biology
Vocab Phase 2
M
P
Open
Pharmacology
(OPS) IMI
P
AZN GSK NOV
Sequence W
Services
Domain Vision
P
AZN GSK NOV PFZ
Chemistry
ELN Query Services
P
ELN Query Services Phase 2
M
Domain Vision
P
W
Domain
Vision
Translational
P
Investigation/Study/Assay
(ISA) Infrastructure
CaBig
Board face to face Planning
Pistoia
EBI
Advisory board?
W
Pistoia Web site
Collaboration Environment
P
Technical Vision
BBSRC
Links to external groups
Technical Governance
P
External Liason
Comms Strategy
2009
2010
Now
In-flight
2011
Current, Active Pistoia Projects
• Semantic Enrichment of Scientific Literature
– An open knowledge brokering framework standard which will
reduce the costs of integration from disparate sources.
• Sequence Services
– A standard service to provide access to public, private &
commercial data & tools, that will enable scientists to search,
store & analyse all their sequence based data in a single web
interface.
• ELN Query Service
– A query service standard applicable for use with data types
commonly found in electronic lab notebooks
SESL Overview
SESL: Biomedical Knowledge Service
Framework
Multiple
Consumers
Target
Dossier
Compound
Dossier
Disease
Dossier
Service Layer
Open
Assertion & Meta Data Mgmt
Stds
Transform / Translate
Consumer Integrator
Firewall
Network
Viz
Std Public
Vocabularies
Business
Rules
Knowledge
Applications
Common
Proprietary
Service
Service
Broker
Broker
Content
Suppliers
Supplier
Firewall
Db 2
Effort required
to fit DBs to
service layer
Db 4
Corpus 1
Db 3
Corpus 5
A Production SESL Service
Consumer
Side
Exemplar
Application
Disease Dossier
License
Service Layer
Assertion & Meta Data Mgmt
Transform / Translate
Integrator
Std Public
Vocabularies
Service Layer
Business
Rules
Transform / Translate
Broker Org #1
Corpus 1
Supplier
Side
Integrator
Service Layer
Business
Rules
Transform / Translate
Corpus 5
Corpus 4
Service Layer
Business
Rules
Transform / Translate
Integrator
Corpus 9
Db 7
Db 6
Assertion & Meta Data Mgmt
Std Public
Vocabularies
Broker Org #3
Broker Org #2
Db 3
Db 2
Assertion & Meta Data Mgmt
Std Public
Vocabularies
Corpus 8
Business
Rules
Integrator
Broker Org #4
Corpus 13
Db 11
Db 10
Assertion & Meta Data Mgmt
Std Public
Vocabularies
Corpus 12
Db 15
Db 14
Corpus 16
Current SESL Participants
AstraZeneca
GSK
Roche
Pfizer
Unilever
European Bioinformatics Institute
Oxford University Press
Nature Publishing Group
Elsevier
Royal Society of Chemistry
Funding
Funding
Funding
Funding
FTEs
FTEs & Hosting
Content
Content
Content
Content
Sequence
Services
The Vision
As we Propose
Today
Client
Client
Client
Client
Services
Services
Services
Data
Data
Data
Client
External
Research
Partners
Services
Commercial
data
Data
Data
Data
Data
Data
Public Data
External
Research
Partners
Private Data
Public
Data
Data
Client
Data
Data
Commercial
Etc…
Data
Data
Data
Overall Status
<Sequence Services>
Y
One-Slide Status as of <May 2010>
Project Description
Key Accomplishments to date
As a drive to cuts costs, encourage standards, and provide simplification it is proposed
that Pistoia commission a set of secure internet hosted sequence services.



These services will ultimately provide access to public, private & commercial data &
tools, that will enable scientists to search, store & analyse all their sequence based
data in a single web interface.
Status
Planned
Actual
Project Initiation
Deliverable / Milestone
Q3 - 2009
achieved
Vision & UseCase definitions
Q4-2009
achieved
Engage with 3rd Party Organisations
Q1-2010
Formal Presentations for POC projects
May
/June2010
Deliver Phase 1 POC
DD-MonthYY
DD-MonthYY

Defined Project Vision.
Split Vision into achievable phases of delivery.
Defined Phase 1 use cases.

Focus on Non-Functional usecases.

Scoring criteria in final stages of drafting.
5 Vendor presentations booked during May / June 2010.

Cognizant, British Telecom, ThomsonReuters, Genome Quest, &
Constellation Technologies.
Issues / Risks / Escalation requests
•
•
•
Risk of partners not being willing to engage. (low – risk)
Risk of not being able to find partner(s) who can undertake the work within our
estimated budget. (med – risk)
Risk of service performance not reaching acceptable levels (med – risk)
Q3-2010
Budget Summary
Working Group




Simon Thornber (GSK)
Cary O’Donnell (AstraZeneca)
Quan Yang (Novartis)
Monica Arenz (Novartis)
Budget
Actual
Variance
£210K
£0K
+£210K
Schedule
Y
Cost
R
Resources
Y
Project Phase
Moving to Implement
Technical
G
Bus Obj
G
ELN Query
Services
ELN Query Service Vision
Exploitation
Clients
Exploitation
Clients
ELN Application
ELN Application
Pistoia Query Services
Pistoia Query Services
Core Services
Core Services
Data
Services
Data
Services
Data
Services
Data
Services
Data
Services
Data
Services
Experiments
Analytical
Chemical
Structures
Experiments
Analytical
Chemical
Structures
Overall Status
ELN Query Services Workgroup
Dashboard Status Report as of May 2010
Project Description
Key Accomplishments to date
To deliver a query service standard applicable for use with data
types commonly found in electronic lab notebooks (ELN’s). The
initial scope will be against chemistry related ELN’s but the solution
should aim to be general enough that it can be applied to other
scientific notebook applications.
Project Benefits
Searching of data stored in ELN’s from different vendors. Lowering
the costs of using ELN data with partners and CRO’s.
Status
Deliverable / Milestone





Actual
Issues / Risks / Escalation requests
Team start-up
April 09
April 09

Phase 1 . Publication of final user stories
Sept 09
Nov09
Phase 2. Review of interim standard
April 10
Cost Issue: No budget associated with workstream. Need finance for
phase 2, to service RFP, will likely need further finance for phase 3.
Schedule Issue: Team have a day-job that takes precedence to
workstream.
Resources issues: Need for architecture input to assist with review of
phase 2 output. John Duncan Proposed.
Technical Issue: unlikely until phase 3


April 10
31-Dec-10

Budget Summary
Working Group






Defined project scope/phases/deliverables
Delivered phase 1 user stories to define problem space.
Engaged group for phase 2, and created RFP to engage external
resource.
RFP exercise completed, vendor for phase 2 standard
development chosen.
GGA engaged and work commenced.
Planned
Phase 2. Publication of Query Service
definition
Phase 3. Delivery of POC using data
standard

Y
Richard Bolton, GSK, Coordinator
David Drake, AZ, team member
Steve Trudel, Pfizer, team member
John Duncan, Pfizer, team member
Uwe Geissler, Novartis, team member
Carol McNab, BMS, team member
Vendor representatives from Symyx, Edge, Accelrys
Budget
Actual
Variance
$0K
$15K
-$15K
Schedule
Y
Cost
Y
Project Phase
Work
Resources
Y
Technical
G
Bus Obj
G
Current Status
•
There are now 31 members of the group in the Ning website from a mix of
pharma (GSK, Pfizer, Novartis, AZ, BMS), and vendors (Symyx, Edge
Consulting, Accelrys, CS yet to join). Migrated to Basecamp.
•
Active Participation at biweekly meetings from
GSK/AZ/Pfizer/BMS/Symyx/Edge/Accelrys
•
•
Agreed 3 delivery phases
Phase 1 Definition of problem space and creation of users stories.
– Complete. User Story Document ‘published’
Phase 2 Creation of ELN Query services definition.
– End to end process run through by team to create a full model for two of the user
stories.
– GGA chosen to complete work. Funding agreed and approved by operations
team. Work started but contract not yet in place.
Phase 3 Creation of POC in partnership with Vendor.
– Not yet started. Will likely require vendor partnership, budget and technology
decision.
•
•
www.pistoiaalliance.org
If you want to go fast, go alone.
If you want to go far, go together.
Backup Slides
Where We Are
Who we Are: Board of Directors
GSK
AZ
Novartis
Pfizer
Lundbeck
BMS
Roche
Accelrys
ChemAxon
Symyx
CambridgeSoft
Infosys
Thomson Reuters
Pistoia Membership Levels
• Core Member ($15,000)
– Those organisations wishing to strongly influence the strategy &
direction of the Alliance
– Have a majority on the Governance & Strategy Board
– Pharma, Life Science, Chemicals/Biologics primary business focus
• Participating Member ($10,000)
– Those wishing to influence the technical outcomes
– Access to Governance & Strategy Board member openings
– Technical & Standards Team voting
• Contributing Member (free)
– Technical & Standards Team voting member
– Working Group participation
– Offer opinions on technical issues
Based around the experiences of: http://www.consortiuminfo.org/
Inconsistent Semantics degrade
Effective Communication
C
B
A
The result from our
proprietary assay is “B”
CRO
But we only record numbers
in our assays!!
PHARMA
Greater Challenge: Unknown Semantic
Collisions
Image/quote from Abdul-Malik Shakir
Revealing assumptions is an essential
component of effective communication.
Crossing the Chasm
• Pistoia is the BRIDGE to cross the chasm
to a more agile pre-competitive
environment
STANDARDIZE
PISTOIA
SIMPLIFY
MEMBERS
CENTRALIZE
SUPPLIERS
Effective Collaboration
Participation,
direction
Energy
Influence
Trust
Recognise
Value (internal
and external)
Delivery
FTE, $$, IP
Assets, Ideas,
concepts,
pragmatism
Pre-competitive Space in the
Technology Lifecycle
Experiment
Innovation
Mainstream
Commodity
Legacy
Precompetitive Space
Opportunities for Standards
Learn from Other Industries
Transportation
Banking
Retail
Automotive
Geospatial
Clinical
Healthcare
Pistoia Standards Process
Governance &
Operations
Governance
& Strategy
Board
Operational
Team
Life Science
Community
Pistoia Working
Groups
Software and
Service
Providers
submit
Technical &
Standards
Teams
coordinate
propose,
comment
Pharma/
BioTech/Agro
publish
Not for
Profit
(e.g. IMI, EBI)
36
SESL Mock Up Slide 1
Gene: abc
Relationship: Any
Disease: Diabetes
Constraint: Species: Any
Tissue: Any
SESL Mock Up Slide 2
Export to Network View
Pivot on Assertion
Gene R’ship
Disease Species Evidence
Co-occurs Diabetes Mus
Paper UID:1234
1 abc1
Up-Reg Diabetes Homo ArrayExpress: XXX
2 abc1
Co-occurs Diabetes Homo Paper UID:1344
3 abc2
4 abc13 Co-occurs Diabetes Mus
Paper UID:1314
5 abc7
Mutation Diabetes Rattus OMIMI: XXX
Paper UID:45643
Co-occurs Diabetes Mus
6 abc1
Co-occurs Diabetes Homo Paper UID:2143
7 abc1
Co-occurs Diabetes Mus
Paper UID:1204
8 abc1
SESL Mock Up Slide 3
Export to Network View
Return
Gene R’ship
Disease Species Supporting Evidence
Co-occurs Diabetes Mus
3
1 abc1
Co-occurs Diabetes Homo 1
2 abc1
Up-Reg Diabetes Homo 1
3 abc1
4 abc7
Mutation Diabetes Rattus 1
1
5 abc13 Co-occurs Diabetes Mus
6 abc2
Co-occurs Diabetes Rattus 1
Pistoia Collaborative Working e.g. 3 parties working together
Past - Independence
X
Y
Emerging – Open Collaboration
X
More
Y
X
Y
Z
overlap
Z
We have all worked
separately on our
environments and with
partners since we had
budget and people
As Is - Sequence Services
X
Sequences
Y
Z
Agreeing the pre competitive
space, allows for
collaboration on Standards
and Services
Vision - Sequence Services
X
Y
3rd
Z
Companies replicate much
of the same functionality
and internally host external
content to ensure high
service levels and privacy
Party
Service
Sequences
Z
Develop Services that allow
decommissioning of interna
services at lower or
equivalent costs. Also
allows for future
enhancement costs to be
shared
Pistoia Domains –
focused on business workflows/supply chains
Enabling
Vocabulary
Knowledge and Information Services
Visualisation
Workflow
Application Integration
Others
Biology
Data
Services
Chemistry
Data
Services
Translationa
l Data
Services
Scope of Pistoia Efforts
Target ID
Hit ID
Lead ID
Lead Opt
Which Target?
Which Compound?
Disease Association
Bioprocess Assoc
Druggability
‘On Target’ Safety Risk
Validation Tools
Competitive Position
Variant Selection
…
DMPK Properties?
BioAssay Development
Activity-Dose studies?
‘Off Target’ Safety Risk?
Synthesis routes?
Competitive Position?
…
Phase II
Which Disease?
What Biomarkers?
CD positioning?
Safety Biomarkers?
Efficacy Biomarkers?
…
Genome/Genetic Data
Genome/Genetic Data
Sequence Data
Expression Data
Phase I
Structural Data
Pathway Data
Patent Data
Pharmacology Data
Literature Data
42
Phase III
Background—How it all started
• In Pistoia, Italy
• Meeting of GSK, AZ,
Pfizer and Novartis—
identified similar
challenges and
frustrations in discovery
informatics
43
Download