HPO - El Corte Inglés

advertisement
Research Infrastructures
to boost R&D
in the field of rare Diseases
Ségolène Aymé
INSERM, Paris, France
Fundacion Ramon Areces
29 Oct 2014
1
International Rare Disease
Research Consortium (IRDiRC)
Cooperation at international level
to stimulate, better coordinate and
maximize output of rare disease
research efforts around the world
2
Public healthcare and research system
Genomics
Industry & Manufactures
Multiple Government Departments
Clinical expertise/experts
Interpretation and application
Transcriptomics
Metabolomics
Training
Phenomics
Natural History
Phenomics
RARE DISEASE
SECTOR
Genomics
Public healthcare system
DIAGNOSIS
Proteomics
Training
• Technology
Policy
Clinical & Academic
Public healthcare system • devices, instruments,
Industry & Manufactures
bioinformatics, systems
Multiple Government
Clinical expertise/experts
Position
Departments
statements Private Healthcare
Private Healthcare
Education
Proteomics
Education
Metabolomics
Proteomics
Education
Public healthcare system
Clinical expertise/experts
Multiple Government Departments
VOICE OF
DATA
(EVIDENCE)
Policy
Interpretation and application
Training
THE CHALLENGE
Clinical and
disability services
Rare Diseases Peculiarities
DISADVANTAGE







no or little evidence available
small populations , scattered
coding and classification poor
no jurisdiction , or country with
sufficient data
require collective data and case
finding for evidence
not all rare diseases are the same in
terms of evidence: e.g. Cystic
Fibrosis ≠ Progeria
orphan therapies fail the cost
effectiveness threshold
ADVANTAGE


Clarity in the extreme
Phenotype:
– genotype atomise disease;
– permit re-aggregation based on
pathways perturbed, not clinical
presentation

New knowledge translation and
the portal to Individualised
medicine
4
Motor Neurone Disease
Retinoblastoma
Angelman Syndrome
Niemann-Pick diseaseFacioscapulohumeral dystrophy
Nemaline myopathy Rett syndrome
Mucopolysaccharidosis
1-3
Congenital
myopathy
Friedreich ataxia
Alport syndrome
200
Neurofibromatosis type 1
Charcot-Marie-Tooth disease
Familial long QT syndrome
Fetal cytomegalovirus syndrome
Partial chromosome Y deletion
Diffuse large B-cell lymphoma
Duchenne muscular dystrophy Fragile X syndrome
Prevalence distribution of rare diseases
Hereditary spastic paraplegia
Marfan syndrome
Myasthenia gravis
Tuberculosis
Turner Syndrome
Malaria
Mesothelioma
180
Hereditary breast & ovarian cancer syndrome
Systemic sclerosis
160
Phenylketonuria
Familial adenomatous polyposis
140
Noonan
Isolated Spina Bifida
Cutaneous lupus erythematosus
Number of diseases
120
Huntington disease
Hemophilia A
100
80
Young adult-onset Parkinsonism
Sickle cell anemia
Williams syndrome
Cystic fibrosis
60
40
20
0
0
5
10
15
20
25
30
35
40
Estimated prevalence
70% of people living with a rare disease
(per 100 000)
75% of people living with a rare disease
80% of People living with a rare disease
45
50
CURRENT STATUS OF RESEARCH
IN THE FIELD OF RARE DISEASES
BASED ON ORPHANET DATA
6
European rare diseases research landscape (36 countries)
5707 ongoing research projects in Orphanet
covering 2129 diseases, excluding clinical trials
513
595
281
393
1048
509
748
179
90
Gene search
Mutations search
Gene expression profile
Genotype-phenotype correlation
In vitro functional study
Animal model creation / study
Human physiopathology study
Pre-clinical gene therapy
Pre-clinical cell therapy
31
452
224
295
158
25
79
15
72
Pre-clinical vaccine development
Observational clinical study
Epidemiological study
Diagnostic tool / protocol
development
Biomarker development
Medical device / instrumentation
development
Health sociology study
Health economics study
Public health / health services
study
(February 2014)
7
International rare diseases clinical trial
landscape

2476 ongoing national or international clinical trials
for 629 diseases in 29 countries
Percentage of clinical trials by category
2%
1%
16%
1% 1% 1%
cell therapy clinical trial
drug clinical trial
gene therapy clinical trial
medical device clinical trial
protocol clinical trial
78%
vaccine clinical trial
other trial
(April 2014)
8
Number of genes tested in each
country in Europe by year
2010
2012
2011
2013
Possibility to diagnose Rare Diseases:
over 2 362 genes tested to date
Number of genes tested by country
Number of rare diseases tested by country
(April 2014)
10
Medicinal products on the European market in 2013
68 orphan medicinal products
92 medicinal products without orphan
designation with at least an indication for a
rare disease or a group of rare disease
(January 2014)
Satisfaction for professionals
Frustration for patients
Anxiety for payors

Slow translation from bench to bedside
Limited access to innovations

Too few treatments compared to needs
Most patients feeling abandonned

High cost of diagnostic tests and drugs
Not affordable

Necessity to de-risk research
Cheaper R&D
12
How to speed-up research and de-risk it ?

Improve coordination and synergies of research at
world level
To increase the research volume and the quantity of data

Support in-silico research
to make optimal use of available data

Find new business-model for R&D
To reduce the cost and et profide affordable treatments
13
TO BOOST COORDINATION AT
WORLD LEVEL
14
IRDiRC policy and guidelines
Principles applying to Research activities
Sharing and collaborative work in RD research




Sharing of data and resources
Rapid release of data
Interoperability and harmonization of data
Data in open access databases
Scientific standards, requirements and regulations in RD research



Projects should adhere to IRDiRC standards
Develop ontologies, biomarkers and patient-centered outcome data
Cite use of databases and biobanks in publications
15
IRDiRC policy and guidelines
Participation by patients and / or their representatives in research





Act in the best interest of patients
Involve patients in all aspects of research
Involve patients in design and governance of registries
Involve patients in the design, conduct and analysis of clinical trials
Acknowledge patients contribution in articles
16
IRDiRC policy and guidelines
Principles applying to Funding Bodies






Promote the discovery of genes
Promote the development of therapies
Fund pre-clinical studies for proof of concept
Promote harmonization, interoperability, sharing, open access
data
Promote coordination between human and animal models
Promote active exchanges between stakeholders through
information dissemination of ongoing projects and events
17
IRDiRC policy and guidelines
Endorsement of standards and tools

Endorsement of standards and tools contributing
to IRDiRC objectives
Ontologies: HPO, ORDO…
Standards: BRIF…
Data sharing: PhenomeCentral, DECIPHER…
Ouctome measures: NINDS, PROMIS…
18
IRDiRC Recommended






Label to be used in highlighting tools, standards and guidelines,
which contributes directly to IRDiRC objectives
Application for ‘IRDiRC Recommended’ label is open to all, including
non-IRDiRC members
‘IRDiRC Recommended’ may be awarded to similar tools, standards
and guidelines
Submission of 1-2 pages application
Evaluation of the application by a review panel
Approval/rejection of the application by the Executive Committee
19
INITIATIVES TO SPEED UP
DATA SHARING
20
Rational




Research produces an enormous amount of data
If shared, will facilitate the development of diagnostics and
treatments while ensuring efficient utilization of scarce
resources
Resources include patient and family material (extracted DNA,
cell lines, pathological samples), technical protocols,
informatics infrastructure, and analysis tools
Datasets include phenotypes, genomic variants, other ‘omic’
data, natural histories, and clinical trial data…
21
Barriers to Data Sharing

Technical and Financial issues
Storing terabytes…Securing data
Providing the logistics for sharing data
Statistical and algorithmic issues to combine datasets

Ethical and Legal issues
Data across public and private networks
Pricacy protection at national level

Cultural issues
Reluctance to share data from researchers/
Institutions/Regulatory bodies
22
A ClearingHouse of Data Standards
is in development at IRDiRC

Five main fields of application
 Standards in Genomics and other OMICS
 Standards in Phenotyping
 Standards in Outcome Measures for clinical trials
 Standards in Human Data Registration
 Open-access Data Repositories to store data

Alignment with other efforts to ensure interconnection and
shareability between data
 RD-Connect
 PCORI, Comete
 ELIXIR, BD2K, Data FAIRport
23
Open Acess Data Repositories





PhenoTips and
PhenomeCentral
Repository of data
Hub for data sharing
CareforRare,
RDConnect
NIH undiagnosed
patients



ClinVar and ICCG
Public archive of
variants and assertions
about significance


Decipher Database of
Chromosome
imbalances and
phenotypes
Using Ensembl
resources
Sanger Institute
NCBI resource


Wellcome Trust
24
INITIATIVES TO SPEED UP
DATA MINING
25
Rational
Make the most of remarkable advances in the
molecular basis of human diseases
dissect the physiological pathways
improve diagnosis
develop treatments
Make rare diseases visible in health information
systems
to gain insight into them
to access real life data already collected
What is the problem ? Computers are
not smart enough….

The following descriptions mean the same thing to
you:
generalized amyotrophy
generalized muscle atrophy
muscular atrophy, generalized

But your computer thinks they're completely
unrelated
27
Phenomes: a continuum
Group of
phenomes
• Top of classification
= System disorder
• Group
« Disorder »
level
• Clinical criterion
• Disease, syndrome,
condition,anomaly…
Subtypes
• Etiological
• Clinical
• Histopathological…
•Disease
•Malformation syndrome
•Morphological anomaly
•Biological anomaly
•Clinical syndrome
•Particular clinical situation
•No type: waiting to have a type
attributed
28
Orphan Diseasome
An Orphan Diseasome permits investigators to
explore the orphan disease (OD) or rare disease
relationships based on shared genes and shared
enriched features (e.g., Gene Ontology Biological
Process, Cellular Component, Pathways,
Mammalian Phenotype).
The red nodes
represent the
orphan diseases
and the green
ones the related
genes. A disease
is connected to a
gene if and only
if a mutation
which is
responsible of
the disease has
been identified
on this gene.
http://research.cchmc.org/od/01/index.html
UMLS = Unified Medical Language System

ICD = International Classification of Diseases
 Since 1863 by WHO
 Used by most countries to code medical activity, mortality data

MeSH = Medical Subject Headings
 controlled vocabulary thesaurus used for indexing articles for PubMed by
National Library of Medicine (USA)

SnoMed CT = Systematized Nomenclature of Medicine--Clinical Terms
 clinical terminology by the International Health Terminology Standards
Development Organisation (IHTSDO) in Denmark
 Used in the USA and a few other countries

MedDRA = Medical Dictionary for Regulatory Activities
 medical terminology to classify adverse event information associated with the
use of medical products
 by the International Federation of Pharmaceutical Manufacturers and
Associations (IFPMA)
Different resources, different terminologies
(e)HR:
SNOMED CT
Others?
Free text
Mutation/patient registries,
databases:
HPO
LDDB
PhenoDB
Elements of morphology
Others? Free text?
Tools for diagnosis:
HPO
LDDB
Orphanet
Each terminology has a purpose–
driven approach

Indexing health status of individual patients for health
management (SnoMED)
Detailed, focus on manifestations and complaints
Adapted to clinical habits
Analytical approach

Indexing health status of individual patients for
statistical purpose in public health (ICD)
More agregated, interpreted phenotypic features
Agregated concepts
Unambiguous to avoid blanks
Purpose–driven approach (2)


Indexing health status of individual patients for clinical
research purpose (HPO / PhenoDB / Elements of morphology)
Highly detailed to fit with the research questions
Specific terminologies developed for disease-specific
patient registries
Indexing health status of individual patients for retrieving
possible diagnoses (LDDB,POSSUM,Orphanet)
Agregated concepts
Requires a judgement of clinicians about phenomic
expressions that are relevant
Unambiguous to avoid blanks
HOW TO MAKE ALL THESE
TERMINOLOGIES INTER-OPERABLE ?
35
Convince the terminologies to
converge in some way….


Sept 2012: start of mappings (Orphanet)
EUGT2 – EUCERD workshop (Paris, September 2012)
LDDB
Elements of Morphology
POSSUM
SNOMED CT (IHTSDO)
ICD (WHO)
PhenoDB
Orphanet
HPO
DECIPHER
Phenotype terminology project

Aims:
Map commonly used clinical terminologies (Orphanet, LDDB,
HPO, Elements of morphology, PhenoDB, UMLS, SNOMED-CT,
MESH, MedDRA):
–
automatic map, expert validation, detection and correction of
inconsistencies
Find common terms in the terminologies
Produce a core terminology
–
Common denominator allowing to share/exchange phenotypic data
between databases
Mapping Terminologies

Orphanet: 1357 terms (Orphanet database, version 2008)

LDDB: 1348 dysmorphological terms (Installation CD)

Elements of Morphology: 423 terms (retrieved manually from
publication AJMG, January 2009)

HPO: 9895 terms (download bioportal, obo format, 30/08/12)

PhenoDB: 2846 terms (given in obo format, 02/05/2012)

UMLS: (version 2012AA) (integrating MeSH, MedDra, SNOMED
CT)
Tools

OnaGUI (INSERM U729):
ontology alignment tool
–
–
–

Work with file in owl format
I-Sub algorithm: detect syntaxic
similarity
Graphical interface to check
automatic mappings and
manually add ones
Metamap (National Library of
Medicine): a tool to map biomedical
text to the UMLS Metathesaurus

Perl scripts: format conversion,
launching Metamap, comparison of
results…
Comparison of mappings and deduction

Perl script to compare all the mappings and infer
mappings of non-Orphanet terminologies
Eg: Orphanet ID XX mapped to YY in HPO and ZZ in LDDB ->
deduction: YY and ZZ should probably map

Retrieve HPO mappings versus UMLS, MeSH
El. Morpho
PhenoDB
HPO
UMLS…
Orphanet

E: 416
E: 978
E: 2228
E: 6948
LDDB
D: 275
D: 533
D: 1123
D:2678
D: 177
D: 716
D: 409
D: 1045
D:3268
LDDB
E: 1062
First figures:
El. Morpho
PhenoDB
HPO
UMLS…
D: 6307+4800
Mapping of non-Orphanet terminologies

Automatic and infered mappings were checked by
experts
Using OnaGUI for all, except UMLS

Automatic I-Sub: 7.0 + deduction
El. Morpho
PhenoDB
HPO
UMLS…
D: 257
+23 added
D:528, 92%E
A:674, 38%E
D: 1105, 87%E
A: 2084, 23%E
D: 2654, 83%E
A: 11731
D:174, 50%E
A:189, 74%E
D:393, 93%E
A: 436, 16%E
D:405, 84%E
A:1248
D:1018, 91%E
A: 4168, 6%E
D: 3222, 82%E
A: 18776
Metamap + deduction + HPO mappings
LDDB

Figures:
El. Morpho
PhenoDB
HPO
UMLS…
D: 7389
A: 65535
First list of common terms

Present in at least 2 terminologies
Definition of rules for nomenclature

Addition of terms present in each terminology as synonyms

Workshop on 21-22 October 2013 in Boston
Success!





Reviewed 2736 terms appearing 2 or more times in the 6
terminologies in 17 hours
2302 terms chosen, including preferred term
Definitions are from Elements of Morphology if
available, and HPO/Stedman’s Medical Dictionary, if not
List of terms, mapping to HPO, PhenoDB, Elements of
Morphology will be available at http://ichpt.org by
January 2015.
All tools will map to this terminology to allow
interoperability among resources
Adoption of a core set of >2,300 terms
common to all terminologies
Workshop on Terminologies for RD –
Paris, 12 September 2012





Many terminologies in use to
describe phenomes - No interoperability
Joint EuroGenTest and EUCERD workshop
Organized by Ségolène Aymé
Agreement to define a core set of terms
common to all terminologies and a
methodology
Core set identified by cross referencing






HPO
PhenoDB
Orphanet
UMLS: MeSH, MedDRA, SnoMed CT
LDDB
Elements of morphology
Workshop of validation, Boston
21-22 October 2013







Workshop supported by HVP and
EuroGenTest
Organized by Ada Hamosh
Expert review of the initial proposal
Selection of 2,370 terms
Decision to propose them for adoption by all
terminologies
Establishment of the International
Consortium for Human Phenotype
Terminologies – ICHPT
Publication on the IRDiRC website with
definitions from
 HPO
 Elements of morphology
44
COMPUTERS ARE NOT SMART
FROM A TERMINOLOGY TO AN
ONTOLOGY
45
Why ontologies are needed ?



Ontologies are representations of the knowledge in a
way which is directly understandable by computers
Ontologies allow reasoning
Ontologies define the objects AND the relationship
between the objects
Duchenne muscular dystrophy (disease) Is a neuromuscular
disease (group of diseases)
Schistosomias (disease) Is a cause of anemia (manifestation)
46
Standardization of Phenotype Ontologies
Workshop Sympathy, 19 Apr 2013, Dublin
Organized by IRDiRC, supported by the University of Dublin, Forge and EuroGenTest
Conclusion: Adopt HPO & ORDO & cross-reference with OMIM
Standardisation of Phenotype Ontologies
Rare Diseases
bioportal.bioontology.org/ontologies/ORDO
Based on Orphanet multi-hierarchical
classification of RD
Genes– diseases relationships
Cross-references:
-For RD nomenclature : OMIM, SNOMED CT,
ICD10, MeSH, MedDRA, UMLS
-For genes : OMIM, HGNC, UniProtKB, IUPHAR,
ensembl, Reactome
Phenotypic Features
bioportal.bioontology.org/ontologies/HP
ICHPT
(International Consortium for Human Phenotype
Terminologies)
2,307 terms- core terminology
Mapped to:
HPO
Elements of Morphology
Orphanet
LDDB
SNOMED CT
Pheno-DB (OMIM)
MeSH
UMLS
Available soon for download at ichpt.org
Please adopt/disseminate
HPO and ORDO
to speed up R&D
to the benefit of the patients
49
COMPUTERS ARE VERY SMART
THEY CAN HELP REPURPOSE DRUGS
50
Rational: Make optimal use of
molecules already known

Drug Repositioning or Repurposing is a strategy used to
generate new or additional value for a drug, by targeting
diseases other than those for which it was originally intended
 Address unmet medical needs
 Reduce time to market due to provided information on
 Unbiased clinical safety and efficacy data
 Add value to exiting porfolio
 Increase drug pipeline
 Decrease R&D failure risks
 Decrease development costs
 Creates new revenue potential
51
Graph Theory Enables Drug Repurposing
Gramatica et Al.: PLOS one, Vol 1 e84912, 2014



23 Million articles from PubMed
Possible to link the gathered information on drugs,
physiological pathways and resulting biological
activities with the pathophysiological signs &
symptoms of diseases
Possible to rank the matches in order to identify the
most promising leads
52
Graph Theory Enables Drug Repurposing
Gramatica et Al.: PLOS one, Vol 1 e84912, 2014
53
Graph Theory Enables Drug Repurposing
Gramatica et Al.: PLOS one, Vol 1 e84912, 2014
54
Conclusion

Open access to dataWe now leave in an open world
It is an opportunity in research
Evidence that open-access to data is beneficial, especially
for the data producer !
Orphadata is accessed by 3000 researchers/ month


Agreed standards to make data interoperable
Responsibility of Institutions and of individual
researchers
55
Thank you for your invitation
56
Download