Research Infrastructures to boost R&D in the field of rare Diseases Ségolène Aymé INSERM, Paris, France Fundacion Ramon Areces 29 Oct 2014 1 International Rare Disease Research Consortium (IRDiRC) Cooperation at international level to stimulate, better coordinate and maximize output of rare disease research efforts around the world 2 Public healthcare and research system Genomics Industry & Manufactures Multiple Government Departments Clinical expertise/experts Interpretation and application Transcriptomics Metabolomics Training Phenomics Natural History Phenomics RARE DISEASE SECTOR Genomics Public healthcare system DIAGNOSIS Proteomics Training • Technology Policy Clinical & Academic Public healthcare system • devices, instruments, Industry & Manufactures bioinformatics, systems Multiple Government Clinical expertise/experts Position Departments statements Private Healthcare Private Healthcare Education Proteomics Education Metabolomics Proteomics Education Public healthcare system Clinical expertise/experts Multiple Government Departments VOICE OF DATA (EVIDENCE) Policy Interpretation and application Training THE CHALLENGE Clinical and disability services Rare Diseases Peculiarities DISADVANTAGE no or little evidence available small populations , scattered coding and classification poor no jurisdiction , or country with sufficient data require collective data and case finding for evidence not all rare diseases are the same in terms of evidence: e.g. Cystic Fibrosis ≠ Progeria orphan therapies fail the cost effectiveness threshold ADVANTAGE Clarity in the extreme Phenotype: – genotype atomise disease; – permit re-aggregation based on pathways perturbed, not clinical presentation New knowledge translation and the portal to Individualised medicine 4 Motor Neurone Disease Retinoblastoma Angelman Syndrome Niemann-Pick diseaseFacioscapulohumeral dystrophy Nemaline myopathy Rett syndrome Mucopolysaccharidosis 1-3 Congenital myopathy Friedreich ataxia Alport syndrome 200 Neurofibromatosis type 1 Charcot-Marie-Tooth disease Familial long QT syndrome Fetal cytomegalovirus syndrome Partial chromosome Y deletion Diffuse large B-cell lymphoma Duchenne muscular dystrophy Fragile X syndrome Prevalence distribution of rare diseases Hereditary spastic paraplegia Marfan syndrome Myasthenia gravis Tuberculosis Turner Syndrome Malaria Mesothelioma 180 Hereditary breast & ovarian cancer syndrome Systemic sclerosis 160 Phenylketonuria Familial adenomatous polyposis 140 Noonan Isolated Spina Bifida Cutaneous lupus erythematosus Number of diseases 120 Huntington disease Hemophilia A 100 80 Young adult-onset Parkinsonism Sickle cell anemia Williams syndrome Cystic fibrosis 60 40 20 0 0 5 10 15 20 25 30 35 40 Estimated prevalence 70% of people living with a rare disease (per 100 000) 75% of people living with a rare disease 80% of People living with a rare disease 45 50 CURRENT STATUS OF RESEARCH IN THE FIELD OF RARE DISEASES BASED ON ORPHANET DATA 6 European rare diseases research landscape (36 countries) 5707 ongoing research projects in Orphanet covering 2129 diseases, excluding clinical trials 513 595 281 393 1048 509 748 179 90 Gene search Mutations search Gene expression profile Genotype-phenotype correlation In vitro functional study Animal model creation / study Human physiopathology study Pre-clinical gene therapy Pre-clinical cell therapy 31 452 224 295 158 25 79 15 72 Pre-clinical vaccine development Observational clinical study Epidemiological study Diagnostic tool / protocol development Biomarker development Medical device / instrumentation development Health sociology study Health economics study Public health / health services study (February 2014) 7 International rare diseases clinical trial landscape 2476 ongoing national or international clinical trials for 629 diseases in 29 countries Percentage of clinical trials by category 2% 1% 16% 1% 1% 1% cell therapy clinical trial drug clinical trial gene therapy clinical trial medical device clinical trial protocol clinical trial 78% vaccine clinical trial other trial (April 2014) 8 Number of genes tested in each country in Europe by year 2010 2012 2011 2013 Possibility to diagnose Rare Diseases: over 2 362 genes tested to date Number of genes tested by country Number of rare diseases tested by country (April 2014) 10 Medicinal products on the European market in 2013 68 orphan medicinal products 92 medicinal products without orphan designation with at least an indication for a rare disease or a group of rare disease (January 2014) Satisfaction for professionals Frustration for patients Anxiety for payors Slow translation from bench to bedside Limited access to innovations Too few treatments compared to needs Most patients feeling abandonned High cost of diagnostic tests and drugs Not affordable Necessity to de-risk research Cheaper R&D 12 How to speed-up research and de-risk it ? Improve coordination and synergies of research at world level To increase the research volume and the quantity of data Support in-silico research to make optimal use of available data Find new business-model for R&D To reduce the cost and et profide affordable treatments 13 TO BOOST COORDINATION AT WORLD LEVEL 14 IRDiRC policy and guidelines Principles applying to Research activities Sharing and collaborative work in RD research Sharing of data and resources Rapid release of data Interoperability and harmonization of data Data in open access databases Scientific standards, requirements and regulations in RD research Projects should adhere to IRDiRC standards Develop ontologies, biomarkers and patient-centered outcome data Cite use of databases and biobanks in publications 15 IRDiRC policy and guidelines Participation by patients and / or their representatives in research Act in the best interest of patients Involve patients in all aspects of research Involve patients in design and governance of registries Involve patients in the design, conduct and analysis of clinical trials Acknowledge patients contribution in articles 16 IRDiRC policy and guidelines Principles applying to Funding Bodies Promote the discovery of genes Promote the development of therapies Fund pre-clinical studies for proof of concept Promote harmonization, interoperability, sharing, open access data Promote coordination between human and animal models Promote active exchanges between stakeholders through information dissemination of ongoing projects and events 17 IRDiRC policy and guidelines Endorsement of standards and tools Endorsement of standards and tools contributing to IRDiRC objectives Ontologies: HPO, ORDO… Standards: BRIF… Data sharing: PhenomeCentral, DECIPHER… Ouctome measures: NINDS, PROMIS… 18 IRDiRC Recommended Label to be used in highlighting tools, standards and guidelines, which contributes directly to IRDiRC objectives Application for ‘IRDiRC Recommended’ label is open to all, including non-IRDiRC members ‘IRDiRC Recommended’ may be awarded to similar tools, standards and guidelines Submission of 1-2 pages application Evaluation of the application by a review panel Approval/rejection of the application by the Executive Committee 19 INITIATIVES TO SPEED UP DATA SHARING 20 Rational Research produces an enormous amount of data If shared, will facilitate the development of diagnostics and treatments while ensuring efficient utilization of scarce resources Resources include patient and family material (extracted DNA, cell lines, pathological samples), technical protocols, informatics infrastructure, and analysis tools Datasets include phenotypes, genomic variants, other ‘omic’ data, natural histories, and clinical trial data… 21 Barriers to Data Sharing Technical and Financial issues Storing terabytes…Securing data Providing the logistics for sharing data Statistical and algorithmic issues to combine datasets Ethical and Legal issues Data across public and private networks Pricacy protection at national level Cultural issues Reluctance to share data from researchers/ Institutions/Regulatory bodies 22 A ClearingHouse of Data Standards is in development at IRDiRC Five main fields of application Standards in Genomics and other OMICS Standards in Phenotyping Standards in Outcome Measures for clinical trials Standards in Human Data Registration Open-access Data Repositories to store data Alignment with other efforts to ensure interconnection and shareability between data RD-Connect PCORI, Comete ELIXIR, BD2K, Data FAIRport 23 Open Acess Data Repositories PhenoTips and PhenomeCentral Repository of data Hub for data sharing CareforRare, RDConnect NIH undiagnosed patients ClinVar and ICCG Public archive of variants and assertions about significance Decipher Database of Chromosome imbalances and phenotypes Using Ensembl resources Sanger Institute NCBI resource Wellcome Trust 24 INITIATIVES TO SPEED UP DATA MINING 25 Rational Make the most of remarkable advances in the molecular basis of human diseases dissect the physiological pathways improve diagnosis develop treatments Make rare diseases visible in health information systems to gain insight into them to access real life data already collected What is the problem ? Computers are not smart enough…. The following descriptions mean the same thing to you: generalized amyotrophy generalized muscle atrophy muscular atrophy, generalized But your computer thinks they're completely unrelated 27 Phenomes: a continuum Group of phenomes • Top of classification = System disorder • Group « Disorder » level • Clinical criterion • Disease, syndrome, condition,anomaly… Subtypes • Etiological • Clinical • Histopathological… •Disease •Malformation syndrome •Morphological anomaly •Biological anomaly •Clinical syndrome •Particular clinical situation •No type: waiting to have a type attributed 28 Orphan Diseasome An Orphan Diseasome permits investigators to explore the orphan disease (OD) or rare disease relationships based on shared genes and shared enriched features (e.g., Gene Ontology Biological Process, Cellular Component, Pathways, Mammalian Phenotype). The red nodes represent the orphan diseases and the green ones the related genes. A disease is connected to a gene if and only if a mutation which is responsible of the disease has been identified on this gene. http://research.cchmc.org/od/01/index.html UMLS = Unified Medical Language System ICD = International Classification of Diseases Since 1863 by WHO Used by most countries to code medical activity, mortality data MeSH = Medical Subject Headings controlled vocabulary thesaurus used for indexing articles for PubMed by National Library of Medicine (USA) SnoMed CT = Systematized Nomenclature of Medicine--Clinical Terms clinical terminology by the International Health Terminology Standards Development Organisation (IHTSDO) in Denmark Used in the USA and a few other countries MedDRA = Medical Dictionary for Regulatory Activities medical terminology to classify adverse event information associated with the use of medical products by the International Federation of Pharmaceutical Manufacturers and Associations (IFPMA) Different resources, different terminologies (e)HR: SNOMED CT Others? Free text Mutation/patient registries, databases: HPO LDDB PhenoDB Elements of morphology Others? Free text? Tools for diagnosis: HPO LDDB Orphanet Each terminology has a purpose– driven approach Indexing health status of individual patients for health management (SnoMED) Detailed, focus on manifestations and complaints Adapted to clinical habits Analytical approach Indexing health status of individual patients for statistical purpose in public health (ICD) More agregated, interpreted phenotypic features Agregated concepts Unambiguous to avoid blanks Purpose–driven approach (2) Indexing health status of individual patients for clinical research purpose (HPO / PhenoDB / Elements of morphology) Highly detailed to fit with the research questions Specific terminologies developed for disease-specific patient registries Indexing health status of individual patients for retrieving possible diagnoses (LDDB,POSSUM,Orphanet) Agregated concepts Requires a judgement of clinicians about phenomic expressions that are relevant Unambiguous to avoid blanks HOW TO MAKE ALL THESE TERMINOLOGIES INTER-OPERABLE ? 35 Convince the terminologies to converge in some way…. Sept 2012: start of mappings (Orphanet) EUGT2 – EUCERD workshop (Paris, September 2012) LDDB Elements of Morphology POSSUM SNOMED CT (IHTSDO) ICD (WHO) PhenoDB Orphanet HPO DECIPHER Phenotype terminology project Aims: Map commonly used clinical terminologies (Orphanet, LDDB, HPO, Elements of morphology, PhenoDB, UMLS, SNOMED-CT, MESH, MedDRA): – automatic map, expert validation, detection and correction of inconsistencies Find common terms in the terminologies Produce a core terminology – Common denominator allowing to share/exchange phenotypic data between databases Mapping Terminologies Orphanet: 1357 terms (Orphanet database, version 2008) LDDB: 1348 dysmorphological terms (Installation CD) Elements of Morphology: 423 terms (retrieved manually from publication AJMG, January 2009) HPO: 9895 terms (download bioportal, obo format, 30/08/12) PhenoDB: 2846 terms (given in obo format, 02/05/2012) UMLS: (version 2012AA) (integrating MeSH, MedDra, SNOMED CT) Tools OnaGUI (INSERM U729): ontology alignment tool – – – Work with file in owl format I-Sub algorithm: detect syntaxic similarity Graphical interface to check automatic mappings and manually add ones Metamap (National Library of Medicine): a tool to map biomedical text to the UMLS Metathesaurus Perl scripts: format conversion, launching Metamap, comparison of results… Comparison of mappings and deduction Perl script to compare all the mappings and infer mappings of non-Orphanet terminologies Eg: Orphanet ID XX mapped to YY in HPO and ZZ in LDDB -> deduction: YY and ZZ should probably map Retrieve HPO mappings versus UMLS, MeSH El. Morpho PhenoDB HPO UMLS… Orphanet E: 416 E: 978 E: 2228 E: 6948 LDDB D: 275 D: 533 D: 1123 D:2678 D: 177 D: 716 D: 409 D: 1045 D:3268 LDDB E: 1062 First figures: El. Morpho PhenoDB HPO UMLS… D: 6307+4800 Mapping of non-Orphanet terminologies Automatic and infered mappings were checked by experts Using OnaGUI for all, except UMLS Automatic I-Sub: 7.0 + deduction El. Morpho PhenoDB HPO UMLS… D: 257 +23 added D:528, 92%E A:674, 38%E D: 1105, 87%E A: 2084, 23%E D: 2654, 83%E A: 11731 D:174, 50%E A:189, 74%E D:393, 93%E A: 436, 16%E D:405, 84%E A:1248 D:1018, 91%E A: 4168, 6%E D: 3222, 82%E A: 18776 Metamap + deduction + HPO mappings LDDB Figures: El. Morpho PhenoDB HPO UMLS… D: 7389 A: 65535 First list of common terms Present in at least 2 terminologies Definition of rules for nomenclature Addition of terms present in each terminology as synonyms Workshop on 21-22 October 2013 in Boston Success! Reviewed 2736 terms appearing 2 or more times in the 6 terminologies in 17 hours 2302 terms chosen, including preferred term Definitions are from Elements of Morphology if available, and HPO/Stedman’s Medical Dictionary, if not List of terms, mapping to HPO, PhenoDB, Elements of Morphology will be available at http://ichpt.org by January 2015. All tools will map to this terminology to allow interoperability among resources Adoption of a core set of >2,300 terms common to all terminologies Workshop on Terminologies for RD – Paris, 12 September 2012 Many terminologies in use to describe phenomes - No interoperability Joint EuroGenTest and EUCERD workshop Organized by Ségolène Aymé Agreement to define a core set of terms common to all terminologies and a methodology Core set identified by cross referencing HPO PhenoDB Orphanet UMLS: MeSH, MedDRA, SnoMed CT LDDB Elements of morphology Workshop of validation, Boston 21-22 October 2013 Workshop supported by HVP and EuroGenTest Organized by Ada Hamosh Expert review of the initial proposal Selection of 2,370 terms Decision to propose them for adoption by all terminologies Establishment of the International Consortium for Human Phenotype Terminologies – ICHPT Publication on the IRDiRC website with definitions from HPO Elements of morphology 44 COMPUTERS ARE NOT SMART FROM A TERMINOLOGY TO AN ONTOLOGY 45 Why ontologies are needed ? Ontologies are representations of the knowledge in a way which is directly understandable by computers Ontologies allow reasoning Ontologies define the objects AND the relationship between the objects Duchenne muscular dystrophy (disease) Is a neuromuscular disease (group of diseases) Schistosomias (disease) Is a cause of anemia (manifestation) 46 Standardization of Phenotype Ontologies Workshop Sympathy, 19 Apr 2013, Dublin Organized by IRDiRC, supported by the University of Dublin, Forge and EuroGenTest Conclusion: Adopt HPO & ORDO & cross-reference with OMIM Standardisation of Phenotype Ontologies Rare Diseases bioportal.bioontology.org/ontologies/ORDO Based on Orphanet multi-hierarchical classification of RD Genes– diseases relationships Cross-references: -For RD nomenclature : OMIM, SNOMED CT, ICD10, MeSH, MedDRA, UMLS -For genes : OMIM, HGNC, UniProtKB, IUPHAR, ensembl, Reactome Phenotypic Features bioportal.bioontology.org/ontologies/HP ICHPT (International Consortium for Human Phenotype Terminologies) 2,307 terms- core terminology Mapped to: HPO Elements of Morphology Orphanet LDDB SNOMED CT Pheno-DB (OMIM) MeSH UMLS Available soon for download at ichpt.org Please adopt/disseminate HPO and ORDO to speed up R&D to the benefit of the patients 49 COMPUTERS ARE VERY SMART THEY CAN HELP REPURPOSE DRUGS 50 Rational: Make optimal use of molecules already known Drug Repositioning or Repurposing is a strategy used to generate new or additional value for a drug, by targeting diseases other than those for which it was originally intended Address unmet medical needs Reduce time to market due to provided information on Unbiased clinical safety and efficacy data Add value to exiting porfolio Increase drug pipeline Decrease R&D failure risks Decrease development costs Creates new revenue potential 51 Graph Theory Enables Drug Repurposing Gramatica et Al.: PLOS one, Vol 1 e84912, 2014 23 Million articles from PubMed Possible to link the gathered information on drugs, physiological pathways and resulting biological activities with the pathophysiological signs & symptoms of diseases Possible to rank the matches in order to identify the most promising leads 52 Graph Theory Enables Drug Repurposing Gramatica et Al.: PLOS one, Vol 1 e84912, 2014 53 Graph Theory Enables Drug Repurposing Gramatica et Al.: PLOS one, Vol 1 e84912, 2014 54 Conclusion Open access to dataWe now leave in an open world It is an opportunity in research Evidence that open-access to data is beneficial, especially for the data producer ! Orphadata is accessed by 3000 researchers/ month Agreed standards to make data interoperable Responsibility of Institutions and of individual researchers 55 Thank you for your invitation 56