Javed Mostafa Jane Greenberg Rahul Deshmukh Lina Huang Outline Clinical Trials SPIROMICS Application of Ontologies and Controlled Vocabularies Use Cases - Ontologies in Clinical Trials SPIRO-V : Role of Ontology and Controlled Vocabularies in SPIROMICS SPIRO-V: Where We Are Now Controlled Vocabulary Management - The Road Ahead Demo Questions? Clinical Trials Conducted by Government Organizations, Pharmaceutical Companies, Academic Research Centers etc. Mostly to assess safety and effectiveness of new medication or device Types Treatments - Combination of drugs Diagnostics Quality of Life - For patients with chronic illness Clinical Trials (Contd.) Phases Phase 0 - Protocol , Patient Identification Phase 1 – Small Group (20-80) safety & side – effects of drug/treatment Phase II – Larger Group (100-300) Phase III – Large Group (1000-3000) Phase IV – Drug’s Risks, Benefits and Optimal Uses Duration – 6 to 8 years Cost for pharmaceutical companies between $100 - $800 Million In 2005, 8000 Clinical Trials, $24 Billion Invested SPIROMICS Subpopulations and intermediate outcome measures in COPD study (SPIROMICS) Primary Goals Identify and validate markers of disease severity Identify disease subpopulations Secondary Goals Clarify the natural history of COPD Develop bioinformatics infrastructure Generate clinical, radiographic and genetic data that can be used for future multisite clinical trials Application of Ontologies and Controlled Vocabularies Ontology , Controlled Vocabulary Communication For people to talk the same language Indexing To improve retrieval and analysis of data Functions Retrieval Browsing Visualization Use Cases - Ontologies in Clinical Trials Locating eligible patients for clinical trials – IBM, Columbia University Matched patient data to SNOMED-CT ontology Semantic gulf between raw data and clinician’s interpretation Structural representation of a disease ontology – Influenza Infectious Disease Ontology Coverage of Infectious Disease Domain Clinical Trial Data Management System – CancerGrid Model of study OR Dataset Forms, Services, Metadata Registry etc SPIRO-V: Vision SPIRO-V Clinical Trial Application Ontology Visualization SPIROMICS Knowledge Base Ontology SPIROMICS Controlled Vocabularies Patient Identification Mapping Specimen Tracking Clinical Trial DBMS Controlled Vocabularies Editing/ Management Where We Are Now Controlled Vocabularies Harvesting Goal - A COPD Vocabulary set to accurately describe all the SPIROMICS cohorts, phenotypes, and outcome measures. Two Approaches Manual Automatic Manual Approach Manual approach to collect vocabularies from authoritative sources on COPD Domain experts conduct quality control Downside: Low efficiency communication, coordination takes time Consolidated Excel Spreadsheet Automatic Approach A relational database back end to store the terms, definitions and associations Incorporation of VCGS automatic metadata generation system for rapid harvesting Provide human review functionality to control the quality of terms and associations generated. Manage controlled vocabularies development process VCGS Manage Controlled Vocabulary Visualize vocabulary set and make it browsable, searchable, and editable. Vocabulary gathering workflow: Suggest candidates -> review -> release/reject Collaborative initiatives: co-authoring and discussion Demo Database • Physical database in MySQL VCGS TemaTres SPIRO-V Questions? References http://clinicaltrials.gov/ct2/home http://www.cscc.unc.edu/spir/ http://iswc2007.semanticweb.org/papers/809.pdf http://influenzaontologywiki.igs.umaryland.edu/wiki/index.php/ Main_Page http://www.cancergrid.org/ Collaborative thesaurus editing Level of user privileges • Common users can suggest a candidate term, an association or a definition and provide feedback • Authorized users can reject/accept the suggestions. • Document changes and comments from different users. Ontology, Thesaurus, Controlled Vocabularies Ontology Thesaurus Where We Are Controlled Vocabularies