KMeD: A Knowledge-Based Multimedia Medical Database System Wesley W. Chu Computer Science Department University of California, Los Angeles http://www.cobase.cs.ucla.edu 1 KMeD A Knowledge-Based Multimedia Medical Distributed Database System October 1, 1991 to September 30, 1993 A Cooperative, Spatial, Evolutionary Medical Database System July 1, 1993 to June 30, 1997 Knowledge-Based Image Retrieval with Spatial and Temporal Constructs May 1, 1997 to April 30, 2001 Wesley W. Chu Alfonso F. Cardenas Ricky K. Taira Computer Science Department Computer Science Department Department of Radiological Sciences 2 Research Team Students John David N. Dionisio Chih-Cheng Hsu David Johnson Christine Chih Collaborators Computer Science Department Alfonso F. Cardenas UCLA Medical School Denise Aberle, MD Robert Lufkin, MD Ricky K. Taira, MD 3 A NIH Grant at UCLA (2001-2005) A Medical Digital library---A Digital File Room for Patient Care, Education, and Research Wesley W. Chu, PhD Hooshang Kangarloo, MD Usha Sinha, PhD David B. Johnson, PhD Bernard Churchill, MD 4 Significance Query multimedia data based on image content and spatial predicates Use domain knowledge to relax and interpret medical queries Present integrated view of multiple temporal and evolutionary data in a timeline metaphor Retrieve Scenario Specific Free-text documents in a Medical Digital Library 5 Overview Image retrieval by feature and content Query relaxation Spatial query answering Similarity query answering Visual query interface Timeline interface Retrieval of scenario specific free text medical documents 6 Image Retrieval by Content Features size, shape, texture, density, histology Spatial Relations angle of coverage, shortest distance, overlapping ratio, contact ratio, relative direction Evolution of Object Growth fusion, fission 7 8 9 10 11 Characteristics of Medical Queries Multimedia Temporal Evolutionary Spatial Imprecise 12 Representing of Temporal and Evolution Objects 01 O O’ 01 O Om Evolution: Object O evolves into a new object O’ Fusion: Object 01, …, Om fuse into a new object O On Fission: Object O splits into object 01, …, On 13 Representing of Temporal and Evolution Objects (cont) Case a: The object exists with its supertype or aggregated type. Case c: The life span of the object starts with and ends before its supertype or aggregated type. Case b: The life span of the object starts after and ends with its supertype or aggregated type. Case d: The life span of the object starts after and ends before its supertype or aggregated type. 14 An Example of Temporal and Evolution Object Lesion MicroLesion MicroLesion 15 16 17 Spatial Distance and Angle of Coverage of Two Objects 18 19 Query Modification Techniques Relaxation Generalization Specialization Association 20 Generalization and Specialization More Conceptual Query Generalization Conceptual Query Generalization Specific Query Specialization Conceptual Query Specialization Specific Query 21 Type Abstraction Hierarchy Presents abstract view of Types Attribute values Image features Temporal and evolutionary behavior Spatial relationships among objects Provides multi-level knowledge representation 22 TAH Generation for Numerical Attribute Values Relaxation Error Difference between the exact value and the returned approximate value The expected error is weighted by the probability of occurrence of each value DISC (Distribution Sensitive Clustering) is based on the attribute values and frequency distribution of the data 23 TAH Generation for Numerical Attribute Values (cont.) Computation Complexity: O(n2), where n is the number of distinct value in a cluster DISC performs better than Biggest Cap (value only) or Max Entropy (frequency only) methods MDISC is developed for multiple attribute TAHs. 2 Computation Complexity: O(mn ), where m is the number of attributes 24 Query Relaxation Query Relax Attribute Display Yes Database Answers No TAHs Query Modification 25 An Cooperative Query Answering Example Query Find the treatment used for the tumor similar-to (loc, size) X1 on 12 year-old Korean males. Relaxed Query Find the treatment used for the tumor Class X on preteen Asians. Association The success rate, side effects, and cost of the treatment. 26 Type Abstraction Hierarchies for Medical Domain Tumor (location, size) Age Ethnic Group Class X Preteens 9 10 12 Teen Adult [loc1 loc3] [s1 s3] Class Y [locY sY] 11 Asian Korean African Chinese Japanese European Filipino X3 X1 X2 [loc1 s1] [loc2 s2] [loc3 s3] 27 Knowledge-Based Image Model TAH TAH TAH SR(t,b) Tumor Size SR(t,l) Lateral Ventricle Knowledge Level SR(t,l) SR(t,b) Brain TAH Tumor Lateral Ventricle SR: Spatial Relation b: Brain t: Tumor l: Lateral Ventricle Schema Level Representation Level (features and contents) 28 Knowledge-based Query Processing Queries Query Analysis and Feature Selection Knowledge-Based Content Matching Via TAHs Query Relaxation Query Answers 29 User Model To customize query conditions and knowledgebased query processing User type Default Parameter Values Feature and Content Matching Policies Complete Match Partial Match 30 User Model (cont.) Relaxation Control Policies Relaxation Order Unrelaxable Object Preference List Measure for Ranking 31 33 Query Preprocessing Segment and label contours for objects of interest Determine relevant features and spatial relationships (e.g., location, containment, intersection) of the selected objects Organize the features and spatial relationships of objects into a feature database Classify the feature database into a Type Abstraction Hierarchy (TAH) 34 Similarity Query Answering Determine relevant features based on query input Select TAH based on these features Traverse through the TAH nodes to match all the images with similar features in the database Present the images and rank their similarity (e.g., by mean square error) 35 40 Visual Query Language and Interface Point-click-drag interface Objects may be represented iconically Spatial relationships among objects are represented graphically 41 Visual Query Example Retrieve brain tumor cases where a tumor is located in the region as indicated in the picture 42 43 44 45 46 47 48 A Visual Query Example 49 A Visual Temporal Query Example 50 Implementation Sun Sparc 20 workstations (128 MB RAM, 24-bit frame buffer) Oracle Database Management System X/Motif Development Environment, C++ Mass Storage of Images (9 GB) 53 54 55 56 Summary I Image retrieval by feature and content Matching and relaxation images based on features Processing of queries based on spatial relationships among objects Answering of imprecise queries Expression of queries via visual query language Integrated view of temporal multimedia data in a timeline metaphor 58 A Knowledge-based Approach to Retrieve Scenario Specific Free-text in a Medical Digital Library 59 NIH Program Project Grant (2000-2005) A 5 year $ 10M joint interdisciplinary project between Medical School & CS faculty Project 1-- teleradaiology infrastructure Project 2-- neuroradiology workstation Project 3-- multimedia information architecture Project 4-- natural language processing for medical reports Project 5-- medical digital library 60 Project 5 Personnel Project leader: Wesley W. Chu Graduate students: Victor Z. Liu Wenlei Mao Qinghua Zou Consultants: Hooshang Kangaloo, M.D. Denies Aberle, M.D. 61 Data in a Medical Digital Library Structured data (patient lab data, demographic data,…)--CoBase Images (X rays, MRI, CT scans)--KMeD Free-text Patient reports Teaching files Literature News articles 62 System Overview Ad-hoc query Patient report for content correlation Medical Digital Library (MDL) Query results Patient reports Medical literature Teaching materials News Articles 63 Scenario Specific Retrieval … Tissue Source: LUNG (FINE NEEDLE ASPIRATION) (LEFT LOWER LOBE) … FINAL DIAGNOSIS: - LUNG NODULE, LEFT LOWER LOBE (FINE NEEDLE ASPIRATION): - LUNG CANCER, SMALL CELL, STAGE II. … ??? ??? How How to to treat the diagnose disease the disease Diagnosisrelated articles Treatmentrelated articles 64 Challenge I: Indexing Extracting domain-specific key concepts in the free text for indexing Free-text: Lung cancer, small cell, stage II Concept terms in knowledge source: stage II small cell lung cancer Conventional methods use NLP Not scalable Cannot adapt to various forms of word permutation 65 Challenge II: Terms used in the query are too general Expanding the general terms in the query to specific terms that are used in the document Query: lung cancer, diagnosis chest x-rayoptions , bronchography, … √ ? Document: … the effectiveness of chest x-ray and bronchography on patients with lung cancer … 66 Challenge III: Mismatching between terms used in query and documents Example Query: … lung cancer, … ? Document 1: … lung carcinoma … ? ? Document 3: anti-cancer drug combinations… Document 2: … lung neoplasm … 67 Application: Query Answering via Templates Sample templates: “<disease>, treatment,” “<disease>, diagnosis ” relevant documents Phrase-based VSM lung cancer IndexFinder Template: “<disease>, treatment” lung cancer, treatment Query Expansion … lung cancer radiotherapy chemotherapy cisplatin 81 Application: Scenario Specific Content Correlation relevant documents Query Templates e.g. treatment, diagnosis, etc. Phrase-based VSM Scenario Selection IndexFinder Query Expansion … Patient Report 82 Summary of MDL Knowledge based (UMLS) approach provides scenariospecific medical free-text retrieval IndexFinder – use word permutation as well as syntactic and semantic filtering to extract domain-specific key concepts in the free text for indexing Knowledge-based query expansion – transform general terms in the query into the scenario specific terms used in the documents, giving the query a higher probability of matching with the relevant documents Phrase based indexing – transform document indexing into phrase paradigm (concept and its word stems) to improve retrieve effectiveness 83 Acknowledgement This research is supported in part by NIC/NIH Grant#4442511-33780 84 Demo http://fargo.cs.ucla.edu/umls/search.aspx Test Texts • Technically successful left lower lobe nodule biopsy. • Preliminary localization CT images again demonstrate a left lower lobe nodule adjacent to the posterior segmental bronchus. • CT scans obtained during biopsy demonstrate the coaxial cannula adjacent to the proximal aspect of the nodule. • Surrounding pulmonary parenchymal hemorrhage as a result of the biopsy is also noted. • There may be a tiny left apical air collection in the pleural space lateral to the apical bulla. • Formal cytologic evaluation of the withdrawn specimen is pending at this time, although abnormal appearing "spindle" cells were identified during on-site cytopathologic evaluation of specimen adequacy. 88 102 107 108 109 110 111 112