Mapping Clinical Narrative to LOINC: A Preliminary Report Charles A. Sneiderman, M.D., Ph.D. Marcelo Fiszman, M.D., Ph.D. Honglan Jin, Ph.D. Thomas C. Rindflesch, Ph.D. Lister Hill National Center for Biomedical Communications Introduction Pilot project Current limitations Identify physiologic functions only In published clinical case reports [see CAS_report.pdf] Motivation for addressing clinical LOINC Knowledge-intensive NLP for clinical narrative Clinical observations less likely structured LOINC standard for communicating observations Side benefit Possibility of semi-automated LOINC coding Background Previous published research Limitation of existing NLP methods Terminologies mapped to LOINC No mapping of documents to LOINC MetaMap: Interaction with LOINC in Metathesaurus [see CAS_examples.doc] String matching: LOINC specification not accommodated [see CAS_examples.doc] LOINC mapping is knowledge intensive Methods: Overview MetaMap to UMLS first Then apply knowledge-based rules Extension of “Lexically Assign, Logically Refine” strategy of Dolin et al. (1998) Evaluation against coding by FP (CS) checked by Cardiologist (BB) [see CAS_annotate.txt] Methods: Knowledge Use canonical document structure Physical examination section • • Use UMLS Semantic types Begin: Lexical cues (e.g. examination) End: Semantic types (e.g. Diagnostic Procedure) To identify physiologic functions (Physiologic Function, Organism Function, Clinical Attribute, Organism Attribute) Use syntactic context Disambiguation: “BP” followed by quantitative concept Methods: LOINC structure Vital Signs Blood pressure system (4th field)=arterial Respiratory rate system=respiratory Heart rate system=XXX Quantitative (QN) in 5th field (scale) Choose most general LOINC No periods in any field No “^” (other than “^Patient”) in any field No “difference” in 2nd field Discussion Preliminary results [see CAS_output.txt] Initial phase Assess feasibility Note issues faced Next Steps Expand rules Based on structured knowledge • • What information does LOINC encode How is it represented Exploit LOINC information model (Forrey et al. 1996; Huff et al. 1998; McDonald et al. 2003) References Dolin RH, Huff SM, Rocha RA, Spackman KA, Campbell KE. Evaluation of a “lexically assign, logically refine” strategy for semi-automated integration of overlapping terminologies. J Am Med Inform Assoc. 1998 Mar- Apr;5(2):203- 13. PMID: 9524353 Forrey AW, McDonald CJ, DeMoor G, Huff SM, Leavelle D, Leland D, Fiers T, Charles L, Griffin B, Stalling F, Tullis A, Hutchins K, Baenziger J. Logical observation identifier names and codes (LOINC) database: a public use set of codes and names for electronic reporting of clinical laboratory test results. Clin Chem. 1996 Jan;42(1):81-90. PMID: 8565239 Huff SM, Rocha RA, McDonald CJ, De Moor GJ, Fiers T, Bidgood WD Jr, Forrey AW, Francis WG, Tracy WR, Leavelle D, Stalling F, Griffin B, Maloney P, Leland D, Charles L, Hutchins K, Baenziger J. Development of the Logical Observation Identifier Names and Codes (LOINC) vocabulary. J Am Med Inform Assoc. 1998 May-Jun;5(3):276-92. PMID: 9609498 McDonald CJ, Huff SM, Suico JG, Hill G, Leavelle D, Aller R, Forrey A, Mercer K, DeMoor G, Hook J, Williams W, Case J, Maloney P. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clin Chem. 2003