Computational Intelligence in Biomedical and Health Care Informatics HCA 590 (Topics in Health Sciences) Rohit Kate Clinical Natural Language Processing 1 Reading Paper: What can natural language processing do for clinical decision support? Dina Demner-Fushman, Wendy Chapman, Clement McDonald Journal of Biomedical Informatics 42 (2009) 760-772 Paper: 2010 i2b2/VA Challenge on Concepts, Assertions, and Relations in Clinical Text Uzuner Ö., South B., Shen S., DuVall S. Journal of the American Medical Informatics Association 2011;18(5):552-556 Clinical Decision Support Systems • A clinical decision support (CDS) system is any computer program designed to help healthcare professionals to make clinical decisions or present them with patient-specific assessments and recommendations – – – – – – Suggest diagnosis and medications Trigger reminders Flag abnormal values Alert about drug interactions Remind the user of overlooked diagnoses Provide advice based on patient-specific data CDS Systems and Narrative Text • Such computer based support is much more effective if the computer system has access to electronic medical records (EMRs) and has the ability to process them • Major portion of patient records, including radiology reports, operative notes, discharge summaries etc. are recorded as narrative text (dictated, transcribed or directly entered) in a natural language such as English • Facts that should activate a CDS system are often found only in free text CDS Systems and NLP • Much of the data that could support CDS is textual and therefore cannot be leveraged by a CDS system without natural language processing (NLP) • NLP to be used for CDS needs to be: – Reliable – High-quality – Modular and flexible – Fast Active and Passive CDS with NLP • Active: System leverages available information and pushes patient-specific information to the user • Passive: The users themselves seek the support • Users: Depending upon the application, besides clinicians users could be patients, researchers, administrators, students, and coders • Besides free text in the clinical records, NLP for CDS could also be processing biomedical literature, web pages etc. Active and Passive CDS with NLP Figure from the paper: Users Example of an Idealized NLP-CDS System • It would monitor EMR for insertions of new data • When free text is entered for example “Right lower lung opacity, which could be contusion or pneumonia”, NLP system will kick in to analyze it • NLP system will extract information that the disorder could be “pulmonary contusion” or “pneumonia”, this will go to the CDS system • The CDS system will look up decision rules for suspected pneumonia and retrieve results of blood test and evaluate white blood cell count • If the count is high, the system will suggest as a reminder message (may be in natural language) that the patient is more likely to have pneumonia than pulmonary contusion and why Example of an Idealized NLP-CDS System • Idealizing further, the NLP system may look up medical literature and solicit more information and present natural language summaries – For example, present summaries of best approaches to manage both disorders – May look up medical publications, guidelines, actionable recommendations available in free text • This idealized system will have to deal with all the challenges of clinical NLP Challenges of Clinical Language Processing • Good Performance – Performance should be good enough to be used for clinical applications, should not be significantly worse than the medical experts – System should have flexibility to trade-off precision and recall • Recovery of Implicit Information – NLP system should contain enough medical knowledge to make appropriate inferences – “rupture” means “rupture of membranes” – “patchy opacity” and “focal infiltrate” may indicate “pneumonia” Challenges of Clinical Language Processing • Interoperability: NLP system should seamlessly integrate into clinical information systems – Many different interchange formats (e.g. HL7) – Different types of reports with different formats, text may contain tables, structured fields etc. – Output of NLP system should be mapped to appropriate controlled vocabulary, e.g. UMLS, SNOMED or ICD Challenges of Clinical Language Processing • Training set availability – Patient records are confidential and requires approval of institutional review board (IRB) – There are methods to de-identify names etc. but identifying names etc. is not easy – These issues do not arise when processing literature • Limited availability in electronic form – Many clinical documents are still written on paper – Optical Character Recognition (OCR) is not accurate especially with physicians’ notes Challenges of Clinical Language Processing • Expressiveness – More than 200 different expressions for severity information: faint, mild, borderline, 3rd degree, mild to moderate etc. – Complex modifiers: “no improvement in pneumonia” in text will match a query “improvement in pneumonia” • A lot of abbreviations which could be ambiguous – pvc may mean pulmonary vascular congestion in chest Xray report and premature ventricular complexes in electrocardigram report Challenges of Clinical Language Processing • Compactness of text – Very compact containing many abbreviations – Sentence boundaries poorly delineated Admit 10/23 71 yo woman h/o DM, HTN, Dilated CM/CHF, Afib s/p embolic event, chronic diarrhae, admitted with SOB. • Rare events – Medical errors and adverse events are not reported frequently, difficult to train a system to detect them Shared Tasks in Clinical Language Processing • Evaluation – Difficult to obtain gold-standard data, time-consuming for medical experts to annotate data – Evaluation competitions or Shared Tasks are very useful, they help compare different systems on the same data • i2b2 shared tasks 2008-2012: – – – – – https://www.i2b2.org/NLP/Obesity/ https://www.i2b2.org/NLP/Medication/ https://www.i2b2.org/NLP/Relations/ https://www.i2b2.org/NLP/Coreference/ https://www.i2b2.org/NLP/TemporalRelations/ • ShARe/CLEF eHealth 2013-2014: – https://sites.google.com/site/shareclefehealth/ – http://clefehealth2014.dcu.ie/ • SemEval 2014 Task 7- Analysis of Clinical Text: – http://alt.qcri.org/semeval2014/task7/ • TREC Medical Records task i2b2 2010: Concepts • Concepts: – Medical Problems – Treatments – Tests • System input: raw text of medical records • System output: A plain text file that contains entries of the form: c=“concept text” offset || t=“concept type” (offset indicates line and token numbers of the document) For example: – c=“cancer” 5:8 5:8 || t=“problem” – c=“chemotherapy” 5:4 5:4 || t=“treatment” – c=“chest x-ray” 6:12 6:13 || t=“test” 16 i2b2 2010: Assertions • Assertions (attributes of medical problems): – – – – – – Present Absent Possible Conditional Hypothetical Not associated with the patient • System input: raw text of medical records and given concepts • System output: Assertions on all problem concepts (and only problem concepts) c=“concept text” offset || t=“concept type” || a=“assertion value” For example: – c=“hypertension” 5:4 5:4 || t=“problem” || a=“absent” – c=“diabetes” 6:12 6:12 || t=“problem” || a=“possible” 17 i2b2 2010: Relations • Extract the relations that exist between the concepts: – medical problems and treatments • 6 possible relations – medical problems and tests • 3 possible relations – medical problems and other medical problems • 2 possible relations • System input: raw text medical records with given concepts and assertions (optional) • System output: relations of pairs of concepts in the following format: – c="a cardiac catheterization" 9:12 9:14 || r="TeCP" || c="chest pain" 9:5 9:6 – c="a cardiac catheterization" 9:12 9:14 || r="TeRP" || c="an occluded right coronary artery" 9:23 9:27 – c="a cardiac catheterization" 9:12 9:14 || r="TeRP" || c="a 40-50% proximal stenosis" 9:29 9:32 18 i2b2 2010: Data • 349 Training reports – – – – 97 discharge summaries from Partners 73 discharge summaries from Beth-Israel Deaconess Medical Center 98 Discharge summaries from University of Pittsburgh Medical Center 81 progress notes from University of Pittsburgh Medical Center • 477 Test reports – 133 discharge summaries from Partners – 123 discharge summaries from Beth-Israel Deaconess Medical Center – 102 Discharge summaries from University of Pittsburgh Medical Center – 119 progress notes from University of Pittsburgh Medical Center 19 i2b2 2010: Best Results • Total 41 teams participated (22 for concepts, 21 for assertions and 16 for relations) Best F-measures (harmonic mean of precision & recall): • Concepts: 85% F-measure • Assertions: 92.6% F-measure • Relations: 73.7% F-measure 20