Text Mining of Medical Documents

advertisement
Text Mining of Medical
Documents
Michael Elhadad - Raphael Cohen
Dept of Computer Science
Natural Language Processing
• Analyze free text to extract “information”
• Key challenges:
– Ambiguity: heart, ‫ברק‬
– Variability: diabetes, dm, diab.
• Applications:
– Search
– Text Mining: information extraction, relations
– Summarization
NLP for Medical Domain
Opportunity
• Availability of online textual documents
– EHR: mostly textual (release notes)
– Scientific literature (PubMed)
Challenge
• Methods developed on “regular language”
fail on “medical language”
Specific Interest
• EHR
– Exploit rich textual data in EHR.
– In Hebrew!
• Hebrew NLP
– Complex morphology, no dictionaries, no
UMLS
• Domain Adaptation
– Machine learning methods to port NLP
models from one domain to medical domain.
Recent Work in Domain
•
Raphael Cohen, Michael Elhadad and Ohad S Birk, Analysis of free
online physician advice services, PLOS ONE, 2013
•
Raphael Cohen, Noemie Elhadad, Michael Elhadad, Redundancy in
Electronic Health Record Corpora: Analysis, Impact on Text Mining
Performance and Mitigation Strategies BMC Bioinformatics, 2013.
•
Raphael Cohen and Michael Elhadad, Syntactic Dependency Parsers
for Biomedical-NLP, AMIA Proceedings 2012, pp121-128
•
Raphael Cohen, Yoav Goldberg and Michael Elhadad, Domain
Adaptation of a Dependency Parser with a Class-Class Selectional
Preference Model, ACL 2012, SRW
•
Raphael Cohen, Avitan Gefen, Michael Elhadad and Ohad S Birk, CSIOMIM - Clinical Synopsis Search in OMIM, BMC Bioinformatics 2011,
12:65
Download