I2B2 Shared Task 2011 Coreference Resolution in Clinical Text David Hinote Carlos Ramirez What is coreference resolution? • Nouns, pronouns, and phrases that refer to the same object, person or idea are coreferent. o Example: "Alexander was playing soccer yesterday. He fell and broke his knee." o "Alexander", "he", and "his" refer to the same person, so they are said to be coreferent. The i2b2 Challenge I2B2: "Informatics for Integrating Biology and the Bedside" • This program has issued a challenge in NLP involving Coreference resolution. • The challenge is to find co-referential relations within a given medical document. • The concepts that can be corefered are all annotated. • There are 5 classes of concepts: o Problem o Person o Treatment o Test o Pronoun Concept Mentions People • Any mention that refers to a person, or group of people o Dr. Lightman, The patient, cardiology Problems • A mention that refers to the reason the subject of the document is in the hospital o Heart attack, blood pressure, broken leg Tests • Tests performed by doctors o EKG, temperature, CAT scan Treatments • Solutions to the problem mentions, or work performed to cure patients o Brain surgery, ice pack, Tylenol Pronouns can refer to any of the four other types of mentions Approaches for Competing • Using tools already made & publicly available o Stanford NLP o BART Coreference o LingPipe o CherryPicker o Reconcile o ARKRef o Apache Open NLP • Coding our own Coreference Tool Other Coreference Tools • We obtained versions of other Coreference tools and tested them on our data. • All tools we found were either still in their initial development stages, or were built for their specific purpose and left alone after. (i.e. Coreference on the MUC datasets) • Testing shows that at best, the other tools we found do not perform acceptably with our data. • After attempts to train other tools using our data failed, we felt it best to code our own approach. Other Tools Statistics Algorithm • Because the data we are working on is so specific, we chose to use a rule based approach to coreference resolution. • This means that we try to learn the characteristics of each coreferent link ourselves, and program a method for the link manually. • We examine concepts in a file, and if they meet our criteria, we create a method to link them. • The idea is to create specific rules, yet generalized enough to apply to similar mentions in all documents. Our Application • To help visualize coreferent links and see what links our program detects, we use a GUI created with Java. • Our program is developed by us using the Mecurial version control system to allow us to keep each others code up to date. • Uses our coded algorithms to determine coreferent links between the given concept mentions. • It displays coreferent links as lines. o Blue for true links. o Red for links that are detected by our algorithm. Our Application Programmed in Java, our application can utilize databases, and the internet to gather information about concept mentions being tested. • We have set up a database to hold data that gives meaning to concept mentions being tested, or to certain key words in a sentence that contains a mention. If words or phrases meet our criteria, they can be added to the appropriate table straight from the program window. • For each mention, information is extracted by the program from Google.com searches as well, which can give the program a wealth of information about the mentions. Sample file • Viewing Concepts & I2B2 Chain File with both UHD and I2B2 Links Shown Statistics for our System Progress • We are currently at around 75% F1 score. (Averaged over all test files.) • Most algorithms for resolving coreference tend to have accuracy in the 60% range. • With the time we have left, we will definitely increase this score. • We still haven't added detection for "Treatment" type concepts, which constitute a significant percentage of the concepts not found when computing our F1 score. • Detection for "Test" type concepts still needs work. Current work Test Mentions • Precision on "Test" type concepts is relatively low (30%). • Mainly this is because many of the tests involve specific body parts (e. g. "chest x-ray" and "chest CT" are sometimes linked by our rules). • Tests also often involve times (e. g. "an x-ray was performed on 5 Aug." would link with "the x-ray on... December 10, 2010"). • They also involve position (e.g. "x-ray on left lung" "x-ray on right lung") Current Work Problem Mentions • Work on these mentions is about 50% complete • To finish, a few more database tables will need to be set up, and certain types of medical vocabulary loaded into them. • We will also need a system for finding phrases made of different words, but mean the same thing AKA a thesaurus Possible future problems • The main risk with a rule based approach is that our rules might be too specific to work with the contest data once it's distributed. • Given the execution speed of our program, we should have enough time to do any necessary modifications in the three days between contest data being sent and results submitted. • There is also a slight problem with the fact that our application is made for a very specific purpose and is probably hard to generalize beyond the context of medical documents. • Most coreference resolution tools are this way though. • Not being able to code fast enough! Future Necessities • A reliable way to find the temporal setting of a particular sentence. o Did an injury described happen 20 years ago, or is the doctor giving instructions for a future case? These are not coreferent even though they may be the same word • Thesaurus work o finding phrases that mean the same thing, but use completely different words • Output o The program will not output files in the I2B2 competition format, we will have this feature made as the competition deadline draws near.