THE ANNOTATION GUIDELINE MANUAL: EXTRACTING ADVERSE DRUG EVENT INFORMATON FROM CLINICAL NARRATIVES IN ELECTRONIC MEDICAL RECORDS Version 2.1 October 29, 2015 Steven Belknap, Elaine Freund, Nadya Frid, Edgard Granillo, Heather Keating, Zuofeng Li, Rashmi Prasad, Balaji Ramesh, Victoria Wang, Hong Yu This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Recommended Citation Belknap, S., Freund, E., Frid, N., Granillo, E., Keating, H., Li, Z., Prasad, R., Ramesh, B., Wang, V., Yu, H. "THE ANNOTATION GUIDELINE MANUAL: EXTRACTING ADVERSE DRUG EVENT INFORMATON FROM CLINICAL NARRATIVES IN ELECTRONIC MEDICAL RECORDS." University of Massachusetts Medical School's Biomedical Informatics Natural Language Processing (UMMS BioNLP) Group. Hong Yu, 29 October 2015. Web. (access date) URL. Table of Contents INTRODUCTION ................................................................................................................................................................................................ 3 General Background..................................................................................................................................................................... 3 Guidelines Background ................................................................................................................................................................ 3 NAMED ENTITY OR ANNOTATION FIELDS* ....................................................................................................................................................... 5 PHI and PII Annotation ................................................................................................................................................................. 5 Medication Annotation: Drug and Drug Attributes...................................................................................................................... 8 Drug- ....................................................................................................................................................................................... 9 Dosage (and form) ................................................................................................................................................................ 10 Route .................................................................................................................................................................................... 11 Frequency – .......................................................................................................................................................................... 11 Duration ................................................................................................................................................................................ 12 Medication Annotation: Medication Related Entities and Attributes ........................................................................................ 13 Indications and Adverse Events ............................................................................................................................................ 13 Indication .............................................................................................................................................................................. 14 Adverse event ....................................................................................................................................................................... 14 MedDRA Annotation ............................................................................................................................................................. 15 SSLIF and Severity ................................................................................................................................................................. 16 SSLIF ...................................................................................................................................................................................... 16 Severity ................................................................................................................................................................................. 17 Choosing A Span ........................................................................................................................................................................ 19 Assertion Categories .................................................................................................................................................................. 20 Outcome Assertions (default notMentioned) ....................................................................................................................... 20 Period Assertions (default current) ....................................................................................................................................... 20 Presence Assertions .............................................................................................................................................................. 22 Annotation Of Relations............................................................................................................................................................. 27 ANNOTATION PRACTICE ................................................................................................................................................................................. 28 General Considerations.............................................................................................................................................................. 28 Anaphoric pronouns ............................................................................................................................................................. 29 Anaphoric Pronouns vs. Co-referent Items ........................................................................................................................... 29 Articles .................................................................................................................................................................................. 29 Titles ..................................................................................................................................................................................... 29 Prepositions .......................................................................................................................................................................... 30 Test results ............................................................................................................................................................................ 31 Longitudinal Information ...................................................................................................................................................... 33 Punctuation and Spelling ...................................................................................................................................................... 34 APPENDIX 1: Additional Annotation Examples and Information ............................................................................................... 35 SSLIF ...................................................................................................................................................................................... 35 Chemotherapy ...................................................................................................................................................................... 36 General Terms....................................................................................................................................................................... 37 Double Annotation................................................................................................................................................................ 37 APPENDIX 2: Protected Health Information (PHI)” and “Personally Identifiable Information ................................................... 39 APPENDIX 3: Entity and Attribute Tables ................................................................................................................................... 40 APPENDIX 4: Routes of Drug Administration and Abbreviations ............................................................................................... 41 APPENDIX 5: Frequency of Drug Administration and Other Abbreviations ............................................................................... 45 APPENDIX 6: Reference Tables (Cardio/Cancer) ........................................................................................................................ 46 APPENDIX 7: Annotation Tool Notes .......................................................................................................................................... 48 APPENDIX 8: Annotation and Tooling Notes Prior to ADE ......................................................................................................... 53 APPENDIX 9: Summary - Annotation Processes/Tooling Changes/Post Processing Notes for the ADE Pharmacovigiliance Project ................................................................................................................................................................................................... 55 APPENDIX 10: Entity and Attribute Diagrams ............................................................................................................................ 57 2 INTRODUCTION General Background An adverse event (AE) is an injury to a patient, and an adverse drug event (ADE) is "an injury resulting from a medical intervention related to a drug" (1). ADEs are common and occur at a rate of 2.4─5.2 per 100 hospitalized adult patients (1–4). Each ADE is estimated to increase the length of hospital stay by 2.2 days and to increase the hospital cost by $3,244 (3,5). Severe ADEs are between the fourth and sixth leading causes of death in the United States (6). Significant healthcare savings could be realized through prevention of ADEs and through early detection and mitigation of ADEs (5,7,8). When a clinician recognizes an ADE, a hospital system typically prompts an appropriate response, such as discontinuation of the drug, adjustment of dose, administration of an antidote (e.g., blood transfusion, antihistamines, antiarrhythmics, or intravenous fluid resuscitation), or other action. While particular instances of ADEs may be recognized and appropriately ameliorated, these events are often not coded in diagnostic or billing fields of the medical record and are therefore “lost” to pharmacoepidemiologists, regulatory agencies, and clinicians. One result of this loss is a paucity of high–quality information that can lead to errors in assessment of toxicity from cancer drugs (9). The lack of timely and accurate ADE information has led to confusion for patients and prescribers, especially when the FDA takes regulatory action (10) that appears to be inconsistent with the available data, as recently happened with clopidogrel (11). Studies have shown that the occurrence of the ADE is often buried in the EMR narrative (e.g.,(12)). The ADE is not separately recorded in the form of diagnosis code or other data accessible in the structured fields and is therefore difficult to detect and assess. However, manual abstraction of data from discharge notes and from other unstructured text remains a significant impediment to progress in pharmacovigilance research. Rapid, accurate, and automated detection of ADEs in any patient population would provide significant cost and logistical advantages over manual ADE detection (e.g., chart review or voluntary reporting) (13). Consequently, robust biomedical natural language processing (BioNLP) approaches that accurately detect ADEs in EMR narratives would be of great interest to other pharmacovigilance researchers and also would have potential application in clinical settings. Current projects utilizing EMR annotation involve clinical narratives from cancer, cardiovascular and diabetes patients. Guidelines Background These guidelines are being used to annotate patient Electronic Medical Records (EMRs) which will be made publicly available as a corpus with high quality annotation of ADEs. This corpus will also be used to train an innovative NLP system which is part of pharmacovigilance toolkit. The toolkit will be integrated into the open source translational research platform i2b2 (14), so these annotation guidelines generally align with the i2b2 (14) guidelines. Annotation objectives are the identification of relevant named entities (disease, medications and ADEs); and discourse relations (e.g., causal, temporal and contrastive relations) between them; severity and Naranjo element extraction method for assessing causality. 3 The annotation tools use Protégé with the Knowtator plugin (15 ) and incorporate, HHS PHI and PII terms, the Naranjo scoring system (16 ) and MedDRA (17) terms in the user interface. The guidelines have been iteratively developed during usage and with experts across many domains. The guidelines and tooling will continue to develop and be refined throughout the annotation process and as research progresses. Short videos demonstrating use of the annotation tooling are available (you may want to use another browser if the links do not open in IE). Alternatively you can go to the UMass BioNLP Annotation Resource Page: 1 Getting Started - Annotation 2 Annotation Tool Orientation 3 First Annotation PHI 4 Spans and Corrections 5 Relations Annotation 6 Adverse Events and MedDRA 7 More on Attributes In brief, you will open a record in the annotation tool and it will look similar to the picture below. The first panel lists the classes [1], the second panel is the medical record window [2] and the third panel is an attribute annotation window [3]. To annotate most classes, click the class in the left panel or in the fast annotate bar [4] and highlight it in the middle panel. Some additional attributes and associations [5] are made from the class panel and the annotation window. A few are made from just the annotation window, i.e. Period. There is a website with the annotation guidelines, videos on how to use the tool, and other resources. http://ummsres12.umassmed.edu/jt/index.php/annotation 4 NAMED ENTITY OR ANNOTATION FIELDS*1 IMPORTANT CONVENTIONS NOTE: ENTITIES AND ATTRIBUTES UNDERLINED IN THE SAME COLOR THEY ARE HIGHLIGHTED IN THE ANNOTATION TOOL YELLOW HIGHLIGHTED TERMS AND SPANS IN THIS DOCUMENT INDICATE THESE ARE ANNOTATABLE, BUT ARE NOT AN INDICATION OF CLASS TYPE. THEY ARE ALSO THE TERM TO WHICH AN ASSERTION APPLIES. ON RARE OCCASION SPECIFIC CLASS COLOR HIGHLIGHTS ARE USED FOR CLARITY. PHI and PII Annotation To enact the Health Insurance Portability and Accountability Act (HIPAA) (18), the Dept. of Health and Human Services published a national standard for the electronic exchange, privacy and security of health information. The “Privacy Rule” protects all individually identifiable health information transmitted in any form and calls this information “Protected Health Information (PHI)” and “Personally Identifiable Information (PII).” There are 18 common identifiers associated with PHI and PII and which must be removed to de-identify data for use or release. These include things such as name, address, date, Social Security Number, etc. and the complete list of PHI is in the Appendix 2 describing PHI datatypes. PHI is annotated to build the named entity recognition in NLP but also for removal during deidentification. How the PHI classes are to be used is described below. PHI: You can use the general class for marking something that you know is PHI but which does not clearly fall into the categories below. Date: This class covers all aspects of date (except year) directly related to an individual, including birth date, admission date, discharge date, date of death. Age over 89: Another date identifier applies to all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older. Medical Record Number: Use this class to include medical record numbers, health beneficiary plan numbers and account numbers of any type. 1 * Classes are underlined in the color used as a highlighter in annotation 5 Social Security Number: self-explanatory Location: It will be valuable for machine learning to annotate address with some granularity. Most of these location identifiers are self-explanatory but Named Sites would include things such as Hospitals, Universities, Organizations, named buildings, Landmarks, etc. Annotate the full name of the location. If you are not sure, you can use the less granular class of Location (or PHI) to annotate. Telephone/FAX/Pager: self-explanatory Name: All aspects of any name are to be annotated, first name, last name, initials, names following titles and indicators, nicknames, logins, handles. Identifiers: This class covers certificate/license numbers; vehicle identifiers and serial numbers including license plates; device identifiers and serial numbers; and biometric identifiers (which would be mostly images and we almost surely will not see that type of data). Electronic Identifiers: e-mail, web sites, IP addresses, username and password Comments and Examples for Inter-Annotator Agreement It is extremely important to annotate PHI. It is the tag that will be used in the de-identification process so it is essential you annotate PHI as something, even if you are unsure or it is incorrect. You may use general classes such as PHI or Location if you need to, but Do NOT leave PHI unannotated. Use PHI when there is mixed datatypes such as a number date combination following accession. Pre name indicators such as Dr., Mr., Mrs., and Ms. do not need to be removed. They are used as tags for name PHI filters. Similarly, you do not need to annotate post name tags (Sr., Jr., III, M.D., Ph.D.). Year is not considered PHI. “2011” does not need to be annotated in the example “She was diagnosed in 2011.” However, if a standalone year is pre-annotated, there is no need to remove the annotation. Week is not PHI when in the context such as “week 3 of CHOP chemotherapy”. Day of week is not PHI if it is not informative, for example: o Please call the wound clinic to set up an appointment on Monday or Tuesday o Hospital Day #1 o Upon questioning the patient that he did wear Crocs without socks for a portion of time last Sunday during a very hot day Do not separate a date phrase. May 18, 1997 is annotated as a single date span. 6 Locations: Annotate the entire location name (i.e. include Hospital, University, Campus” etc.). Annotate abbreviations for buildings as Named Sites when you know that is the reference (i.e. AS for Albert Sherman Center), otherwise use Location. Annotate wings, rooms, corridors, hallways as location. If unsure or the specific type of location cannot be captured in a contiguous span, use the general Location. The key thing is to annotate any PHI in any PHI category so it is removed in deidentification. Examples include: o ACC, MLN, AS-X = named sites or location o #s for Wings, rooms, corridors, hallways as location such as BB-H3 o Clinic names such as “Cancer Clinic or “Diabetes Clinic” are not annotated but examples that are annotated include specific names such as: The Multiple Sclerosis Center at UMASS Memorial Health Care, or Hahnemann Family Health Center, or Emma L. Bowen Community Service Center Identifying numbers used by the hospital: acct, MRN, Fi, accession…etc., should be annotated as medical record #s (MRN). Accession numbers - annotate # and letters in a single span. 123456789 F122334455 L1 3344556622 Some people have annotated just the numbers after the space. Either way is okay. Patient’s identifying numbers used by the government: passport#, driver license, social security, immigration related, etc. should be annotated as Identifier. At UMMS the 5 digit number following a physician’s name is an identifier. For mixed data types, use the general PHI to annotate. 7 Medication Annotation: Drug and Drug Attributes2 When an adverse event is recognized, a physician will discontinue the drug, adjust the dose, or administer an antidote. Drug and drug specific attributes are important elements to annotate. Information will be used to assess causal relations between an adverse event and drug administration. Field Drug name [Entity] Definition Example Substances for which the Eg1: Lotensin 20 mg p.o. daily. patient has experienced or Eg 2: He was started on will experience; including azithromycin and ceftriaxone. drug class name or Eg 3: drug classes such as oral medications referred with contraceptives, macrolides, pronouns. nonsteroidal anti-inflammatory, antiDrug name must be infectives mentioned either in USP published drug list or included in the orange book. Dosage [Attribute] The amount of a single Eg 1: In the ER, the patient received - Type (Discrete/Continuous) medication used in each heparin 4000 units bolus, then 1000 - Strength administration. units per hour. (Concentration/Amount) Quantified description of the Eg 2: Digoxin 0.125 mg every other - Form (solid, tablet, liquid, drug administered in each day. injectable, cream) administration. Route [Attribute] Method for administering Eg 1: She continues to receive - PO, IV, Topical, Epidural, the medication. antibiotics intravenously. Sublingual, Intramuscular, etc. A list with abbreviations, Eg 2: Glyburide 5 mg orally twice a see Appendix 4 day. Frequency [Attribute] How often each dose of the Eg 1:A patient was prescribed - Times a day, etc. medication should be taken Melphalan 5mg (1 tablet) daily. - Specified time of day or hours including both discrete and Eg 2: Labetalol 300 mg by mouth continuous values. three times a day. Table with Abbreviations, see Appendix 5 Duration [Attribute] How long the medication is Eg 1: The patient received Taxol for - Days, weeks, months, etc. to be administered. one month. Eg 2: Continue home medications and Flagyl 500 mg 1 tablet p.o. q.i.d. for 10 days. 2 Yellow highlighted terms and spans in this document indicate these are annotatable, but are not an indication of class type. 8 You can look up drugs in the U.S. Pharmacopeia by loginng in to the library at this link, which is the USP site. http://library.umassmed.edu/ebooks/ebooksredirect.cfm?ID=570 Comments and Examples for Inter-Annotator Agreement Drug- substance used in the diagnosis, cure, mitigation, treatment, or prevention of disease. Drugs are substances used for the treatment or prevention of disease If there are two drug names in a combination drug, capture it as one span Lisinopril-Hydrochlorothiazide 20-12.5 mg oral tablet; take 2 tablets daily. We annotate non-specific drug classes; for example chemotherapy, pain medication, anesthetic. General and local are two classes of anesthetics so you would also annotate general anesthetic and local anesthetic. We do not annotate the term drug when it does not denote a specific drug, Brand and generic drug names: If the drug is named along with the drug class, how you annotate it depends on the format. If the format is “Brand name (generic name)” annotate as one span. Tylenol (acetaminophen) Another format is sometimes seen in discharge notes and is a medication table, and often the generic is a line below the brand name. You will need to double annotate associating each drug name with the attributes: In this case you will annotate o Tylenol 325 650 PO every Four Hours as well as o Acetaminophen 325 650 PO every Four Hours A drug name may include/be followed by a percentage (letters or numbers). These are usually part of the drug name, but may also be the dose. Do not annotate dosage information as part of a drug name. This will be handled post processing Latanoprost 0.05%, one drop in the affected eye once daily Here the drug name is followed by letters - XL, referring to an extended release form. It is annotated as part of the drug name. Wellbutrin XL We are not annotating vaccines or their attributes. If a vaccine name, or part of it, is pre-annotated, the pre-annotation needs to be removed. 9 Non-drug examples Non-drug treatment options like fluids and blood derivatives are not annotated. Blood transfusions, fluids, normal saline and red packed cells in the examples below are NOT annotated as drugs: Given multiple blood transfusions Pressors continued with fluids. He was admitted to the hospital and hydrated with normal saline. Pancytopenia, treated with G-CSF, erythropoetin and red packed cells Not annotating drug Do not annotate drug in drug relationship phrase and similar contexts since it does not refer to a specific drug: According to the pharmacovigilance center reporter and to French methodology of causality assessment, the drug relationship is unable to determine According to the pharmacovigilance center reporter and to the French methodology of causality assessment, drug relationship is probable Do not annotate social self-medication with alcohol, tobacco, IV drugs, street drugs, etc. Only annotate what is medically provided for an indication, or self-medication with legally obtained OTC drugs. [Substance abuse can be factors impacting outcomes and treatment, but we are not annotating them here where the focus is on adverse events of prescribed medication.] Dosage (and form) – the amount of drug in a unit dose (physical shape of constituent substances in the product) You can associate more than one dosage annotation in a single span with a drug. This helps in annotating dose and form. If there are two concentrations written out due to the medication being composed of two drugs, annotate both concentrations. o Lisinopril-Hydrochlorothiazide 20-12.5 mg oral tablet; take 2 tablets daily. We annotate x2 tabs a day as “2 tabs a day” span (not “x2 tabs "). Annotate non-specific dosages such as a small amount, low-dose, high-dose Percentages (and sometimes numbers) are usually part of the drug name, but it is also the dose. Annotate this dose. Latanoprost 0.05%, one drop in the affected eye once daily Annotate concentration and form in one span where possible, otherwise more than one span can be associated with the drug 10 o o 0.5mg 1-2 tablets Amlodipine 10 mg oral tablet take 1 tablet by mouth every day Examples of items that may be annotated as dosage are cream, enteric coated, patch, chewable forms of medications. Annotate the form along with any other indications of dosage. Caution in the case of fentanyl patch 1 patch daily, 1 patch daily = dosage whereas fentanyl patch is a route (transdermal). NOTE: deciding between dosage, form and route can be tricky. If unsure, first look in the appendix, next steps are to ask an editor or one of our MDs. Route – path by which a drug is taken into the body An epidural is an injection into a physical space so it is a route. Note you may commonly see “Patient receives injection of drug X”. This is annotated as Route (not Dosage). Dosage would be if the number of injections was provided. o If Blood sugars are low do not take insulin injections Examples • combivent inhaler or 2 puff by inhalation four times daily • "spray" in "Lidocaine spray"? see Appendix 4 for OROPHARYNGEAL Administration directly to the mouth and pharynx. • " ophtalmic suspension" in tobramycin ophtalmic suspension see Appendix 4 for OPHTHALMIC Administration to the external eye. Other examples of Route may be found in Appendix 4 include: infusion, “swish and spit” and rectal suppository, epidural injection Frequency – rate of occurrence See Appendix 5 for common abbreviations for frequencies of drug administration. The adjective “weekly” is annotated as frequency, e.g., weekly Taxol Use prepositions where appropriate (meaningful) in annotating frequency such as in “with meals” or “before meals” Other frequencies include AM, PM, in the morning, in the evening, etc. We do not annotate days when they are just a point in time, but “days” in the example are describing a frequency, so be aware that you need to annotate based on context. This is common in chemotherapy regimens o Patient is presenting for induction VTD chemotherapy for multiple myeloma. I plan to administer velcade 1.3 mg/m2 on Day 1, 4, 8, 11; thalidomide 100mg daily day 1-12 then 200mg daily day 13-24; and decadron 40mg Day 2, 3, 4. Example of Days as Frequency (is annotated) Warfarin 5mg Mon., Wed., Fri. and 7.5mg Sat., Sun., Tues, Thurs 11 Example of Day/Cycle just indicating a point in time (and not annotated) o Patient is today presenting for a follow-up visit. He is currently Day 8 of Cycle 2 of CHOP chemotherapy for his non-Hodgkin’s Lymphoma. Watch for cases where a whole span is frequency, this can be confused as duration. In the case below “each day for 2 weeks per month" is a frequency. A duration would be denoted "for 6 months” o chemo drugX, doseY, each day for 2 weeks per month Duration – continuance in time; length of time Example of Cycles as duration Patient is presenting for induction of VTD chemotherapy for multiple myeloma. We plan to administer 3-4 cycles then do an autologous stem cell transplant. Patient is here for a follow-up visit. Patient is s/p 3 cycles of VTD for this multiple myeloma. Example of time span that is not a duration. In the example below, the patient has taken the drug for 3 weeks but this does not mean the patient was prescribed a 3-week course of Zoloft. The patient is taking Zoloft over the past 3 weeks You need a range for a duration, so a started on date is not sufficient information. More Drug Annotation Examples When more than one dosage is associated with a drug, double annotate it. OxyContin 10 mg in the morning and 30 mg at night is annotated drug with relation to first dosage, drug with relation to second dosage o OxyContin 10 mg in the morning and o OxyContin 30 mg in the evening When there is redundant information, capture the first instance. MULTIVITAMINS (TAB-A-VITE) 1 TABLET = 1 TABLET Oral TABLET DAILY STAT and then Routine for 30 Days Directions: 1 tablet oral DAILY carbamazepine 200 mg extended release 2 tablets twice daily spiriva 18 mcg inhalation capsule one daily, Glipizide 5 mg, a half tablet daily. Tramadol 50mg tabs 1-2 tabs every 4 hrs Calcium + D 500mg/400IU 1 tab po BID Lisinopril-Hydrochlorothiazide 20-12.5 MG Oral Tablet; TAKE 2 TABLETS DAILY; albuterol 2.5 mg/3 ml 1 unit dose in nebulizer every 4 hours as needed, Methyldopa 1000 in the morning and 500 in the evening/. Methyldopa 1000 in the morning and 500 in the evening Latanoprost 0.05%, one drop in the affected eye once daily. 12 Medication Annotation: Medication Related Entities and Attributes Indications and Adverse Events Elements beyond the drug administration to annotate include: why a drug is being given, the injury resulting from a medical intervention related to a drug, and differentiating the ADE from other signs and symptoms. NOTE: Annotation Practice covers how to choose a span when annotation Indications, Adverse Events and SSLIF. Field Indication [Entity] [annotated in the class navigation bar and appears as the Drug attribute “Reason” when a relation is created] Definition Example Medical conditions for which Present: The patient was diagnosed the medication is given in with hypertension and was treated the past or the present. with Accupril. Past : He did have some hypokalemia which was treated with p.o. K-Dur Adverse Event (AE) [Entity] Drug related injury to a [you can relate more than one patient. AE to a drug] Present: She experienced a hypersensitivity reaction while receiving intravenous Taxol (paclitaxel) therapy. Past: Patient had anaphylaxis after getting penicillin 10 years ago. Adverse drug events (ADEs) are a primary focus of this project and are injuries resulting from drugrelated medical interventions. These are medical signs and symptoms associated with administration of a medication. ADEs are an “injury resulting from the use of a drug”. It is an event caused by a drug at a normal dose in normal use. This includes drugs as a single dose, with prolonged use, withdrawal of a drug or from drug combinations. Note that you may see non-definitive language when an ADE is suspected. You must make the determination based on the words and context and the ability to create a linkage to a drug. 13 Comments and Examples for Inter-Annotator Agreement Indication When medications are given prophylactically or empirically, that is an indication for the drug. In the example below, antibiotics is annotated as the drug since it is a drug class and preoperative is the indication o preoperative antibiotics o empiric antibiotics o omeprazole for GI prophylaxis Note that prophylaxis is an indication, but it is often associated with more specific information such as GI prophylaxis, DVT prophylaxis, PEP prophylaxis. Include this information in the annotation. Do not double annotate events as Indications and SSLIF. If a medication is given to treat the complaint, it is an Indication. Examples: o Paroxetine for depression. o a history of a remote CVA in the setting of chronic atrial fibrillation. He has been continued on warfarin therapy for thromboembolic protection and has apparently done well [drug=warfarin and there are three indications] o The patient was found to have a left lower lobe pneumonia and currently is on ceftriaxone and azithromycin. [pneumonia is the one indication for both drugs: ceftriaxone and azithromycin] Adverse event Adverse drug events (ADEs) are injuries resulting from drug-related medical interventions. These are medical signs and symptoms that are associated with administration of a medication. Note that you may see non-definitive language when an ADE is suspected. You must make the determination based on the words and context and the ability to create a linkage to a drug. Note: more than one AE can be linked to a drug. Examples: o “nonspecific ST-T wave changes consistent with digoxin effect” Drug=digoxin, AE= nonspecific ST-T wave changes o “…she got her first Reclast infusion and later that day, she developed a fever and nausea…” Drug=Reclast, Route=infusion AE=fever, nausea SSLIF [other signs and symptoms] are described later but note that suspected ADEs are annotated with the assertion possible. They are also annotated as an SSLIF (present). 14 o “….cirrhosis that was thought to be precipitated by oral contraceptives…” Drug=oral contraceptives (OC), AE=cirrhosis with possible assertion SSLIF=cirrhosis with present assertion Caution on annotation of allergies. Allergies are only considered an adverse event when it is related to taking a drug. Other reactions to food and non-prescribed substances should be annotated as a SSLIF. In the example below, the highlighted items are for the ADEs. Eggs and bee pollen would be SSLIFs. o ALLERGIES: Multiple allergies including erythromycin, Celebrex, Darvocet, eggs, Levaquin, Reclast, bee pollen and Vicodin It is okay to include annotation of words associated with allergies such as drug, med, medical i.e. no “drug allergies” (assert absent) MedDRA Annotation Adverse effects are mapped to the concepts from MedDRA. Therefore, the number of ADE and MedDRA terms should be the same for any given record. To search for MedDRA concepts, you can use http://ummsres14.umassmed.edu/OntoSolr/browse from anywhere except the Annotation Server. On the Annotation Server, please use this URL to search and browse MedDRA terms: http://ummsqhslxweb01.umassmed.edu/OntoSolr/browse See the UMass BioNLP Annotation Resource Page: for videos demonstrating MedDRA annotation. Annotate adverse events with the MedDRA term that best applies to the span. “pain in the left back and the left upper abdomen” is annotated as one span. If pain is an adverse effect, there are actually two different MedDRA matches: back pain (MedDRA:10003988) and abdominal pain (MedDRA:10000081). Use the more generic MedDRA term for pain to relate to the full span (MedDRA:10033371). Other common ADEs Include: Nausea MedDRA:10028813; Vomiting MedDRA:10047700; Dehydration MedDRA:10012174; Pyrexia (for fever) MedDRA: 10037660; Drug allergies are common adverse effects is assigned "Drug hypersensitivity" MedDRA:10013700 Some drug allergies are not drug hypersensitivities, for instance in the case of GI sensitivity, or abdominal pain. A patient has an allergy to erythromycin that causes abdominal pain. GI pain is intolerance and common side effect - is commonly put in allergy section so patient doesn’t get it again. Annotate as “Drug intolerance” (MedDRA:10061822) Contrast media is not a drug and not routinely annotated. However it is a pharmaceutical and we will annotate adverse events it may cause. Use the entity “Drug” and reactions to it are an ADE. o allergy to iodine contrast use “Iodine allergy (MedDRA:10052098)” o contrast allergy use “Contrast media allergy (MedDRA:10066973)” Note that this is different from “Contrast media reaction (MedDRA:10010836)” 15 Drug allergy is considered an Adverse Event in the past if a specific drug is mentioned (e.g. ALLERGIES: IMDUR). The allergy is assigned "Drug hypersensitivity" MedDRA:10013700 and "History" value to the Period Assertion and “Present” to the Presence Assertion. Note that in spans like (no) drug allergies, drug allergy is annotated as SSLIF and not as Adverse Event since no specific drug is mentioned. Multiple drugs are given and an AE is noted. In this case these are all antibiotics that are associated with causing diarrhea. The doctor does not distinguish so all drugs are linked to the AE. Vancomycin and ceftazidime were started. His pheresis catheter was not functioning and the nurses could not draw back and it was removed. Prior to his admission, he was home on levofloxacin, acyclovir, and posaconazole. It was noted that his white blood count on admission was 100 cells and has risen to 400 since that time. The patient reported some chills and the patient noted some loose stool up to about 4-5 times a day. SSLIF and Severity Field Signs, Symptoms, Abnormal Test Findings, and Diseases (SSLIF) Severity [an attribute of Indication, AE and SSLIF] (annotated in class navigation bar, but must be added in the right annotation window for Indication and SSLIF) Definition Medical signs, symptoms and diseases that are neither adverse effects nor reasons for administering a medication. Intensity of an adverse effect. Example The patient has a history of COPD. Eg 1: Severe headache, moderate chest pain. Eg 2: The PLB has 50% stenosis just proximal to a widely patent stent. Comments and Examples for Inter-Annotator Agreement SSLIF When annotating a SSLIF, include the location of where it is occurring if provided and if it can be part of a contiguous span. Locations do not need to be highly specific. o “headache in the back of the head” o “headache more pronounced occipitally” Watch for general terms which can also be part of disease names and are annotated. For example, recurrent is often part of the disease name in cancers. In the example below recurrent/recurrence is used twice as a diagnosis and twice as a general term. See Appendix 6 o The patient remained cancer free after that for 9 years until he had laryngeal/vocal cord 16 recurrence. Pathology returned recurrent lymphoma, natural killer cell subtype. This is now the third recurrence in the paranasal sinus area and fourth overall recurrence. Sometimes terms that you usually think of as a SSLIF is used in a different way and is not annotated. In the example below, the patient does suffer nasal discharge because of the lymphoma of the sinuses. Here, however, nasal discharge and respiratory sputum act like test materials not symptoms. o There were never any positive results on the cultures, either from nasal discharge or the respiratory sputum Severity Severity is often indicated by modifying words such as mild, minimal, markedly, severe, endstage, small, extremely, substantial. Severity terms can also be phrases such as: borderline to slightly high, moderate-to-severe. He is rather diffusely tender to palpation Here ‘rather’ can be a severity meaning ‘to some degree’ Middle turbinate was severely inflamed Note that some terms you might consider as severity are actually part of the disease name and are not annotated. For example, large is part of the name “large B cell lymphoma” or “Chronic Heart Failure (CHF) exacerbation” Disease stages are specific indicators of severity and should be annotated as severity. A relation must be created between a severity assertion and the SSLIF o a history of natural killer/T-cell lymphoma <...> At that time, the patient had radiation for stage I disease Medical use of the following terms can be used as a disease name or temporal indicator, These words are not severity: acute, chronic, acute-on-chronic, flare. Annotate the example below as follows: o significant flares of fibromyalgia We only annotate numbers or references to quantity as severity when there is a frame of reference or scale such as %, mm, X out of Y, etc. Annotate where you can understand the meaning, otherwise it is diagnosing. EXCEPT if it is associated with an ADE – these events will be manually reviewed to assign CTCAE severity scores. o fever greater than 100.5 - fever is annotated as an entity but the temperature is not annotated as severity Multiple and innumerable would not be annotated in the examples below. o innumerable 1mm non-blanching papules o multiple 1mm papules 17 Annotate pain in his back twice to create a relation to each – (1) severe and (2) “10 on a scale of 1-10”. 8 is not annotated since in isolation there is no scale. o Remarkable for the aforementioned severe pain in his back which she states is, without pain medicine, 10 on a scale of 1-10. At the moment, it is down to 8. Some” and “somewhat” are terms that can be used for descriptors of severity or as quasi severity quantifiers “Some pain" means approximately "mild pain." Some is vague but it is a description of severity, i.e. not very severe, and can be useful. o Don’t annotate when used as a quantifier such as “some blisters” We do not annotate words that indicate severity dynamics such as worsening or increasing, decreasing (we do capture decreased and increased as severity). Use your judgement. The word "frayed" is sometimes annotated as SSLIF, sometimes as severity, it is context dependent. In the example below, breakthrough is not a modifier. It is referring to severity and is annotated as such. Breakthrough pain is both sudden and severe A severity term may refer to more than one entity. Make a relation to each applicable entity. o Severe rash and redness. In this example, severe has relations to both rash and redness. o Large mass measuring 5cm. Large and 5 cm are both severities for mass. In general avoid redundancy in severity annotation. For example: some mild, annotate only mild. Sometimes a second word modifies the first and it is necessary to capture both. For example very slight shows less severity than simply slight. 18 Choosing A Span When there are SSLIFs written as a comma separated list (often in a review of symptoms), this may be annotated as a span, but it should be separated into individual terms. “no murmurs, gallops or clicks”. Similarly separate terms with a / between them as in “rash/blistering” Annotate locations as part of a span. Annotate the accurate span even if it is long, and include coordinated locations. For example: “fibromyalgia causing pain in the neck and paracervical region and down the arm” This is a long span but the pain is in both the neck and arm. “adenopathy in the supraclavicular or axillary regions” Annotate the location as part of a span when it is contiguous (a single span without breaks). Otherwise do not annotate location terms that are distant to the S/S. pain of the lesion on the right shoulder swelling on the right shoulder. It is in the anterior aspect of the shoulder…. pustular collection underneath the end of the nail about 5 days ago. It is her right middle finger. If two SSLIF occur in the same location these are annotated separately, and only one will include the location. pain and swelling of his leg In general we do not include prepositions except where they provide meaning or create a contiguous span to include locations and coordinated locations relative to a SSLIF. See the section on Prepositions for examples. 19 Assertion Categories – [where assertions are for the highlighted entity] Outcome Assertions (default notMentioned) Annotate the Outcome field for adverse events where possible. It has four values: recovered, not completely recovered, died, not mentioned (most common). The default value for this field is notMentioned. You may see an AE mentioned early in the note and much later the outcome is said to be resolved by drug X in the same note, it is okay to put outcome as "resolved." If this is longitudinal information from another note, you do not annotate in the current note. Only annotate what is in the note you are working on and only annotate outcome if it is specified in the narrative. If an adverse event's Assertion is Absent, the Outcome field is not annotated. Period Assertions (default current) Annotate temporal information for several entities: Adverse effect, Indication and SSLIF. This is done using the Period attribute. Period values are: current and history. The default value for this field is current. Current Current usually refers to the complaint being referred to in the visit. This is usually described in “History of Present Illness” section. Note: this section is often in the past tense since (1) the visit is to discuss an event that has already occurred and/or (2) the doctor is writing notes after having seen a patient so even the current visit is discussed in past tense. Caution: this section will often also contain references to the past and future. Examples: o o She comes with history of sudden episode of nausea, vomiting and diarrhea. She says she was apparently all right until the night before yesterday. She returned from a trip three weeks ago where she has had pain in the bottom of both feet If the focus is on an ongoing symptom or disease the event is annotated as current, even if it started in the past. o o The patient describes daily headache and throughout most of her life She continues to have constant pain "Known" or "chronic" symptoms or diseases are annotated as current if there is no context favoring history o Patient with known ulcerative colitis who presents with lower gastrointestinal bleeding. 20 o Patient with chronic respiratory insufficiency Symptoms or diseases in Family History section are current by default if there is no context favoring history History History markers are: "history of", "old", "past", "in the past", "prior", "status post", reference to a date or prior period, etc. o She has a history of fibromyalgia. o She did have evidence of old ischemic changes. o In the most recent past, cervical cytology in [Date, 2 years from the visit date] revealed LGSIL and was HPV negative. o NHL diagnosed in 2005. o head injury as a child o She never had another panic attack Symptoms or diseases in “Past Medical History” section are annotated as history if there is no context favoring current. o PAST MEDICAL HISTORY: Fibromyalgia, hypertension, hyperlipidemia, sleep apnea, degenerative joint disease. o PAST MEDICAL HISTORY: Paroxetine for depression Annotate what is in the record you are viewing. Do not make assumptions or consider longitudinal information you may know. The purpose of annotation is machine learning. Patterns will be learned. “he no longer has the abdominal pain that he originally presented” is annotated as SSLIF for abdominal pain with the assertion of absent (vs. history, absent). Section headers such as PAST MEDICAL HISTORY and ASSESSMENT/PLAN provide context for “history” or “current” although current is the default value and specific annotation for Period is not required. History is heterogeneous and can mean both what happened in the past but stopped and what happened in the past and still happening. You may see in PAST MEDICAL HISTORY things like hyperlipidemia, hypertension or a medication. In this case, annotate those SSLIF as “history”. In HISTORY OF CURRENT ILLNESS, it is “current” In the ASSESSMENT/PLAN” the patient is still actively being treated with the medication for hypertension or hyperlipidemia. Annotate the SSLIF and the default is current. This is common for chronic diseases such as cancer, cardiovascular disease or diabetes. Examples of using context: The patient is not cured but annotate as history because it is in context of note “Status post stage II nodular sclerosing Hodgkin's disease.” (period assertion = history) 21 PAST MEDICAL HISTORY She also has hypercholesterolemia, irritable bowel syndrome, arthritis, dry eyes. [ even though it is in PMH section, annotate as current because the verb is present tense)] FAMILY HISTORY: Mother with a history of rectal cancer [follow the cue of the word history even though it is in Family History which is often annotated as current} Presence Assertions (default is present) (modality) expresses a speaker’s degree of commitment to the expressed proposition’s believability, obligatoriness, desirability, or reality. Ascribe assertion values to medications and diseases, namely, to “drug”, “adverse effects”, “indication”, and “other signs, symptoms and diseases” entities. Options are below but NOTE that past tense also requires inclusion of the Period Assertion “history”: Present – entities are or were present Absent – entities are or were not present Possible – any level of uncertainty about that an entity is or was present or absent Conditional – entity occur(s/ed) only under certain conditions Hypothetical –conjectures, based on a suggested idea or theory and often in if/then scenarios. NotAssociatedWithPatient – entity occurs in relation to (a) a family member or (b) general population Presence (default) Presence (Present/Positive) means that entities associated with the patient are/were there; exist/existed. In our annotation, the positive value ‘present’ is the default value, i.e. if an entity does not have any assertion value ascribed it means that the value is positive/present. The drugs the patient receives are also annotated as Present. Helpful Hint: It is very easy to confuse the Presence Assertion “Present” and the Period Assertion “current”. Examples: a female patient died while receiving Taxol (Paclitaxel) therapy for the treatment of endometrial cancer [Drug and Indication are both Present] The patient had a history of hypertension She is on oxycodone 10mg for pain *Replaced, held or discontinued drugs are annotated as “positive (present)” and not as “absent”, since they used to be taken vs. never taken. 22 Comments and Examples for Inter-Annotator Agreement Drugs are always annotated as present since they are being taken or used to be taken vs. never taken. o At this point in time, he does not require any more antibiotics. o she has since been discontinued on digoxin o His enalapril was changed to lisinopril o His aspirin was held In the example below, the entity “anaphylactic shock” is ascribed a presence/present/positive value, since it did occur. It is only its relation to Taxol that is questioned or negated and we do not ascribe assertion to relations in the current schema. o The anaphylactic shock was possibly related to Taxol (relation between Taxol and anaphylactic shock is Adverse) o The anaphylactic shock was most likely related to Taxol (relation between Taxol and anaphylactic shock is Adverse) o The anaphylactic shock was not related to Taxol (no relation annotated between Taxol and anaphylactic shock) Even if certain symptoms were identified as being part of another symptom and deleted from the file or renamed, they still did exist and need to be annotated as positive (default). o The second episode of malaise, loss of consciousness, undetectable pulse, and tension were identified as being part of shock. Since these were considered manifestations of the shock and anaphylactoid reaction, the previously reported separate events of dyspnea, malaise, abdominal pain, and erythema have been deleted from the file o Supplemental information received from the reporter via BMS Japan on January 15, 2002 indicated that the events dyspnea, blood pressure decreased and facial hot flushes were changed to anaphylactic shock Narratives can be vague. Here the use of “not much” is questionable, but “not much is not “none” so the assertion for highlighted entities is present: o He has no complaints of pain, dyspnea, or dysphagia. He really has not had much crusting or discharge from the nose, either. Absence “Absent” asserts that the problem does not exist in the patient. Also annotate drugs the patient did not receive as Absent. Examples (entity requiring the assertion is highlighted): o no known drug allergy o the patient denied any dizziness, shortness of breath… o Without syncopal episodes o The patient currently is pain free o There were no clinical signs of congestive heart failure o She is not a candidate for anticoagulation 23 o o The patient had no fever No antibiotics were given Do not annotate the outcome of “absent” adverse events. (If you have no adverse event, you have no outcome.) Create relations between “absent” adverse events or indications to drugs the same way we relate “present” ones. o He receives no chemotherapy for his lymphoma Possible ”Possible” asserts that the patient may have a problem, but there is uncertainty expressed in the note. This assertion covers the range of possibilities from likely/probably to unlikely/doubtful. Examples (entity requiring the assertion is highlighted): o Questionable DVT o Question of DVT o Their differential is gliomatosis versus radiation effect. o Possible anterolateral ischemia o a consult was placed to rule out CAD o Rule out congestive heart failure but doubt o The differential diagnosis for his fever included possible inadequately pneumonia versus bacteremia versus UTI versus CSF infection o It is not likely that histiocytic non-Hodgkin’s lymphoma would go to the kidney Conditional ”Conditional” is used when the mention of the medical problem asserts that the patient experiences the problem only under certain conditions. DVT prophylaxis per surgical protocol, but he should receive Lovenox bridging to Coumadin as deemed safe by surgery shortness of breath with exertion Hypothetical “Hypothetical: is used for medical problems the patient may develop; for a conjecture. Examples: Should her symptoms return or headache develop, please discontinue to taper and notify Dr. **NAME[ZZZ]'s office. Call Dr. X if increased swelling or redness of the left lower extremity or starts to have difficulty breathing 24 If nausea or bleeding develop, return to the emergency room. Notify MD For Any of the Following : Increased Trouble Breathing or Chest Pain, Fever Temperature Higher Than 101 Degrees, Pain Not Improved by Medication Vancomycin is a possibility in the future Please return to the ED if you experience any concerning symproms such as chest pain, dizziness, severe abdominal pain, nausea, vomiting, blood in your stools or black stools. Hypothetical vs Conditional If/than words are cues to a hypothetical followed by a conditional response “If normal, treat with oral anti-inflammatory medications and” Here the anti-inflammatory medication is annotated as ‘conditional’. Should symptoms return would indicate using the assertion ‘hypothetical’. In this example, the drug is conditional and the indication is hypothetical o If the Gram-stain showed gram-positive cocci in chains we will consider adding vancomycin o If his vitamin D 1,2,5-OH is not elevated, I will start treatment with VitD o If the Gram-stain showed gram-positive cocci in chains we will consider adding vancomycin Not associated with Patient The mention of the medical problem is associated with someone who is not the patient (family or people in general, such as educational materials). Family history of prostate cancer Brother had asthma The surgical procedure and risks including infection, bleeding, blindness, meningitits, CSF leak, and brain damage were discussed The most common diabetes symptoms include frequent urination, intense thirst and hunger, weight gain, unusual weight loss, fatigue, cuts and bruises that do not heal, male sexual dysfunction, numbness and tingling in hands and feet. If needed, the classification can be further detailed. For example, a drug can be “absent” because the doctor did not recommend it, because the patient refused it; or the probability of a disease can vary from very low to very high. Assertion values should not exclude each other. A span can be assigned two assertion values from different categories: no family history of diabetes: history+NAWP Patient’s mother has cancer: current+ NAWP 25 No family history of breast or ovarian cancer: NAWP and history No family history of skin cancer’: NAWP and history Had cancer years ago: Present and history It is possible to have sentences containing both present and absent information. Patient with chronic Hepatitis C “denies any sequela of hepatitis” is annotated as 2 spans: hepatitis is present and sequela of hepatitis is absent Drugs assertions Drugs are usually annotated as present. If the drug mentioned has or is being taken, annotate as present. If the drug is mentioned as never having been used, annotate as absent. If the drug has possibly been used but there is some question, annotate as possible. If the drug is mentioned in the context of being considered for future use, annotate as hypothetical 26 Annotation Of Relations Annotate relations are (connections) between entities and their attributes. See Appendix 3 for a full table of possible relations. Dosage, route, frequency, duration, indication and adverse event are drug attributes and are related to their drugs. For example, in Albuterol 2 puffs p.o. “2 puffs” is linked to “Albuterol” by “Dosage” relation and “p.o.” is linked to “Albuterol” by Route relation. Severity can be linked to Indication, Adverse Event or SSLIF. Examples: Drug’s Relations Context She receives Albuterol 2 puffs p.o. q4-6h. The patient was treated with ampicillin for two weeks. He later received chemotherapy for his lung cancer. Patient's death was due to anaphylactic shock caused by the intravenously administered penicillin. SSLIF/Indication/ADE attributes He has severe diarrhea. Relation (:) Dosage (Albuterol : 2 puffs) Route (Albuterol : p.o.) Frequency (Albuterol : q4-6h) Duration (Ampicillin : two weeks) Reason/Indication (lung cancer: chemotherapy) Adverse (penicillin : anaphylactic, shock) Severity (diarrhea : severe) 27 ANNOTATION PRACTICE General Considerations DO DO annotate the meaning of the words in the text you are annotating. Context is important. It may be that some things are annotated differently in the same or adjacent sentences (this is likely most frequent with current and history). It may mean you are able to annotate something in one context and not another. DO annotate misspelled words DO include periods in terms that normally have them (but do not go nuts over this); p.r.n., p.o., DO remember to annotate negations with the Presence Assertion “Absent” o Annotate negations and negated words with the Assertion “Absent” so it will be detected in the negation algorithm. Negated word examples are: nontender, anicteric, asymptomatic. (This is counterintuitive but is a flag for machine learning) Do make relevant relations, regardless of distance between terms Do remember the panel on the right side of the tool has fields. There is a scroll bar to see them. DO NOT Do not make assumptions, do not infer. Do not consider longitudinal information in this workflow– annotate information in current record Do not diagnose Do not annotate a patient’s mistaken beliefs when medical professional commentary is contradictory Do not annotate general terms such as “problem” and “disease”. They are not informative. See Appendix 1 for more examples. Do not annotate procedures or tests. Do not annotate normal except when they are negated abnormalities such as non-tender and atraumatic (and then use the assertion absent) Do not annotate parts of words Nontender vs nontender and non-tender Do not annotate parts of phrases. For example do not annotate “hernia” if used in the phrase “hernia repair” o hernia repair vs hernia repair or hepatitis panel vs hepatitis panel If we do not annotate the entity we do not annotate its attributes either NOTE Not everything is annotatable. It is okay. Note “acute” in medical parlance is a temporal term meaning sudden or rapid onset and/or a short course. The opposite is “chronic”. 28 Be aware of terms that are a category of disease. In cancers recurrences reflect a category and they are treated differently than the initial occurrence. Recurrent lymphoma would be a disease. (also see other lymphoma examples in Appendix 6). Remember Present, Absent, and NAWP are in the same category and are mutually exclusive. Remember Present does not mean Current. Anaphoric pronouns Anaphoric pronouns are the pronouns that refer back to another word or phrase. We do not annotate anaphoric pronouns like it or this in examples below even though these refer to entities we do annotate: The patient had diplopia but it was resolved completely. The patient had anaphylactic shock. This was caused by antihistamines. Anaphoric Pronouns vs. Co-referent Items Coreference is when two or more expressions in a text refer to the same thing. We do not annotate anaphoric pronouns. However, we do annotate other coreferent items. She fell down a flight of stairs and hit her head. The injury caused chronic seizures The patient presents with nausea, vomiting, and abdominal pain. The patient describes the pain as crampy in nature. In the example below, “it was down” is meaningless on its own as well as “it” being an anaphoric pronoun, so annotate just the first half Blood pressure was high 180/110, rechecked it was down to 137/100. Articles Indefinite article "a" is not included in annotated entities: in the noun phrase a malignant tumor of the breast, the span annotated is “malignant tumor of the breast” not “a malignant tumor of the breast.” Definite article "the" is not included in disease names, either. In the example below the adverse effect is “anaphylactic shock” not “the anaphylactic shock”: The anaphylactic shock was characterized by nausea. Titles Titles are study names, references to consumer or biomedical literature. Section headers (usually in CAPS) are titles. Common section headers are: PAST HISTORY, REVIEWS OF SYSTEMS, PAST MEDICAL HISTORY, PLAN ASSESSMENT, MEDICATION, ALLERGY, CHIEF COMPLAINT However, there are cases where you may want to annotate a title. ALLERGIES and PAST CHEMOTHERAPY may be appropriate to annotate in a relation. If indication is the title in a list of symptoms, you can use it to associate with drug. If the phrasing of the sentence with the drug also makes the association clear, use the title as well. 29 Joint pain. He is willing to try glucosamine to help with joint pain but…… Certain adverse effect reports include clinical trial title, for example: Protocol title: (NON-BMS/RETRO TAXOL) RETROSPECTIVE DATA COLLECTION TAXOL IN PATIENTS WITH SOLID TUMORS. Investigator causality assessment was not provided. Do not annotate the drug name (Taxol) and its related information in the title, since the name of the clinical trial may include drugs that an individual patient in that clinical trial does not receive. For example a clinical trial might have this name: "A randomized, controlled, blinded clinical trial comparing miraclecillin to wondersporin." In this trial, some of the patients got miraclecillin and other patients got wondersporin, but no patients got both. Entities like Suspect Drug/Causality in the example Suspect Drug/Causality: paclitaxel are treated like titles and not annotated. AE in the example below is not annotated either: AE outcome: The patient experienced death on [words marked] EMR section title ALLERGIES can be annotated, it is often the only mention of an ADE with a list of drugs. “ALLERGIES: amoxicillin and vancomycin” Allergy is an adverse event and is linked to each drug separately. “ALLERGIES: none” Annotate allergies where there is no drug named as a SSLIF with the assertion “absent” Do not annotate Departments or Specialties (not a SSLIF, Indication or ADE). Infectious disease below refers to a dept, specialist, specialty. Infectious disease concerns could be a fungal infection or bacterial infections. an infectious disease consultation was obtained the patient was placed on iv antibiotics which will be started as per infectious disease recommendations Prepositions Avoid prepositions except where they provide meaning or create a contiguous span such as is to include locations and coordinated locations relative to a SSLIF. Do NOT include prepositions examples are: When annotating duration spans, e.g. for three weeks or for an unknown period of time, we do not include for in duration spans. o In the noun phrase via intravenous drip we only annotate the intravenous drip span” DO include when prepositional phrases • Add meaning such as in frequency spans 30 • • – "before/after meals" and "with meals" – Around three weeks – Between 1 and three weeks ago Complete a span to include a location – itching of her skin – Scaling patches on her legs bilaterally Create a contiguous span about the SSLIF – pain on all movements/ pain with abduction – range of motion is limited on internal and external rotation – dyspnea on exertion – Significant abnormalities of retroperitoneal lymphadenopathy – liver is diffusely increased in echogenicity and coarsened Test results Commented lab results are annotated. “3 ova and parasites being negative, Giardia being negative in a stool culture that as negative.” Annotate these items as they are comments and are asserting they are absent. Results presented in paragraph form are likely from dictation where some lab results are just read in, and some are commented upon. The section below would be annotated where there are comments as follows: Both normal and abnormal test result lists in the form of uncommented numbers are not annotated as diagnoses: Laboratory data showed sodium of 143, potassium 4.1, chloride 105, CO2 of 26, BUN 4, creatinine 0.7, glucose 90, calcium 9.4. White count 5.4, hemoglobin 12.7, hematocrit 27.7, platelet count 247. We assume that if a certain test or measurement result is significantly abnormal, the diagnosis is mentioned in the text separately. For example, if a patient’s blood pressure was 180/100, the report will most likely mention “high blood pressure.” 31 Uncommented lab results are not annotated. Culture, culture pending are lab tests and are not annotated. When the result is positive, you would annotate it as part of a diagnosis. Nothing is annotated here for two reasons. We do not annotate procedures or tests themselves and we do not annotate uncommented lab results. This may be a lab result in text, but the comments need to refer to something to provide additional meaning such as an SSLIF, Indication or ADE. ‘….took a surface culture and nothing…..’ and ‘Culture pending.’ blood culture, no growth” Similarly attributes and values in text are only annotated when there is an interpretation associated with it. The cells were cdc20 positive - This is not a SSLIF on its own, it is an attribute of a tumor. K+ is 50 - this is a value. Usually a list of results in a table are structured data from a report, are not commented and therefore not annotated. COLOR URINE APPEARANCE URINE SPEC GRAVITY, URINE pH URINE PROTEIN URINE GLUCOSE URINE KETONES URING BILIRUBIN URINE OCCULT BLOOD URINE NITRITE URINE UROBILINOGEN URINE LEUKO ESTERASE MICROSCOPIC ? Yellow Clear 1.012 6.0 Negative Negative Negative Negative Negative Negative Normal Negative Not Indicated Yellow Clear 1.005-1.03 4.6-8.0 Negative Negative Negative Negative Negative Negative Normal Negative 32 Similarly do NOT annotate microorganism names when given as a test request, but it can be annotated as part of a result. So “BABESIOSIS ORDERED” would not be annotated. Depending on context “BABESIOSIS SEROLOGY POSITIVE” would be because it indicates a specific infection. “Hepatitis C, genotype 1b” Annotate genotypes, subtypes, variants, etc. when given. This may give information related to treatment decisions. Do not diagnose or interpret lab results. You can see the result but you cannot say if it is an SSLIF or normal from just a number. “LABORATORY DATA: Alpha-fetoprotein tumor marker is 4.3” Longitudinal Information Annotate what is in the specific record you are viewing. Do not make assumptions or consider longitudinal information you may know (temporal aspects of adverse events are annotated in a separate workflow with Naranjo scoring). “HISTORY OF PRESENT ILLNESS: ……..history of Burkett’s lymphoma….” Patient is within a few months of treatment, which is not a timetable to consider it cured. The patient in fact does go on to have a recurrence of the lymphoma, but here it is annotated SSLIF and as history. 33 Punctuation and Spelling If symptoms are separated by a “/ “they should be annotated separately (i.e. rash/blistering is 2 spans) except when they are abbreviations and will not have individual meaning (i.e. n/v which is annotated as a span). If drugs are separated by a “/ “ they should be annotated as span since they are a drug combination If a drug name is followed by another in parenthesis, this is the brand name (generic name) and is annotated as a span e.g. taxol (paclitaxel) Annotate misspelled words Annotate words with and without capitalization Do not annotate wrong words. Chemotherapy here is the CHOP and it is followed by radiation therapy, not chemotherapy o 6 cycles of CHOP followed by radiation chemotherapy 34 APPENDIX 1: Additional Annotation Examples and Information Attending Problem: Supratheraputic INR (964.2 DRUG TOXICITY WARFARIN )linked to Coumadin) Resident Problem: Supratheraputic INR (linked to Coumadin) Attending Plan: No active bleeding (absent). Was 5.0 on Friday per patient. Hold Coumadin. Type and Cross for possible FFP. Consent for blood products. Recheck INR in am. If increasing (Hypothetical), would administer Vit K (conditional, link to INR in am)... subcutaneous and consider FFP if any active bleeding (hypothetical). Resident Plan: Patient has INR of 6.6 with no evidence of bleeding (absent). Stool guaic (Absent) negative, therefore, will not reverse immediately. Plan: If guaic (hypothetical) positive, consent patient for FFP and blood products. If guaic (hypothetical) positive reverse INR. Consider vitamin K (conditional) If actively bleeding (hypothetical) FFP. check CBC in am Hold coumadin. SSLIF “Sit hunched over…” This is a sign of the back pain being described. “started to have some very formed stools from diarrheal stools” A “formed stool” is a normal function, and generally not annotated. However, it may be relevant to annotate in context. Relapse itself not a SSLIF Be aware that Hodgkins Disease names are easily confused with severity [classical, nonclassical, high-grade (3 types), low-grade (3 types)] Nausea and vomiting are separate symptoms that are often seen together. Annotate each term separately but if the abbreviation n/v is used, annotate as one span. Similarly annotate n/v/d as one span for nausea/vomiting/dehydration renal function has deteriorated is annotated as a span since the deterioration is part of the description of the SSLIF and is not a severity dynamic per se Do not annotate normal, it is not a sign or symptom or disease Locations associated with two SSLIF will be handled in post processing. Annotate as follows: the right lung has some rhonchi and wheezes in it. We do not annotate locations or organ systems as entities, so you can’t mark them as absent. The following would not be annotated. No change in his general medical status Medical status is unchanged no vision changes change in voice quality Speech and affect are unchanged cardiovascular: no change in status Cardiovascular: No established abnormality. <…> Gastrointestinal: No established abnormality. Genitourinary: No established abnormality. 35 Musculoskeletal: No established abnormality. Neurologic: No established abnormality. no abnormality identified in hypopharinx Chemotherapy Chemotherapy can be tricky because regimen names often contain information about drug, dosage, and duration. He received CHOP chemotherapy CHOP chemotherapy for 6 cycles CHOP chemotherapy is a span because it is a drug regimen and CHOP names the four drugs used in combination. It also provides information about duration. Cycles are the number of times the treatment is repeated at a specified time interval. See http://www.lymphomation.org/chemo-CHOP.htm Radiation recall – and ADE from the combination of radiation and chemotherapy. “He tolerated the conditioning regimen well, although he did seem to have a sinus pain reaction during the BCNU infusion on the first day. This pain was felt to be infusion reaction, although there was concern that there might have been a radiation recall” Annotate sinus pain as the adverse reaction to the BCNU with the assertion as “possible.” Radiation recall is annotated as OSSLIF Difficult Examples Indication to Drug Link Followed by Logic in How to Annotate: /ID The patient developed febrile neutropenia 1/27 and blood cultures from that day revealed pseudomonas and Strep viridans. Urine cultures showed 35000 colonies of E. faecalis. Cefepime (1/272/27) was initiated (synercid was not used due to a history of adverse effect: myalgias) in addition to the acyclovir and diflucan (stopped 2/5) which were started earlier for prophylaxis. Daptomycin was started on 1/29 for VRE. Caspofungin was started 2/7, 2 days after diflucan was stopped as fevers persisted. The patient remained afebrile for several weeks until 2/27 at which point Cipro and meropenem were initiated, though no microbes were initiated on culture. The patient remained on acyclovir, caspofungin, ciprofloxacin, daptomycin, and meropenem (d/c'd 3/10), for the rest of this hospitalization, and remained afebrile since 3/1. He will be discharged on p.o. ciprofloxacin (to d/c when ANC >1000), voriconazole, and acyclovir--the latter two medications to take indefinitely. Suggested Indication to Drug Links: Pseudomonas (Cefepime) Strep Viridans (Cefepime) E. Faecalis (Cefepime) VRE (Daptomycin) Logic: The paragraph is indicating that they did a blood culture which revealed Pseudomonas and Strep viridans. A urine culture was also done which showed E. Faecalis. These are all bacterias. The paragraph continues by stating Cefepime was initiated. If you research this drug you will see its an antibiotic (Cephalosporin class) and this antibiotic is used for all the 3 bacterias 36 mentioned previously. That is why I provided the guidance as this. Secondly, the paragraph also is clearly providing the following indication to drug relation: "Daptomycin was started on 01/29 for VRE". That is why I provided this second guidance. The rest of drugs on the paragraph are mentioned, but there is no clear direction as to why they were provided. General Terms General terms are ones that are not informative. Judgement is required, but often descriptions with greater specificity follow the use of a general term and annotating those are preferred. Examples of general terms are: Abnormality disease concern illness complaint issue problem complication drug deficit medication disability supplement diagnosis therapy difficulty treatment diffuse Use your judgement. There are cases where it is appropriate to annotate these words and you need to decide based on context Examples: arthritic issues, eye problem, cardiac abnormality supplement may be required to add meaning such as in "vitamin D supplement", "iron supplement", “herbal supplement Reaction is not a general term. It often is associated with an AE. Double Annotation When more than one frequency is associated with a drug, double annotate the drug and create an association with each frequency. Brand name (generic name) is one span for a drug name. Sometimes discharge notes have a medication table, and often the generic is a line below the brand name. You will need to double annotate associating each drug name with the attributes Signs and symptoms that are suspected ADEs are annotated as an ADE with the assertion possible. They are also annotated as an SSLIF (present). 37 Large mass measuring 5cm. Large and 5 cm are both severities for mass Not Clear Sometimes there is no good way to annotate a phrase (bolded) because the meaning is unclear. In the example below, is this 2x/day or 2 capsules? Daily is the frequency but 2 can be part of the dose or part of frequency. Do not annotate what you are not sure of unless associated with an ADE. Report this problem. MEDICATIONS: Aleve 220 mg 2-3 tablets p.r.n., hydrocodone/acetaminophen 5/325 p.r.n. for pain, omega-3 at 1000 mg capsules 2 daily and tramadol 50 mg p.r.n Normal Normal is not annotated. Examples: sensation is diffusely intact surgical scars 38 APPENDIX 2: Protected Health Information (PHI)” and “Personally Identifiable Information The Privacy Rule’s Safe Harbor Method for De-identification (17). “Under the safe harbor method, covered entities must remove all of a list of 18 enumerated identifiers and have no actual knowledge that the information remaining could be used, alone or in combination, to identify a subject of the information. The safe harbor is intended to provide covered entities with a simple, definitive method that does not require much judgment by the covered entity to determine if the information is adequately de-identified.” 1. Names; first name, last name, initials, names following titles and other indicators, login name, screen name, nickname, or handle 2. All geographical subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code, if according to the current publicly available data from the Bureau of the Census: (1) The geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and (2) The initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000. 3. All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; 4. Phone numbers; 5. Fax numbers; 6. Electronic mail addresses; 7. Social Security numbers; 8. Medical record numbers; 9. Health plan beneficiary numbers; 10. Account numbers; 11. Certificate/license numbers; 12. Vehicle identifiers and serial numbers, including license plate numbers; 13. Device identifiers and serial numbers; 14. Web Universal Resource Locators (URLs); 15. Internet Protocol (IP) address numbers; 16. Biometric identifiers, including finger and voice prints; 17. Full face photographic images and any comparable images; and 18. Any other unique identifying number, characteristic, or code (note this does not mean the unique code assigned by the investigator to code the data) 39 APPENDIX 3: Entity and Attribute Tables This table indicates the attribute fields shown in the Annotation Window for each Entity type. Note that some fields may be present but are not applicable. For example Adverse is an entity and attribute for Adverse Effect and is duplicative. Attribute Entity Adverse effect Drug Indication SSLIF Adverse NA Yes (>1) No NA Presence (Assertion) Yes Yes Yes Yes MedDRA Yes No No No Outcome Yes No No No Dose No Yes (>1) No No Duration No Yes No No Frequency No Yes No No Route No Yes No No Period Yes Yes Yes NA Reason (Indication) NA Yes (>1) NA No Severity Yes No Yes Yes Notes: Key: Indication (Reason) is a SSLIF that is treated with a drug. Severity has field for assertion but it is not used No – field is not present NA- field is present but not applicable 40 APPENDIX 4: Routes of Drug Administration and Abbreviations FDA Standards Manual list of route of drug administration. For full list which includes FDA codes and NCI concept codes, see: (19). http://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/Electr onicSubmissions/DataStandardsManualmonographs/ucm071667.htm NAME DEFINITION SHORT NAME AURICULAR (OTIC) Administration to or by way of the ear. OTIC BUCCAL Administration directed toward the cheek, generally from within the mouth. BUCCAL CONJUNCTIVAL Administration to the conjunctiva, the delicate membrane that lines the eyelids and covers the exposed surface of the eyeball. CONJUNC CUTANEOUS Administration to the skin. CUTAN DENTAL Administration to a tooth or teeth. DENTAL ELECTRO-OSMOSIS Administration of through the diffusion of substance through a membrane in an electric field. EL-OSMOS ENDOCERVICAL Administration within the canal of the cervix uteri. Synonymous with the term intracervical.. E-CERVIC ENDOSINUSIAL Administration within the nasal sinuses of the head. E-SINUS ENDOTRACHEAL Administration directly into the trachea. E-TRACHE ENTERAL Administration directly into the intestines. ENTER EPIDURAL Administration upon or over the dura mater. EPIDUR EXTRA-AMNIOTIC Administration to the outside of the membrane enveloping the fetus X-AMNI EXTRACORPOREAL Administration outside of the body. X-CORPOR HEMODIALYSIS Administration through hemodialysate fluid. HEMO INFILTRATION Administration that results in substances passing into tissue spaces or into cells. INFIL INTERSTITIAL Administration to or in the interstices of a tissue. INTERSTIT INTRA-ABDOMINAL Administration within the abdomen. I-ABDOM INTRA-AMNIOTIC Administration within the amnion. I-AMNI INTRA-ARTERIAL Administration within an artery or arteries. I-ARTER INTRA-ARTICULAR Administration within a joint. I-ARTIC INTRABILIARY Administration within the bile, bile ducts or gallbladder. I-BILI INTRABRONCHIAL Administration within a bronchus. I-BRONCHI INTRABURSAL Administration within a bursa. I-BURSAL INTRACARDIAC Administration with the heart. I-CARDI INTRACARTILAGINOUS Administration within a cartilage; endochondral. I-CARTIL INTRACAUDAL Administration within the cauda equina. I-CAUDAL 41 INTRACAVERNOUS Administration within a pathologic cavity, such as occurs in the lung in tuberculosis. I-CAVERN INTRACAVITARY Administration within a non-pathologic cavity, such as that of the cervix, uterus, or penis, or such as that which is formed as the result of a wound. I-CAVIT INTRACEREBRAL Administration within the cerebrum. I-CERE INTRACISTERNAL Administration within the cisterna magna cerebellomedularis. I-CISTERN INTRACORNEAL Administration within the cornea (the transparent structure forming the anterior part of the fibrous tunic of the eye). I-CORNE INTRACORONAL, DENTAL Administration of a drug within a portion of a tooth I-CORONAL which is covered by enamel and which is separated from the roots by a slightly constricted region known as the neck. INTRACORONARY Administration within the coronary arteries. I-CORONARY INTRACORPORUS CAVERNOSUM Administration within the dilatable spaces of the corporus cavernosa of the penis. I-CORPOR INTRADERMAL Administration within the dermis. I-DERMAL INTRADISCAL Administration within a disc. I-DISCAL INTRADUCTAL Administration within the duct of a gland. I-DUCTAL INTRADUODENAL Administration within the duodenum. I-DUOD INTRADURAL Administration within or beneath the dura. I-DURAL INTRAEPIDERMAL Administration within the epidermis. I-EPIDERM INTRAESOPHAGEAL Administration within the esophagus. I-ESO INTRAGASTRIC Administration within the stomach. I-GASTRIC INTRAGINGIVAL Administration within the gingivae. I-GINGIV INTRAILEAL Administration within the distal portion of the small intestine, from the jejunum to the cecum. I-ILE INTRALESIONAL Administration within or introduced directly into a localized lesion. I-LESION INTRALUMINAL Administration within the lumen of a tube. I-LUMIN INTRALYMPHATIC Administration within the lymph. I-LYMPHAT INTRAMEDULLARY Administration within the marrow cavity of a bone. I-MEDUL INTRAMENINGEAL Administration within the meninges (the three membranes that envelope the brain and spinal cord). I-MENIN INTRAMUSCULAR Administration within a muscle. IM INTRAOCULAR Administration within the eye. I-OCUL INTRAOVARIAN Administration within the ovary. I-OVAR INTRAPERICARDIAL Administration within the pericardium. I-PERICARD INTRAPERITONEAL Administration within the peritoneal cavity. I-PERITON INTRAPLEURAL Administration within the pleura. I-PLEURAL 42 INTRAPROSTATIC Administration within the prostate gland. I-PROSTAT INTRAPULMONARY Administration within the lungs or its bronchi. I-PULMON INTRASINAL Administration within the nasal or periorbital sinuses. I-SINAL INTRASPINAL Administration within the vertebral column. I-SPINAL INTRASYNOVIAL Administration within the synovial cavity of a joint. I-SYNOV INTRATENDINOUS Administration within a tendon. I-TENDIN INTRATESTICULAR Administration within the testicle. I-TESTIC INTRATHECAL Administration within the cerebrospinal fluid at any level of the cerebrospinal axis, including injection into the cerebral ventricles. IT INTRATHORACIC Administration within the thorax (internal to the ribs); synonymous with the term endothoracic. I-THORAC INTRATUBULAR Administration within the tubules of an organ. I-TUBUL INTRATUMOR Administration within a tumor. I-TUMOR INTRATYMPANIC Administration within the aurus media. I-TYMPAN INTRAUTERINE Administration within the uterus. I-UTER INTRAVASCULAR Administration within a vessel or vessels. I-VASC INTRAVENOUS Administration within or into a vein or veins. IV INTRAVENOUS BOLUS Administration within or into a vein or veins all at once. IV BOLUS INTRAVENOUS DRIP Administration within or into a vein or veins over a sustained period of time. IV DRIP INTRAVENTRICULAR Administration within a ventricle. I-VENTRIC INTRAVESICAL Administration within the bladder. I-VESIC INTRAVITREAL Administration within the vitreous body of the eye. I-VITRE IONTOPHORESIS Administration by means of an electric current where ions of soluble salts migrate into the tissues of the body. ION IRRIGATION Administration to bathe or flush open wounds or body cavities. IRRIG LARYNGEAL Administration directly upon the larynx. LARYN NASAL Administration to the nose; administered by way of the NASAL nose. NASOGASTRIC Administration through the nose and into the stomach, NG usually by means of a tube. NOT APPLICABLE Routes of administration are not applicable. NA OCCLUSIVE DRESSING TECHNIQUE Administration by the topical route which is then covered by a dressing which occludes the area. OCCLUS OPHTHALMIC Administration to the external eye. OPHTHALM ORAL Administration to or by way of the mouth. ORAL OROPHARYNGEAL Administration directly to the mouth and pharynx. ORO OTHER Administration is different from others on this list. OTHER 43 PARENTERAL Administration by injection, infusion, or implantation. PAREN PERCUTANEOUS Administration through the skin. PERCUT PERIARTICULAR Administration around a joint. P-ARTIC PERIDURAL Administration to the outside of the dura mater of the spinal cord.. P-DURAL PERINEURAL Administration surrounding a nerve or nerves. P-NEURAL PERIODONTAL Administration around a tooth. P-ODONT RECTAL Administration to the rectum. RECTAL RESPIRATORY (INHALATION) Administration within the respiratory tract by inhaling orally or nasally for local or systemic effect. RESPIR RETROBULBAR Administration behind the pons or behind the eyeball. RETRO SOFT TISSUE Administration into any soft tissue. SOFT TIS SUBARACHNOID Administration beneath the arachnoid. S-ARACH SUBCONJUNCTIVAL Administration beneath the conjunctiva. S-CONJUNC SUBCUTANEOUS Administration beneath the skin; SC hypodermic. Synonymous with the term SUBDERMAL. SUBLINGUAL Administration beneath the tongue. SL SUBMUCOSAL Administration beneath the mucous membrane. S-MUCOS TOPICAL Administration to a particular spot on the outer surface TOPIC of the body. The E2B term TRANSMAMMARY is a subset of the term TOPICAL. TRANSDERMAL Administration through the dermal layer of the skin to the systemic circulation by diffusion. T-DERMAL TRANSMUCOSAL Administration across the mucosa. T-MUCOS TRANSPLACENTAL Administration through or across the placenta. T-PLACENT TRANSTRACHEAL Administration through the wall of the trachea. T-TRACHE TRANSTYMPANIC Administration across or through the tympanic cavity. T-TYMPAN UNASSIGNED Route of administration has not yet been assigned. UNAS UNKNOWN Route of administration is unknown. UNKNOWN URETERAL Administration into the ureter. URETER URETHRAL Administration into the urethra. URETH VAGINAL Administration into the vagina. VAGIN 44 APPENDIX 5: Frequency of Drug Administration and Other Abbreviations Common frequency of drug administration abbreviations Abbreviation Definition q.d. once a day b.i.d. twice a day t.i.d. three times a day q.i.d. four times a day q.h.s. before bed 5X a day five times a day q.4h every four hours q.6h every six hours q.o.d. every other hour prn. as needed Other Common Abbreviations NAD – Not Acute Distress SR – Sustained Release XL and ER – extended release IR – immediate release Abbreviations for Chemo regimens ABVD CHOP OEPA OEPA-COPDAC 45 APPENDIX 6: Reference Tables (Cardio/Cancer) INTENSITY OF PULSE SCALE (0 – 4): (this is for your information, but we do not annotate numbers without a reference/scale) 0 indicating no palpable pulse 1 + indicating a faint, but detectable pulse 2 + suggesting a slightly more diminished pulse than normal 3 + is a normal pulse 4 + indicating a bounding pulse. The way we describe normal heart sounds is by saying “s1 and s2 present” which corresponds to the valves closing. An S4 is an additional heart sound often heard due to a “stiff heart wall.” Some of the causes are hypertension, fibrosis, or a disease condition called HOCM. So this is a variant of the normal. Non-Hodgkin Lymphoma (NHL) 1) Low-Grade Non-Hodgkin Lymphoma Follicular Lymphoma Chronic Lymphocytic Leukemia (CLL) Small Lymphocytic Lymphoma Lymphoplasmacytoid Lymphoma/Waldenstrom Macroglobulinemia Marginal Zone Lymphoma 2) High-Grade Non-Hodgkin Lymphoma Diffuse Large B-Cell Lymhpoma Primary CNS Lymphoma Burkitt Lymphoma Mantle Cell Lymphoma 3) T-Cell/Natural Killer Cell Non-Hodgkin Lymphoma Peripheral T-Cell Lymphoma, Not Otherwise Specified (PTCL-NOS) Angioimmunoblastic T-Cell Lymphoma Anaplastic Large Cell Lymphoma T-Cell Lymphoma/Natural Killer (NK) Cell Lymphoma Hepatosplenic T-Cell Lymphoma Enteropathy-associated T-Cell Lymphoma Cutaneous T-Cell Lymphoma/Mycosis Fungoides Hodgkin/Non-Hodgkin Lymphoma Classification 46 To avoid confusion, note that often people who are not oncologists just use the most generic names like Hodgkin versus non-Hodgkin’s lymphoma when describing this cancer. Hodgkin Lymphoma 1) Classical Hodgkin Lymphoma Nodular Sclerosis Mixed Cellularity Lymphocyte-rich Lymphocyte-depleted 2) Nodular Lymphocyte Predominant We annotated cancer stage as severity. The staging system tells you how involved the cancer is. In general, stage I is usually a tumor that is local; stage II is usually a localized tumor but larger in size or invading adjacent structures; stage III typically involves lymph nodes; and stage IV is widespread, metastatic disease. These are just general rules and they vary depending on the cancer. The reason I think this should be annotated as severity is that the stage dictates treatment. Stage I/II cancers can usually be surgically removed and have possible adjuvant chemotherapy/radiation, while metastatic Stage III/IV disease is usually treated with chemotherapy alone or with radiation. So I believe this information is clinically relevant and should be included. When cancer drugs are given, there may be a reaction. The physician may reduce the dose or skip a week or both and may readminister. The reaction may not reappear right away. Note cancer patients get weaker over course of treatment which impacts how they react. 47 APPENDIX 7: Annotation Tool Notes NLP Objectives: Current annotation informs ADE Pharmacovigilance. This includes use of current classes, MedDRA annotation, Naranjo scoring, etc. When measured against objectives, the addition of anything must be essential to this list to maintain annotation focus. Disease Drugs Adverse Drug Events Discourse relations · Temporal relations · Causal relations · Contrastive relations Severity Tool Use Notes Navigating Protégé There are three panels: The leftmost panel is the Class Navigation Bar with the annotation schema. The middle panel is the record window with the clinical note. You can view the complete note by using the scroll bar to the right in the panel. The rightmost panel is for attribute annotation and comments. Note this panel has two sections and section sizes can be adjusted by dragging the middle bar. The upper section is for attributes. For some entities, there will be many possible attributes and a scroll bar will appear on the right. Scroll up and down to see what is available. The comments sections in the Protégé tool is free text and as such, it is not computable. [Note annotation of the clinical notes is to make them computable] Therefore the best use of this field is for communication between editors and annotators. 48 Figure: The first panel lists the classes [1], the second panel is the medical record window [2] and the third panel is an attribute annotation window [3]. To annotate most classes, click the class in the left panel or in the fast annotate bar [4] and highlight it in the middle panel. Some additional attributes and associations [5] are made from the class panel and the annotation window. A few are made from just the annotation window, i.e. Period. There is a website with the annotation guidelines, videos on how to use the tool, and other resources. http://ummsres12.umassmed.edu/jt/index.php/annotation Open Protégé and you can select from recent projects or you can navigate to the folder with the patient file you will be annotating or editing, and open it. At the very top is a menu bar. If you select Window, you have the option to increase font size, which most people need to do. One the third menu bar there are several tabs. At the end is a Knowtator tab. Click this to open the file if it doesn’t open automatically. A machine annotated record will appear with various colors marking it up. Above the Knowtator tab there is a white tool bar which contains the ‘Save’ icon. It is very important to save your work when you end a session. There is also a section for “Text source collection:” Arrows allow you to scroll through names of patient records. The name appears at the top of the record window as “text source: date.txt” in blue. The rightmost arrows allow you to scroll between annotations of the same category within the note. 49 The mark-ups are visible as text highlighted in colors corresponding to classes of interest. These are described in the Annotation Guidelines and the classes appear in the leftmost panel – the Class Navigation Bar. Most of the leftmost classes open up to subclasses and arrows indicate which ones have further lower level classes. A best practice is to start by scanning the text to familiarize yourself with the content. Next, read it carefully line by line and review it for 1. missing and 2. incorrect annotation mark-ups. It is particularly important to mark all PHI (Personal Health Information). These annotations will also be used to remove the personal data from the files in the de-identification process - prior to any data sharing. To mark text, select (left click) the class type from the left Class Navigation Bar. On first use, a drop down appears to the right of the class asking if you want to “create an annotation” or to “fast annotate.” Select “fast annotate.” This class will now appear in the “fast annotate” bar toolbar. You can select classes with one click directly from this bar (without the drop down) on subsequent uses of this class. After selecting a class for fast annotation, selecting it again in the left Class Navigation Bar will give you an additional choice to “remove” that class from fast annotate. Once a class is selected, move your cursor to the Record Window and using a highlight motion, highlight the term (or part of it) you would like to mark as this class. It will now be marked with the class color. To unmark an entity, select it in the Record Window by left clicking it. Near the top of the annotation window is span edit: can you can choose clear or delete. Span edit can shrink or increase an annotation. There are relations or associations that are annotated manually. Classes can be entities or attributes. A common relation is between the entity Drug and its attributes of Dosage, Route, Frequency, Duration, Indication and Adverse Event. (see Appendix 2). To annotate a relation between an entity and its attributes, left-click on the entity or entity span, then right-click on the attribute or attribute span. The attribute gets highlighted in a dotted box when you have created the relation. To annotate a relation between a drug and its attribute, left-click on the drug span, then right-click on the attribute span. The attribute is then highlighted in a dotted box. Continue this process for each attribute associated with the drug. If an entity has several attributes from the same category, you must create each separate association. For example, a Drug caused multiple adverse effects; you must create each association of the Drug to the Adverse Event individually. In other words, you must create several identical drug spans and link each one to an Adverse Event. Confirm your work in the Annotation Window. Adjust your Annotation Panels or scroll to see all of the fields filled in and these fields will change depending on the entity. MedDRA MedDRA is a five-level hierarchy of medical terms: - System organ class (SOC): most general - High level group term (HLGT) - High level term (HLT) - Preferred term (PT) - Lowest level term (LLT): most specific All the adverse events should only be assigned Preferred terms (PT). Lowest Level terms are more specific but the term list contains synonyms so there is a lot of redundancy. 50 If the search did not bring any PT result but only LLT result (e.g. Itching) we double left click the LLT term and choose on its corresponding PT (Pruritus) When annotating an Adverse Event, after making the selection, you will see a pop up window with MedDRA PT terms. Select the best option from the list. It will insert the term you highlighted into the MedDRA field. Do not choose the class MedDRA, if you do, it will insert the term you highlight into the MedDRA code field regardless of possible term matches, and it is not a code. If there is no meaningful match, you can search for its synonym. For example, “expire” does not produce a result but searching on “death” will. Sometimes a fairly generic term is the best choice. To manually enter a MedDRA code, left click on the Adverse Event. In the annotation window, click the gray box in the MedDRA code field. Click the square with a superscript + in the ConceptCode field and in the Term to manually add information. Follow this link to browse the most current version of the MedDRA ontology from anywhere except the Annotation Server (i.e. Outside the secure environment). http://ummsres14.umassmed.edu/OntoSolr/browse On the Annotation Server, please use this URL to search and browse MedDRA terms: http://ummsqhslxweb01.umassmed.edu/OntoSolr/browse To review, change or just see the possible MedDRA matches for an AE again, left click on the AE. In the MedDRA Code field, pick the diamond in a star and the pop up should reappear by the term. To check on an MedDRA annotation, double click the grey square in the MedDRA Code field. The annotation window will refresh and the MedDRA “instance” will be at the top and the annotated MedDRA values will appear – the MedDRA code in the ConceptCode field, MedDRA Code field is empty, and the term you selected will be in the Term field. Note: if you have added MedDRA terms correctly, then the number of adverse event terms annotated should be equal to the number of MedDRA terms annotated. Attributes Assertion, Period, Outcome 51 Default Values There is no need to annotate default values for attributes. Assertion = Present Period = Current Outcome = notMentioned When annotating a span, or returning to an annotation, select the term or span to mark with an assertion and the fields will appear in the Annotation Window. Click the “Add Instance’ icon (the square with the superscript +) and select the appropriate option. Assertion: Present, Absent, Conditional, Hypothetical, Period: Current, History Outcome: When you are done, be sure to SAVE your work by clicking on the Save icon. 52 APPENDIX 8: Annotation and Tooling Notes Prior to ADE Deviations from i2b2 Guidelines Conditional: ”Conditional” is used when the mention of the medical problem asserts that the patient experiences the problem only under certain conditions. He got 1 day of voriconazole for possible presumed aspergillosis, but given that he was improving on the other antibiotics and his CT was not consistent with aspergillosis and he was no longer on immunosuppression, it seemed like a less likely diagnosis. His urine and blood cultures were all negative. Given these findings that presumed diagnosis is community-acquired pneumonia, he will complete a 10-day course of azithromycin and Omnicef. The patient has been instructed to return if his fevers or cough worsen and he gets worsening shortness of breath as this may indicate that the patient has a recurrent aspergillosis We have not come across examples of conditional value in our corpus so far. We believe that i2b2 examples of Conditional value fall into "Present" category. For example, “dyspnea on exertion” is a medical term and should be annotated as “dyspnea on exertion” with Present assertion value (not as just “dyspnea” with Conditional assertion value). Prepositions: We do not include prepositions when annotating duration spans, e.g. for three weeks or for an unknown period of time we do not include for in duration spans. Here we differ from i2b2 where the preposition for is included in duration spans. Export the Annotation to XML3 To get annotations out of Knowtator is to use the XML export. Select the menu option Knowtator -> Export annotations to XML and then follow the directions. This will generate one XML file per text source in your collection. The XML format used directly parallels the data model that Knowtator uses for storing annotations in Protégé. Looking at the XML files may actually be helpful to understand how Knowtator represents annotations in Protégé.http://knowtator.sourceforge.net/faq.shtml Table: Annotation Tools and other Resources Software URL Document Protégé http://protege.cim3.net /download/oldreleases/3.3.1/basic/ 3 This is old text and refers to outdated versions of the Knowtator Plugin, MEDdra, and i2b2, but we wanted to retain this historical information. 53 Knowtator http://knowtator.sourc eforge.net/ http://knowtator.sourceforge.net/install.shtml MedDRA Browser http://www.meddrams so.com/subscriber_do wnload_tools_browser .asp Need MedDRA131E, import the folder named MedAscii. I2b2 Medication Annotation Guideline http://lancet.googlecod e.com/files/Preliminary .Annotation.Guidelines .7.9.pdf Semi-Automated Annotation with the BioNLP named entity tagger−Lancet [This section was written some time ago and it is not clear how much of this is still applicable] To increase the annotation speed, we apply the BioNLP named entity tagger Lancet, which is trained on the annotated data, to automatically identify the named entities. An annotator then corrects the automatically labeled corpus. The annotated corpus will be fed into the learner and used to train a new model. Such interactive steps are repeated until a satisfactory performance is met. This section is to guide the oracle on how to correct the automatically labeled corpus. After importing the NLP tools annotation, the annotator attribute of each annotation is assigned with a NLP tool name, such as Lancet UWM. First, a default annotator is assigned. You can configure that by click: Knowtator-> Configure>default Annotator. This configuration does not change the attributes of any existing annotation and is set for a new annotation. In the event of partially correct annotations, the annotator needs to delete the annotation first and then re-annotate. Otherwise, the correction work will not be recorded by Knowtator. In the event when an entity is annotated more than once, please keep the correct annotation and delete the other ones. If both or all of them are correct, just delete ones until only one is left. Please look through the whole article and insert the absent annotations. There is a trade-off between precision and recall of the NLP tools. Here, we prefer a high precision annotation. 54 APPENDIX 9: Summary - Annotation Processes/Tooling Changes/Post Processing Notes for the ADE Pharmacovigiliance Project 1. Reviewed PHI and PII requirements. Significantly streamlined PHI annotation and made all PHI markings the same color for a simplified view. 2. We now have access to MedDRA releases and the terms used in annotation are now updated with each release. MedDRA annotation has been brought into the annotation tool; a major time saver. Upon selecting an adverse event, a MedDRA pop-up window shows the top 10 matches to the selected term or span. Selection of a MedDRA term from the pop-up window populates the MedDRA fields with term and concept codes. Manual annotation of MedDRA terms is still possible and an updated browser on the virtual machine makes searching an easier process. 3. Inter-annotator agreement is a priority as the team expands. Established a regular meeting to compare and discuss annotations. Incorporation of conclusions, rules and examples in the guidelines is now a routine process. 4. Updated guidelines with more examples, filled in gaps, created a section with examples to specifically aid inter-annotator agreement, there is a history section, and another for tracking major changes to processes and tooling. 5. Created videos demonstrating use of the Annotation Tool. 6. Added a webpage for Annotation related resources. 7. Established a workflow process and file system on the virtual machine to enable annotation, editing and other separated workflows. 8. Post annotation processing will include: a. assigning CTCAE (20) categories to severity annotations b. assigning default values for i. Assertion = present ii. Period = current iii. Outcome = notMentioned. c. Negated words are marked with the Assertion “Absent”. They will be detected in the negation algorithm. Negated word examples are: nontender, anicteric, asymptomatic. d. Annotation of actual dosage calculated from drug dosage, frequency e. Drug names with XL or % in the names to be dealt with. Not all have the XL, and we have moved to putting the % in with dose f. Locations associated with two SSLIF will be handled in post processing. Annotate as follows: i. the right lung has some rhonchi and wheezes in it. ii. pain and swelling of his leg iii. joint aches and pains 9. Do not annotate general terms such as “problem” and “disease”. See Appendix 1 for more examples. 10. Make all relevant relations, regardless of distance between terms 11. Multiple indications for a drug are allowed 12. Multiple ADEs are allowed for a drug 55 13. Stage of cancer is a severity. Logic: We annotated stage as severity. The staging system tells you how involved the cancer is. In general, stage I is usually a tumor that is local; stage II is usually a localized tumor but larger in size or invading adjacent structures; stage III typically involves lymph nodes; and stage IV is widespread, metastatic disease. These are just general rules and they vary depending on the cancer. The reason I think this should be annotated as severity is that the stage dictates treatment. Stage I/II cancers can usually be surgically removed and have possible adjuvant chemotherapy/radiation, while metastatic Stage III/IV disease is usually treated with chemotherapy alone or with radiation. So I believe this information is clinically relevant and should be included. 14. Not annotating numbers as severity unless there is a scale or frame of reference UNLESS it is an ADE. 15. You can now associate more than one span of dosage information to a drug. NLP will use it for learning and actual dosage can be determined computationally. carbamazepine 200 mg extended release 2 tablets twice daily spiriva 18 mcg inhalation capsule one daily, Glipizide 5 mg, a half tablet daily. Tramadol 50mg tabs 1-2 tabs every 4 hrs 16. So it is not lost - In the beginning OSSD was used for other Signs. Symptoms and other Diseases. It was changed to S/S/LIF Signs, Symptoms and Laboratory Findings (as per Balaji who put this in Knowtator). We are only annotating commented lab results. 56 APPENDIX 10: Entity and Attribute Diagrams PHI DATE MRN AGE OVERV 90 SSN LOCATION PHON/FAX/PAGER NAME IDENTIFIERS ELECTRONIC IDENTIFIERS ST PO BOX STATE / ZIP CODE COMPANY HOSPITAL NAMED SITE (ABREVI. OF BUILDINGS) CITY *ADE PRESENT (default) ABSENT POSSIBLE CONDITIONAL HYPOTHETICAL NOT ASSOC./PT PRESENCE ASSERTION MEDICATION (Drug Name) *DOSES *DURATION *FRECUENCY *ROUTE CURRENT (default) PERIOD HISTORY UNK *REASON / INDICATION *Association in record and annotation window 57 ADE PRESENCE ASSERTION Yes association in record and annotation window PRESENT (default) ABSENT POSSIBLE CONDITIONAL HYPOTHETICAL NOT ASSOC./PT SSLIF CURRENT (default) PERIOD HISTORY UNK SEVERITY TYPE association in annotation window only Example: mild markadely severe very stage IB 58 ADVERSE PRESENCE ASSERTION N/A PRESENT (default) ABSENT POSSIBLE CONDITIONAL HYPOTHETICAL NOT ASSOC./PT MEDRA CODE ADE OUTCOME PERIOD RECOVER DIED NOT COMPLETELY RECOVER NOT MENTIONED (default) CURRENT (default) HISTORY UNK REASON SEVERITY TYPE N/A Example: severe mild markadely very 59 PRESENCE ASSERTION PRESENT (default) ABSENT POSSIBLE CONDITIONAL HYPOTHETICAL NOT ASSOC./PT CURRENT (default) PERIOD REASON/INDICATION HISTORY UNK SEVERITY TYPE REASON/INDICATION Example: mild markadely severe very stage X N/A 60