Annotation Guidelines

advertisement
THE ANNOTATION GUIDELINE MANUAL:
EXTRACTING ADVERSE DRUG EVENT INFORMATON FROM
CLINICAL NARRATIVES IN ELECTRONIC MEDICAL RECORDS
Version 2.1
October 29, 2015
Steven Belknap, Elaine Freund, Nadya Frid, Edgard Granillo, Heather Keating, Zuofeng Li,
Rashmi Prasad, Balaji Ramesh, Victoria Wang, Hong Yu
This work is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Recommended Citation
Belknap, S., Freund, E., Frid, N., Granillo, E., Keating, H., Li, Z., Prasad, R., Ramesh, B., Wang, V., Yu, H. "THE
ANNOTATION GUIDELINE MANUAL: EXTRACTING ADVERSE DRUG EVENT INFORMATON FROM CLINICAL
NARRATIVES IN ELECTRONIC MEDICAL RECORDS." University of Massachusetts Medical School's Biomedical
Informatics Natural Language Processing (UMMS BioNLP) Group. Hong Yu, 29 October 2015. Web. (access date)
URL.
Table of Contents
INTRODUCTION ................................................................................................................................................................................................ 3
General Background..................................................................................................................................................................... 3
Guidelines Background ................................................................................................................................................................ 3
NAMED ENTITY OR ANNOTATION FIELDS* ....................................................................................................................................................... 5
PHI and PII Annotation ................................................................................................................................................................. 5
Medication Annotation: Drug and Drug Attributes...................................................................................................................... 8
Drug- ....................................................................................................................................................................................... 9
Dosage (and form) ................................................................................................................................................................ 10
Route .................................................................................................................................................................................... 11
Frequency – .......................................................................................................................................................................... 11
Duration ................................................................................................................................................................................ 12
Medication Annotation: Medication Related Entities and Attributes ........................................................................................ 13
Indications and Adverse Events ............................................................................................................................................ 13
Indication .............................................................................................................................................................................. 14
Adverse event ....................................................................................................................................................................... 14
MedDRA Annotation ............................................................................................................................................................. 15
SSLIF and Severity ................................................................................................................................................................. 16
SSLIF ...................................................................................................................................................................................... 16
Severity ................................................................................................................................................................................. 17
Choosing A Span ........................................................................................................................................................................ 19
Assertion Categories .................................................................................................................................................................. 20
Outcome Assertions (default notMentioned) ....................................................................................................................... 20
Period Assertions (default current) ....................................................................................................................................... 20
Presence Assertions .............................................................................................................................................................. 22
Annotation Of Relations............................................................................................................................................................. 27
ANNOTATION PRACTICE ................................................................................................................................................................................. 28
General Considerations.............................................................................................................................................................. 28
Anaphoric pronouns ............................................................................................................................................................. 29
Anaphoric Pronouns vs. Co-referent Items ........................................................................................................................... 29
Articles .................................................................................................................................................................................. 29
Titles ..................................................................................................................................................................................... 29
Prepositions .......................................................................................................................................................................... 30
Test results ............................................................................................................................................................................ 31
Longitudinal Information ...................................................................................................................................................... 33
Punctuation and Spelling ...................................................................................................................................................... 34
APPENDIX 1: Additional Annotation Examples and Information ............................................................................................... 35
SSLIF ...................................................................................................................................................................................... 35
Chemotherapy ...................................................................................................................................................................... 36
General Terms....................................................................................................................................................................... 37
Double Annotation................................................................................................................................................................ 37
APPENDIX 2: Protected Health Information (PHI)” and “Personally Identifiable Information ................................................... 39
APPENDIX 3: Entity and Attribute Tables ................................................................................................................................... 40
APPENDIX 4: Routes of Drug Administration and Abbreviations ............................................................................................... 41
APPENDIX 5: Frequency of Drug Administration and Other Abbreviations ............................................................................... 45
APPENDIX 6: Reference Tables (Cardio/Cancer) ........................................................................................................................ 46
APPENDIX 7: Annotation Tool Notes .......................................................................................................................................... 48
APPENDIX 8: Annotation and Tooling Notes Prior to ADE ......................................................................................................... 53
APPENDIX 9: Summary - Annotation Processes/Tooling Changes/Post Processing Notes for the ADE Pharmacovigiliance Project
................................................................................................................................................................................................... 55
APPENDIX 10: Entity and Attribute Diagrams ............................................................................................................................ 57
2
INTRODUCTION
General Background
An adverse event (AE) is an injury to a patient, and an adverse drug event (ADE) is "an injury resulting
from a medical intervention related to a drug" (1). ADEs are common and occur at a rate of 2.4─5.2 per
100 hospitalized adult patients (1–4). Each ADE is estimated to increase the length of hospital stay by
2.2 days and to increase the hospital cost by $3,244 (3,5). Severe ADEs are between the fourth and
sixth leading causes of death in the United States (6). Significant healthcare savings could be realized
through prevention of ADEs and through early detection and mitigation of ADEs (5,7,8). When a
clinician recognizes an ADE, a hospital system typically prompts an appropriate response, such as
discontinuation of the drug, adjustment of dose, administration of an antidote (e.g., blood transfusion,
antihistamines, antiarrhythmics, or intravenous fluid resuscitation), or other action. While particular
instances of ADEs may be recognized and appropriately ameliorated, these events are often not coded
in diagnostic or billing fields of the medical record and are therefore “lost” to pharmacoepidemiologists,
regulatory agencies, and clinicians. One result of this loss is a paucity of high–quality information that
can lead to errors in assessment of toxicity from cancer drugs (9). The lack of timely and accurate ADE
information has led to confusion for patients and prescribers, especially when the FDA takes regulatory
action (10) that appears to be inconsistent with the available data, as recently happened with
clopidogrel (11).
Studies have shown that the occurrence of the ADE is often buried in the EMR narrative (e.g.,(12)). The
ADE is not separately recorded in the form of diagnosis code or other data accessible in the structured
fields and is therefore difficult to detect and assess. However, manual abstraction of data from
discharge notes and from other unstructured text remains a significant impediment to progress in
pharmacovigilance research. Rapid, accurate, and automated detection of ADEs in any patient
population would provide significant cost and logistical advantages over manual ADE detection (e.g.,
chart review or voluntary reporting) (13). Consequently, robust biomedical natural language processing
(BioNLP) approaches that accurately detect ADEs in EMR narratives would be of great interest to other
pharmacovigilance researchers and also would have potential application in clinical settings.
Current projects utilizing EMR annotation involve clinical narratives from cancer, cardiovascular and
diabetes patients.
Guidelines Background
These guidelines are being used to annotate patient Electronic Medical Records (EMRs) which will be
made publicly available as a corpus with high quality annotation of ADEs. This corpus will also be used
to train an innovative NLP system which is part of pharmacovigilance toolkit. The toolkit will be
integrated into the open source translational research platform i2b2 (14), so these annotation guidelines
generally align with the i2b2 (14) guidelines. Annotation objectives are the identification of relevant
named entities (disease, medications and ADEs); and discourse relations (e.g., causal, temporal and
contrastive relations) between them; severity and Naranjo element extraction method for assessing
causality.
3
The annotation tools use Protégé with the Knowtator plugin (15 ) and
incorporate, HHS PHI and PII terms, the Naranjo scoring system (16 ) and
MedDRA (17) terms in the user interface. The guidelines have been
iteratively developed during usage and with experts across many domains.
The guidelines and tooling will continue to develop and be refined throughout
the annotation process and as research progresses.
Short videos demonstrating use of the annotation tooling are available (you
may want to use another browser if the links do not open in IE). Alternatively
you can go to the UMass BioNLP Annotation Resource Page:
1 Getting Started - Annotation
2 Annotation Tool Orientation
3 First Annotation PHI
4 Spans and Corrections
5 Relations Annotation
6 Adverse Events and MedDRA
7 More on Attributes
In brief, you will open a record in the annotation tool and it will look similar to the picture below. The first
panel lists the classes [1], the second panel is the medical record window [2] and the third panel is an
attribute annotation window [3]. To annotate most classes, click the class in the left panel or in the fast
annotate bar [4] and highlight it in the middle panel. Some additional attributes and associations [5] are
made from the class panel and the annotation window. A few are made from just the annotation
window, i.e. Period.
There is a website with the annotation guidelines, videos on how to use the tool, and other resources.
http://ummsres12.umassmed.edu/jt/index.php/annotation
4
NAMED ENTITY OR ANNOTATION FIELDS*1
IMPORTANT CONVENTIONS NOTE:


ENTITIES AND ATTRIBUTES UNDERLINED IN THE SAME COLOR THEY ARE HIGHLIGHTED IN THE
ANNOTATION TOOL
YELLOW HIGHLIGHTED TERMS AND SPANS IN THIS DOCUMENT INDICATE THESE ARE ANNOTATABLE,
BUT ARE NOT AN INDICATION OF CLASS TYPE. THEY ARE ALSO THE TERM TO WHICH AN ASSERTION
APPLIES. ON RARE OCCASION SPECIFIC CLASS COLOR HIGHLIGHTS ARE USED FOR CLARITY.
PHI and PII Annotation
To enact the Health Insurance Portability and Accountability Act (HIPAA) (18), the Dept. of Health and
Human Services published a national standard for the electronic exchange, privacy and security of
health information. The “Privacy Rule” protects all individually identifiable health information
transmitted in any form and calls this information “Protected Health Information (PHI)” and “Personally
Identifiable Information (PII).” There are 18 common identifiers associated with PHI and PII and which
must be removed to de-identify data for use or release. These include things such as name, address,
date, Social Security Number, etc. and the complete list of PHI is in the Appendix 2 describing PHI
datatypes. PHI is annotated to build the named entity recognition in NLP but also for removal during deidentification. How the PHI classes are to be used is described below.
PHI: You can use the general class for marking something that you know is
PHI but which does not clearly fall into the categories below.
Date: This class covers all aspects of date (except year) directly related to an
individual, including birth date, admission date, discharge date, date of death.
Age over 89: Another date identifier applies to all ages over 89 and all
elements of dates (including year) indicative of such age, except that such
ages and elements may be aggregated into a single category of age 90 or
older.
Medical Record Number: Use this class to include medical record numbers,
health beneficiary plan numbers and account numbers of any type.
1
* Classes are underlined in the color used as a highlighter in annotation
5
Social Security Number: self-explanatory
Location: It will be valuable for machine learning to annotate address with some granularity. Most of
these location identifiers are self-explanatory but Named Sites would include things such as Hospitals,
Universities, Organizations, named buildings, Landmarks, etc. Annotate the full name of the location. If
you are not sure, you can use the less granular class of Location (or PHI) to annotate.
Telephone/FAX/Pager: self-explanatory
Name: All aspects of any name are to be annotated, first name, last name, initials, names following
titles and indicators, nicknames, logins, handles.
Identifiers: This class covers certificate/license numbers; vehicle identifiers and serial numbers
including license plates; device identifiers and serial numbers; and biometric identifiers (which would be
mostly images and we almost surely will not see that type of data).
Electronic Identifiers: e-mail, web sites, IP addresses, username and password
Comments and Examples for Inter-Annotator Agreement
It is extremely important to annotate PHI. It is the tag that will be used in the de-identification process
so it is essential you annotate PHI as something, even if you are unsure or it is incorrect.
You may use general classes such as PHI or Location if you need to, but Do NOT leave PHI
unannotated. Use PHI when there is mixed datatypes such as a number date combination following
accession.
Pre name indicators such as Dr., Mr., Mrs., and Ms. do not need to be removed. They are used as
tags for name PHI filters. Similarly, you do not need to annotate post name tags (Sr., Jr., III, M.D.,
Ph.D.).
Year is not considered PHI. “2011” does not need to be annotated in the example “She was
diagnosed in 2011.” However, if a standalone year is pre-annotated, there is no need to remove the
annotation.
Week is not PHI when in the context such as “week 3 of CHOP chemotherapy”.
Day of week is not PHI if it is not informative, for example:
o Please call the wound clinic to set up an appointment on Monday or Tuesday
o Hospital Day #1
o Upon questioning the patient that he did wear Crocs without socks for a portion of time
last Sunday during a very hot day
Do not separate a date phrase. May 18, 1997 is annotated as a single date span.
6
Locations: Annotate the entire location name (i.e. include Hospital, University, Campus” etc.).
Annotate abbreviations for buildings as Named Sites when you know that is the reference (i.e. AS
for Albert Sherman Center), otherwise use Location. Annotate wings, rooms, corridors, hallways as
location. If unsure or the specific type of location cannot be captured in a contiguous span, use the
general Location. The key thing is to annotate any PHI in any PHI category so it is removed in deidentification. Examples include:
o ACC, MLN, AS-X = named sites or location
o #s for Wings, rooms, corridors, hallways as location such as BB-H3
o Clinic names such as “Cancer Clinic or “Diabetes Clinic” are not annotated but examples
that are annotated include specific names such as: The Multiple Sclerosis Center at
UMASS Memorial Health Care, or Hahnemann Family Health Center, or Emma L.
Bowen Community Service Center
Identifying numbers used by the hospital: acct, MRN, Fi, accession…etc., should be annotated as
medical record #s (MRN).
Accession numbers - annotate # and letters in a single span.
 123456789
 F122334455
 L1 3344556622 Some people have annotated just the numbers after the space. Either way
is okay.
Patient’s identifying numbers used by the government: passport#, driver license, social security,
immigration related, etc. should be annotated as Identifier.
At UMMS the 5 digit number following a physician’s name is an identifier.
For mixed data types, use the general PHI to annotate.
7
Medication Annotation: Drug and Drug Attributes2
When an adverse event is recognized, a physician will discontinue the drug, adjust the dose, or
administer an antidote. Drug and drug specific attributes are important elements to annotate.
Information will be used to assess causal relations between an adverse event and drug administration.
Field
Drug name [Entity]
Definition
Example
Substances for which the
Eg1: Lotensin 20 mg p.o. daily.
patient has experienced or Eg 2: He was started on
will experience; including
azithromycin and ceftriaxone.
drug class name or
Eg 3: drug classes such as oral
medications referred with
contraceptives, macrolides,
pronouns.
nonsteroidal anti-inflammatory, antiDrug name must be
infectives
mentioned either in USP
published drug list or
included in the orange book.
Dosage [Attribute]
The amount of a single
Eg 1: In the ER, the patient received
- Type (Discrete/Continuous) medication used in each
heparin 4000 units bolus, then 1000
- Strength
administration.
units per hour.
(Concentration/Amount)
Quantified description of the Eg 2: Digoxin 0.125 mg every other
- Form (solid, tablet, liquid,
drug administered in each day.
injectable, cream)
administration.
Route [Attribute]
Method for administering
Eg 1: She continues to receive
- PO, IV, Topical, Epidural,
the medication.
antibiotics intravenously.
Sublingual, Intramuscular, etc. A list with abbreviations,
Eg 2: Glyburide 5 mg orally twice a
see Appendix 4
day.
Frequency [Attribute]
How often each dose of the Eg 1:A patient was prescribed
- Times a day, etc.
medication should be taken Melphalan 5mg (1 tablet) daily.
- Specified time of day or hours including both discrete and Eg 2: Labetalol 300 mg by mouth
continuous values.
three times a day.
Table with Abbreviations,
see Appendix 5
Duration [Attribute]
How long the medication is Eg 1: The patient received Taxol for
- Days, weeks, months, etc.
to be administered.
one month.
Eg 2: Continue home medications
and Flagyl 500 mg 1 tablet p.o. q.i.d.
for 10 days.
2
Yellow highlighted terms and spans in this document indicate these are annotatable, but are not an indication of class
type.
8
You can look up drugs in the U.S. Pharmacopeia by loginng in to the library at this link, which is the
USP site. http://library.umassmed.edu/ebooks/ebooksredirect.cfm?ID=570
Comments and Examples for Inter-Annotator Agreement
Drug- substance used in the diagnosis, cure, mitigation, treatment, or prevention of disease.
Drugs are substances used for the treatment or prevention of disease
If there are two drug names in a combination drug, capture it as one span
 Lisinopril-Hydrochlorothiazide 20-12.5 mg oral tablet; take 2 tablets daily.
We annotate non-specific drug classes; for example chemotherapy, pain medication, anesthetic.
General and local are two classes of anesthetics so you would also annotate general anesthetic
and local anesthetic. We do not annotate the term drug when it does not denote a specific drug,
Brand and generic drug names:
If the drug is named along with the drug class, how you annotate it depends on the format. If the
format is “Brand name (generic name)” annotate as one span.
 Tylenol (acetaminophen)
Another format is sometimes seen in discharge notes and is a medication table, and often the
generic is a line below the brand name. You will need to double annotate associating each drug
name with the attributes:
In this case you will annotate
o Tylenol 325 650 PO every Four Hours as well as
o Acetaminophen 325 650 PO every Four Hours
A drug name may include/be followed by a percentage (letters or numbers). These are usually part
of the drug name, but may also be the dose. Do not annotate dosage information as part of a drug
name. This will be handled post processing
 Latanoprost 0.05%, one drop in the affected eye once daily
Here the drug name is followed by letters - XL, referring to an extended release form. It is annotated
as part of the drug name.
 Wellbutrin XL
We are not annotating vaccines or their attributes. If a vaccine name, or part of it, is pre-annotated,
the pre-annotation needs to be removed.
9
Non-drug examples
Non-drug treatment options like fluids and blood derivatives are not annotated. Blood
transfusions, fluids, normal saline and red packed cells in the examples below are NOT
annotated as drugs:
 Given multiple blood transfusions
 Pressors continued with fluids.
 He was admitted to the hospital and hydrated with normal saline.
 Pancytopenia, treated with G-CSF, erythropoetin and red packed cells
Not annotating drug
Do not annotate drug in drug relationship phrase and similar contexts since it does not refer to a
specific drug:


According to the pharmacovigilance center reporter and to French methodology of
causality assessment, the drug relationship is unable to determine
According to the pharmacovigilance center reporter and to the French methodology of
causality assessment, drug relationship is probable
Do not annotate social self-medication with alcohol, tobacco, IV drugs, street drugs, etc.
Only annotate what is medically provided for an indication, or self-medication with legally
obtained OTC drugs. [Substance abuse can be factors impacting outcomes and treatment, but
we are not annotating them here where the focus is on adverse events of prescribed
medication.]
Dosage (and form) – the amount of drug in a unit dose (physical shape of constituent substances in the
product)
You can associate more than one dosage annotation in a single span with a drug. This helps in
annotating dose and form.
If there are two concentrations written out due to the medication being composed of two drugs,
annotate both concentrations.
o Lisinopril-Hydrochlorothiazide 20-12.5 mg oral tablet; take 2 tablets daily.
We annotate x2 tabs a day as “2 tabs a day” span (not “x2 tabs ").
Annotate non-specific dosages such as a small amount, low-dose, high-dose
Percentages (and sometimes numbers) are usually part of the drug name, but it is also the dose.
Annotate this dose.
 Latanoprost 0.05%, one drop in the affected eye once daily
Annotate concentration and form in one span where possible, otherwise more than one span can
be associated with the drug
10
o
o
0.5mg 1-2 tablets
Amlodipine 10 mg oral tablet take 1 tablet by mouth every day
Examples of items that may be annotated as dosage are cream, enteric coated, patch, chewable
forms of medications. Annotate the form along with any other indications of dosage. Caution in the
case of fentanyl patch 1 patch daily, 1 patch daily = dosage whereas fentanyl patch is a route
(transdermal). NOTE: deciding between dosage, form and route can be tricky. If unsure, first look in
the appendix, next steps are to ask an editor or one of our MDs.
Route – path by which a drug is taken into the body
An epidural is an injection into a physical space so it is a route.
Note you may commonly see “Patient receives injection of drug X”. This is annotated as Route (not
Dosage). Dosage would be if the number of injections was provided.
o If Blood sugars are low do not take insulin injections
Examples
•
combivent inhaler or 2 puff by inhalation four times daily
• "spray" in "Lidocaine spray"? see Appendix 4 for OROPHARYNGEAL Administration directly
to the mouth and pharynx.
• " ophtalmic suspension" in tobramycin ophtalmic suspension see Appendix 4 for
OPHTHALMIC Administration to the external eye.
Other examples of Route may be found in Appendix 4 include: infusion, “swish and spit” and rectal
suppository, epidural injection
Frequency – rate of occurrence
See Appendix 5 for common abbreviations for frequencies of drug administration.
The adjective “weekly” is annotated as frequency, e.g., weekly Taxol
Use prepositions where appropriate (meaningful) in annotating frequency such as in “with meals” or
“before meals”
Other frequencies include AM, PM, in the morning, in the evening, etc.
We do not annotate days when they are just a point in time, but “days” in the example are
describing a frequency, so be aware that you need to annotate based on context. This is common
in chemotherapy regimens
o Patient is presenting for induction VTD chemotherapy for multiple myeloma. I plan to
administer velcade 1.3 mg/m2 on Day 1, 4, 8, 11; thalidomide 100mg daily day 1-12 then
200mg daily day 13-24; and decadron 40mg Day 2, 3, 4.
Example of Days as Frequency (is annotated)
 Warfarin 5mg Mon., Wed., Fri. and 7.5mg Sat., Sun., Tues, Thurs
11
Example of Day/Cycle just indicating a point in time (and not annotated)
o Patient is today presenting for a follow-up visit. He is currently Day 8 of Cycle 2 of
CHOP chemotherapy for his non-Hodgkin’s Lymphoma.
Watch for cases where a whole span is frequency, this can be confused as duration. In the case
below “each day for 2 weeks per month" is a frequency. A duration would be denoted "for 6 months”
o chemo drugX, doseY, each day for 2 weeks per month
Duration – continuance in time; length of time
Example of Cycles as duration
 Patient is presenting for induction of VTD chemotherapy for multiple myeloma. We plan to
administer 3-4 cycles then do an autologous stem cell transplant.
 Patient is here for a follow-up visit. Patient is s/p 3 cycles of VTD for this multiple myeloma.
Example of time span that is not a duration. In the example below, the patient has taken the drug
for 3 weeks but this does not mean the patient was prescribed a 3-week course of Zoloft.

The patient is taking Zoloft over the past 3 weeks
You need a range for a duration, so a started on date is not sufficient information.
More Drug Annotation Examples
When more than one dosage is associated with a drug, double annotate it.
OxyContin 10 mg in the morning and 30 mg at night is annotated drug with relation to first
dosage, drug with relation to second dosage
o OxyContin 10 mg in the morning and
o OxyContin 30 mg in the evening
When there is redundant information, capture the first instance.
MULTIVITAMINS (TAB-A-VITE) 1 TABLET = 1 TABLET Oral TABLET DAILY STAT and then Routine for 30 Days
Directions: 1 tablet oral DAILY








carbamazepine 200 mg extended release 2 tablets twice daily spiriva 18 mcg inhalation capsule one daily,
Glipizide 5 mg, a half tablet daily.
Tramadol 50mg tabs 1-2 tabs every 4 hrs
Calcium + D 500mg/400IU 1 tab po BID
Lisinopril-Hydrochlorothiazide 20-12.5 MG Oral Tablet; TAKE 2 TABLETS DAILY;
albuterol 2.5 mg/3 ml 1 unit dose in nebulizer every 4 hours as needed,
Methyldopa 1000 in the morning and 500 in the evening/. Methyldopa 1000 in the morning and 500
in the evening
 Latanoprost 0.05%, one drop in the affected eye once daily.
12
Medication Annotation: Medication Related Entities and Attributes
Indications and Adverse Events
Elements beyond the drug administration to annotate include: why a drug is being given, the injury
resulting from a medical intervention related to a drug, and differentiating the ADE from other signs and
symptoms.
NOTE: Annotation Practice covers how to choose a span when annotation Indications, Adverse Events
and SSLIF.
Field
Indication [Entity]
[annotated in the class
navigation bar and appears as
the Drug attribute “Reason”
when a relation is created]
Definition
Example
Medical conditions for which Present: The patient was diagnosed
the medication is given in with hypertension and was treated
the past or the present.
with Accupril.
Past : He did have some
hypokalemia which was treated with
p.o. K-Dur
Adverse Event (AE) [Entity]
Drug related injury to a
[you can relate more than one patient.
AE to a drug]
Present: She experienced a
hypersensitivity reaction while
receiving intravenous Taxol
(paclitaxel) therapy.
Past: Patient had anaphylaxis after
getting penicillin 10 years ago.
Adverse drug events (ADEs) are a primary focus of this project and are injuries resulting from drugrelated medical interventions. These are medical signs and symptoms associated with administration of
a medication.
ADEs are an “injury resulting from the use of a drug”. It is an event caused by a drug at a normal dose
in normal use. This includes drugs as a single dose, with prolonged use, withdrawal of a drug or from
drug combinations. Note that you may see non-definitive language when an ADE is suspected. You
must make the determination based on the words and context and the ability to create a linkage to a
drug.
13
Comments and Examples for Inter-Annotator Agreement
Indication
When medications are given prophylactically or empirically, that is an indication for the drug. In the
example below, antibiotics is annotated as the drug since it is a drug class and preoperative is the
indication
o preoperative antibiotics
o empiric antibiotics
o omeprazole for GI prophylaxis
Note that prophylaxis is an indication, but it is often associated with more specific information such
as GI prophylaxis, DVT prophylaxis, PEP prophylaxis. Include this information in the annotation.
Do not double annotate events as Indications and SSLIF. If a medication is given to treat the
complaint, it is an Indication.
Examples:
o Paroxetine for depression.
o a history of a remote CVA in the setting of chronic atrial fibrillation. He has been
continued on warfarin therapy for thromboembolic protection and has apparently done
well [drug=warfarin and there are three indications]
o The patient was found to have a left lower lobe pneumonia and currently is on
ceftriaxone and azithromycin. [pneumonia is the one indication for both drugs:
ceftriaxone and azithromycin]
Adverse event
Adverse drug events (ADEs) are injuries resulting from drug-related medical interventions. These are
medical signs and symptoms that are associated with administration of a medication. Note that you
may see non-definitive language when an ADE is suspected. You must make the determination based
on the words and context and the ability to create a linkage to a drug. Note: more than one AE can be
linked to a drug.
Examples:
o
“nonspecific ST-T wave changes consistent with digoxin effect”
Drug=digoxin, AE= nonspecific ST-T wave changes
o
“…she got her first Reclast infusion and later that day, she developed a fever and
nausea…”
Drug=Reclast, Route=infusion AE=fever, nausea
SSLIF [other signs and symptoms] are described later but note that suspected ADEs are
annotated with the assertion possible. They are also annotated as an SSLIF (present).
14
o
“….cirrhosis that was thought to be precipitated by oral contraceptives…”
Drug=oral contraceptives (OC), AE=cirrhosis with possible assertion
SSLIF=cirrhosis with present assertion
Caution on annotation of allergies. Allergies are only considered an adverse event when it is related
to taking a drug. Other reactions to food and non-prescribed substances should be annotated as a
SSLIF. In the example below, the highlighted items are for the ADEs. Eggs and bee pollen would be
SSLIFs.
o ALLERGIES: Multiple allergies including erythromycin, Celebrex, Darvocet, eggs,
Levaquin, Reclast, bee pollen and Vicodin
It is okay to include annotation of words associated with allergies such as drug, med, medical
i.e. no “drug allergies” (assert absent)
MedDRA Annotation
Adverse effects are mapped to the concepts from MedDRA. Therefore, the number of ADE and
MedDRA terms should be the same for any given record.
To search for MedDRA concepts, you can use http://ummsres14.umassmed.edu/OntoSolr/browse from
anywhere except the Annotation Server. On the Annotation Server, please use this URL to search and
browse MedDRA terms: http://ummsqhslxweb01.umassmed.edu/OntoSolr/browse
See the UMass BioNLP Annotation Resource Page: for videos demonstrating MedDRA annotation.
Annotate adverse events with the MedDRA term that best applies to the span.
 “pain in the left back and the left upper abdomen” is annotated as one span. If pain is an
adverse effect, there are actually two different MedDRA matches: back pain
(MedDRA:10003988) and abdominal pain (MedDRA:10000081). Use the more generic
MedDRA term for pain to relate to the full span (MedDRA:10033371).
 Other common ADEs Include: Nausea MedDRA:10028813; Vomiting MedDRA:10047700;
Dehydration MedDRA:10012174; Pyrexia (for fever) MedDRA: 10037660;
Drug allergies are common adverse effects is assigned "Drug hypersensitivity" MedDRA:10013700
Some drug allergies are not drug hypersensitivities, for instance in the case of GI sensitivity, or
abdominal pain. A patient has an allergy to erythromycin that causes abdominal pain. GI pain is
intolerance and common side effect - is commonly put in allergy section so patient doesn’t get it
again. Annotate as “Drug intolerance” (MedDRA:10061822)
Contrast media is not a drug and not routinely annotated. However it is a pharmaceutical and we
will annotate adverse events it may cause. Use the entity “Drug” and reactions to it are an ADE.
o allergy to iodine contrast use “Iodine allergy (MedDRA:10052098)”
o contrast allergy use “Contrast media allergy (MedDRA:10066973)” Note that this is different
from “Contrast media reaction (MedDRA:10010836)”
15
Drug allergy is considered an Adverse Event in the past if a specific drug is mentioned (e.g.
ALLERGIES: IMDUR). The allergy is assigned "Drug hypersensitivity" MedDRA:10013700 and
"History" value to the Period Assertion and “Present” to the Presence Assertion. Note that in spans like
(no) drug allergies, drug allergy is annotated as SSLIF and not as Adverse Event since no specific drug
is mentioned.
Multiple drugs are given and an AE is noted. In this case these are all antibiotics that are associated
with causing diarrhea. The doctor does not distinguish so all drugs are linked to the AE.
 Vancomycin and ceftazidime were started. His pheresis catheter was not
functioning and the nurses could not draw back and it was removed. Prior to his admission, he
was home on levofloxacin, acyclovir, and posaconazole. It was noted that his white blood count
on admission was 100 cells and has risen to 400 since that time. The patient reported some
chills and the patient noted some loose stool up to about 4-5 times a day.
SSLIF and Severity
Field
Signs, Symptoms, Abnormal
Test Findings, and Diseases
(SSLIF)
Severity
[an attribute of Indication, AE
and SSLIF]
(annotated in class navigation
bar, but must be added in the
right annotation window for
Indication and SSLIF)
Definition
Medical signs, symptoms
and diseases that are
neither adverse effects nor
reasons for administering a
medication.
Intensity of an adverse
effect.
Example
The patient has a history of COPD.
Eg 1: Severe headache, moderate
chest pain.
Eg 2: The PLB has 50% stenosis
just proximal to a widely patent stent.
Comments and Examples for Inter-Annotator Agreement
SSLIF
When annotating a SSLIF, include the location of where it is occurring if provided and if it can be
part of a contiguous span. Locations do not need to be highly specific.
o “headache in the back of the head”
o “headache more pronounced occipitally”
Watch for general terms which can also be part of disease names and are annotated. For example,
recurrent is often part of the disease name in cancers. In the example below recurrent/recurrence is
used twice as a diagnosis and twice as a general term. See Appendix 6
o The patient remained cancer free after that for 9 years until he had laryngeal/vocal cord
16
recurrence. Pathology returned recurrent lymphoma, natural killer cell subtype. This is
now the third recurrence in the paranasal sinus area and fourth overall recurrence.
Sometimes terms that you usually think of as a SSLIF is used in a different way and is not
annotated. In the example below, the patient does suffer nasal discharge because of the lymphoma
of the sinuses. Here, however, nasal discharge and respiratory sputum act like test materials not
symptoms.
o
There were never any positive results on the cultures, either from nasal discharge or
the respiratory sputum
Severity
Severity is often indicated by modifying words such as mild, minimal, markedly, severe, endstage,
small, extremely, substantial. Severity terms can also be phrases such as: borderline to slightly
high, moderate-to-severe.
 He is rather diffusely tender to palpation Here ‘rather’ can be a severity meaning ‘to some
degree’
 Middle turbinate was severely inflamed
Note that some terms you might consider as severity are actually part of the disease name and are
not annotated. For example, large is part of the name “large B cell lymphoma” or “Chronic Heart
Failure (CHF) exacerbation”
Disease stages are specific indicators of severity and should be annotated as severity. A relation
must be created between a severity assertion and the SSLIF
o a history of natural killer/T-cell lymphoma <...> At that time, the patient had radiation
for stage I disease
Medical use of the following terms can be used as a disease name or temporal indicator, These
words are not severity: acute, chronic, acute-on-chronic, flare. Annotate the example below as
follows:
o significant flares of fibromyalgia
We only annotate numbers or references to quantity as severity when there is a frame of reference
or scale such as %, mm, X out of Y, etc. Annotate where you can understand the meaning,
otherwise it is diagnosing. EXCEPT if it is associated with an ADE – these events will be
manually reviewed to assign CTCAE severity scores.
o fever greater than 100.5 - fever is annotated as an entity but the temperature is not
annotated as severity
Multiple and innumerable would not be annotated in the examples below.
o innumerable 1mm non-blanching papules
o multiple 1mm papules
17
Annotate pain in his back twice to create a relation to each – (1) severe and (2) “10 on a scale
of 1-10”. 8 is not annotated since in isolation there is no scale.
o Remarkable for the aforementioned severe pain in his back which she states is,
without pain medicine, 10 on a scale of 1-10. At the moment, it is down to 8.
Some” and “somewhat” are terms that can be used for descriptors of severity or as quasi severity
quantifiers
 “Some pain" means approximately "mild pain." Some is vague but it is a description of
severity, i.e. not very severe, and can be useful.
o Don’t annotate when used as a quantifier such as “some blisters”
We do not annotate words that indicate severity dynamics such as worsening or increasing,
decreasing (we do capture decreased and increased as severity).
Use your judgement. The word "frayed" is sometimes annotated as SSLIF, sometimes as severity, it
is context dependent. In the example below, breakthrough is not a modifier. It is referring to severity
and is annotated as such.
 Breakthrough pain is both sudden and severe
A severity term may refer to more than one entity. Make a relation to each applicable entity.
o Severe rash and redness. In this example, severe has relations to both rash and
redness.
o Large mass measuring 5cm. Large and 5 cm are both severities for mass.
In general avoid redundancy in severity annotation. For example: some mild, annotate only mild.
Sometimes a second word modifies the first and it is necessary to capture both. For example very
slight shows less severity than simply slight.
18
Choosing A Span
When there are SSLIFs written as a comma separated list (often in a review of symptoms), this may be
annotated as a span, but it should be separated into individual terms. “no murmurs, gallops or clicks”.
Similarly separate terms with a / between them as in “rash/blistering”
Annotate locations as part of a span.
Annotate the accurate span even if it is long, and include coordinated locations. For example:
 “fibromyalgia causing pain in the neck and paracervical region and down the arm” This is a
long span but the pain is in both the neck and arm.
 “adenopathy in the supraclavicular or axillary regions”
Annotate the location as part of a span when it is contiguous (a single span without breaks).
Otherwise do not annotate location terms that are distant to the S/S.
 pain of the lesion on the right shoulder
 swelling on the right shoulder. It is in the anterior aspect of the shoulder….
 pustular collection underneath the end of the nail about 5 days ago. It is her right middle
finger.
If two SSLIF occur in the same location these are annotated separately, and only one will include the
location.
 pain and swelling of his leg
In general we do not include prepositions except where they provide meaning or create a contiguous
span to include locations and coordinated locations relative to a SSLIF. See the section on Prepositions
for examples.
19
Assertion Categories
– [where assertions are for the highlighted entity]
Outcome Assertions (default notMentioned)
Annotate the Outcome field for adverse events where possible. It has four values: recovered, not
completely recovered, died, not mentioned (most common). The default value for this field is
notMentioned.
You may see an AE mentioned early in the note and much later the outcome is said to be resolved by
drug X in the same note, it is okay to put outcome as "resolved." If this is longitudinal information from
another note, you do not annotate in the current note. Only annotate what is in the note you are working
on and only annotate outcome if it is specified in the narrative.
If an adverse event's Assertion is Absent, the Outcome field is not annotated.
Period Assertions (default current)
Annotate temporal information for several entities: Adverse effect, Indication and SSLIF. This is done
using the Period attribute. Period values are: current and history. The default value for this field is
current.
Current
Current usually refers to the complaint being referred to in the visit. This is usually described in “History
of Present Illness” section.
 Note: this section is often in the past tense since (1) the visit is to discuss an event that has
already occurred and/or (2) the doctor is writing notes after having seen a patient so even the
current visit is discussed in past tense.
 Caution: this section will often also contain references to the past and future.
Examples:
o
o
She comes with history of sudden episode of nausea, vomiting and diarrhea. She says
she was apparently all right until the night before yesterday.
She returned from a trip three weeks ago where she has had pain in the bottom of both
feet
If the focus is on an ongoing symptom or disease the event is annotated as current, even if it
started in the past.
o
o
The patient describes daily headache and throughout most of her life
She continues to have constant pain
"Known" or "chronic" symptoms or diseases are annotated as current if there is no context favoring
history
o
Patient with known ulcerative colitis who presents with lower gastrointestinal bleeding.
20
o
Patient with chronic respiratory insufficiency
Symptoms or diseases in Family History section are current by default if there is no context favoring
history
History
History markers are: "history of", "old", "past", "in the past", "prior", "status post", reference to a
date or prior period, etc.
o She has a history of fibromyalgia.
o She did have evidence of old ischemic changes.
o In the most recent past, cervical cytology in [Date, 2 years from the visit date] revealed
LGSIL and was HPV negative.
o NHL diagnosed in 2005.
o head injury as a child
o She never had another panic attack
Symptoms or diseases in “Past Medical History” section are annotated as history if there is no
context favoring current.
o PAST MEDICAL HISTORY: Fibromyalgia, hypertension, hyperlipidemia, sleep apnea,
degenerative joint disease.
o PAST MEDICAL HISTORY: Paroxetine for depression
Annotate what is in the record you are viewing. Do not make assumptions or consider longitudinal
information you may know. The purpose of annotation is machine learning. Patterns will be learned.
 “he no longer has the abdominal pain that he originally presented” is annotated as SSLIF for
abdominal pain with the assertion of absent (vs. history, absent).

Section headers such as PAST MEDICAL HISTORY and ASSESSMENT/PLAN provide
context for “history” or “current” although current is the default value and specific annotation
for Period is not required.

History is heterogeneous and can mean both what happened in the past but stopped and
what happened in the past and still happening. You may see in PAST MEDICAL HISTORY
things like hyperlipidemia, hypertension or a medication. In this case, annotate those SSLIF
as “history”. In HISTORY OF CURRENT ILLNESS, it is “current”

In the ASSESSMENT/PLAN” the patient is still actively being treated with the medication for
hypertension or hyperlipidemia. Annotate the SSLIF and the default is current. This is
common for chronic diseases such as cancer, cardiovascular disease or diabetes.
Examples of using context:
The patient is not cured but annotate as history because it is in context of note
 “Status post stage II nodular sclerosing Hodgkin's disease.” (period assertion = history)
21


PAST MEDICAL HISTORY She also has hypercholesterolemia, irritable bowel syndrome,
arthritis, dry eyes. [ even though it is in PMH section, annotate as current because the verb
is present tense)]
FAMILY HISTORY: Mother with a history of rectal cancer [follow the cue of the word history
even though it is in Family History which is often annotated as current}
Presence Assertions (default is present) (modality) expresses a speaker’s degree of commitment to the
expressed proposition’s believability, obligatoriness, desirability, or reality. Ascribe assertion values to
medications and diseases, namely, to “drug”, “adverse effects”, “indication”, and “other signs,
symptoms and diseases” entities.
Options are below but NOTE that past tense also requires inclusion of the Period Assertion “history”:
 Present – entities are or were present
 Absent – entities are or were not present
 Possible – any level of uncertainty about that an entity is or was present or absent
 Conditional – entity occur(s/ed) only under certain conditions
 Hypothetical –conjectures, based on a suggested idea or theory and often in if/then scenarios.
 NotAssociatedWithPatient – entity occurs in relation to (a) a family member or (b) general
population
Presence (default)
Presence (Present/Positive) means that entities associated with the patient are/were there;
exist/existed.
In our annotation, the positive value ‘present’ is the default value, i.e. if an entity does not have any
assertion value ascribed it means that the value is positive/present. The drugs the patient receives are
also annotated as Present.
Helpful Hint: It is very easy to confuse the Presence Assertion “Present” and the Period Assertion
“current”.
Examples:
 a female patient died while receiving Taxol (Paclitaxel) therapy for the treatment of endometrial
cancer [Drug and Indication are both Present]
 The patient had a history of hypertension
 She is on oxycodone 10mg for pain
*Replaced, held or discontinued drugs are annotated as “positive (present)” and not as “absent”, since
they used to be taken vs. never taken.
22
Comments and Examples for Inter-Annotator Agreement
Drugs are always annotated as present since they are being taken or used to be taken vs. never taken.
o At this point in time, he does not require any more antibiotics.
o she has since been discontinued on digoxin
o His enalapril was changed to lisinopril
o His aspirin was held
In the example below, the entity “anaphylactic shock” is ascribed a presence/present/positive value,
since it did occur. It is only its relation to Taxol that is questioned or negated and we do not ascribe
assertion to relations in the current schema.
o The anaphylactic shock was possibly related to Taxol (relation between Taxol and
anaphylactic shock is Adverse)
o The anaphylactic shock was most likely related to Taxol (relation between Taxol and
anaphylactic shock is Adverse)
o The anaphylactic shock was not related to Taxol (no relation annotated between Taxol
and anaphylactic shock)
Even if certain symptoms were identified as being part of another symptom and deleted from the file or
renamed, they still did exist and need to be annotated as positive (default).
o The second episode of malaise, loss of consciousness, undetectable pulse, and tension
were identified as being part of shock. Since these were considered manifestations of
the shock and anaphylactoid reaction, the previously reported separate events of
dyspnea, malaise, abdominal pain, and erythema have been deleted from the file
o Supplemental information received from the reporter via BMS Japan on January 15,
2002 indicated that the events dyspnea, blood pressure decreased and facial hot flushes
were changed to anaphylactic shock
Narratives can be vague. Here the use of “not much” is questionable, but “not much is not “none” so
the assertion for highlighted entities is present:
o He has no complaints of pain, dyspnea, or dysphagia. He really has not had much
crusting or discharge from the nose, either.
Absence
“Absent” asserts that the problem does not exist in the patient. Also annotate drugs the patient did not
receive as Absent.
Examples (entity requiring the assertion is highlighted):
o no known drug allergy
o the patient denied any dizziness, shortness of breath…
o Without syncopal episodes
o The patient currently is pain free
o There were no clinical signs of congestive heart failure
o She is not a candidate for anticoagulation
23
o
o
The patient had no fever
No antibiotics were given
Do not annotate the outcome of “absent” adverse events. (If you have no adverse event, you have no
outcome.)
Create relations between “absent” adverse events or indications to drugs the same way we relate
“present” ones.
o He receives no chemotherapy for his lymphoma
Possible
”Possible” asserts that the patient may have a problem, but there is uncertainty expressed in the note.
This assertion covers the range of possibilities from likely/probably to unlikely/doubtful.
Examples (entity requiring the assertion is highlighted):
o Questionable DVT
o Question of DVT
o Their differential is gliomatosis versus radiation effect.
o Possible anterolateral ischemia
o a consult was placed to rule out CAD
o Rule out congestive heart failure but doubt
o The differential diagnosis for his fever included possible inadequately pneumonia versus
bacteremia versus UTI versus CSF infection
o It is not likely that histiocytic non-Hodgkin’s lymphoma would go to the kidney
Conditional
”Conditional” is used when the mention of the medical problem asserts that the patient experiences the
problem only under certain conditions.
 DVT prophylaxis per surgical protocol, but he should receive Lovenox bridging to Coumadin as
deemed safe by surgery
 shortness of breath with exertion
Hypothetical
“Hypothetical: is used for medical problems the patient may develop; for a conjecture.
Examples:


Should her symptoms return or headache develop, please discontinue to taper and
notify Dr. **NAME[ZZZ]'s office.
Call Dr. X if increased swelling or redness of the left lower extremity or starts to have
difficulty breathing
24




If nausea or bleeding develop, return to the emergency room.
Notify MD For Any of the Following : Increased Trouble Breathing or Chest Pain, Fever Temperature Higher Than 101 Degrees, Pain Not Improved by Medication
Vancomycin is a possibility in the future
Please return to the ED if you experience any concerning symproms such as chest pain,
dizziness, severe abdominal pain, nausea, vomiting, blood in your stools or black stools.
Hypothetical vs Conditional
If/than words are cues to a hypothetical followed by a conditional response
 “If normal, treat with oral anti-inflammatory medications and” Here the anti-inflammatory
medication is annotated as ‘conditional’.
Should symptoms return would indicate using the assertion ‘hypothetical’.
In this example, the drug is conditional and the indication is hypothetical
o If the Gram-stain showed gram-positive cocci in chains we will consider adding
vancomycin
o If his vitamin D 1,2,5-OH is not elevated, I will start treatment with VitD
o If the Gram-stain showed gram-positive cocci in chains we will consider adding
vancomycin
Not associated with Patient
The mention of the medical problem is associated with someone who is not the patient (family or
people in general, such as educational materials).




Family history of prostate cancer
Brother had asthma
The surgical procedure and risks including infection, bleeding, blindness, meningitits, CSF leak,
and brain damage were discussed
The most common diabetes symptoms include frequent urination, intense thirst and hunger,
weight gain, unusual weight loss, fatigue, cuts and bruises that do not heal, male sexual
dysfunction, numbness and tingling in hands and feet.
If needed, the classification can be further detailed. For example, a drug can be “absent” because the
doctor did not recommend it, because the patient refused it; or the probability of a disease can vary
from very low to very high.
Assertion values should not exclude each other.
A span can be assigned two assertion values from different categories:
 no family history of diabetes: history+NAWP
 Patient’s mother has cancer: current+ NAWP
25


No family history of breast or ovarian cancer: NAWP and history
No family history of skin cancer’: NAWP and history
Had cancer years ago: Present and history
It is possible to have sentences containing both present and absent information.
 Patient with chronic Hepatitis C “denies any sequela of hepatitis” is annotated as 2 spans:
hepatitis is present and sequela of hepatitis is absent
Drugs assertions
Drugs are usually annotated as present.
If the drug mentioned has or is being taken, annotate as present.
If the drug is mentioned as never having been used, annotate as absent.
If the drug has possibly been used but there is some question, annotate as possible.
If the drug is mentioned in the context of being considered for future use, annotate as hypothetical
26
Annotation Of Relations
Annotate relations are (connections) between entities and their attributes. See Appendix 3 for a full
table of possible relations.
Dosage, route, frequency, duration, indication and adverse event are drug attributes and are related to
their drugs. For example, in Albuterol 2 puffs p.o. “2 puffs” is linked to “Albuterol” by “Dosage” relation
and “p.o.” is linked to “Albuterol” by Route relation. Severity can be linked to Indication, Adverse Event
or SSLIF.
Examples:
Drug’s Relations
Context
She receives Albuterol 2 puffs p.o. q4-6h.
The patient was treated with ampicillin for two
weeks.
He later received chemotherapy for his lung
cancer.
Patient's death was due to anaphylactic shock
caused by the intravenously administered
penicillin.
SSLIF/Indication/ADE attributes
He has severe diarrhea.
Relation (:)
Dosage (Albuterol : 2 puffs)
Route (Albuterol : p.o.)
Frequency (Albuterol : q4-6h)
Duration (Ampicillin : two weeks)
Reason/Indication (lung cancer:
chemotherapy)
Adverse (penicillin : anaphylactic, shock)
Severity (diarrhea : severe)
27
ANNOTATION PRACTICE
General Considerations
DO






DO annotate the meaning of the words in the text you are annotating. Context is important. It
may be that some things are annotated differently in the same or adjacent sentences (this is
likely most frequent with current and history). It may mean you are able to annotate something
in one context and not another.
DO annotate misspelled words
DO include periods in terms that normally have them (but do not go nuts over this); p.r.n., p.o.,
DO remember to annotate negations with the Presence Assertion “Absent”
o Annotate negations and negated words with the Assertion “Absent” so it will be detected
in the negation algorithm. Negated word examples are: nontender, anicteric,
asymptomatic. (This is counterintuitive but is a flag for machine learning)
Do make relevant relations, regardless of distance between terms
Do remember the panel on the right side of the tool has fields. There is a scroll bar to see them.
DO NOT
 Do not make assumptions, do not infer.
 Do not consider longitudinal information in this workflow– annotate information in current record
 Do not diagnose
 Do not annotate a patient’s mistaken beliefs when medical professional commentary is
contradictory
 Do not annotate general terms such as “problem” and “disease”. They are not informative. See
Appendix 1 for more examples.
 Do not annotate procedures or tests.
 Do not annotate normal except when they are negated abnormalities such as non-tender and
atraumatic (and then use the assertion absent)
 Do not annotate parts of words Nontender vs nontender and non-tender
 Do not annotate parts of phrases. For example do not annotate “hernia” if used in the phrase
“hernia repair”
o hernia repair vs hernia repair or hepatitis panel vs hepatitis panel
 If we do not annotate the entity we do not annotate its attributes either
NOTE


Not everything is annotatable. It is okay.
Note “acute” in medical parlance is a temporal term meaning sudden or rapid onset and/or a
short course. The opposite is “chronic”.
28



Be aware of terms that are a category of disease. In cancers recurrences reflect a category and
they are treated differently than the initial occurrence. Recurrent lymphoma would be a disease.
(also see other lymphoma examples in Appendix 6).
Remember Present, Absent, and NAWP are in the same category and are mutually exclusive.
Remember Present does not mean Current.
Anaphoric pronouns
Anaphoric pronouns are the pronouns that refer back to another word or phrase. We do not annotate
anaphoric pronouns like it or this in examples below even though these refer to entities we do annotate:
The patient had diplopia but it was resolved completely.
The patient had anaphylactic shock. This was caused by antihistamines.
Anaphoric Pronouns vs. Co-referent Items
Coreference is when two or more expressions in a text refer to the same thing. We do not annotate
anaphoric pronouns. However, we do annotate other coreferent items.
 She fell down a flight of stairs and hit her head. The injury caused chronic seizures
 The patient presents with nausea, vomiting, and abdominal pain. The patient describes the pain
as crampy in nature.

In the example below, “it was down” is meaningless on its own as well as “it” being an anaphoric
pronoun, so annotate just the first half
 Blood pressure was high 180/110, rechecked it was down to 137/100.
Articles
Indefinite article "a" is not included in annotated entities: in the noun phrase a malignant tumor of the
breast, the span annotated is “malignant tumor of the breast” not “a malignant tumor of the breast.”


Definite article "the" is not included in disease names, either. In the example below the adverse
effect is “anaphylactic shock” not “the anaphylactic shock”:
The anaphylactic shock was characterized by nausea.
Titles
Titles are study names, references to consumer or biomedical literature. Section headers (usually in
CAPS) are titles. Common section headers are: PAST HISTORY, REVIEWS OF SYSTEMS, PAST MEDICAL
HISTORY, PLAN ASSESSMENT, MEDICATION, ALLERGY, CHIEF COMPLAINT
However, there are cases where you may want to annotate a title. ALLERGIES and PAST
CHEMOTHERAPY may be appropriate to annotate in a relation.
If indication is the title in a list of symptoms, you can use it to associate with drug.
If the phrasing of the sentence with the drug also makes the association clear, use the title as well.
29

Joint pain. He is willing to try glucosamine to help with joint pain but……
Certain adverse effect reports include clinical trial title, for example:
 Protocol title: (NON-BMS/RETRO TAXOL) RETROSPECTIVE DATA COLLECTION TAXOL IN
PATIENTS WITH SOLID TUMORS. Investigator causality assessment was not provided.
Do not annotate the drug name (Taxol) and its related information in the title, since the name of the
clinical trial may include drugs that an individual patient in that clinical trial does not receive. For
example a clinical trial might have this name: "A randomized, controlled, blinded clinical trial comparing
miraclecillin to wondersporin." In this trial, some of the patients got miraclecillin and other patients got
wondersporin, but no patients got both.
Entities like Suspect Drug/Causality in the example Suspect Drug/Causality: paclitaxel
are treated like titles and not annotated.
AE in the example below is not annotated either:
 AE outcome: The patient experienced death on [words marked]
EMR section title ALLERGIES can be annotated, it is often the only mention of an ADE with a list of
drugs.
 “ALLERGIES: amoxicillin and vancomycin” Allergy is an adverse event and is linked to each
drug separately.
 “ALLERGIES: none” Annotate allergies where there is no drug named as a SSLIF with the
assertion “absent”
Do not annotate Departments or Specialties (not a SSLIF, Indication or ADE). Infectious disease below
refers to a dept, specialist, specialty.
 Infectious disease concerns could be a fungal infection or bacterial infections. an infectious
disease consultation was obtained
 the patient was placed on iv antibiotics which will be started as per infectious disease
recommendations
Prepositions
Avoid prepositions except where they provide meaning or create a contiguous span such as is to
include locations and coordinated locations relative to a SSLIF.
Do NOT include prepositions examples are:
 When annotating duration spans, e.g. for three weeks or for an unknown period of time, we
do not include for in duration spans.
o In the noun phrase via intravenous drip we only annotate the intravenous drip span”
DO include when prepositional phrases
• Add meaning such as in frequency spans
30
•
•
– "before/after meals" and "with meals"
– Around three weeks
– Between 1 and three weeks ago
Complete a span to include a location
– itching of her skin
– Scaling patches on her legs bilaterally
Create a contiguous span about the SSLIF
– pain on all movements/ pain with abduction
– range of motion is limited on internal and external rotation
– dyspnea on exertion
– Significant abnormalities of retroperitoneal lymphadenopathy
– liver is diffusely increased in echogenicity and coarsened
Test results
Commented lab results are annotated.
 “3 ova and parasites being negative, Giardia being negative in a stool culture that as negative.”
Annotate these items as they are comments and are asserting they are absent.
Results presented in paragraph form are likely from dictation where some lab results are just read in,
and some are commented upon. The section below would be annotated where there are comments as
follows:
Both normal and abnormal test result lists in the form of uncommented numbers are not annotated as
diagnoses:
Laboratory data showed sodium of 143, potassium 4.1, chloride 105, CO2 of 26,
BUN 4, creatinine 0.7, glucose 90, calcium 9.4. White count 5.4, hemoglobin
12.7, hematocrit 27.7, platelet count 247.
We assume that if a certain test or measurement result is significantly abnormal, the diagnosis is
mentioned in the text separately. For example, if a patient’s blood pressure was 180/100, the report will
most likely mention “high blood pressure.”
31
Uncommented lab results are not annotated. Culture, culture pending are lab tests and are not
annotated. When the result is positive, you would annotate it as part of a diagnosis. Nothing is
annotated here for two reasons. We do not annotate procedures or tests themselves and we do not
annotate uncommented lab results. This may be a lab result in text, but the comments need to refer to
something to provide additional meaning such as an SSLIF, Indication or ADE.
 ‘….took a surface culture and nothing…..’ and ‘Culture pending.’
 blood culture, no growth”
Similarly attributes and values in text are only annotated when there is an interpretation associated with
it.
 The cells were cdc20 positive - This is not a SSLIF on its own, it is an attribute of a tumor.
 K+ is 50 - this is a value.
Usually a list of results in a table are structured data from a report, are not commented and therefore
not annotated.
COLOR URINE
APPEARANCE URINE
SPEC GRAVITY, URINE
pH URINE
PROTEIN URINE
GLUCOSE URINE
KETONES URING
BILIRUBIN URINE
OCCULT BLOOD URINE
NITRITE URINE
UROBILINOGEN URINE
LEUKO ESTERASE
MICROSCOPIC ?
Yellow
Clear
1.012
6.0
Negative
Negative
Negative
Negative
Negative
Negative
Normal
Negative
Not Indicated
Yellow
Clear
1.005-1.03
4.6-8.0
Negative
Negative
Negative
Negative
Negative
Negative
Normal
Negative
32
Similarly do NOT annotate microorganism names when given as a test request, but it can be annotated
as part of a result. So “BABESIOSIS ORDERED” would not be annotated. Depending on context
“BABESIOSIS SEROLOGY POSITIVE” would be because it indicates a specific infection.
“Hepatitis C, genotype 1b” Annotate genotypes, subtypes, variants, etc. when given. This may give
information related to treatment decisions.
Do not diagnose or interpret lab results. You can see the result but you cannot say if it is an SSLIF or
normal from just a number. “LABORATORY DATA: Alpha-fetoprotein tumor marker is 4.3”
Longitudinal Information
Annotate what is in the specific record you are viewing. Do not make assumptions or consider
longitudinal information you may know (temporal aspects of adverse events are annotated in a
separate workflow with Naranjo scoring).
 “HISTORY OF PRESENT ILLNESS: ……..history of Burkett’s lymphoma….” Patient is within a
few months of treatment, which is not a timetable to consider it cured. The patient in fact does
go on to have a recurrence of the lymphoma, but here it is annotated SSLIF and as history.
33
Punctuation and Spelling






If symptoms are separated by a “/ “they should be annotated separately (i.e. rash/blistering is 2
spans) except when they are abbreviations and will not have individual meaning (i.e. n/v which
is annotated as a span).
If drugs are separated by a “/ “ they should be annotated as span since they are a drug
combination
If a drug name is followed by another in parenthesis, this is the brand name (generic name) and
is annotated as a span e.g. taxol (paclitaxel)
Annotate misspelled words
Annotate words with and without capitalization
Do not annotate wrong words. Chemotherapy here is the CHOP and it is followed by radiation
therapy, not chemotherapy
o 6 cycles of CHOP followed by radiation chemotherapy
34
APPENDIX 1: Additional Annotation Examples and Information
Attending Problem: Supratheraputic INR (964.2 DRUG TOXICITY WARFARIN )linked to Coumadin)
Resident Problem: Supratheraputic INR (linked to Coumadin)
Attending Plan: No active bleeding (absent). Was 5.0 on Friday per patient. Hold Coumadin. Type and
Cross for possible FFP. Consent for blood products. Recheck INR in am. If increasing (Hypothetical),
would administer Vit K (conditional, link to INR in am)... subcutaneous and consider FFP
if any active bleeding (hypothetical).
Resident Plan: Patient has INR of 6.6 with no evidence of bleeding (absent). Stool
guaic (Absent) negative, therefore, will not reverse immediately.
Plan: If guaic (hypothetical) positive, consent patient for FFP and blood products.
If guaic (hypothetical) positive reverse INR. Consider vitamin K (conditional) If
actively bleeding (hypothetical) FFP. check CBC in am Hold coumadin.
SSLIF








“Sit hunched over…” This is a sign of the back pain being described.
“started to have some very formed stools from diarrheal stools” A “formed stool” is a normal
function, and generally not annotated. However, it may be relevant to annotate in context.
Relapse itself not a SSLIF
Be aware that Hodgkins Disease names are easily confused with severity [classical,
nonclassical, high-grade (3 types), low-grade (3 types)]
Nausea and vomiting are separate symptoms that are often seen together. Annotate each term
separately but if the abbreviation n/v is used, annotate as one span. Similarly annotate n/v/d as
one span for nausea/vomiting/dehydration
renal function has deteriorated is annotated as a span since the deterioration is part of the
description of the SSLIF and is not a severity dynamic per se
Do not annotate normal, it is not a sign or symptom or disease
Locations associated with two SSLIF will be handled in post processing. Annotate as follows:
the right lung has some rhonchi and wheezes in it.
We do not annotate locations or organ systems as entities, so you can’t mark them as absent. The
following would not be annotated.
 No change in his general medical status
 Medical status is unchanged
 no vision changes
 change in voice quality
 Speech and affect are unchanged
 cardiovascular: no change in status
 Cardiovascular: No established abnormality. <…>
 Gastrointestinal: No established abnormality.
 Genitourinary: No established abnormality.
35



Musculoskeletal: No established abnormality.
Neurologic: No established abnormality.
no abnormality identified in hypopharinx
Chemotherapy
Chemotherapy can be tricky because regimen names often contain information about drug, dosage,
and duration.
 He received CHOP chemotherapy
 CHOP chemotherapy for 6 cycles
CHOP chemotherapy is a span because it is a drug regimen and CHOP names the four drugs used in
combination. It also provides information about duration. Cycles are the number of times the treatment
is repeated at a specified time interval. See http://www.lymphomation.org/chemo-CHOP.htm
Radiation recall – and ADE from the combination of radiation and chemotherapy.
“He tolerated the conditioning regimen well, although he did seem to have a sinus pain reaction
during the BCNU infusion on the first day. This pain was felt to be infusion reaction, although
there was concern that there might have been a radiation recall”
Annotate sinus pain as the adverse reaction to the BCNU with the assertion as “possible.”
Radiation recall is annotated as OSSLIF
Difficult Examples
Indication to Drug Link Followed by Logic in How to Annotate:
/ID The patient developed febrile neutropenia 1/27 and blood cultures from that day revealed
pseudomonas and Strep viridans. Urine cultures showed 35000 colonies of E. faecalis. Cefepime (1/272/27) was initiated (synercid was not used due to a history of adverse effect: myalgias) in addition to the
acyclovir and diflucan (stopped 2/5) which were started earlier for prophylaxis. Daptomycin was started
on 1/29 for VRE. Caspofungin was started 2/7, 2 days after diflucan was stopped as fevers persisted.
The patient remained afebrile for several weeks until 2/27 at which point Cipro and meropenem were
initiated, though no microbes were initiated on culture. The patient remained on acyclovir, caspofungin,
ciprofloxacin, daptomycin, and meropenem (d/c'd 3/10), for the rest of this hospitalization, and
remained afebrile since 3/1. He will be discharged on p.o. ciprofloxacin (to d/c when ANC >1000),
voriconazole, and acyclovir--the latter two medications to take indefinitely.
Suggested Indication to Drug Links:
Pseudomonas (Cefepime)
Strep Viridans (Cefepime)
E. Faecalis (Cefepime)
VRE (Daptomycin)
Logic:
The paragraph is indicating that they did a blood culture which revealed Pseudomonas and
Strep viridans. A urine culture was also done which showed E. Faecalis. These are all bacterias.
The paragraph continues by stating Cefepime was initiated. If you research this drug you will
see its an antibiotic (Cephalosporin class) and this antibiotic is used for all the 3 bacterias
36
mentioned previously. That is why I provided the guidance as this.
Secondly, the paragraph also is clearly providing the following indication to drug relation:
"Daptomycin was started on 01/29 for VRE". That is why I provided this second guidance.
The rest of drugs on the paragraph are mentioned, but there is no clear direction as to why they
were provided.
General Terms
General terms are ones that are not informative. Judgement is required, but often descriptions with
greater specificity follow the use of a general term and annotating those are preferred.
Examples of general terms are:
Abnormality
disease
concern
illness
complaint
issue problem
complication
drug
deficit
medication
disability
supplement
diagnosis
therapy
difficulty
treatment
diffuse
Use your judgement. There are cases where it is appropriate to annotate these words and you
need to decide based on context Examples:
 arthritic issues, eye problem, cardiac abnormality
 supplement may be required to add meaning such as in "vitamin D supplement",
"iron supplement", “herbal supplement
Reaction is not a general term. It often is associated with an AE.
Double Annotation
When more than one frequency is associated with a drug, double annotate the drug and create
an association with each frequency.
Brand name (generic name) is one span for a drug name. Sometimes discharge notes have a
medication table, and often the generic is a line below the brand name. You will need to double
annotate associating each drug name with the attributes
Signs and symptoms that are suspected ADEs are annotated as an ADE with the assertion
possible. They are also annotated as an SSLIF (present).
37

Large mass measuring 5cm. Large and 5 cm are both severities for mass
Not Clear
Sometimes there is no good way to annotate a phrase (bolded) because the meaning is
unclear. In the example below, is this 2x/day or 2 capsules? Daily is the frequency but 2 can be
part of the dose or part of frequency. Do not annotate what you are not sure of unless
associated with an ADE. Report this problem.
 MEDICATIONS: Aleve 220 mg 2-3 tablets p.r.n., hydrocodone/acetaminophen 5/325
p.r.n. for pain, omega-3 at 1000 mg capsules 2 daily and tramadol 50 mg p.r.n
Normal
Normal is not annotated. Examples:
 sensation is diffusely intact
 surgical scars
38
APPENDIX 2: Protected Health Information (PHI)” and “Personally
Identifiable Information
The Privacy Rule’s Safe Harbor Method for De-identification (17).
“Under the safe harbor method, covered entities must remove all of a list of 18 enumerated
identifiers and have no actual knowledge that the information remaining could be used, alone or
in combination, to identify a subject of the information. The safe harbor is intended to provide
covered entities with a simple, definitive method that does not require much judgment by the
covered entity to determine if the information is adequately de-identified.”
1. Names; first name, last name, initials, names following titles and other indicators, login name,
screen name, nickname, or handle
2. All geographical subdivisions smaller than a State, including street address, city, county,
precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code,
if according to the current publicly available data from the Bureau of the Census: (1) The
geographic unit formed by combining all zip codes with the same three initial digits contains
more than 20,000 people; and (2) The initial three digits of a zip code for all such geographic
units containing 20,000 or fewer people is changed to 000.
3. All elements of dates (except year) for dates directly related to an individual, including birth
date, admission date, discharge date, date of death; and all ages over 89 and all elements of
dates (including year) indicative of such age, except that such ages and elements may be
aggregated into a single category of age 90 or older;
4. Phone numbers;
5. Fax numbers;
6. Electronic mail addresses;
7. Social Security numbers;
8. Medical record numbers;
9. Health plan beneficiary numbers;
10. Account numbers;
11. Certificate/license numbers;
12. Vehicle identifiers and serial numbers, including license plate numbers;
13. Device identifiers and serial numbers;
14. Web Universal Resource Locators (URLs);
15. Internet Protocol (IP) address numbers;
16. Biometric identifiers, including finger and voice prints;
17. Full face photographic images and any comparable images; and
18. Any other unique identifying number, characteristic, or code (note this does not mean the
unique code assigned by the investigator to code the data)
39
APPENDIX 3: Entity and Attribute Tables
This table indicates the attribute fields shown in the Annotation Window for each Entity type.
Note that some fields may be present but are not applicable. For example Adverse is an entity
and attribute for Adverse Effect and is duplicative.
Attribute
Entity
Adverse effect
Drug
Indication
SSLIF
Adverse
NA
Yes (>1)
No
NA
Presence
(Assertion)
Yes
Yes
Yes
Yes
MedDRA
Yes
No
No
No
Outcome
Yes
No
No
No
Dose
No
Yes (>1)
No
No
Duration
No
Yes
No
No
Frequency
No
Yes
No
No
Route
No
Yes
No
No
Period
Yes
Yes
Yes
NA
Reason
(Indication)
NA
Yes (>1)
NA
No
Severity
Yes
No
Yes
Yes
Notes:


Key:


Indication (Reason) is a SSLIF that is treated with a drug.
Severity has field for assertion but it is not used
No – field is not present
NA- field is present but not applicable
40
APPENDIX 4: Routes of Drug Administration and Abbreviations
FDA Standards Manual list of route of drug administration. For full list which includes FDA
codes and NCI concept codes, see: (19).
http://www.fda.gov/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/Electr
onicSubmissions/DataStandardsManualmonographs/ucm071667.htm
NAME
DEFINITION
SHORT NAME
AURICULAR (OTIC)
Administration to or by way of the ear.
OTIC
BUCCAL
Administration directed toward the cheek, generally
from within the mouth.
BUCCAL
CONJUNCTIVAL
Administration to the conjunctiva, the delicate
membrane that lines the eyelids and covers the
exposed surface of the eyeball.
CONJUNC
CUTANEOUS
Administration to the skin.
CUTAN
DENTAL
Administration to a tooth or teeth.
DENTAL
ELECTRO-OSMOSIS
Administration of through the diffusion of substance
through a membrane in an electric field.
EL-OSMOS
ENDOCERVICAL
Administration within the canal of the cervix
uteri. Synonymous with the term intracervical..
E-CERVIC
ENDOSINUSIAL
Administration within the nasal sinuses of the head.
E-SINUS
ENDOTRACHEAL
Administration directly into the trachea.
E-TRACHE
ENTERAL
Administration directly into the intestines.
ENTER
EPIDURAL
Administration upon or over the dura mater.
EPIDUR
EXTRA-AMNIOTIC
Administration to the outside of the membrane
enveloping the fetus
X-AMNI
EXTRACORPOREAL
Administration outside of the body.
X-CORPOR
HEMODIALYSIS
Administration through hemodialysate fluid.
HEMO
INFILTRATION
Administration that results in substances passing into
tissue spaces or into cells.
INFIL
INTERSTITIAL
Administration to or in the interstices of a tissue.
INTERSTIT
INTRA-ABDOMINAL
Administration within the abdomen.
I-ABDOM
INTRA-AMNIOTIC
Administration within the amnion.
I-AMNI
INTRA-ARTERIAL
Administration within an artery or arteries.
I-ARTER
INTRA-ARTICULAR
Administration within a joint.
I-ARTIC
INTRABILIARY
Administration within the bile, bile ducts or
gallbladder.
I-BILI
INTRABRONCHIAL
Administration within a bronchus.
I-BRONCHI
INTRABURSAL
Administration within a bursa.
I-BURSAL
INTRACARDIAC
Administration with the heart.
I-CARDI
INTRACARTILAGINOUS
Administration within a cartilage; endochondral.
I-CARTIL
INTRACAUDAL
Administration within the cauda equina.
I-CAUDAL
41
INTRACAVERNOUS
Administration within a pathologic cavity, such
as occurs in the lung in tuberculosis.
I-CAVERN
INTRACAVITARY
Administration within a non-pathologic cavity, such as
that of the cervix, uterus, or penis, or such as that
which is formed as the result of a wound.
I-CAVIT
INTRACEREBRAL
Administration within the cerebrum.
I-CERE
INTRACISTERNAL
Administration within the cisterna magna
cerebellomedularis.
I-CISTERN
INTRACORNEAL
Administration within the cornea (the transparent
structure forming the anterior part of the fibrous tunic
of the eye).
I-CORNE
INTRACORONAL, DENTAL
Administration of a drug within a portion of a tooth
I-CORONAL
which is covered by enamel and which is separated
from the roots by a slightly constricted region known as
the neck.
INTRACORONARY
Administration within the coronary arteries.
I-CORONARY
INTRACORPORUS CAVERNOSUM
Administration within the dilatable spaces of the
corporus cavernosa of the penis.
I-CORPOR
INTRADERMAL
Administration within the dermis.
I-DERMAL
INTRADISCAL
Administration within a disc.
I-DISCAL
INTRADUCTAL
Administration within the duct of a gland.
I-DUCTAL
INTRADUODENAL
Administration within the duodenum.
I-DUOD
INTRADURAL
Administration within or beneath the dura.
I-DURAL
INTRAEPIDERMAL
Administration within the epidermis.
I-EPIDERM
INTRAESOPHAGEAL
Administration within the esophagus.
I-ESO
INTRAGASTRIC
Administration within the stomach.
I-GASTRIC
INTRAGINGIVAL
Administration within the gingivae.
I-GINGIV
INTRAILEAL
Administration within the distal portion of the small
intestine, from the jejunum to the cecum.
I-ILE
INTRALESIONAL
Administration within or introduced directly into a
localized lesion.
I-LESION
INTRALUMINAL
Administration within the lumen of a tube.
I-LUMIN
INTRALYMPHATIC
Administration within the lymph.
I-LYMPHAT
INTRAMEDULLARY
Administration within the marrow cavity of a bone.
I-MEDUL
INTRAMENINGEAL
Administration within the meninges (the three
membranes that envelope the brain and spinal cord).
I-MENIN
INTRAMUSCULAR
Administration within a muscle.
IM
INTRAOCULAR
Administration within the eye.
I-OCUL
INTRAOVARIAN
Administration within the ovary.
I-OVAR
INTRAPERICARDIAL
Administration within the pericardium.
I-PERICARD
INTRAPERITONEAL
Administration within the peritoneal cavity.
I-PERITON
INTRAPLEURAL
Administration within the pleura.
I-PLEURAL
42
INTRAPROSTATIC
Administration within the prostate gland.
I-PROSTAT
INTRAPULMONARY
Administration within the lungs or its bronchi.
I-PULMON
INTRASINAL
Administration within the nasal or periorbital sinuses.
I-SINAL
INTRASPINAL
Administration within the vertebral column.
I-SPINAL
INTRASYNOVIAL
Administration within the synovial cavity of a joint.
I-SYNOV
INTRATENDINOUS
Administration within a tendon.
I-TENDIN
INTRATESTICULAR
Administration within the testicle.
I-TESTIC
INTRATHECAL
Administration within the cerebrospinal fluid at any
level of the cerebrospinal axis, including injection into
the cerebral ventricles.
IT
INTRATHORACIC
Administration within the thorax (internal to the ribs);
synonymous with the term endothoracic.
I-THORAC
INTRATUBULAR
Administration within the tubules of an organ.
I-TUBUL
INTRATUMOR
Administration within a tumor.
I-TUMOR
INTRATYMPANIC
Administration within the aurus media.
I-TYMPAN
INTRAUTERINE
Administration within the uterus.
I-UTER
INTRAVASCULAR
Administration within a vessel or vessels.
I-VASC
INTRAVENOUS
Administration within or into a vein or veins.
IV
INTRAVENOUS BOLUS
Administration within or into a vein or veins all at once. IV BOLUS
INTRAVENOUS DRIP
Administration within or into a vein or veins over a
sustained period of time.
IV DRIP
INTRAVENTRICULAR
Administration within a ventricle.
I-VENTRIC
INTRAVESICAL
Administration within the bladder.
I-VESIC
INTRAVITREAL
Administration within the vitreous body of the eye.
I-VITRE
IONTOPHORESIS
Administration by means of an electric current where
ions of soluble salts migrate into the tissues of the
body.
ION
IRRIGATION
Administration to bathe or flush open wounds or body
cavities.
IRRIG
LARYNGEAL
Administration directly upon the larynx.
LARYN
NASAL
Administration to the nose; administered by way of the NASAL
nose.
NASOGASTRIC
Administration through the nose and into the stomach, NG
usually by means of a tube.
NOT APPLICABLE
Routes of administration are not applicable.
NA
OCCLUSIVE DRESSING TECHNIQUE
Administration by the topical route which is then
covered by a dressing which occludes the area.
OCCLUS
OPHTHALMIC
Administration to the external eye.
OPHTHALM
ORAL
Administration to or by way of the mouth.
ORAL
OROPHARYNGEAL
Administration directly to the mouth and pharynx.
ORO
OTHER
Administration is different from others on this list.
OTHER
43
PARENTERAL
Administration by injection, infusion, or implantation.
PAREN
PERCUTANEOUS
Administration through the skin.
PERCUT
PERIARTICULAR
Administration around a joint.
P-ARTIC
PERIDURAL
Administration to the outside of the dura mater of the
spinal cord..
P-DURAL
PERINEURAL
Administration surrounding a nerve or nerves.
P-NEURAL
PERIODONTAL
Administration around a tooth.
P-ODONT
RECTAL
Administration to the rectum.
RECTAL
RESPIRATORY (INHALATION)
Administration within the respiratory tract by inhaling
orally or nasally for local or systemic effect.
RESPIR
RETROBULBAR
Administration behind the pons or behind the eyeball.
RETRO
SOFT TISSUE
Administration into any soft tissue.
SOFT TIS
SUBARACHNOID
Administration beneath the arachnoid.
S-ARACH
SUBCONJUNCTIVAL
Administration beneath the conjunctiva.
S-CONJUNC
SUBCUTANEOUS
Administration beneath the skin;
SC
hypodermic. Synonymous with the term SUBDERMAL.
SUBLINGUAL
Administration beneath the tongue.
SL
SUBMUCOSAL
Administration beneath the mucous membrane.
S-MUCOS
TOPICAL
Administration to a particular spot on the outer surface TOPIC
of the body. The E2B term TRANSMAMMARY is a
subset of the term TOPICAL.
TRANSDERMAL
Administration through the dermal layer of the skin to
the systemic circulation by diffusion.
T-DERMAL
TRANSMUCOSAL
Administration across the mucosa.
T-MUCOS
TRANSPLACENTAL
Administration through or across the placenta.
T-PLACENT
TRANSTRACHEAL
Administration through the wall of the trachea.
T-TRACHE
TRANSTYMPANIC
Administration across or through the tympanic cavity.
T-TYMPAN
UNASSIGNED
Route of administration has not yet been assigned.
UNAS
UNKNOWN
Route of administration is unknown.
UNKNOWN
URETERAL
Administration into the ureter.
URETER
URETHRAL
Administration into the urethra.
URETH
VAGINAL
Administration into the vagina.
VAGIN
44
APPENDIX 5: Frequency of Drug Administration and Other
Abbreviations
Common frequency of drug administration abbreviations
Abbreviation Definition
q.d.
once a day
b.i.d.
twice a day
t.i.d.
three times a day
q.i.d.
four times a day
q.h.s.
before bed
5X a day
five times a day
q.4h
every four hours
q.6h
every six hours
q.o.d.
every other hour
prn.
as needed
Other Common Abbreviations




NAD – Not Acute Distress
SR – Sustained Release
XL and ER – extended release
IR – immediate release
Abbreviations for Chemo regimens




ABVD
CHOP
OEPA
OEPA-COPDAC
45
APPENDIX 6: Reference Tables (Cardio/Cancer)
INTENSITY OF PULSE SCALE (0 – 4): (this is for your information, but we do not annotate
numbers without a reference/scale)
 0 indicating no palpable pulse
 1 + indicating a faint, but detectable pulse
 2 + suggesting a slightly more diminished pulse than normal
 3 + is a normal pulse
 4 + indicating a bounding pulse.
The way we describe normal heart sounds is by saying “s1 and s2 present” which
corresponds to the valves closing. An S4 is an additional heart sound often heard due to a
“stiff heart wall.” Some of the causes are hypertension, fibrosis, or a disease condition called
HOCM. So this is a variant of the normal.
Non-Hodgkin Lymphoma (NHL)
1) Low-Grade Non-Hodgkin Lymphoma





Follicular Lymphoma
Chronic Lymphocytic Leukemia (CLL)
Small Lymphocytic Lymphoma
Lymphoplasmacytoid Lymphoma/Waldenstrom Macroglobulinemia
Marginal Zone Lymphoma
2) High-Grade Non-Hodgkin Lymphoma




Diffuse Large B-Cell Lymhpoma
Primary CNS Lymphoma
Burkitt Lymphoma
Mantle Cell Lymphoma
3) T-Cell/Natural Killer Cell Non-Hodgkin Lymphoma







Peripheral T-Cell Lymphoma, Not Otherwise Specified (PTCL-NOS)
Angioimmunoblastic T-Cell Lymphoma
Anaplastic Large Cell Lymphoma
T-Cell Lymphoma/Natural Killer (NK) Cell Lymphoma
Hepatosplenic T-Cell Lymphoma
Enteropathy-associated T-Cell Lymphoma
Cutaneous T-Cell Lymphoma/Mycosis Fungoides
Hodgkin/Non-Hodgkin Lymphoma Classification
46
To avoid confusion, note that often people who are not oncologists just use the most generic
names like Hodgkin versus non-Hodgkin’s lymphoma when describing this cancer.
Hodgkin Lymphoma
1) Classical Hodgkin Lymphoma




Nodular Sclerosis
Mixed Cellularity
Lymphocyte-rich
Lymphocyte-depleted
2) Nodular Lymphocyte Predominant
We annotated cancer stage as severity. The staging system tells you how involved the
cancer is.
In general, stage I is usually a tumor that is local; stage II is usually a localized tumor but
larger in size or invading adjacent structures; stage III typically involves lymph nodes; and
stage IV is widespread, metastatic disease. These are just general rules and they vary
depending on the cancer.
The reason I think this should be annotated as severity is that the stage dictates treatment.
Stage I/II cancers can usually be surgically removed and have possible adjuvant
chemotherapy/radiation, while metastatic Stage III/IV disease is usually treated with
chemotherapy alone or with radiation. So I believe this information is clinically relevant and
should be included.
When cancer drugs are given, there may be
a reaction. The physician may reduce the
dose or skip a week or both and may readminister. The reaction may not reappear
right away. Note cancer patients get weaker
over course of treatment which impacts how
they react.
47
APPENDIX 7: Annotation Tool Notes
NLP Objectives:
Current annotation informs ADE Pharmacovigilance. This includes use of current classes,
MedDRA annotation, Naranjo scoring, etc. When measured against objectives, the addition of
anything must be essential to this list to maintain annotation focus.
Disease
Drugs
Adverse Drug Events
Discourse relations
·
Temporal relations
·
Causal relations
·
Contrastive relations
Severity
Tool Use Notes
Navigating Protégé
There are three panels: The leftmost panel is the Class Navigation Bar with the annotation
schema. The middle panel is the record window with the clinical note. You can view the
complete note by using the scroll bar to the right in the panel. The rightmost panel is for attribute
annotation and comments. Note this panel has two sections and section sizes can be adjusted
by dragging the middle bar. The upper section is for attributes. For some entities, there will be
many possible attributes and a scroll bar will appear on the right. Scroll up and down to see
what is available. The comments sections in the Protégé tool is free text and as such, it is not
computable. [Note annotation of the clinical notes is to make them computable] Therefore the
best use of this field is for communication between editors and annotators.
48
Figure: The first panel lists the classes [1], the second panel is the medical record window [2]
and the third panel is an attribute annotation window [3]. To annotate most classes, click the
class in the left panel or in the fast annotate bar [4] and highlight it in the middle panel. Some
additional attributes and associations [5] are made from the class panel and the annotation
window. A few are made from just the annotation window, i.e. Period.
There is a website with the annotation guidelines, videos on how to use the tool, and other
resources. http://ummsres12.umassmed.edu/jt/index.php/annotation
Open Protégé and you can select from recent projects or you can navigate to the folder with the
patient file you will be annotating or editing, and open it. At the very top is a menu bar. If you
select Window, you have the option to increase font size, which most people need to do. One
the third menu bar there are several tabs. At the end is a Knowtator tab. Click this to open the
file if it doesn’t open automatically. A machine annotated record will appear with various colors
marking it up.
Above the Knowtator tab there is a white tool bar which contains the ‘Save’ icon. It is very
important to save your work when you end a session. There is also a section for “Text source
collection:” Arrows allow you to scroll through names of patient records. The name appears at
the top of the record window as “text source: date.txt” in blue. The rightmost arrows allow you to
scroll between annotations of the same category within the note.
49
The mark-ups are visible as text highlighted in colors corresponding to classes of interest.
These are described in the Annotation Guidelines and the classes appear in the leftmost panel
– the Class Navigation Bar. Most of the leftmost classes open up to subclasses and arrows
indicate which ones have further lower level classes.
A best practice is to start by scanning the text to familiarize yourself with the content. Next, read
it carefully line by line and review it for 1. missing and 2. incorrect annotation mark-ups. It is
particularly important to mark all PHI (Personal Health Information). These annotations will also
be used to remove the personal data from the files in the de-identification process - prior to any
data sharing.
To mark text, select (left click) the class type from the left Class Navigation Bar. On first use, a
drop down appears to the right of the class asking if you want to “create an annotation” or to
“fast annotate.” Select “fast annotate.” This class will now appear in the “fast annotate” bar
toolbar. You can select classes with one click directly from this bar (without the drop down) on
subsequent uses of this class. After selecting a class for fast annotation, selecting it again in the
left Class Navigation Bar will give you an additional choice to “remove” that class from fast
annotate. Once a class is selected, move your cursor to the Record Window and using a
highlight motion, highlight the term (or part of it) you would like to mark as this class. It will now
be marked with the class color. To unmark an entity, select it in the Record Window by left
clicking it. Near the top of the annotation window is span edit: can you can choose clear or
delete. Span edit can shrink or increase an annotation.
There are relations or associations that are annotated manually. Classes can be entities or
attributes. A common relation is between the entity Drug and its attributes of Dosage, Route,
Frequency, Duration, Indication and Adverse Event. (see Appendix 2). To annotate a relation
between an entity and its attributes, left-click on the entity or entity span, then right-click on the
attribute or attribute span. The attribute gets highlighted in a dotted box when you have created
the relation. To annotate a relation between a drug and its attribute, left-click on the drug span,
then right-click on the attribute span. The attribute is then highlighted in a dotted box. Continue
this process for each attribute associated with the drug. If an entity has several attributes from
the same category, you must create each separate association. For example, a Drug caused
multiple adverse effects; you must create each association of the Drug to the Adverse Event
individually. In other words, you must create several identical drug spans and link each one to
an Adverse Event. Confirm your work in the Annotation Window. Adjust your Annotation Panels
or scroll to see all of the fields filled in and these fields will change depending on the entity.
MedDRA
MedDRA is a five-level hierarchy of medical terms:
- System organ class (SOC): most general
- High level group term (HLGT)
- High level term (HLT)
- Preferred term (PT)
- Lowest level term (LLT): most specific
All the adverse events should only be assigned Preferred terms (PT). Lowest Level terms are
more specific but the term list contains synonyms so there is a lot of redundancy.
50
If the search did not bring any PT result but only LLT result (e.g. Itching) we double left click the
LLT term and choose on its corresponding PT (Pruritus)
When annotating an Adverse Event, after making the selection, you will see a pop up window
with MedDRA PT terms. Select the best option from the list. It will insert the term you
highlighted into the MedDRA field. Do not choose the class MedDRA, if you do, it will insert the
term you highlight into the MedDRA code field regardless of possible term matches, and it is not
a code. If there is no meaningful match, you can search for its synonym. For example, “expire”
does not produce a result but searching on “death” will. Sometimes a fairly generic term is the
best choice.
To manually enter a MedDRA code, left click on the Adverse Event. In the annotation window,
click the gray box in the MedDRA code field. Click the square with a superscript + in the
ConceptCode field and in the Term to manually add information.

Follow this link to browse the most current version of the MedDRA ontology from
anywhere except the Annotation Server (i.e. Outside the secure environment).
http://ummsres14.umassmed.edu/OntoSolr/browse

On the Annotation Server, please use this URL to search and browse MedDRA terms:
http://ummsqhslxweb01.umassmed.edu/OntoSolr/browse
To review, change or just see the possible MedDRA matches for an AE again, left click on the
AE. In the MedDRA Code field, pick the diamond in a star and the pop up should reappear by
the term.
To check on an MedDRA annotation, double click the grey square in the MedDRA Code field.
The annotation window will refresh and the MedDRA “instance” will be at the top and the
annotated MedDRA values will appear – the MedDRA code in the ConceptCode field, MedDRA
Code field is empty, and the term you selected will be in the Term field.
Note: if you have added MedDRA terms correctly, then the number of adverse event terms
annotated should be equal to the number of MedDRA terms annotated.
Attributes
Assertion, Period, Outcome
51
Default Values
There is no need to annotate default values for attributes.
Assertion = Present
Period = Current
Outcome = notMentioned
When annotating a span, or returning to an annotation, select the term or span to mark with an
assertion and the fields will appear in the Annotation Window. Click the “Add Instance’ icon (the
square with the superscript +) and select the appropriate option.
Assertion: Present, Absent, Conditional, Hypothetical,
Period: Current, History
Outcome:
When you are done, be sure to SAVE your work by clicking on the Save icon.
52
APPENDIX 8: Annotation and Tooling Notes Prior to ADE
Deviations from i2b2 Guidelines
Conditional:
”Conditional” is used when the mention of the medical problem asserts that the patient
experiences the problem only under certain conditions.
He got 1 day of voriconazole for possible presumed aspergillosis, but given that he was
improving on the other antibiotics and his CT was not consistent with aspergillosis and he was
no longer on immunosuppression, it seemed like a less likely diagnosis. His urine and blood
cultures were all negative. Given these findings that presumed diagnosis is community-acquired
pneumonia, he will complete a 10-day course of azithromycin and Omnicef. The patient has
been instructed to return if his fevers or cough worsen and he gets worsening shortness of
breath as this may indicate that the patient has a recurrent aspergillosis
We have not come across examples of conditional value in our corpus so far. We believe that
i2b2 examples of Conditional value fall into "Present" category. For example, “dyspnea on
exertion” is a medical term and should be annotated as “dyspnea on exertion” with Present
assertion value (not as just “dyspnea” with Conditional assertion value).
Prepositions:
We do not include prepositions when annotating duration spans, e.g. for three weeks or for an
unknown period of time we do not include for in duration spans. Here we differ from i2b2 where
the preposition for is included in duration spans.
Export the Annotation to XML3
To get annotations out of Knowtator is to use the XML export. Select the menu option Knowtator
-> Export annotations to XML and then follow the directions. This will generate one XML file per
text source in your collection. The XML format used directly parallels the data model that
Knowtator uses for storing annotations in Protégé. Looking at the XML files may actually be
helpful to understand how Knowtator represents annotations in
Protégé.http://knowtator.sourceforge.net/faq.shtml
Table: Annotation Tools and other Resources
Software
URL
Document
Protégé
http://protege.cim3.net
/download/oldreleases/3.3.1/basic/
3
This is old text and refers to outdated versions of the Knowtator Plugin, MEDdra, and i2b2, but we wanted to
retain this historical information.
53
Knowtator
http://knowtator.sourc
eforge.net/
http://knowtator.sourceforge.net/install.shtml
MedDRA
Browser
http://www.meddrams
so.com/subscriber_do
wnload_tools_browser
.asp
Need MedDRA131E, import the folder named MedAscii.
I2b2
Medication
Annotation
Guideline
http://lancet.googlecod
e.com/files/Preliminary
.Annotation.Guidelines
.7.9.pdf
Semi-Automated Annotation with the BioNLP named entity tagger−Lancet
[This section was written some time ago and it is not clear how much of this is still applicable]
To increase the annotation speed, we apply the BioNLP named entity tagger Lancet, which is
trained on the annotated data, to automatically identify the named entities. An annotator then
corrects the automatically labeled corpus. The annotated corpus will be fed into the learner and
used to train a new model. Such interactive steps are repeated until a satisfactory performance
is met. This section is to guide the oracle on how to correct the automatically labeled corpus.
After importing the NLP tools annotation, the annotator attribute of each annotation is assigned
with a NLP tool name, such as Lancet UWM.
First, a default annotator is assigned. You can configure that by click: Knowtator-> Configure>default Annotator. This configuration does not change the attributes of any existing annotation
and is set for a new annotation.
In the event of partially correct annotations, the annotator needs to delete the annotation first
and then re-annotate. Otherwise, the correction work will not be recorded by Knowtator.
In the event when an entity is annotated more than once, please keep the correct annotation
and delete the other ones. If both or all of them are correct, just delete ones until only one is left.
Please look through the whole article and insert the absent annotations.
There is a trade-off between precision and recall of the NLP tools. Here, we prefer a high
precision annotation.
54
APPENDIX 9: Summary - Annotation Processes/Tooling Changes/Post
Processing Notes for the ADE Pharmacovigiliance Project
1. Reviewed PHI and PII requirements. Significantly streamlined PHI annotation and made
all PHI markings the same color for a simplified view.
2. We now have access to MedDRA releases and the terms used in annotation are now
updated with each release. MedDRA annotation has been brought into the annotation
tool; a major time saver. Upon selecting an adverse event, a MedDRA pop-up window
shows the top 10 matches to the selected term or span. Selection of a MedDRA term
from the pop-up window populates the MedDRA fields with term and concept codes.
Manual annotation of MedDRA terms is still possible and an updated browser on the
virtual machine makes searching an easier process.
3. Inter-annotator agreement is a priority as the team expands. Established a regular
meeting to compare and discuss annotations. Incorporation of conclusions, rules and
examples in the guidelines is now a routine process.
4. Updated guidelines with more examples, filled in gaps, created a section with examples
to specifically aid inter-annotator agreement, there is a history section, and another for
tracking major changes to processes and tooling.
5. Created videos demonstrating use of the Annotation Tool.
6. Added a webpage for Annotation related resources.
7. Established a workflow process and file system on the virtual machine to enable
annotation, editing and other separated workflows.
8. Post annotation processing will include:
a. assigning CTCAE (20) categories to severity annotations
b. assigning default values for
i. Assertion = present
ii. Period = current
iii. Outcome = notMentioned.
c. Negated words are marked with the Assertion “Absent”. They will be detected in
the negation algorithm. Negated word examples are: nontender, anicteric,
asymptomatic.
d. Annotation of actual dosage calculated from drug dosage, frequency
e. Drug names with XL or % in the names to be dealt with. Not all have the XL, and
we have moved to putting the % in with dose
f. Locations associated with two SSLIF will be handled in post processing.
Annotate as follows:
i. the right lung has some rhonchi and wheezes in it.
ii. pain and swelling of his leg
iii. joint aches and pains
9. Do not annotate general terms such as “problem” and “disease”. See Appendix 1 for
more examples.
10. Make all relevant relations, regardless of distance between terms
11. Multiple indications for a drug are allowed
12. Multiple ADEs are allowed for a drug
55
13. Stage of cancer is a severity. Logic:
We annotated stage as severity. The staging system tells you how involved the
cancer is.
In general, stage I is usually a tumor that is local; stage II is usually a localized tumor
but larger in size or invading adjacent structures; stage III typically involves lymph
nodes; and stage IV is widespread, metastatic disease. These are just general rules
and they vary depending on the cancer.
The reason I think this should be annotated as severity is that the stage dictates
treatment. Stage I/II cancers can usually be surgically removed and have possible
adjuvant chemotherapy/radiation, while metastatic Stage III/IV disease is usually
treated with chemotherapy alone or with radiation. So I believe this information is
clinically relevant and should be included.
14. Not annotating numbers as severity unless there is a scale or frame of reference
UNLESS it is an ADE.
15. You can now associate more than one span of dosage information to a drug. NLP will
use it for learning and actual dosage can be determined computationally.
carbamazepine 200 mg extended release 2 tablets twice daily spiriva 18 mcg inhalation capsule one daily,
Glipizide 5 mg, a half tablet daily.
Tramadol 50mg tabs 1-2 tabs every 4 hrs
16. So it is not lost - In the beginning OSSD was used for other Signs. Symptoms and other
Diseases. It was changed to S/S/LIF Signs, Symptoms and Laboratory Findings (as per
Balaji who put this in Knowtator). We are only annotating commented lab results.
56
APPENDIX 10: Entity and Attribute Diagrams
PHI
DATE
MRN
AGE OVERV 90
SSN
LOCATION
PHON/FAX/PAGER
NAME
IDENTIFIERS
ELECTRONIC IDENTIFIERS
ST
PO BOX
STATE / ZIP CODE
COMPANY
HOSPITAL
NAMED SITE (ABREVI. OF BUILDINGS)
CITY
*ADE
PRESENT (default)
ABSENT
POSSIBLE
CONDITIONAL
HYPOTHETICAL
NOT ASSOC./PT
PRESENCE
ASSERTION
MEDICATION
(Drug Name)
*DOSES
*DURATION
*FRECUENCY
*ROUTE
CURRENT (default)
PERIOD
HISTORY
UNK
*REASON / INDICATION
*Association in record and annotation window
57
ADE
PRESENCE
ASSERTION
Yes
association in record and annotation window
PRESENT (default)
ABSENT
POSSIBLE
CONDITIONAL
HYPOTHETICAL
NOT ASSOC./PT
SSLIF
CURRENT (default)
PERIOD
HISTORY
UNK
SEVERITY TYPE
association in annotation
window only
Example:
mild
markadely
severe
very
stage IB
58
ADVERSE
PRESENCE
ASSERTION
N/A
PRESENT (default)
ABSENT
POSSIBLE
CONDITIONAL
HYPOTHETICAL
NOT ASSOC./PT
MEDRA CODE
ADE
OUTCOME
PERIOD
RECOVER
DIED
NOT COMPLETELY RECOVER
NOT MENTIONED (default)
CURRENT (default)
HISTORY
UNK
REASON
SEVERITY TYPE
N/A
Example:
severe
mild
markadely
very
59
PRESENCE
ASSERTION
PRESENT (default)
ABSENT
POSSIBLE
CONDITIONAL
HYPOTHETICAL
NOT ASSOC./PT
CURRENT (default)
PERIOD
REASON/INDICATION
HISTORY
UNK
SEVERITY TYPE
REASON/INDICATION
Example:
mild
markadely
severe
very
stage X
N/A
60
Download