ASIST Proceedings Template

advertisement
Issues in the Development of a Thesaurus for Patients’ Chief Complaints
in the Emergency Department
Stephanie W. Haas
School of Information and Library Science, CB#3360, 100 Manning Hall, University of North Carolina,
Chapel Hill, NC 27599-3360. Email: stephani@ils.unc.edu
Debbie A. Travers
Department of Emergency Medicine, School of Medicine, CB#7594, University of North Carolina,
Chapel Hill, NC. 27599-7594. Email: dtravers@med.unc.edu
When a patient visits the Emergency Department
(ED), the reason the patient is seeking care is
recorded as the Chief Complaint (CC). Beyond its
role in the patient’s care, there is interest in the
CC for secondary uses. Clinicians and
epidemiologists can use CC for research. ED
clinicians and administrators incorporate CC data
into quality monitoring and improvement efforts.
Public health officials can use it as data for health
surveillance. But there is no controlled
vocabulary for recording CC, or standard for a CC
component in the patient record. Travers (2003)
completed a crucial first step toward the creation
of a thesaurus for CC by analyzing a corpus of
CCs to determine the nature of the language used
by triage nurses, and the concepts that were
expressed. Her analysis also illuminated many
issues concerning the content and structure of a
CC thesaurus that must be discussed before the
thesaurus can be developed. Using Cimino’s
1998 article, “Desiderata for Controlled Medical
Vocabularies in the Twenty-First Century”, as a
framework, we discuss these issues and the
resulting decisions that the thesaurus
development team, along with other stakeholders,
will encounter.
Introduction
When a patient visits the Emergency Department (ED),
the triage nurse asks the reason the patient is seeking care.
This information is recorded as the Chief Complaint (CC),
and it influences many aspects of the ED visit, such as the
speed with which the patient is seen, or any immediate
treatment needed. The initial actions of the ED clinical
team therefore depend on the CC. Beyond its role in the
patient’s care, there is interest in the CC for secondary uses.
Clinicians and epidemiologists can use CC for research.
ED clinicians and administrators incorporate CC data into
quality monitoring and improvement efforts. Public health
officials can use it as data for health surveillance, both for
bioterrorism surveillance and other kinds of problems, such
as identifying potential SARS victims, or early
identification of flu outbreaks. But the CC is not collected
in a way that will readily support these uses; there is no
controlled vocabulary for recording CC, or standard for a
CC component in the patient record. In a survey of North
Carolina and Seattle EDs, Travers et al. (2003) found that
CC was collected in a variety of formats, including free
text, locally developed or adapted term lists and
commercially-provided lists.
Without a standardized
vocabulary, aggregating CC data requires immense human
effort to collect all the CCs that express the same concept.
Without this “translation” step, results are likely to be
unreliable.
There have been calls for a standardized CC vocabulary
from many sources. Initial work by the Emergency Nurses
Association (ENA) on an emergency nursing minimum data
set was incorporated into a multi-disciplinary effort
sponsored by the Centers for Disease Control and
Prevention, which led to the release of Data Elements for
Emergency Departments (DEEDS) 1.0 in 1997 (Bradley,
1995 & 1996; NCIPC, 1997; Pollock et al., 1998). DEEDS
data element 4.06 is “Chief Complaint”. Since no standard
vocabulary exists for documentation of CC, the DEEDS
and ENA leaders recommended evaluation and adaptation
of established terminologies as a solution to the need for an
ED CC controlled vocabulary.
The shortcomings of existing CC data have been
documented, and include lack of CCs in electronic form
and CCs documented in a variety of formats (free text,
locally developed lists, varying field lengths from 12-250
characters (Travers et al, 2003; Coben et al., 1996; Cline,
2000). In spite of the lack of a CC terminology, CC data are
currently used for many public health, academic, and
hospital-based surveillance applications. For example, ED
CC data are a component of several regional bioterrorism
surveillance systems, in which various techniques such as
keyword searches and natural language processing (NLP)
techniques (e.g., stemming) are used to identify certain CC
terms indicative of potential biological agent exposure
(Greenko, Mostashari, Fine & Layton, 2003; Ivanov,
Wagner, Chapman & Olszewski, 2002). One reason for the
interest in the CC for disease surveillance is that the CC is
recorded at the start of the ED visit, and therefore has the
potential to be available in a timelier manner than
conventional disease-reporting methods. Researchers have
shown that CC data can be used to detect disease outbreaks
up to 2 weeks sooner than the conventional approach,
which involves waiting for lab reports and final diagnoses
to trigger reporting (Tsui et al., 2001; Teich et al., 2002).
Given that the CC is already used as a source of data for
surveillance and other secondary uses, improving the
quality of CC data is of utmost importance.
A CC thesaurus would represent the concepts, preferred
terms, and synonyms needed to support a controlled
vocabulary for CC. Developing a thesaurus is a significant
undertaking that should involve a variety of stakeholders:
ED clinicians, those interested in secondary uses of CC
such as research or health surveillance, and professional
and standards organizations whose approval and continuing
support would be necessary for its widespread acceptance
and use, as well as ongoing maintenance. Additional
operational constraints arise from the busy ED
environment. Triage nurses typically have only a couple of
minutes to evaluate a patient’s condition. Any information
system based on the thesaurus must allow nurses to enter an
accurate CC quickly, and the CC must be informative to the
other ED clinicians who base decisions about the patient’s
initial care on the CC.
Travers (2003) completed a crucial first step toward the
creation of a thesaurus for CC by analyzing a corpus of CCs
to determine the nature of the language used by triage
nurses, and the concepts that were expressed. Her analysis
also illuminated many issues concerning the content and
structure of a CC thesaurus, that must be discussed before
the thesaurus can be developed. Using Cimino’s 1998
article, “Desiderata for Controlled Medical Vocabularies in
the Twenty-First Century”, as a framework, we discuss
these issues and the resulting decisions that the thesaurus
development team, along with other stakeholders, will
encounter.
Analysis of Chief Complaints
We present a brief overview of Travers’ research here;
details may be found in her dissertation and related papers
(Travers, 2003; Travers & Bodenreider, 2002; Travers &
Haas, 2003). She created a corpus of one year’s worth of
CCs, collected from three southeastern EDs representing
urban, rural and suburban academic medical centers. CCs
were entered as free text at two of the sites. At the third
site, nurses could use a vendor-developed controlled list of
terms, enter the CC as free text, or both. The CC could be
up to 30 characters at two of the sites, but was limited to 12
at the other.
Travers developed the Emergency Medical Text
Processor (EMT-P), a set of modules that cleans and
normalizes text and extracts standardized terms. CC
entries, like other kinds of unedited free text, are messy.
Travers found a variety of abbreviations, truncations,
misspellings, local expressions, coordinate structures,
punctuation, and other interesting (and occasionally
puzzling) characteristics. She then identified the concepts
expressed in the cleaned CCs, and mapped them to
concepts identified in the National Library of Medicine’s
Unified Medical Language System Metathesaurus®. Her
work accomplished several goals, including the
development of a list of almost 4,000 concepts used by
triage nurses in CCs, collection of groups of words,
phrases, abbreviations and other expressions that nurses
used to name each concept, and identification of ED
concepts that were not in the Metathesaurus®. These
missing concepts are not included in any of the UMLS
component vocabularies, and would need to be added in
order to provide complete coverage of the ED CC domain.
This paper focuses on a set of findings from Travers’
analysis of CC terms that raise specific issues for the design
and development of a CC thesaurus. In the next section, we
summarize Cimino’s (1998) desiderata for controlled
medical vocabularies, which provide a useful framework in
which to consider these findings. In the following sections,
we discuss each issue and their implications for the CC
thesaurus.
Desiderata
We can group the most pertinent of Cimino’s desiderata
into those relating to the content, structure, and
administration of the thesaurus.
Content
First and foremost, the thesaurus should provide complete
coverage of all the concepts in the domain. In other words,
users of the vocabulary should be able to express whatever
they need to. This implies a process of laying out the scope
of the domain, and then enumerating the concepts therein.
Cimino urges that a controlled vocabulary be “concept
oriented”. Concepts should be distinct from each other.
Terms should refer to a single concept; not be overly vague,
and not be ambiguous. Note that this does not rule out the
inclusion of synonyms. A thesaurus can represent concepts
and a preferred term for the concept, but also include other
terms, phrases, abbreviations, etc. that refer to the same
concept.
Finally, there must be some way to recognize gaps in the
coverage of the thesaurus, and define procedures for filling
them. Gaps may occur because of discoveries in the
discipline, recognition of distinctions that had not been
made before, or simple omissions at the time of original
development.
Structure
Decisions about the structure of the thesaurus are closely
related to decisions about its scope and content. Cimino
recommends that a controlled vocabulary provide varying
levels of granularity in concepts. A very specific concept
may be more informative, but often, especially in the early
stages in medical care such as triage, only a more general
concept can be used. For example, the CC may be rash,
and only later in the visit can the specific type or cause of
the rash be determined.
A related question is what the desired “unit” of concept is.
If only simple “single-element” units are included, then
there must be some rules for combining them to create more
complex concepts. Cimino illustrates this with the example
of bone fractures. One could try to enumerate all the
possible combinations of types of fractures and bone
names, or one could include the fracture types and the
bones, and specify post-coordination rules that will
eliminate the possibility of producing nonsensical
combinations.
Usage rules could be expanded to include representation
of the context in which a concept can be used, or
relationships with other data fields. This is directly related
to the scope of the thesaurus itself, and how it articulates
with those of related domains.
Administration
Two of the desiderata concern the creation of a controlled
vocabulary. First, what is the purpose for which it will be
used? Cimino urges that vocabularies be designed to
support many purposes, rather than just one. This implies
that designers should know how the thesaurus will be used.
The next question is whether a new controlled vocabulary is
actually needed, or whether an existing one can be used,
perhaps with minor adaptations. Uncontrolled additions to
a vocabulary can lead to redundancy and muddled
concepts. On the other hand, proliferation of many specialpurpose vocabularies complicates data sharing and
merging. Finally, it is important to consider who is
responsible for maintaining and updating the vocabulary.
Buy-in by stakeholder organizations is important to foster
use of the vocabulary, but shared responsibility, for
maintenance, for example, can be difficult to co-ordinate.
Findings for Thesaurus Development
In this section, we use Cimino’s content, structure, and
administration desiderata as a framework for discussing the
implications of Travers’ findings for the design of the CC
thesaurus. Travers also identified additional issues specific
to the ED that are included in our discussion.
Content of the Thesaurus
Although the corpus from which Travers (2003) extracted
CC concepts was large, it cannot be assumed that all
concepts that triage nurses need to express were used in it.
It is likely, however, that the most commonly-used concepts
were found. The 564 most frequent CC entry types
accounted for 50% of the entry tokens. There was general
agreement among the three EDs from which the sample was
drawn as to the most common concepts. The 4 most
frequent CC concepts in the corpus (abdominal pain, chest
pain, fever, and headache) accounted for 18% of the entry
tokens. These same concepts were the 4 most frequent
concepts found in the National Center for Health Statistics
(NCHS) annual survey of EDs, where they accounted for
about 20% of patient visits (McCaig & Burt, 2001, 2003;
McCaig & Ly, 2002).
EDs in different parts of the country, or those that serve
different populations, may use additional terms for known
concepts, and may also need additional concepts (although
these are likely to be less common ones). For example,
Travers found that each ED had a different coding scheme
for describing special patient populations, such as
unidentified patients, or those with major trauma or
cardiopulmonary arrest. The controlled vocabulary must
either provide a single standardized scheme, or the
thesaurus must allow for easy addition of local variants.
Although the latter approach would allow individual EDs to
continue using their customary scheme, it has drawbacks
for aggregating data across EDs.
Given the expectations of encountering new concepts and
new ways of expressing concepts, any interface based on
the CC thesaurus must allow triage nurses to add additional
free text entries to the CC when they feel that the existing
vocabulary is insufficient. This is not necessarily bad news:
these entries have a role to play in satisfying the next
requirement.
Since it is impossible to definitively identify all possible
CC concepts, or anticipate new ones, effort should be
focused on developing means for recognizing and
incorporating new concepts into the thesaurus. The
matching strategies that Travers used to map concepts
found in the CCs to concepts in the UMLS Metathesaurus®
can be the basis for regular monitoring to identify new
concepts. Free text CCs that do not match any UMLS
concept, or that are incorrectly matched, may be evidence
of either a new or previously unseen concept, or a new or
previously unseen way of expressing a known concept.
These could signal the need to update the thesaurus.
Mapping CC concepts to concepts in the UMLS helps
fulfill Cimino’s requirement of concept orientation for the
CC thesaurus. In addition, the mapping process highlighted
several areas of concept definition that need investigation.
Some common CC concepts were not found in the UMLS.
For example, patients are frequently brought to the ED for
clearance; to verify their medical stability before they are
sent to a jail or substance abuse detoxification center. CCs
such as jail clearance or detox clearance represent a
function of the ED whose concepts are possible additions to
the UMLS.
Other CC concepts were closely related to concepts in the
UMLS, but were not considered matches by the panel of
experts who validated a sample of matches (Travers, 2003).
These mismatches or partial matches often represent
differences in perspective between EM clinicians and the
medical vocabularies present in the UMLS. For example,
the UMLS contains motor vehicle accident (MVA), which
includes bicycles and motorcycles, but seems to be
restricted to accidents on roads. EM clinicians need more
specific concepts, such as motor vehicle crash (MVC), or
motorcycle crash. The corpus included related CCs such as
ped vs. car and bike vs. car, indicating collisions involving
pedestrians or bicycles.
In other cases, the UMLS contains multiple concepts,
which, for the purposes of representing CCs, may be
deemed by experts to be similar enough to be combined.
For example, the UMLS includes both the concept
shortness of breath, and the concept dyspnea. The phrase
shortness of breath is also listed as a synonym for dyspnea.
These examples of “missed synonymy” are often an artifact
of merging multiple vocabularies (Hole & Srinivasan,
2000). The thesaurus development team will need to
decide whether to merge similar concepts into a single
concept.
Other concepts also require consideration by domain
experts. For example, the term congestion is ambiguous
when seen in isolation away from the ED setting. In the
validation study (Travers, 2003), one expert said that even
though the usual assumption in the ED is that it refers to
nasal congestion, the UMLS concept that was proposed as
a match was broader, and could refer to nasal, pulmonary
or hepatic congestion. These examples emphasize the
importance of defining concepts in a specific clinical
context, in this case, the ED.
Analysis of the CC corpus revealed extensive use of
synonyms, abbreviations, and truncations, especially for
common concepts. For example, the most common
concept, abdominal pain, had 7 synonyms. However, the
analysis also highlighted the frequent misspellings of terms
and abbreviations, for example, the 10 different ways that
hemorrhoid was spelled. Should the thesaurus include illformed (but common) synonyms in addition to well-formed
ones? Another way of framing the question is, should the
thesaurus represent CC language the way it is used, or in its
preferred form? Analysis of free-text CCs, as well as other
types of clinical notes, will require recognition of these
misspelled words. The EMT-P system currently contains
modules that recognize and transform them to standard
terms, but an argument could be made that including them
in the thesaurus would make it more robust. This is a
question of “division of labor” between the thesaurus itself
and additional processing routines needed to support
various applications.
Structure of the Thesaurus
Many of Travers’ findings concern the scope and
structure of the thesaurus, but also have implications for the
structure of the patient record.
Co-morbidities are existing medical conditions that may
be related to the patient’s CC or affect patient management.
Common co-morbidities found in the CC corpus included
patient currently pregnant and diabetes. These are not the
reasons patients come to the ED; their visits are due to
specific complaints such as labor pains or altered mental
status. The thesaurus must include co-morbidities, but
recording them in a separate field in the patient record may
be more precise than combining them with the CC.
When patients’ visits were associated with an injury, CCs
in the corpus frequently included the method of injury
(MOI), such as MVC/Neck Pain, or Alt MS/Fell (altered
mental state, fell) Further, many CCs consisted only of a
MOI, with no information about symptoms. The frequency
with which MOI was used indicates that the thesaurus
should include MOI concepts, but as with co-morbidities,
perhaps they should be recorded in a separate field. This
would also make extracting injury-related data for research
and public health surveillance easier.
Several types of modifiers and qualifiers were found in
the CCs. Modifiers are words that alter the severity,
location, or acuity of a CC (Chute and Elkin, 1997), such as
right leg, or acute arm pain. Qualifiers qualify, or
circumscribe, the meaning of a term, such as history of
seizure. Cimino mentioned the choice between including
only simple concepts which can then be combined as they
are needed, or pre-coordinating them into complex
concepts. The UMLS is inconsistent in this regard, since it
is derived from the structures of its component
vocabularies. For example, it includes left flank pain, but
not left femur pain as concepts. One approach would be for
the thesaurus to include the most common combinations,
such as right leg, and allow post-coordination when triage
nurses need to express less common combinations. (Note
the advantage that the corpus-based approach to developing
the thesaurus gives; frequency information is readily
available.) The thesaurus would then need rules to prohibit
ill-formed combinations, such as right rash. It might be
possible to base at least some rules on the UMLS semantic
types, such as allowing a description of laterality (right or
left) to modify a body location, but not a sign or symptom.
Other candidates for pre-coordination are frequently
occurring conjoined CCs. For example nausea and
vomiting frequently occurred together, and there is a
UMLS concept that represents this combination. Other
frequent combinations, such as fever and vomiting, might
also be included as a pre-coordinated concept. This implies
that in some cases, more than one CC can be included in a
single field; we discuss the need for multiple fields below.
Some allowable combinations of CCs may also be based on
semantic types. For example, a common combination in the
corpus was body part plus a lesion or injury type, such
bump on head or finger laceration.
Several kinds of expressions of time were found in the
CCs. Temporal expressions represented duration of a
complaint (for 3 weeks), the frequency with which a
problem occurred (twice a day), and the time at which an
event such as an injury or medical procedure took place
(yesterday, June 4th). These expressions require more
analysis to determine their structure and use in the CC. As
with other qualifiers, rules for their combination with other
CC terms are needed; for example leg injury every a.m.
does not make sense.
Numbers were used to express dates and times, but also
other information such as temperatures, and visual acuity.
As with temporal expressions, further analysis is needed to
understand their structure and use.
We have already mentioned that rules are needed for
post-coordination of qualifiers and modifiers with CC
concepts , and combining CCs. Other kinds of rules could
express relationships between fields in the patient record.
For example, MOI was frequently the only information in
the CC. If instead it is recorded in its own field, should
there be a rule also requiring an entry in the CC field?
Another example concerns identifying an entry such as
temp 100.7 as a synonym for fever. It may depend on other
information in the record, such as the time of day, and the
age of the patient. These sorts of rules go beyond
specifying well-formed post-coordinations into the realm of
inference.
The final structural issue we present concerns the number
of CC fields the patient record should contain. Our survey
of EDs (Travers et al., 2003) found that most electronic
records allowed only one CC field, but some allowed 2, 3
or more. The size of the fields ranged from 12 characters to
“paragraphs”. Travers found that many entries in the
corpus included multiple CC concepts, such as abdominal
pain & headache & fever, which contains three separate
concepts. This raises the issue of whether a standardized
patient record should have more than one CC field. If so,
what guidelines should be specified for their use? For
example, one field could be designated the primary field,
which should always have an entry, while the other(s) could
be secondary; and used only when more than one CC is
needed.
Administration of the Thesaurus
Cimino urged that a controlled vocabulary be designed to
support many purposes. The primary purpose of the CC
thesaurus is to support a controlled vocabulary for
recording CC in ED patient records. A thesaurus allows the
development of a concept-oriented vocabulary, and the
designation of preferred terms and synonyms (possibly
including ill-formed synonyms). A crucial element of
support for this primary purpose lies in the design of user
interfaces that will help triage nurses select and/or enter
informative, unambiguous terms in the hectic environment
of the ED. There are many possibilities. Nurses could pick
one or more CCs from a list of preferred terms. The list
could be sorted according to frequency of use, organized by
type of problem, body part or system, or some other
scheme. Another approach would allow nurses to start to
enter free text CCs, and be prompted with suggested
completions drawn from the preferred term list, from which
they would select the most accurate one. Or, free text
entries could be converted to controlled terms by the
information system in the background, using EMT-P.
Research is needed to determine the most appropriate
interface designs, but the thesaurus should not limit the
possibilities.
The thesaurus will also support a variety of secondary
analyses of CC, which could be used for bioterrorism and
other types of health surveillance, research, quality
improvement efforts of ED care, etc. As long as there is no
controlled vocabulary for CC, this information is
exceedingly difficult and time-consuming to use.
Another possible use for the thesaurus is to support
processing of clinical notes, such as nurse and physician
notes, in the ED. Applications could include translating
concepts used in the notes into a standardized vocabulary,
indexing, searching, or summarization. This assumes that
there is substantial similarity in the concepts and language
used in clinical notes and the CC, an assumption which
needs to be tested. Since CC and ED clinical notes are both
within the domain of emergency medicine, there is likely to
be abundant overlap of concepts, but the structure of the
language may differ. The limited length of many CC fields,
along with the fast pace of triage, encourages (or forces)
substantial use of abbreviations, acronyms, and truncations.
Space limitations may not be as severe for notes, on the
other hand, the time constraints may have the same effect.
Stetson, Johnson, Scotch and Hripcsak (2002) did find
differences in the language used by hospital clinicians in
various types of notes and reports.
By mapping the CC concepts to UMLS concepts, Travers
was able to provide some information about whether there
is an existing medical vocabulary that could be adapted for
the CC. We have already mentioned that there are some
concepts missing from the UMLS; any vocabulary would
need to have these added. The two vocabularies that
contained the most concepts in the CC corpus were Clinical
Terms Version 3 (also known as the Read Codes, England,
National Health Service Centre for Coding and
Classification, 1999), which covered 65% of the CC
concepts, and Systematized Nomenclature of Human and
Veterinary Medicine: SNOMED International, Version 3.5
(Cote, Roger A., editor, College of American Pathologists,
1998), which covered 50% of the CC concepts. These two
vocabularies have recently been merged into SNOMED CT
(College of American Pathologists, 2002), which is being
incorporated into the UMLS Metathesaurus®. This seems
to be a good candidate for consideration.
The last administrative requirement from Cimino that we
mention here concerns who takes responsibility for
distributing and maintaining the thesaurus. This is a
question that must be resolved by the various organizations
that have an interest in medical vocabularies, and will also
depend on the decision to adapt an existing vocabulary or
create a new one.
Conclusions and Future Work
Cimino’s 1998 “Desiderata” laid out requirements for a
sound controlled vocabulary or thesaurus on a general
level, applicable to any medical vocabulary. Travers
(2003), and Travers & Haas (2003) reported on research
that gathered and analyzed a large portion of the
information needed to build a thesaurus that would satisfy
these requirements, including concept orientation,
recognition of synonyms, and means of identifying needed
updates and modifications of the thesaurus over time. We
have placed others of Cimino’s requirements in the specific
context of the CC, framing questions relating to scope and
coverage (e.g., pre- or post-coordination of concepts, and
merging similar concepts), and the structure of the
thesaurus (e.g., inclusion and use of MOI and qualifiers and
modifiers).
Further work is needed to clarify these issues and develop
guidelines for the development of the CC thesaurus. First,
we will expand the comparison of CC terms and concepts
with existing vocabularies to determine whether any can be
easily adapted for our needs. We will explore both the
coverage of the vocabularies, and policy and licensing
decisions as to their availability.
The second step is to expand the CC corpus and continue
to search for new concepts and terms. The original corpus
contains CCs from 3 southeastern teaching hospitals, and it
is possible that EDs in other parts of the country or in other
types of hospitals (e.g., smaller nonacademic hospitals) see
different kinds of problems, or express them differently.
Our intent is to identify systematic areas in which coverage
is needed, rather than to attempt the impossible task of
finding every last concept or term. For example, EDs in
other parts of the country may see different kinds of injuries
than those seen in the southeast (e.g., injuries arising from
outdoor activities in cold weather, such as skiing or icefishing). We plan to partner with researchers in other parts
of the country to pursue this goal.
Finally, we plan to convene a symposium of experts and
other stakeholders to discuss the open issues of content and
policy in the design of the CC thesaurus and fields in the
patient record that we have introduced in this paper.
It should be noted that the development of the CC
thesaurus alone is not sufficient to make CC data more
readily available for use; there are additional operational
issues. One concern was revealed in the ED survey
(Travers et al., 2003). In North Carolina, 46% of the
responding EDs reported that CC was collected on paper,
rather than in electronic form. A data entry process would
be required before CC data could be aggregated, and the
effort and delay would make any kind of real-time
surveillance infeasible. The same survey reported that all
16 EDs in Seattle collected CC electronically, but clearly
one cannot assume that CCs will be immediately available
from EDs in all parts of the country.
A controlled vocabulary for CC is necessary to increase
the utility of this data element for practice, clinical research,
and health surveillance purposes. In this paper we have
outlined some of the decisions that lie ahead for the
thesaurus development team and the ED clinical
community.
ACKNOWLEDGMENTS
National Library of Medicine training grant number
LM07071 supported Travers’ work on this project.
Funding was also provided by the North Carolina Office of
Public Health Preparedness and Response in the
Bioterrorism Branch of the Epidemiology Section of the
Division of Public Health. The authors wish to thank
members of NCEDD (http://www.ncedd.org) for technical
support and helpful discussions.
REFERENCES
Bradley, V. (1995). Toward a common language: emergency
nursing uniform data set (ENUDS). Journal of Emergency
Nursing, 21(3), 248-250.
Bradley, V. (1996). Innovative informatics: development of an
emergency data set: a worthwhile challenge. Journal of
Emergency Nursing, 22(3), 238-240.
Chute, C. G. & Elkin, P. L. (1997). A clinically derived
terminology: Qualification to reduction. Proceedings of the
AMIA Symposium 1997, 570-574.
Cimino, J. J. (1998). Desiderata for controlled medical
vocabularies in the twenty-first century.
Methods of
Information in Medicine, 37, 394-403.
Cline, S. (2000. Morbidity and mortality associated with
Hurricane Floyd – North Carolina, September-October 1999.
Morbidity and Mortality World Report, 49(17), 369-372.
Coben, J.H., Dearwater, S.R., Garrison, H.G. & Dixon, B. W.
(1996). Evaluation of the emergency department logbook for
population-based surveillance of firearm-related injury. Injury
Prevention, 28(1), 188-193.
Greenko, J., Mostashari, F., Fine, A., & Layton, M. (2003).
Clinical evaluation of the emergency medical services (EMS)
ambulance dispatch-based syndromic surveillance system, New
York City. Journal of Urban Health: Bulletin of the New York
Academy of Medicine, 80(2, suppl. 1).
Hole, W. T. & Srinivasan, S. (2000). Discovering missed
synonymy in a large concept-oriented Metathesaurus.
Proceedings of the AMIA Symposium 2000, 354-358.
Ivanov, O., Wagner, M. M., Chapman W. W. & Olszewski, R. T.
(2002). Accuracy of three classifiers of acute gastrointestinal
syndrome for syndromic surveillance. Proceedings of the
AMIA Symposium 2002, 345-349.
McCaig, L. F. & Burt, C. S. (2001). National Ambulatory
Medical Care Survey: 1999 emergency department summary.
Advance data from vital and health statistics; no. 320.
Hyattsville, MD: National Center for Health Statistics.
release 1.0. Atlanta, GA: Centers for Disease Control and
Prevention.
Pollock, D.A., Adams, D. L., Bernardo, L. M., Bradley, V.
Brandt, M. D. & Davis, T. E. (1998). Data elements for
emergency department systems (DEEDS), release 1.0: a
summary report. Journal of Emergency Nursing, 24, 35-44.
Stetson, P. D., Johnson, S. B., Scotch, M. & Hripcsak, G. (2002).
The sublanguage of cross-coverage. Proceedings of the AMIA
Symposium 2002, 742-746.
Teich. J.M., Wagner, M.M., Mackenzie, C.F. & Schafer, K.O.
(2002). The informatics response in disaster, terrorism, and
war. Journal of the American Medical informatics Association,
9, 97-104.
Travers, D. A. (2003). Identification of Concepts from Emergency
Department Text Using Natural Language Processing
Techniques and The Unified Medical Language System®.
Doctoral Dissertation, University of North Carolina at Chapel
Hill.
Travers, D. A. & Bodenreider, O. (2002.) Identifying medical
concepts in free text chief complaint data.
Academic
Emergency Medicine, 9, 511 (abstract).
Travers, D. A. & Haas, S. W. (2003). Using nurses’ natural
language entries to build a concept-oriented terminology for
patients: Chief complaints in the emergency department.
Journal of Biomedical Informatics, 36(4-5), 260-270.
McCaig, L. F. & Burt, C. S. (2003). National Ambulatory
Medical Care Survey: 2001 emergency department summary.
Advance data from vital and health statistics; no. 335.
Hyattsville, MD: National Center for Health Statistics.
Travers, D. A., Waller, A., Haas, S. W., Lober, W. B. & Beard, C.
(2003).
Emergency department data for bioterrorism
surveillance: Electronic data availability, timeliness, sources
and standards. Proceedings of the AMIA Symposium 2003,
664-668.
McCaig, L. F. & Ly, N. (2002). National Ambulatory Medical
Care Survey: 2000 emergency department summary. Advance
data from vital and health statistics; no. 326. Hyattsville, MD:
National Center for Health Statistics.
Tsui, F. C., Wagner, M.M., Datao, V. & Chang, C.C. (2001).
Value of ICD-9-coded chief complaints for detection of
epidemics. Proceedings of the AMIA Symposium 2001, 711715.
National Center for Injury Prevention and Control (NCIPC).
(1997). Data elements for emergency department systems,
Download