readme

advertisement
File: readme.doc
G. Moody
July 1989
Copyright (C) Massachusetts Institute of Technology 1989.
reserved.
All rights
Overview of this disk
=====================
This CD-ROM contains over two hundred hours' worth of two-lead ECG
recordings,
most of which are annotated beat-by-beat. Approximately one-fourth of
the
disk is occupied by the MIT-BIH Arrhythmia Database, and the rest
contains
supplementary ECG recordings comprising seven additional databases. Each
of
the databases occupies its own directory on this disk. You will also
find
three directories on this disk which do not contain ECG recordings;
these are
named `\bin', `\lib', and `\udb'. With the exception of `.doc' and `.h'
files,
the files on this disk are binary (i.e., they are not directly
printable).
There are several hundred ECG recordings on this CD-ROM, ranging in
length from
20 seconds to 24 hours. The program `view' (in the `\bin' directory)
allows
you to examine any of these recordings interactively. Run it without any
command-line arguments for instructions on its use.
The rest of this file briefly describes the contents of this CD-ROM.
More
information can be found in the manuals which accompany the disk. If you
need additional copies of these manuals, or if you have further
questions,
please write to:
Beth Israel Hospital
Biomedical Engineering Division, KB-26
330 Brookline Avenue
Boston, MA 02215 USA
or call (617)735-4553.
Programs
========
The `\bin' directory contains a few application programs which can be run
under
MS-DOS to examine the ECG recordings. You may copy these programs to a
directory on your magnetic disk which is in your PATH. If you wish to
run them
directly from the CD-ROM, simply add the `\bin' directory of your CD-ROM
drive
to your PATH. (See your DOS manual for information on how to do this.)
See
`\bin\bin.doc' for further information about the applications.
A larger package of applications is available on floppy disks for MS-DOS
or
Macintosh systems. These applications, along with the DB library
functions
(see below), are also available in C source form for use on UNIX systems
or for
customization. Write to the address above for details.
Subroutine Library
==================
If you wish to construct your own database applications, the `\lib'
directory
of this disk contains the DB library, a set of functions which provide
access
to database files. The DB library can be used with the Microsoft C
compiler
under MS-DOS. See `\lib\lib.doc' for details about files in `\lib' and
`\udb'.
ECG Recordings
==============
The ECG recordings on this disk are organized into eight databases, each
in its
own directory. Each database record contains a continuous ECG recording
from a
single subject. Records are identified by unique record names (for
example,
MIT-BIH Arrhythmia Database record names are 3-digit numbers). All files
belonging to a record have the same primary file name (which is the
record
name), and extensions which indicate the file type.
Header Files
-----------Each record includes a header (.hea) file, which contains general
information
about the record such as its length. Header files occupy one disk block
each.
If you have a utility for converting UNIX text files to MS-DOS format,
you can
use it to produce printable versions of these files if you are curious
about
their contents.
Signal Files
-----------With the exception of the Atrial Fibrillation Database (see below), each
record includes a signal (.dat) file, which contains digitized ECG
signals.
Most of the signal files contain two signals (with samples from each
signal
stored alternately), but several of those in the ST Change Database
contain
only one signal, and one record in the Long-Term Database contains three
signals. The sampling frequencies range from 128 to 360 samples per
second per
signal (see the descriptions of the individual databases below). Signal
files
occupy most of the disk, and range in size from 20 Kb to about 45 Mb.
Signal
files may be examined using `rdsamp' (in the `\bin' directory of this
disk).
Annotation Files
---------------Most records include reference annotation (.atr) files. The reference
annotation files for the MIT-BIH Arrhythmia Database are the most
complete and
accurate of these; they describe each beat by an annotation (i.e., a
label),
and include additional annotations which describe rhythm and signal
quality
changes. As noted below, reference annotation files for the other
databases
are less extensive (some omit rhythm or signal quality annotations, and
others
contain only rhythm annotations) and may contain small numbers of errors
since
they have not been subjected to the intense scrutiny directed at the MITBIH
Arrhythmia Database during its development and in the nine years between
its
initial release and the publication of this disk.
The Atrial Fibrillation Database contains a set of machine-generated
annotation
files with `.qrs' extensions (see below), in addition to its reference
annotation (.atr) files.
Annotation files vary in length, according to their contents and the
duration
of the record, from less than 1 Kb to several hundred Kb. These files
may be
examined using `rdann' (in the `\bin' directory of this disk).
ECG Databases
=============
This section briefly describes each of the eight databases on this CDROM,
lists the record names for each, and includes a few references for
further
reading.
MIT-BIH Arrhythmia Database (\mitdb)
-----------------------------------Record names:
100
105
111
116
122
101
106
112
117
123
102
107
113
118
124
103
108
114
119
104
109
115
121
\----- (the `100 series') ----/
200
201
202
203
205
\-----
207
213
220
230
208
214
221
231
209
215
222
232
210
217
223
233
212
219
228
234
(the `200 series') ----/
This database consists of 48 annotated records, obtained from 47 subjects
studied by the Arrhythmia Laboratory of Beth Israel Hospital in Boston
between
1975 and 1979. About 60% of the records were obtained from inpatients.
The
database contains 23 records (the `100 series') chosen at random from a
set of
over 4000 24-hour Holter tapes, and 25 records (the `200 series')
selected from
the same set to include a variety of rare but clinically important
phenomena
which would not be well-represented by a small random sample. Several
records
in the 200 series were chosen specifically because features of the
rhythm,
QRS morphology, or signal quality may be expected to present significant
difficulty to arrhythmia detectors.
Each record is slightly over 30 minutes in length. Each signal file
contains
two signals sampled at 360 Hz. The header files include information
about the
leads used, the patient's age, sex, and medications. (This information
is
reproduced in the MIT-BIH Arrhythmia Database Directory, which
accompanies this
disk.) The reference annotation files include beat, rhythm, and signal
quality
annotations. Each of the roughly 109,000 beats was manually annotated by
at
least two cardiologists working independently; their annotations were
compared, consensus on disagreements was obtained, and the reference
annotation
files were prepared.
Four records (102, 104, 107, and 217) include paced beats. The original
analog tapes do not represent the pacemaker artifacts with sufficient
fidelity
to permit them to be recognized by pulse amplitude (or slew rate) and
duration
alone, the method commonly used for real-time processing. The database
records
reproduce the analog recordings with sufficient fidelity to permit use of
pacemaker artifact detectors designed for tape analysis, however.
The MIT-BIH Arrhythmia Database has been used for evaluation of
arrhythmia
detectors at approximately 100 sites worldwide prior to the publication
of this
CD-ROM. Since its initial release in 1980, sixteen errors in beat
annotations
have been discovered and corrected. No such errors have been found since
1987.
Reference:
Mark, R.G., Schluter, P.S., Moody, G.B., Devlin, P.H., and Chernoff, D.
An annotated ECG database for evaluating arrhythmia detectors.
Frontiers of Engineering in Health Care: Proceedings of the 4th Annual
Conference of the IEEE Engineering in Medicine and Biology Society,
pp. 205-210. New York: IEEE Press (1982).
Noise Stress Test Database (\nstdb)
----------------------------------Record names:
bw
118_02
em
118_04
ma
118_06
118_08
118_10
118_12
119_02
119_04
119_06
119_08
119_10
119_12
This database consists of 15 thirty-minute records. Three of these
(records
`bw', `em', and `ma') contain noise of the types typically observed in
ECG
recordings. They were obtained using a Holter recorder on an active
subject,
with leads placed so that the subject's ECG is not visible. Two signals
were recorded simultaneously. Record `bw' contains primarily baseline
wander,
a low-frequency signal usually caused by motion of the subject or the
leads.
Record `em' contains electrode motion artifact (usually the result of
intermittent mechanical forces acting on the electrodes), with
significant
amounts of baseline wander and muscle noise as well. Record `ma'
contains
primarily muscle noise (EMG), with a spectrum which overlaps that of the
ECG,
but which extends to higher frequencies. Electrode motion artifact is
usually
the most troublesome type of noise for arrhythmia detectors since it can
closely mimic characteristics of the ECG.
The remaining records reproduce MIT-BIH Arrhythmia Database records 118
and 119
with `em' noise added at various levels. Since the correct beat labels
are
known for these records, they may be used to test the noise tolerance of
an
arrhythmia detector. (`\mitdb\118.atr' and `\mitdb\119.atr' are the
reference
annotation files for these records; there are no annotation files in the
`\nstdb' directory.) Records 118 and 119 were chosen for this purpose
because
they are not intrinsically noisy, and because most arrhythmia detectors
can
analyze them without significant errors. Thus any errors which occur in
the
analysis of the records to which noise has been added can be attributed
to the
noise, and not to any inherent difficulty in analyzing the ECG itself.
The
names of these records are of the form `rrr_nn', where `rrr' is the name
of the
original ECG record and `nn' indicates the noise-to-signal ratio during
the
noisy periods (02 = noise 20% as large as the QRS complexes, 12 = noise
120% as
large as the QRS complexes, etc.).
Reference:
Moody, G.B., Muldrow, W.K., and Mark, R.G.
A noise stress test for arrhythmia detectors.
Computers in Cardiology 11:381-384 (1984).
ST Change Database (\stdb)
-------------------------Record names:
300
305
301
306
302
307
303
308
304
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
This database consists of 28 unannotated records ranging in length from
13 to
67 minutes, obtained from 28 subjects. Records 300 to 323 were obtained
during exercise stress tests, using an FM instrumentation tape recorder;
these
records exhibit transient ST depression in response to exercise-induced
ischemia. The header files for these records include information about
the
gain of the signals which can be used to calibrate ST measurements in
terms
of body surface potentials.
Records 324 to 327 were obtained from Holter tapes, and show ST
elevation.
Records 313 to 317 and 319 to 323 contain only one signal; the rest
contain
two signals. All signals were sampled at 360 Hz.
Reference:
Albrecht, P.
S-T segment characterization for long-term automated ECG analysis.
MIT M.S. thesis (1983).
Malignant Ventricular Arrhythmia Database (\vfdb)
------------------------------------------------Record names:
418
423
419
424
420
425
421
426
422
427
428
429
430
602
605
607
609
610
611
612
614
615
This database consists of 22 thirty-five-minute records, obtained from
Holter
tapes of 16 subjects. It is annotated only with respect to rhythm
changes,
which include 89 episodes of ventricular tachycardia, 60 episodes of
ventricular flutter, and 42 episodes of ventricular fibrillation. The
signal
files contain two signals, each sampled at 250 Hz.
References:
Greenwald, S.D., Albrecht, P., Moody, G.B., and Mark, R.G.
Estimating confidence limits for arrhythmia detector performance.
Computers in Cardiology 12:383-386 (1985).
Greenwald, S.D.
Development and evaluation of a ventricular fibrillation detector.
MIT M.S. thesis (1986).
Atrial Fibrillation/Flutter Database (\afdb)
-------------------------------------------Record names:
00735
04126
03665
04746
04015
04908
04043
04936
04048
05091
05121
05261
06426
06453
06995
07162
07859
07879
07910
08215
08219
08378
08405
08434
08455
This database may be useful for development and evaluation of atrial
fibrillation/flutter detectors which rely on timing information only.
It
consists of 25 ten-hour records (obtained from Holter tapes of 25
subjects)
containing about 300 episodes of atrial fibrillation and 40 episodes of
atrial
flutter. Because of space limitations, it is not feasible to include all
250
hours of the ECG signals on this disk. The `\afdb' directory contains
two sets
of annotation files for all 25 records, and a signal file for record
04936.
The signal file contains two signals, sampled at 250 Hz. The reference
annotation (.atr) files contain only rhythm change annotations. The beat
annotation (.qrs) files were produced by an automated QRS detector, and
all
beats are labelled normal; the R-R interval sequences may be recovered
from
these files and used as input to the atrial fibrillation/flutter detector
to be
tested. Note that the .qrs files have not been audited, and contain a
small
number of errors.
Reference:
Moody, G.B., and Mark, R.G.
A new method for detecting atrial fibrillation using R-R intervals.
Computers in Cardiology 10:227-230 (1983).
ECG Compression Test Database (\cdb)
-----------------------------------Records:
08730
11442
11950
12247
12431
12490
12531
12621
12713
12921
(7)
(4)
(5)
(4)
(5)
(2)
(5)
(4)
(4)
(1)
12936
12940
12981
13005
13030
13045
13059
13130
13139
13227
(3)
(5)
(3)
(2)
(5)
(3)
(7)
(2)
(3)
(6)
13229
13274
13301
13317
13370
13380
13420
13425
13431
13508
(4)
(5)
(4)
(5)
(2)
(5)
(12)
(5)
(3)
(5)
13543
13556
13559
13580
13585
13590
13649
13687
(6)
(7)
(1)
(2)
(8)
(6)
(4)
(4)
This database consists of 168 unannotated records, each 20.48 seconds in
duration, obtained from Holter tapes from 38 subjects. Each subject is
identified by one of the five-digit numbers in the table above; the
number
in parentheses next to each of these subject numbers is the number of
records
which were obtained from that subject's Holter tape. The record names
are
of the form `sssss_nn', where `sssss' is the subject number and `nn' is a
segment number, beginning at 01; thus the records for subject 12490 are
named `12490_01' and `12490_02'.
The records exhibit a wide variety of arrhythmias, conduction
disturbances, and
noise. Many were selected because the characteristics of the signal or
noise
may be expected to pose a problem for an ECG compression method which is
not
exactly invertible. By comparing diagnoses made on the basis of
compressed
versions of these records with independent diagnoses made from the
uncompressed
versions, the ability of an ECG compression method to preserve clinically
important waveform details can be assessed. Each record contains two
signals,
sampled at 250 Hz.
Reference:
Moody, G.B., Mark, R.G., and Goldberger, A.L.
Evaluation of the "TRIM" ECG data compressor.
Computers in Cardiology 15 (1988).
Supraventricular Arrhythmia Database (\svdb)
-------------------------------------------Record names:
800
805
801
806
802
807
803
808
804
809
810
811
812
This database contains the first 13 thirty-minute records of a database
we are
building to supplement the examples of supraventricular arrhythmias in
the
MIT-BIH Arrhythmia Database. The records were obtained from Holter tapes
of
13 subjects. They have been annotated using a semi-automated method
which
gives highly accurate results, but the annotations have not been audited
to
the extent of those in the MIT-BIH Arrhythmia Database, and a small
number of
errors may be present. The reference annotation files include beat and
signal
quality annotations, but no rhythm annotations. Each record contains two
signals, sampled at 128 Hz.
Long-Term Database (\ltdb)
-------------------------Record names:
14046
14157
14134
14172
14149
14184
15814
This database contains seven annotated long-term records ranging in
length
from 14 to 24 hours. These records are complete Holter tapes from seven
subjects. As for the Supraventricular Arrhythmia Database, the records
have been annotated using a semi-automated method, and a small number of
errors may be present. The reference annotation files include beat and
signal
quality annotations, but no rhythm annotations. Six of the records
contain two
signals; record 15814 contains three. All signals are sampled at 128 Hz.
-----------------------------------------------------------------------------Macintosh is a registered trademark of Apple Computer, Inc.
Microsoft and MS-DOS are registered trademarks of Microsoft Corporation.
UNIX is a registered trademark of AT&T.
Download