automatic labeling and classification of brain

Tianxia Gong, Shimiao Li, Jie Wang, Chew Lim Tan1∗
Boon Chuan Pang, C. C. Tchoyoson Lim, Cheng Kiang Lee2
Qi Tian, Zhuo Zhang3
School of Computing, National University of Singapore
National Neuroscience Institute, Tan Tock Seng Hospital
Institute for Infocomm Research
Automatic medical image classification is difficult because
of the lacking of training data. As manual labeling is too
costly, we provide an automatic labeling solution to this problem by making use of the radiology report associated with
the medical images. We first segment and reconstruct the 3D
regions of interest (ROIs) from the medical images, and extract pathology and anatomy information from the associated
report. We use an anatomical atlas to map the ROIs to the
anatomy part(s) and match the pathology information of the
same anatomy part(s) from the text. In this way, the ROIs
are automatically labeled with pathology types which can be
served as class labels, and a training data set of a large number of training instances is generated automatically. We extract the volume, color, location, and shape features of the
ROIs, and classify the types of ROIs using these features. The
overall evaluation result is promising to doctors and medical
professionals. Our experiment is conducted using traumatic
brain injury CT images; however, our framework of automatically labeling and classifying medical cases can be extended
to medical images in other modality or of other anatomical
Index Terms— Biomedical image processing, object
recognition, biomedical informatics
classifier difficult. Moreover, most works focused on image processing only and neglected the valuable information
contained in the associated text reports. We propose a novel
framework that solves the above problems by making use of
both medical images and its associated radiology reports to
generate a training corpus for region based medical image
classification. A CT or an MRI examination usually consists
of a radiology report and a series of images. An example
brain CT radiology report and the key frames of the images
are shown in Figure 1.
There is a intracerebral hematoma present in the
left frontal lobe with associated acute subdural
hematoma in left frontal and parietal lobe. The
hematoma over the right parietal lobe is lenticular in shape and represents acute extradural
hematoma. No skull vault fracture is seen.
Fig. 1. The key frames of CT image series
In recent years, a large number of medical images are produced daily in hospitals and medical institutions. As manually processing medical images requires expertise and is thus
expensive and slow, computerized image processing in this
domain is of great need. One important task is to identify
the region of interest (ROI) in these images and classify the
images according to these ROIs. Many works used different techniques to achieve this goal; however, the lacking of
labeled data makes the training of machine learning based
∗ This
research is supported in part by MOE grant R252-000-349-112.
Medical image classification and automatic annotation are becoming popular research topics in recent years. Numerous
tasks have been proposed in the tracks in [1]. The tasks are
focused on classifying medical images to different categories
of acquisition modality (CT, X-ray, MR, etc.), body orientation, body region, and biological system. A few research
works focused on classifying medical images or automatically annotate medical images by pathology. [2] manually
labeled 35 brain CT image series and classified the 2D images according to the hematoma types found in the images.
The main constraint is that each ROI must be explicitly labeled. Such region based manual labeling is expensive as it
requires large amount of effort from medical professionals.
To overcome this difficulty, [3] used unsupervised approach
to label the images with keywords extracted from associated
reports. [3] used machine translation approach to learn the
alignments between pathology labels and the region of interest iteratively; but the lacking of region-labeled training data
affect the image classification accuracy to some extent.
We propose a framework to automatically map the information extracted from both text and images, generate a regionbased labeled image corpus for training, and classify the new
image series using trained model. Figure 2 shows the architecture of our system. We take a series of images as input
and reconstruct it to 3D. For image series with large distance
between scans, we interpolate the intermediate images before
3D reconstruction. Then we set a threshold automatically and
segment the 3D region(s) of interest. We register the 3D image to the reference 3D brain CT model we build and label
the ROI(s) with the anatomy region(s) it overlaps with in the
3D brain atlas. For free text radiology report, we first extract the pathology and anatomy expressions and map them to
standard terms; then we extract the logical relation between
the anatomy terms and the pathology terms from their syntactic relation in the sentence. Finally we compare the anatomy
labels for ROI from image processing result and the anatomy
terms modifying the pathology term in text processing result.
If they match, we set the class of the ROI with the pathology
term its anatomy term modifies in the report.
3.1. ROI 3D reconstruction
To label the regions of interest of each image series, we first
need to segment the ROIs and reconstruct the them to 3D. For
brain CT images, we first use the procedures in [4] to preprocess the image series. We remove skull area from the image to
keep only the brain content inside the skull. We also remove
the cupping artifacts–the “white ring” of brain area adjacent
to skull due to imaging reasons. For the convenience of the
registration process in later stage, we rotate the images so that
the ideal midline is vertical. Then we reduce the noise and
normalize the intensity of the whole image series and use the
method in [4] to segment the ROI regions. In some cases, the
images are too scarce for the ROIs to be directly reconstruct
to 3D, we need to interpolate the intermediate slices prior to
3D reconstruction. We use the shape based method described
in [5] to do the interpolation. Figure 3 demonstrates two segmentation and 3D reconstruction examples of traumatic brain
injury (subdural hematoma and extradural hematoma).
Fig. 3. Segmentation and 3D reconstruction examples
3.2. Mapping 3D ROI to anatomical locations
After we reconstruct the whole image series to 3D and segment the ROI(s), we need to map them to respective anatomical locations in the brain, so that when comparing with the
associated radiology report, we know which ROI(s) the report describes. We use the data provided in [6] to construct
a 3D brain model for an average brain and mark the different
anatomical parts [7]. Figure 4 shows some slices of the 3D
brain anatomical atlas with different colors referring to different anatomical parts. We use the method in [8] to do 3D
image registration and label the ROI in each case with the
anatomy region(s) it overlaps with in the referenced 3D brain
CT model. For example, after image registration, the 3D extradural hematoma region shown in Figure 3 (right) overlaps
with the frontal and parietal lobes of the left side of the brain,
so its anatomical location will be labeled as “left frontal lobe”
and “left parietal lobe”.
3.3. Pathology and anatomy term extraction from radiology report
Fig. 2. The general framework
In text processing part, we take the radiology reports as input, extract pathology terms and anatomy terms that modify
the pathology terms, and output them in a structured form.
finding with these two anatomy concepts, and if found in
the structured term extraction result, we take its pathology
concept and label the hematoma with it.
3.5. Automatic ROI classification
Fig. 4. Some sample slices from the 3D brain anatomy map
Firstly, we reduce the misspelt and variants words into standard forms such as “hemorrhage” for “hemorhage”. We add
a negation filter to detect negative expressions, so that terms
referring to pathologies not found in the images will not be
extracted. We use MetaMap [9] for term mapping. MetaMap
identifies phrases with medical terms and map the terms to
Unified Medical Language System (UMLS) Metathesaurus,
which contains terms from the various controlled vocabularies. We use the method described in [10] to find the relations
between the pathology and anatomy terms extracted so that
we know where the pathology occurs. For example, after term
mapping and parsing, the syntactic relations of the words and
terms in the example report from Section 1 are shown in
Figure 5. We transfer the syntactic relation to logical relation
between the pathology term and anatomy term:
pathology concept: intracerebral hematoma
anatomical location: left frontal lobe
pathology concept: subdural hematoma
anatomical location: left frontal lobe
anatomical location: left parietal lobe
pathology concept: extradural hematoma
anatomical location: right parietal lobe
Fig. 5. The typed dependency tree of example sentence.
3.4. Matching ROI from descriptions in image and text
For each case, after the ROI in the images is mapped to
the anatomical location(s), we search the medical finding
extracted from its report that also has the same anatomical description. Then we label the ROI with the pathology
concept in that finding. For example, the hematoma region
in Figure 3 (right) is mapped to anatomical locations “left
frontal lobe” and “left parietal lobe”. We search the medical
We extract the color, size, location, and shape features for
hematoma classification. For the 3D ROIs reconstructed, we
fit a ellipsoid to the region and use the lengths of the three axes
of the ellipsoid as shape descriptors. Extent and solidity are
also important shape features to distinguish different types of
hematoma. Some image series are too sparse to get a good 3D
reconstruction result, 3D features for ROIs in these cases may
not be good enough; so we also use the slice with largest ROI
area to extract 2D shape features. Table 1 shows the details of
each feature.
Table 1. Features for hematoma regions
The average intensity of the hematoma
The volume of the hematoma region
The anatomy locations of the hematoma
3D Axes
Lengths of the three axes of the fitting ellipsoid
3D Extent hematoma volume/bounding box volume
3D Solidity hematoma volume/convex hull volume
2D Axes
Lengths of the two axes of the fitting ellipse
2D Extent hematoma area/bounding box area
2D Solidity hematoma area/convex hull area
We obtained 429 CT image series of severe traumatic brain
injury with associated radiology reports from National Neuroscience Institute, Tan Tock Seng Hospital, Singapore. The
pathology terms of hematoma types extracted from the radiology reports are: subdural hematoma (SDH), extradural
hematoma (EDH), intracranial hematoma (ICH), intraventricular hematoma (IVH), and subarachnoid hematoma (SAH).
These hematoma types extracted served as class labels. The
anatomy terms extracted from the text correspond to the
anatomy part in the brain atlas. They include ventricle and
left/right side of the four lobes: frontal, parietal, temporal,
and occipital lobe. 38 ROIs in 29 image series are manually labeled with pathology classes. 34 of the 38 ROIs are
assigned with the correct class label by the automatical process as described in Section 3.4, i.e. the overall precision of
the automatical class label assigning process during creating
training instances process is 89.5%. We assigned pathology
class labels to the remaining 400 cases and generated a training corpus. We used svm for classification. Ten-fold cross
validation result shown in Table 2 is used to evaluate the
system performance.
Apart from the class labeling error produced in automatic
training corpus creation process, other factors that affect classification accuracy include the errors produced at different
phases: interpolation error, segmentation error, image registration/anatomy labeling error, and term extraction error. For
ROIs of relatively smaller size, they appear in very few scans
or even just one scan in the image series, thus the interpolation, 3D reconstruction and feature extraction results for such
ROIs are poorer. Segmentation error affect the SAH cases
more than others, as SAH is not seen as obvious as others in
intensity. Image registration error, ROI anatomy labeling error, and term extraction error are minimal and have the least
effect on the system performance. The classification result is
also affected due to unbalanced data set. SDH is the most frequent class label, but only a few cases are labeled with IVH.
Therefore, the classification is biased towards SDH and results in higher recall; whereas the classification result for IVH
is poorer than most of the other classes.
Table 2. Classification results
precision 80.6 82.9 83.3 75.0
87.9 79.1 78.9 66.7
As the data sets used in our work and related works [2]
and [3] are of the same type and from the same source,
though the specific images series used for training and testing
are different, we consider the experiment results comparable. Though the ROIs are manually labeled in [2], our ROI
classification result is better than the results from [2] because
the automatic region based class labeling process in our work
provides a much larger data set for training; whereas the
data set in [2] is very small due to high expenses of manual
medical image labeling. Our result is also slightly better than
the result from [3], because the region specific labels along
with the ROI features provide more insights of the ROIs in
our work, and the ROI-annotation alignment error in [3] is
greater than the automatic ROI labeling error in our work.
We propose a framework to use both medical images and text
to automatically generate a training corpus for region based
medical image classification. The accuracy of assigning correct class labels to respective regions of interest in the process of training data set genetation is promising to medical
professionals. The ROI classification result is better than the
results from related works due to the novel automatic region
based class labeling process, as it provides more insights of
the ROIs and generates a larger data set for training and testing. Though our experiment was conducted using traumatic
brain injury data, the framework we described in the paper
can be extended to medical images in other modality or of
other anatomical parts.
[1] Tatiana Tommasi, Barbara Caputo, Petra Welter,
Mark Oliver Guld, and Thomas M. Deserno, “Overview
of the clef 2009 medical image annotation track,” CLEF
working notes 2009, 2009.
[2] Tianxia Gong, Ruizhe Liu, Chew Lim Tan, Neda Farzad,
Cheng Kiang Lee, Boon Chuan Pang, Qi Tian, Suisheng
Tang, and Zhuo Zhang, “Classification of ct brain images of head trauma,” in Proceedings of the second
IAPR International Workshop on Pattern Recognition in
Bioinformatics, 2007, pp. 401–408.
[3] Tianxia Gong, Shimiao Li, Chew Lim Tan, Boon Chuan
Pang, C.C. Tchoyoson Lim, Cheng Kiang Lee, Qi Tian,
and Zhuo Zhang, “Automatic pathology annotation on
medical images: A statistical machine translation framework,” in Proceedings of the 20th International Conference on Pattern Recognition, 2010, pp. 2504–2507.
[4] Shimiao Li, Tianxia Gong, Jie Wang, Ruizhe Liu,
Chew Lim Tan, Tze Yun Leong, Boon Chuan Pang,
C. C. Tchoyoson Lim, Cheng Kiang Lee, Qi Tian, and
Zhuo Zhang, “Tbidoc:3d content-based ct image retrieval system for traumatic brain injury,” in Proceedings of SPIE Medical Imaging 2010, 2010.
[5] George J. Grevera and Jayaram K. Udupa, “Shapebased interpolation of multidimensional grey-level images,” IEEE Transactions on Medical Imaging, vol. 15,
no. 6, pp. 881–892, 1996.
[6] Jie Wang, Shimiao Li, Chew Lim Tan, Boon Chuan
Pang, C.C. Tchoyoson Lim, Cheng Kiang Lee, Qi Tian,
and Zhuo Zhang, “Registration based ct slice indexing
for severe traumatic brain injury,” 2011, submitted.
[7] Hanna Damasio, Human Brain Anatomy in Computerized Images, Oxford University Press, 2005.
[8] Andriy Myronenko and Xubo Song, “Intensity-based
image registration by minimizing residual complexity,”
IEEE Transactions on Medical Imaging, vol. 29, no. 11,
pp. 1882–1891, 2010.
[9] Alan R. Aronson, “Effective mapping of biomedical text
to the umls metathesaurus: The metamap program,” in
Proceedings of American Medical Informatics Association Annual Symposium (AMIA’01), 2001, pp. 17–21.
[10] Tianxia Gong, Chew Lim Tan, Tze Yun Leong,
Cheng Kiang Lee, Boon Chuan Pang, Tchoyoson Lim,
Qi Tian, Suisheng Tang, and Zhuo Zhang, “Text mining in radiology reports,” in Proceedings of the eighth
IEEE International Conference on Data Mining, 2008,
pp. 815–820.