Detection and Classification of Breast Cancer
Nandi Nwe Win, Nang Aye Aye Htwe

Abstract— Breast cancer is the second most lethal cancer for
women in the world today. X-ray mammography is the most
widely used method for early detection of breast cancer. To
detect breast cancer region, Canny edge detection is used. To
separate this region from all other background, thresholding
method is used. This paper presents an implementation of
detection and classification system for cancerous tissues.
Malignant and benign abnormalities are selected from the
segmented images. And then texture based features are
extracted using Gray Level Difference Method (GLDM). For
the purpose of pattern classification between malignant and
benign samples, the optimum subset of texture features are
modeled by using Artificial Neural Network (ANN).Detection
and Classification of cancerous tissues is implemented with
MATLAB programming language.
Index Terms—Artificial Neural Network, Canny Operation
Digital mammograms, Feature Extraction,
Difference (GLDM), Thresholding
Gray
Level
these features with a pattern recognition algorithm. Features
are nothing but observable patterns in the image which gives
some information about image. For every pattern
classification problem, the most important stage is Feature
Extraction. The accuracy of the classification depends on the
Feature Extraction stage. Much research has been done in
mammography towards detecting one or more abnormal
structures: circumscribed masses [5], speculated lesions [6]
and micro-calcifications [4].Other researchers have focused
on classifying the breast lesions as benign or malignant.There
are different feature descriptors such as GLDM (Gray Level
Difference Method), LBP (Local Binary Patterns),
GLRLM(Grey level Run Length Method),Harralick, Gabor
texture features and there are classification methods such as
SVM,C4.5,K-NN Classifier,ANN. In this paper we have
used a GLDM feature extraction method over set of
mammography images and then tested their performance on
ANN classification.
II. RELATED WORKS
I. INTRODUCTION
Cancer is uncontrolled growth of cells. Breast cancer is the
uncontrolled growth of cells in the breast region. Breast
cancer is the second leading cause of cancer deaths in women
today. Early detection of the cancer can reduce mortality rate.
Mammography has reported cancer detection rate of 70-90%
which means 10-30% of breast cancers are missed with
mammography [1].Early detection of breast cancer can be
achieved using Digital Mammography, typically through
detection of Characteristics masses and/or micro
calcifications. A mammogram is an x-ray of the breast tissue
which is designed to identify abnormalities. The presence of
clustered microcalcifications in X-ray mammograms is
considered an important indicator for the detection of breast
cancer, especially for individual microcalcifications with
diameters up to about 0.7 mm and with an average diameter of
0.3 mm [2]. Studies have shown that radiologists can miss the
detection of a significant proportion of abnormalities in
addition to having high rates of false positives .Therefore; it
would be valuable to develop a computer aided method for
mass/tumour classification based on extracted features from
the Region of Interest (ROI) in mammograms [3]. Pattern
recognition in image processing requires the extraction of
features from regions of the image, and the processing of
Manuscript received Oct 15, 2011.
Nandi Nwe Win, Department of Information Technology, Mandalay
Technological University, Mandalay, Myanmar, 09-256269894 (e-mail:
anonymous.mdy.85@gamil.com).
Nang Aye Aye Htwe, Department of Information Technology, Mandalay
Technological University, Mandalay, Myanmar, 095661208 (e-mail:
htwe.aye@gmail.com).
In the literature, various numbers of techniques are
described to detect and classify the presence of breast cancer
in digital mammograms. A lot of research has been done on
the textural analysis on mammographic images.
Papadopoulossa et al. [7] presented a hybrid intelligent
system for the identification of microcalcification clusters in
digital mammograms, which can be summarised in
three-steps: edge detection, segmentation, feature extraction
and classification.
This paper investigates the accuracy of a detection
methodology that uses Haralick Texture Features as an input
to ANN (Artificial Neural Networks) to classify the images
into benign or malignant[8]. Weidong Xu et al. proposed a
new algorithm based on ANN for detecting masses
automatically [9].
III. BACKGROUND THEORY
In this paper, there are four main parts: image acquisition,
edge detection, image segmentation, feature extraction and
classification.
A. Image Acquisition
Digital mammograms are used as the standard inputs into
the proposed framework. Mammography dataset obtained
from the Mammographic Image Analysis Society (MIAS)
database. MIAS mammography images are digitized at 200
micron pixel edge, with a size of 1024 ×1024 pixels. Each
pixel in the grayscale mammogram image represents the pixel
intensity in the range of [0, 255] (8-bit). Breast images in
MIAS database as shown in Figure 1.
1
All Rights Reserved © 2012 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
Figure 1. Breast images in MIAS database
B. Canny Edge Detection
The Canny edge detection is known as the optimal edge
detector. Canny edge detection aims at enhancing the many
edge detectors already published at that time. It is important
that edges occurring in images should not be missed and that
there be no responses to non-edges. Canny method is a better
method to find edges by isolating noise from the image
without disturbing the feature of edges in the image. The
experimental result of tested breast image by using Canny
method as shown in Figure 2.
Figure 4. (a) Original image (b) GLDM for Original image
(distance=1, direction=0).
The Grey-Level Difference Method is constructed based
on the statistics of the second order joint conditional
probability density function p (i | d).Where i is the grey level
(i.e. intensity) difference between two pixels. And then the
feature vectors can be derived the following the feature as
shown in Table 1.
TABLE I
DESCRIPTION OF TEXTURE FEATURES
Feature
Figure 2. Edge Detection Using Canny Method
C. Image Segmentation
The goal of Image Segmentation is to find regions that
represent objects or meaningful parts of objects.
Segmentation divides an image into its constituent regions or
objects. Thresholding has been used for segmentation as it is
most suitable for the present application in order to obtain an
image with ‘1’ representing the breast tumor
and ‘0’
representing the background. The segmented breast as shown
in Figure 3.
Figure 3. Image Segmentation Using Thresholding
1
Contrast
2
Mean
3
Formula
Entropy
4
Inverse Difference
Moment
5
Angular Second
Moment
6
Area
D.
Texture Features Extraction Using Gray Level
Difference Method (GLDM)
Texture Feature extraction is a very important process in
the area of classification. Texture features have been widely
used in mammogram classification. The texture features are
ability to distinguish between abnormal and normal cases.
Gray Level Difference Method (GLDM) is a good feature
extraction method for our implementation. An example of
gray level difference method is as shown in Figure 4(a) and
(b).
A complete set of 360 features are used for the
classification of breast image. Resulting feature vectors are
shown in Figure 5. Finally, these sets of features are used to
classify the breast images.
2
All Rights Reserved © 2012 IJSETR
past experience and produce a result. 6 features fed to neural
input layer. The 20 hidden layer and the output layer produce
either 1 (Benign) or 0 (Malignant).
IV. SYSTEM DESIGN
A. Design of the Proposed System
In this system, Canny Method, Thresholding Technique,
Gray Level Difference Method and Artificial Neural Network
are applied to implement Breast Cancer Detection and
Classification System. In image acquisition step, we have
used the images from MIAS database. The total 80
mammograms have been used for training and testing.
These images are already processed. After applying
GLDM feature extractor following value are Contrast,
Angular Second moment, Entropy, Mean, Inverse Difference
Moment and Area. ANN Classifier is applied to these features
which classify the input image as malignant or benign. Overall
block diagram of the system is shown in Figure 7.
Input:
Image Acquisition
Edge Detection
Image Segmentation
Texture Feature
Extraction
Classification
Artificial Neural
Network
Digital
Mammogram
Figure 5. Extracted Features
E. Classification
Neural network is the best tool in pattern classification
application and composed of three layers as shown in Figure
6.
Input Layer
Hidden Layer
Output Layer
Contrast
Mean
Entropy
.
.
output
.
Angular
second
moment
Inverse
difference
moment
Classification
Benign
Result : Malignant
or Benign
Figure 7. Overall Block Diagram of the System
V. EXPERIMENT
For the experiment we have used MIAS database. It is a
collection of 100 images. We implemented GLDM feature
extraction method in Mat lab V-7.1, R-12.These images are
already preprocessed. After applying GLDM feature extractor
following values are obtained. As in table 1.ANN Classifier is
applied to these features which classify the input image as
malignant or non malignant. This paper gives result for two
images shown in Figure 8 and Figure 9.
Malignant
Area
Figure 6. Architecture of Artificial Neural Networks
The classification process is divided into the training
phase and the testing phase. The classifier is trained and tested
on mammogram image. The classification accuracy depends
on training. In the training phase known data are given. In the
testing phase, unknown data are given and the classification is
performed using the classifier after training. The accuracy of
the classification depends on the efficiency of the training.
Neural network are trained by experience, when fed an
unknown input into neural network, it can generalize from
Figure 8. Input Image 1 for GLDM
Figure 9. Input Image 2 for GLDM
3
All Rights Reserved © 2012 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 1, Issue 1, July 2012
TABLE II
GRAY LEVEL DIFFERENCE METHOD EXTRACTED FEATURES
FEATURES
IMAGE1
IMAGE2
Benign
Malignant
Angular
Second
Moment
216.0473
159.2742
Contrast
52.1763
44.7974
Inverse
Different
Moment
0.9604
0.6952
Mean
0.3044
0.2616
Entropy
0.0117
0.0099
Area
0
144.2500
VI. CONCLUSIONS
Breast cancer classification is a vital stage for the
performance of the canny method of breast cancer detection.
GLDM feature vector is calculated for each image cell and is
used for better computation performance. It reduces the false
positive rate by reducing the unnecessary biopsy and health
care cost as well. ANN shows very good performance in
medical diagnostic systems. Computational time is around 36
seconds for each breast classification. It was evaluated on 60
images containing malignant and benign masses with
different size and shape. Using the ANN classifier, breast
cancer diagnosis with a training accuracy of 100% and testing
accuracy of 100% is achieved.
ACKNOWLEDGMENT
First of all, the author is grateful to her parents who
specially offered strong moral and physical support, care and
kindness. The author is highly grateful to Dr. Myint Thein, the
Pro.Rector of the Mandalay Technological University for his
permission for completion of this paper. The author is deeply
thankful to Dr. Aung Myint Aye, Dr. Nang Aye Aye Htwe,
Mandalay Technological University, for their overall
supporting during the writing of this paper.
REFERENCES
After extracting the features, the user runs the final result
Figure 10 are results of breast classification with Malignant
and Benign.
[1]
[2]
[3]
[4]
[5]
[6]
[7]
Figure 10. Classification result of the program
To evaluate performance in this system, there are known
image from a train data set and an unknown image from a test
data set. The system’s accuracy of breast classification is
described in Table 3.
[8]
[9]
TABLE III
THE ACCURACY RATE OF BREAST CLASSIFICATION
Images
set
Cancer
Non-ca
ncer
Tot No
Correct
Prediction
Accuracy
rate
Training
set
30
30
60
60
100%
Testing
set
50
50
100
100
100%
R. G. Bird, R. G. Wallace, and B. C. Yankaskas,
“Analysis
ofcancers missed at screening mammography,” Radiology, vol. 184,
pp. 613–617,1992.
D. B. Kopans, Breast Imaging. Philadelphia, PA: J.B. Lippincoff, pp.
81–95,1989.
M. Sampat, M. Markey, A. Bovik et al., “Computer-aided detection
and diagnosis in mammography,” Handbook of image and video
processing,vol. 10, no. 4, pp. 1195–1217, 2005.
R. Strickland and H. Hahn, “Wavelet transforms for
detectingmicrocalcifications in mammograms,” Medical Imaging,
IEEE Transactions on, vol. 15, no. 2, pp. 218–229, 1996.
M. Giger, F. Yin, K. Doi, C. Metz, R. Schmidt, and C.
Vyborny,“Investigation of methods for the computerized detection and
analysis of mammographic masses,” in Proceedings of SPIE, vol.
1233, 1990, p.183
S. Liu and E. J. Delp, “Multiresolution detection of stellate lesions in
mammograms,” in In Proceedings of the IEEE International
Conference
on Image Processing, 1997, pp. 109–112
Y. Cairns, I. W. Ricketts, D. Folkes, M. Nimmo, P. E.
Preece,A.Thompson,and C. Walker, “The automated detection of
clusters of microcalcifications,” in Proc. Inst. Elect. Eng. Colloquium
on Applications of Image Processing in Mass Health Screening, pp.
3/1–5,1982.
Papadopoulossa,
D.I. Fotiadisb, A. Likasb, ―An
AutomaticMicrocalcification Detection System Based on a Hybrid
Neural Network Classifier‖, Artificial Intelligence in Medicine , pp:
149–167, v.25, 2002.
R. M. Welch, K. S Kuo, S. K. Sengupta, and D. W. Chen, “Cloud field
classification based upon high spatial resolution textural feature (I):
gray-level cooccurrence matrix approach,” J. Geophys. Res., vol.93,
pp. 12, 663–12681, Oct. 1988.
4
All Rights Reserved © 2012 IJSETR