Breast Cancer Classification with Statistical Features of Wavelet

advertisement
Breast Cancer Classification with Statistical Features of
Wavelet Coefficient of Mammograms
Shital Lahamage1, Harishchandra Patil2
2
1
PG Student, Cummins College of Engineering, Pune
Associate Professor, Cummins College of Engineering, Pune
1sheetalsonare9@gmail.com
2
ht_patil143@yahoo.com
the lesion is more than surrounding tissues. These
abnormalities are classified into two classes as Benign and
Malign. Normally, it is vary tedious for radiologist to
analyse between benign and malignant mass. This study
involves some novel classification approach and resulted in
good accuracy rates in classifying benign and malignant.
The result is obtained were analysed for its efficiency using
some performance matrices like accuracy, sensitivity with
the help of SVM.
Abstract— Mammography is an X-ray imaging technique
for diagnosis breast tumor. Segmentation of tumor in the
mammogram images are difficult task because poor contrast
and lesions are surrounded by tissue with similar
characteristics. Feature extraction from mammogram images
is critical task for classification of cancer. In this paper
methodology to classify breast cancer with extract features
from mammograms is proposed. In this method include image
enhancement, Breast region (ROI) selection and discrete
wavelet transform (DWT) for feature extraction. With
Contrast Limited Histogram Equalization (CLAHE) image
enhancement improves the image quality for processing.
DWT is used for image decomposition and statistical features
are extracted from low frequency coefficients. Principal
Component Analysis (PCA) is used for data reduction and
Support Vector Machine for Classification. This method is
performed on set of images provided by Mammographic
Image Analysis Society (MIAS). The performance of the
system is then evaluated using a dataset containing 80 images
and obtained accuracy about 90.47%
A number of methods have been used to classify or to
detect abnormalities in mammograms. Main task of
methods is extraction of ROI which consist abnormalities.
Variety of techniques has been developed for mass
detection, but most are follow two step scheme. First,
features are computed for each pixel and each pixel is
classified. Second, is region are classified as normal and
abnormal according to features like size, shape, or contrast.
Various techniques for pre-processing and ROI extraction
on the mammograms are available in literature [1-8].
Region of interest (ROI) is detected in [1] using kittler’s
method segmentation. Chengdan et al. [2] proposed
marker-controlled
watershed
for
breast
region
segmentation. Another approach is proposed in [3] which
uses morphological and seeded region growing to remove
digitization noise and suppress artefacts also remove
pectoral muscle to accentuate ROI. Another approach was
suggested in [4] to classify mammograms with DWT and
RT transform with SVM as classifier. Maha sharkas [5]
presented a new method for detection of Microcalcifications (MCs) using contourlet transform and
principal component analysis (PCA) to extract features,
while SVM to classification. Andy Tirtajaya [6] proposed
a methodology based on dual tree complex wavelet
transform (DT-CWT) as feature extraction with SVM
classifier to classify calcification into benign and malign.
To construct and evaluate superimposed classifier for
mammograms using DWT proposed in [7]. New
methodology tested in [8] using db3 at three level
decomposition to classify tumour in Normal and Abnormal
or Benign and malign. Pravin Hajare [9] proposed a
method using Gabor filter, PCA and SVM for breast tissue
classification. This method proposed to feature extraction
using DWT with PCA for data reduction and SVM to
classify tumour into two classes as Benign and Malign.
Keywords— Region of Interest (ROI), Discrete Wavelet
Transform (DWT), CLAHE enhancement, Principal
Component Analysis (PCA), Support Vector Machine (SVM).
I. INTRODUCTION
Breast cancer is most common type of cancer in
women. With tremendous growth of medical field, the
reason of cancer is unknown. Therefore mammogram play
important role in early diagnosing of breast cancer.
Mammography is x-ray imaging technique. In this x-ray
component of mammogram is required for breast cancer
screening purpose. Mammography is simple, chip, most
effective and easily available technique. There are two
types of mammography Film mammography and Digital
mammography. For this experimentation we have chose
digital mammography, because good contrast is achieved
over dense breast tissue, also image acquisition is fast and
patient is exposed to radiation for small amount of time.
Breast cancer is type of cancer which originating from
breast tissues, commonly from inner lining of duct and
from lobules that supply ducts with milk. It originates from
duct called Ductal Carcinoma and when originates from
lobule called Lobular Carcinoma. Most common
abnormality present in the mammograms is mass and
calcification. These are very small in size and contrast of
1
II. CAD SYSTEM
In this CAD system consist six parts shown in Fig.
1.Image acquisition, Pre-processing, Detection (Cancer
Area Selection), Feature Extraction, Feature Selection,
Classification.
B. Image Enhancement
Before any image processing algorithm can be applied on
mammogram, pre-processing steps are very important in
order to limit the search for abnormalities without undue
influence from background of the mammogram. Digital
mammograms are medical images that are difficult to be
interpreted, thus a preparation phase is needed in order to
improve the image quality and make the detection of cancer
area results more accurate. The main objective of this
process is to improve the quality of the image to make it
ready to further processing. Here Contrast Limited
Adaptive Histogram Equalization (CLAHE) enhancement
is applied on image. Lesion area is enhanced by CLAHE
shown in Fig. 2 which is used for further analysis.
Figure: 2 Left: Original Mammogram, Right: Enhanced Image
C. Detection / Breast Region Selection
Figure: 1 Classification System
Original mammograms are 1024x1024 pixels, and almost
50% of images having lot of noise. Therefore a manual
cropping operation is applied to images to remove
unwanted portion of the image such as labels, artefacts etc.
Breast region is cropped according to their x, y imagecoordinates of centre of abnormality, and approximate
radius (in pixels) of a circle enclosing the abnormality and
resize into 256x256. In this we are selecting the breast
region of the abnormality area show in Fig. 3.
A. Image Acquisition
In this study 80 cancerous mammography images from
MIAS which currently has 332 “normal”, “benign” and
“malign” cases [11] were selected. In this study only
circumscribed mass, ill-defined mass, speculated mass,
architectural distortion and asymmetry are considered. In
MIAS associated patient information and image
information is given as below.
There are four major groups for classifying breast density:
• Fatty (F) (106 images).
• Fatty-glandular (G) (104 images).
• Dense-glandular (D) (112 images).
The abnormalities are also described with their kind:
• CALC Calcification.
• CIRC Well-defined/circumscribed masses.
• SPIC Spiculated masses.
• MISC Other, ill-defined masses.
• ARCH Architectural distortion.
Figure: 3 Breast cancer area of image
D. Feature Extraction - Discrete Wavelet Transform
The discrete wavelet transform (DWT) is a linear
transformation that operates on a data vector whose length
is an integer power of two, transforming it into a
Information about x, y image-coordinates of centre of
abnormality, and approximate radius (in pixels) of a circle
enclosing the abnormality are also provided.
2
images at each scale Fig.4.
numerically different vector of the same length. It is a tool
that separates data into different frequency components,
and then studies each component with resolution matched
to its scale. DWT is computed with a cascade of filtering
followed by a factor 2 sub-sampling Fig. 4.
Figure: 4 DWT Tree
Figure: 6 Sub-band images
H and L denoted as high and low-pass filters respectively,
↓ 2 denotes sub-sampling. Outputs of these filters are
given by equations (1) and (2)
𝑎𝑗+1 [𝑝] = ∑∞
𝑘=−∞ 𝑙[𝑛 − 2𝑝]𝑎𝑗 [𝑛]
(1)
𝑑𝑗+1 [𝑝] = ∑∞
𝑘=−∞ ℎ[𝑛 − 2𝑝]𝑎𝑗 [𝑛]
(2)
The decomposition results in two intermediate sub–images.
Then, the same procedure is applied to each column of the
intermediate sub–images. For one–level decomposition,
this results in yields four quarter-sized sub-images LL (m,
n), LH (m, n), HL (m, n) and HH (m, n). In hierarchical
wavelet decomposition, the sub-image LL is further
decomposed into other four sub–images. In this
mammograms selected texture feature listed in Table 1.
Elements aj are used for next step (scale) of the
transform and elements dj, called wavelet coefficients,
determine output of the transform. l[n] and h[n] are
coefficients of low and high -pas filters respectively One
can assume that on scale j+1 there is only half from
number of a and d elements on scale j. This causes that
DWT can be done until only two aj elements remain in the
analysed signal these elements are called scaling function
coefficients. DWT algorithm for two-dimensional pictures
is similar. The DWT is performed firstly for all image rows
and then for all columns shown in Fig.5.
Features
Formulas
𝑀
Mean
𝜇=
𝑁
1
∑ ∑ 𝑝(𝑖, 𝑗)
𝑀𝑁
𝑖=1 𝑗=1
Standard Deviation
𝑀
𝑁
1
𝜎= √
∑ ∑(𝑝(𝑖, 𝑗) − 𝜇)2
𝑀𝑁
𝑖=1 𝑗=1
Energy
𝐸 = ∑ 𝑝(𝑖, 𝑗)2
𝑖,𝑗
𝐿−1
Entropy
ℎ = − ∑ 𝑃𝑟𝑘(𝑙𝑜𝑔2 𝑃𝑟𝑘)
𝑘=0
Skewness
𝑀
𝑆=
𝑁
1
(𝑝(𝑖, 𝑗) − 𝜇)2
∑∑(
)
𝑀𝑁
𝜎
𝑖=1 𝑗=1
𝑉𝑎𝑟 = (𝑆. 𝐷)2
Variance
Homogeneity
Figure: 5 Wavelet decomposition for two-dimensional pictures
𝐻= ∑
𝑖,𝑗
The main feature of DWT is multistage representation of
function. By using the wavelets, given function can be
analysed at various levels of resolution. The DWT is also
invertible and can be orthogonal [14].
𝑀
Kurtosis
𝐾= {
𝑝(𝑖, 𝑗)
1 + |𝑖 − 𝑗|
𝑁
4
1
𝑝(𝑖, 𝑗) − 𝜇
∑∑[
] }
𝑀𝑁
𝜎
𝑖=1 𝑗=1
−3
Smoothness
TEXTURE FEATURES:
In this work only one set of DWT derived features is
considered. It is a vector, which contains features of
wavelet coefficients calculated in sub-bands at successive
scales. As a result of this transform there are 4 sub band
𝑅 =1−
Table 1: Texture & Statistical Feature
3
1
1 + 𝜎2
E. Feature Selection
The feature selection and dimensionality reduction is
process of elimination of closely related data with other
data items in a set, as a result a smaller set of features is
generated which preserves all the properties of the original
large data set. Commonly used dimensionality reduction
techniques are Principal Component Analysis (PCA).
Principal component analysis (PCA) is a mathematical
procedure that uses orthogonal transformation to convert a
set of observations of possibly correlated variables into a
set of values of linearly uncorrelated variables called
principal components. Principal Components Analysis
(PCA). PCA is a useful statistical technique that has found
application in fields such as face recognition and image
compression, and is a common technique for finding
patterns in data of high dimension. PCA is the simplest
type of the true eigenvector-based multivariate analyses. Its
operation can be thought of as revealing the internal
structure of the data in a way that best explains the variance
in the data. If a multivariate dataset is visualized as a set of
coordinates in a high-dimensional data space, PCA can
supply the user with a lower-dimensional picture of this
object when viewed from its most informative viewpoint.
This is done by using only the first few principal
components so that the dimensionality of the transformed
data is reduced.
Figure 7: Support Vector Machine with a hyper plane
Since an SVM is a classifier, then given a set of training
examples, each marked as belonging to one of two
categories, an SVM training algorithm builds a model that
predicts whether a new example falls into one category or
the other. More formally, a support vector machine
constructs a hyper plane or set of hyper planes in a high or
infinite dimensional space, which can be used for
classification, regression or other tasks. Intuitively, a good
separation is achieved by the hyper plane that has the
largest distance to the nearest training data points of any
class (so-called functional margin), since in general the
larger the margin the lower the generalization error of the
classifier.
The basic principle of SVMs is a maximum margin
classifier. Using the kernel methods, the data can be first
implicitly mapped to a high dimensional kernel space. The
maximum margin classifier determined in the kernel space
and the corresponding decision function in the original
space can be non-linear. The non-linear data in the feature
space is classified into linear data, with kernel space by the
SVMs. This is illustrated in Fig. 8 as follows. The aim of
SVM classification method is to find an optimal hyper
plane separating relevant and irrelevant vectors by
maximizing the size of the margin (between both classes).
F. Classification
Support vector machines (SVM) are based on the
Structural Risk Minimization principle from statistical
learning theory. SVM is also applied on different real
world problems such as face recognition, cancer diagnosis
and text categorization. The idea of structural risk
minimization is to find a hypothesis h with the lowest true
error. In their basic form, support vector machines find the
hyper plane that separates the training data with maximum
margin. SVM is a useful technique for data classification.
A classification task usually involves with training and
testing data which consist of some data instances. Each
instance in the training set contains one “target value"
(class labels) and several “attributes" (features). The
standard SVM Fig.7 takes a set of input data, and predicts,
for each given input, which of two possible classes the
input is a member of which makes the SVM a nonprobabilistic binary linear classifier.
Figure 8: The function f embeds the data in the original space (a) kernel
space (b) Where the non-linear pattern now becomes linear.
4
III. EXPERIMENTAL WORK & RESULT
This section is divided into two parts result first is SVM
classification with testing dataset & training dataset and
second is single image testing with SVM.
A. Training –Testing Dataset
Training set contains one “target value" (class labels) such
as benign and malign with several features like texture and
statistical features of image. Testing dataset also consist no
of images to test classification process on that. In this 10
images are used as testing dataset. In this section cropped
ROI saved as dataset as training and testing. ROI is cropped
according to their x, y coordinates of center of abnormality
and the radius of that lesion. Dataset used in work listed in
Table 2.
Dataset
Benign
Malign
Training
5
5
Testing
5
5
Other Data
24
36
(a)
(b)
(c)
(d)
(e)
Figure: 9 (a) Original image, (b) Enhanced image, (c) Cropped ROI, (d) 1st
level decomposed ROI, (d) 2nd level Decomposed ROI
Table 2 Dataset
B. Single Image Testing
In the following, we would like to give few examples to
show the application of the proposed method. Here we used
60 different mammogram images, which were all digitized
at a resolution of 1024×1024 pixels. Since these images
were stored in jpeg version they were converted to grayscale
images. Selected image enhanced and cropped manually
according to information given by MIAS and resized to
256×256 pixels. The proposed algorithm uses DWT to
decomposed image at 2 levels for feature extraction and
then extracted features given to SVM for classification.
Examples of 4 images shown in Fig. 9 for single test
imaging.
C. Performance measures
We have tested the performance of these classifiers by
calculating and analysis of accuracy, sensitivity and
specificity for malignancy and benign detection. These are
defined as follows:
Accuracy: Number of classified mass / number of total
mass.
𝑇𝑃 𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
(3)
𝑇𝑃 𝑇𝑁 𝐹𝑃 𝐹𝑁
Sensitivity: Number of correct classified malignant mass
/number of total malignant mass.
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
5
𝑇𝑃
𝑇𝑃 𝐹𝑁
(4)
college of Engineering, Pune International Conference on Recent
Trends in Engineering & Technology, 2013.
Specificity: Number of correct classified benign mass /
number of total benign mass.
[2] Chengadan Pei, Chunmei Wang, Shengzhou Xu “Segmentation of
𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁 𝐹𝑃
the Breast Region in Mammograms using Marker-controlled
Watershed Transform”IEEE.
(5)
[3] Jawad Nagi, Sameem Abdul Kareem, Farrukh Nagi, Syed Khaleel
Ahmed “Automated Breast Profile Segmentation for ROI Detection
Using Digital Mammogram ”, College of Engineering, University of
Malaya, Malaysia IEEE EMBS Conference on Biomedical
Engineering & Science, pp. 87- 92, 2010.
[4] Salim Lahmiri, Mounir Boukadoum “DWT and RT-Based Approach
for Feature Extraction and Classification of Mammograms with
SVM” Department of Computer Science, University of Quebec at
Montreal IEEE, pp. 412-415, 2011.
Accuracy, sensitivity and specificity of DWT are given in
Table 2 with all previous result obtained by others.
Accuracy
%
Sensitivity
%
Specificity
%
DWT [14]
89%
87%
87%
[6] Andy
Gabor
Wavelet[14]
86%
89%
85%
[7] Cristiane Bastos Rocha Ferreira, Dibio Leandro Borges “Analysis of
Method
Sharkas, Mohamed Al-Sharkawy “Detection of
Microcalcification in Mammograms Using Support Vector Machine
”Department of Electronics & Communication, AAST IEEE , pp.
179-184, 2011.
[5] Maha
Tirtajaya, Diaz D. Santika “Classification of
Microcalcification Using Dual-Tree Complex Wavelet Transform
and Support Vector Machine IEEE, 2nd International Conference on
Advances in Computing, Control & Telecommunication
Technologies, pp. 164-166, 2010.
Mammogram Classification Using a Wavelet
Decomposition Elsevier Science, pp. 973-982, 2002.
Transform
[8] Ibrahima Faye, Brahim Belhaouari Samir, Mohamed M. M.
DWT[15]
89.41%
95.56%
Eltoukhy “Digital Mammograms Classification Using a Wavelet
Based Feature Extraction Method IEEE, 2nd International
Conference on Computer & Electrical Engineering, pp. 318-322,
2009.
82.5%
[9] Pravin S. Hajare, Vaibhav V. Dixit “Breast Tissue Classification
Proposed
90.47%
91.42%
Using Gabor Filter, PCA and Support Vector Machine International
Journal of advancement in electronics and computer engineering
(IJAECE) Volume 1, Issue 4, 2012,
89.79%
[10] Pragathi. J, H. T. Patil “Multiresolution Analysis for ComputerAided Mass Detection in Mammogram Using Pixel Based
Segmentation Method International Conference on Recent Trends in
Information Technology (ICRTIT), pp. 214-220, 2003.
Table: 2 Performance Measures
[11] http://peipa.essex.ac.uk/ ipa/pix/mias/
[12] Lori Mann Bruce, Reza R. Adhami “Classifying Mammographic
IV. CONCLUSION
Mass Shapes Using the Wavelet Transform Modulus-Maxima
Method IEEE Transaction On Medical Imaging, vol.18, pp. 214-220,
1999.
In this work breast cancer classification is done with
good result. Breast region enhancement is achieved by using
CLAHE enhancement technique. Manual cropping method
extract a particular region which having abnormality
correctly. In the proposed algorithm multi-resolution image
analysis is performed to obtain a decomposed image with
DWT for feature extraction. With the help of PCA we
obtained particular features data in a way that best explains
the variance in the data. Extracted features improved
classification result with help of SVM. From the final result
we see that we achived good classification accuracy about
90.47% with sensitivity 91.42% and specificity 89.79% for
all type of lesions.
[13] Wang T.C., and Karayiannis N.B. “Detection of microcalcifications
in digital mammograms using wavelets IEEE Trans. Med. Imaging,
vol.17 no.4, pp. 498-509, 1998.
[14] S. M. Salve, V. A. Chakkarwar "Classification of Mammographic
images using Gabor Wavelet and Discrete Wavelet Transform ",
International Journal of Advanced Research in Electronics and
Communication Engineering (IJARECE) Volume 2, Issue 5, May
2013.
[15] J. Anitha, J. Dinesh Peter "A Wavelet Based Morphological Mass
Detection and Classification in Mammograms", 2012 IEEE.
References
[1] Pragathi J, H. T. Patil “Segmentation Method for ROI Detection in
Mammogram Images using Wiener Filter and Kittler’s Method”,
Department of Instrumentation & Control Engineering, Cummins
6
Download