Retrieval of Compressed Medical Images Using Data Mining Techniques Enireddy.Vamsidhar

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 20 Number 3 – Feb 2015
Retrieval of Compressed Medical Images Using Data Mining Techniques
Enireddy.Vamsidhar#1, Dr.Reddi KiranKumar*2
#
Research Scholar,Depatment of CSE,JNTUK,Kakinada,Andhra Pradesh,INDIA
*Asst.Professor,Department of ComputerScience,Krishna University,Machilipatnam, Andhra Pradesh,INDIA
Abstract— The advance of technology in the medical field is
creating a large amount of digital data in the form of digital
medical images. These images are stored in the large databases
for easy accessibility and Image Retrieval (IR) is used to retrieve
diagnostic cases similar to the query medical image to help the
healthcare professional in analysing the query image. Image
compression technique is utilized for storage and transmission of
the images. A study has been done on the retrieval of compressed
images. The proposed method integrates content based image
retrieval of diagnostic cases similar to the query medical image
and image compression techniques to minimize the bandwidth
utilization. daubechie wavelet is used for image compression
without losses. Edge and texture features are extracted from the
medical compressed medical images using Sobel edge detector
and Gabor transforms respectively. The features are then
reduced using information gain and the classification accuracy
of retrieval is evaluated using Naïve Bayes, Support Vector
Machine, IBL,CART, Random forest.
Keywords— Medical Images, Image retrieval, Compression,
Data Mining, Support Vector Machine, Naïve Bayes, IBL,
CART, Random Forest
I. INTRODUCTION
With the advance of medical technologies, digital images
such as X-rays, MRI, ECG, CT has become a norm for
diagnosis and treatment. These digital medical images are
stored in large databases for easy accessibility and Content
based image retrieval (CBIR) is used to retrieve diagnostic
cases similar to the query medical image[1][2]. Image
retrieval using conventional methods like index or semantics
is not feasible as the databases contain a huge amount of data
and also the image content is more versatile than the
semantics. Using different algorithms CBIR extract relevant
features from the image, on presenting a query image, based
on these features the images are retrieved from the database.
Features such as colour, texture, and shape in the image are
automatically extracted by CBIR systems. Similarity measures
are used to compare the features extracted from the query
image with that of features of images stored in the database.
Images with features similar to that of query are retrieved.
CBIR is now widely applied in medical image applications;
many CBIR systems are reviewed in literature [3] [4]. The
major problem is to store the large amount of diagnostic data
which is in the form of medical images and also the efficient
transmission of the data is also another task with the available
bandwidth. Image compression can be utilized to reduce the
amount of data[5].During compression process, redundancies
in the image are removed resulting in compact representation
of the image. Compression process is of two types: lossless
compression and lossy compression. In lossless compression,
ISSN: 2231-5381
the original image is perfectly recovered whereas in lossy
compression minor loss in details occurs when image is
recovered. The major advantage of lossy compression is high
compression ratio is achieved. Medical image compression
cannot afford to lose any details on recovering of image, as it
may lead to problems to loss of information in diagnostically
important region. Thus, compression ratio achieved through
lossless compression for medical images is very less. A
commonly used approach to overcome this issue is to segment
the medical image into region of interest (ROI), and ROI is
compressed using lossless compression and rest of the image
i.e., non-ROI is compressed using lossy compression. Thus,
achieve a better compression ratio while preserving quality of
diagnostically crucial region.
In this paper, to retrieve
diagnostic cases similar to the query medical image which are
compressed to minimize the bandwidth utilization is
investigated. Daubechie wavelet is used for image
compression with a decomposition level of one to reduce the
losses. Edge and texture features are extracted from the
medical compressed medical images using Sobel edge
detector and Gabor transforms respectively. The classification
accuracy of retrieval is evaluated using Naïve Bayes , Support
Vector Machine, CART,IBL and Random forest.
II. METHODOLOGY
In this paper the medical images were compressed using the
Daubechies wavelet with a decomposition level of one to have
a high PSNR. The low level features edge and textures are
extracted using Sobel edge detector and Gabor transforms
from the compressed medical images. Feature reduction is
done using the gain ratio and the reduced features are used for
classification. The figure shows the detailed methodology of
the work.
MRI Input image
Image compression
Daubechies wavelet
Texture features using Gabor filter
http://www.ijettjournal.org
Edge features using
Sobel Edge Detector
Feature Selection
Information Gain
Classification and performance measurement
Fig 1. Detailed Methodology
Page 143
International Journal of Engineering Trends and Technology (IJETT) – Volume 20 Number 3 – Feb 2015
Daubechies had a lasting impact on the field with her
construction of the first family of compactly supported,
orthogonal wavelet bases [6]. Due to their remarkable
properties and ease of implementation, the Daubechies
wavelets have become popular and led to a number of
successful signal processing applications, such as
compression, denoising, classification, or fusion.
The Daubechies wavelet transforms, the scaling signals and
wavelets have slightly longer supports, i.e., they produce
averages and differences using just a few more values from
the signal. This change, however, provides a tremendous
improvement in the capabilities of these new transforms. They
provide us with a set of powerful tools for performing basic
signal processing tasks. The Daubechie4 wavelet transform is
defined in essentially the same way as the Haar wavelet
transform. If a signal f has an even number N of values, then
the 1-level Daubechie4 transform is the mapping f _−D→1
(a1 | d1) from the signal f to its first trend subsignala1 and first
fluctuation sub-signal d1. Each value am of a1 = (a1, . . . ,aN/2)
is equal to a scalar product: am = f ·V1m off with a 1-level
scaling signal V1m. Likewise, each value dm of d1 =(d1, . . . ,
dN/2) is equal to a scalar product :dm= f ·W1mof with a 1level wavelet W1m. In image processing and computer vision
edge detection is a fundamental tool, particularly in the areas
of feature detection and feature extraction, which aim at
identifying points in a digital image at which the image
brightness changes sharply or, more formally, has
discontinuities[7].To understand image features detection of
edges in an image is a very important step. Edges consist of
meaningful features and contained significant information. It’s
reduce significantly the amount of the image size and filters
out information that may be regarded as less relevant,
preserving the important structural properties of an image [8].
Using the information of the edges of an image the
redundancies can sometimes be removed [9]. Eliminating the
redundancy could be done through edge detection. When
image edges are detected, every kind of redundancy present in
the image is removed [10]. The Sobel Edge Detector generates
a series of gradient magnitudes with a simple convolution
kernel. The gradient of an image say f(x,y) at the location
(x,y) is given by the vector
=
=
+2P4 +P7 ) where P1 to P9 are the pixels in the sub image as
shown in Figure 2 [12].
P1
P2
P3
P4
P5
P6
P7
P8
P9
)=
+
1/2
The magnitude of the gradient is approximated as∆ = | | +
and the direction of the gradient vector is given by
( , )=
where the angle is measured along the x-
axis. The equivalent digital form of the gradient is given by
Sobel operators and the equation is given by Gx=(P7 +2P8
+P9 ) - (P1 +2P2 +P3 ) and similarly Gy=(P3 +2P6 +P9) - (P1
ISSN: 2231-5381
-1
0
0
0
1
2
1
(b)
-1
0
1
-2
0
2
-1
0
1
(c)
Fig.2 shows Sobel masks. (a) Sub image (b)Sobel mask for horizontal
direction (c) Sobel mask for vertical direction
The masks in Figure 2 (b) computes Gx at the centre of the
3X3 region and the other is used to compute Gy .Gabor filters
model uses texture for image interpretation tasks as there are
strong relations between different filters outputs. The texture
can be defined as the regular repetitions of an element or
pattern on a surface.. The Gabor filter is capable of multi-scale
and multi-resolution and it consists of a tunable band pass
filter . It has selectivity for orientation, spectral bandwidth and
spatial extent. Visually different Image regions can have the
same first order statistics. Use of second order statistics
enables improvement of the situation taking into account not
just grey pixel levels but also spatial relationships between
them . Gabor filter, Gabor transform are directed by the
“Uncertainty Principle” [13]. This function provides accurate
time-frequency location. A two dimensional Gabor function
g(x,y) and its Fourier transform G(u,v) is given by:
( , )=
−
+
Employing the
magnitude of the gradient vector in the edge direction,
represented as[11]:
(
-2
(a)
+
=
-1
(
+
)
whereσ is the spatial spreadω is the frequencyθ is the
2



v 2  
 1 u  W 
G  u, v   exp   


2
 v2  
 2   u

 Where
orientation
 u  1 2 x
  1 2 
y .
and v
On pre-processing the data, the features of the data set are
identified as either being significant to the classification
process, or redundant. These redundant features can be
removed and this process is known as feature selection.
Redundant features are generally found to be closely
correlated with one or more other features. As a result,
http://www.ijettjournal.org
Page 144
International Journal of Engineering Trends and Technology (IJETT) – Volume 20 Number 3 – Feb 2015
omitting them from the process does not degrade classification
accuracy. In fact, the accuracy may improve due to the
resulting data reduction, and removal of noise and
measurement errors associated with the omitted features.
Therefore, choosing a good subset of features proves to be
significant in improving the performance of the system [14].
In this work, gain ratio is used which is a modification of the
information gain that solves the issue of bias towards features
with a larger set of values, exhibited by information gain.
Gain ratio should be large when data is evenly spread and
small when all data belong to one branch attribute. When
choosing an attribute, Gain ratio takes number and size of
branches into account, as it corrects information gain by
considering intrinsic information of a split (how much
information is needed to know which branch an instance
belongs to) where Intrinsic information is the entropy of
instances distribution into branches. For a given feature x and
a feature value of y, the calculation is as follows:
Gain ratio (y,x)=
gain( y, x)
int rinsic info(x)
where,
int rinsic inf o  x   
si
s
* Log 2 i
s
s
|s| is the number of possible values a feature x can take, and
|si| is the number of actual values of feature x. Application of
gain ratio to every dataset feature provides an estimate of a
feature’s importance with all features being ranked from the
most influential to the least through sorting of gain ratios. The
top k features then construct a simple classifier.
These selected features are utilized for the classification of the
images. A database of 100 images was taken and the images
contains lung and brain. The classification accuracy of
retrieval is evaluated using Naïve Bayes, Support Vector
Machine, IBL, CART, Random forest.
The Bayesian Classification represents a supervised learning
method as well as a statistical method for classification. The
Naïve Bayes uses Bayes theorem and it is a probabilistic
method used for prediction. Since it is a Supervised learning
method during training the conditional probabilities of each
attribute in the predicted class is estimated from the training
data set. Using the small training data the parameter’s mean
and variance obtained and it is sufficient for classification.
Commonly used as the Naïve Bayes provide good results and
easy interpretation of the results [15]. The disadvantage being
that the classifier considers that the occurrence of attributes is
independent; therefore the correlation between the attributes is
ignored.
ISSN: 2231-5381
Naïve Bayes classifies the given input represented by its
feature vector to the most likely class. Learning is simplified
on the assumption that the features are independent given
class,



P X C  in1P X i C

Where X=(X1,…,Xn) is the feature vector and C is a class.
SVM(support vector machine) are a useful technique for the
data classification. A classification task generally separates
data into training and testing sets. Each instance in the
training set contains target value( Class Label) and several
attributes. It is a new learning method for binary
classification, where basic idea is to locate a hyper plane that
separates d-dimensional data into its two classes perfectly
[16].. A key insight in SVM's is that higher-dimensional space
does not need to be handled directly (only formula for dotproduct in such space is needed), which in turn eliminates
above concerns The aim of SVM is production of a model
(based on training data) that predicts target values of test data
when given only test data attributes. Given a training set of
instance-label pairs (xi; yi); i = 1,….l, the support vector
machines (SVM) need a solution of the following
optimization problem [17]:
1 T
w w C
w,b , 2
min
l

i
i 1
Subject to


yi wT   xi   b  1  i ,
i  0
A classification problem is restricted to consideration of a
two-class problem without loss of generality [18]. Instancebased learning is a machine learning method that classifies
new examples by comparing them to those already seen and in
memory. These are “lazy” in the sense that they perform little
work when learning from the dataset, but expend more effort
classifying new examples. The simplest method, nearest
neighbour, performs no work at all when learning; it stores all
examples in memory verbatim. Effort is transferred to
classification time, when the system decides which example in
memory to use to classify the new one. Case-based reasoning
systems perform a small amount of work indexing new cases,
resulting in a reduction in classification effort. The examples
stored in memory are called exemplars, and are retained in an
exemplar database[19]
http://www.ijettjournal.org
Page 145
International Journal of Engineering Trends and Technology (IJETT) – Volume 20 Number 3 – Feb 2015
On the basis of the stored examples, Predictions are derived
[20] accomplished by means of nearest-neighbour estimation
principle [21]. Let
is equipped with a distance measure
∆(·), i.e., ∆ (x, x0) is the distance between instances x, x' ∈ X.
Usually Euclidean distance is used and attributes are
normalized. Distance between two instances xi and xj is
defined as
( , )≡
( ( )−
)
=
where is the
actual value of the attribute is the output space and ⟨ , ⟩ ∈ X
× Y is called a labelled instance, a case, or an example. In
classification, Y is a finite (usually small) set comprised of m
classes { ,…….. }, whereas Y = R in regression. IBL reduces
the number of training instances stored to a small set of
representative examples. Another advantage of IBL is it can
be used in problems other than classification.
CART(Classification and Regression Trees) tree is a binary
decision tree that is constructed by splitting a node into two
child nodes repeatedly, beginning with the root node that
contains the whole learning sample[22]. The basic idea of tree
growing is to choose a split among all the possible splits at
each node so that the resulting child nodes are the “purest”. If
X is a nominal categorical variable of I categories, there are 2
I-1 -1 possible splits for this predictor. If X is an ordinal
categorical or continuous variable with K different values,
there are K - 1 different splits on X. A tree is grown starting
from the root node by repeatedly using the following steps on
each node.
Attribute are normalized by
1.Find each predictor’s best split.
For each continuous and ordinal predictor, sort its values from
the smallest to the largest. For the sorted predictor, go through
each value from top to examine each candidate split point (call
it v, if x ≤ v, the case goes to the left child node, otherwise,
goes to the right.) to determine the best. The best split point is
the one that maximize the splitting criterion the most when the
node is split according to it. The definition of splitting
criterion is in later section. For each nominal predictor,
examine each possible subset of categories (call it A, if xϵ A,
the case goes to the left child node, otherwise, goes to the
right.) to find the best split.
2. Find the node’s best split. Among the best splits found in
step 1, choose the one that maximizes the splitting criterion.
3. Split the node using its best split found in step 2 if the
stopping rules are not satisfied
Random forests are a combination of tree predictors such that
each tree depends on the values of a random vector sampled
independently and with the same distribution for all trees in
the forest[23]. The common element in all of these procedures
is that for the kth tree, a random vector k is generated,
independent of the past random vectors , ... , k-1but
with the same distribution; and a tree is grown using the
training set and k , resulting in a classifier h(x, k ) where
x is an input vector. The analysis show that the accuracy of a
random forest depends on the strength of the individual tree
classifiers and a measure of the dependence between
them[24]. The medical images retrieval ,often have the
property that there are many input variables, often in the
hundreds or thousands, with each one containing only a small
amount of information.
A single tree classifier will then have accuracy only slightly
better than a random choice of class. But combining trees
grown using random features can produce improved accuracy.
Given an ensemble of classifiers h1(x),h2(x), ... ,hK(x), and
with the training set drawn at random from the distribution of
the random vector Y,X, define the margin function as
mg(X,Y) =avkI(hk(X)=Y)=maxj≠y avkI(hk(X)= j ) .where
I() is the indicator function. The margin measures the extent
to whichthe average number of votes at X,Yfor the right class
exceeds the average votefor any other class. The larger the
margin, the more confidence in the classification. The
generalization error is given byPE* =PX,Y(mg(X,Y) < 0)
where the subscripts X,Y indicate that the probability is over
the X,Y space.In random forests, hk(X) =h(X,k ) .
III. RESULTS AND DISCUSSIONS
Medical Images are compressed using the daubechie
wavelet. The low level features edge and textures are
extracted using Sobel edge detector and Gabor transforms
from the compressed medical images. The extracted features
are then reduces using the information gain and the selected
features are used for the classification. The classification
accuracy of retrieval is evaluated using Naïve Bayes, Support
Vector Machine[25] ,IBL,CART,[26] Random forest. Table 1
tabulates the classification accuracy and RMSE, and Figure 2
shows the same.
Technique
Naïve Bayes
C-SVM with linear
kernel
nu-SVM with linear
kernel
Classification and
Regression Tree
Instance
Based
Learner
Random Forest
Classification
Accuracy
92%
RMSE
91%
0.3
92%
88%
0.2828
0.3325
93%
0.2646
92%
0.2632
0.2828
Table 1: Classification Accuracy
ISSN: 2231-5381
http://www.ijettjournal.org
Page 146
International Journal of Engineering Trends and Technology (IJETT) – Volume 20 Number 3 – Feb 2015
100%
Recall
80%
60%
0.95
Classification
Accuracy
RMSE
0.85
0.8
Recall
Fig. 2 Graph showing classification accuracy
Fig 4 Graph showing the Recall
Table 2 lists the precision, recall and f Measure for various
classification techniques. Figure 3 shows the precision, Figure
4 shows the recall, and Figure 5shows the f- measure.
Naïve Bayes
0.92
0.92
0.92
C-SVM with linear kernel
0.91
0.91
0.91
nu-SVM with linear kernel
0.92
0.92
0.92
CART
0.913
0.84
0.875
IBL
0.957
0.9
0.928
Random Forest
0.920
0.920
0.920
F-Measure
IBL
Recall
C-SVM with…
Precision
0.94
0.92
0.9
0.88
0.86
0.84
Naïve Bayes
Technique
fMeasure
F-Measure
Random Forest
0%
0.9
CART
20%
nu-SVM with…
40%
Table 2: Precision, Recall and F Measure
Fig 5 Graph showing the f-Measure
Precision
0.98
0.96
0.94
0.92
0.9
0.88
Precision
Fig. 3 Graph showing the Precision
ISSN: 2231-5381
IV. CONCLUSION
This paper proposed to investigate the Image Retrieval (IR)
problem on compressed images. The medical images are
compressed and retrieved using traditional techniques. The
classification accuracy obtained is comparable to the
accuracies obtained in uncompressed images. In future work
needs to be carried out to investigate the effectiveness of soft
computing classification algorithms for compressed medical
image retrieval.
REFERENCES
1. Lehmann , T.M., Schubert, H. , Keysers, D., Kohnen, M., Wein, B.B , The
IRMA code for unique classification of medical image, in the Proceedings of
the SPIE 5033, 109-117 (2003).
2. Samuel, G., Armato III, et al.: Lung image database consortium –
Developing a resource for the medical imaging research community, in
Radiology . 232, 739-748 (2004).
3. Crucianu M., Ferecatu M., Boujemaa N.: Relevance vthe Art in
Audiovisual Content-Based Retrieval, Information Universal Access and
Interaction, 2004.
4. Muller, H., Michoux, N., Bandon, D., Geissbuhler, A, A review of content
based image retrieval systems in medical applications – Clinical benefits and
http://www.ijettjournal.org
Page 147
International Journal of Engineering Trends and Technology (IJETT) – Volume 20 Number 3 – Feb 2015
future directions, in the International Journal of Medical Informatics .73, 123 (2004).
5. Cerra, D. ,Datcu, M., Image Retrieval using Compression based techniques,
in Proceedings of the International Conference on Source andChannel
Coding (SCC), 1-6 (2010)
6. I. Daubechieechies, “Orthogonal bases of compactly supported
wavelets,”Commun. Pure Appl. Math., vol. 41, pp. 909–996, Nov. 1988.
7. Pal, N. R., & Pal, S. K. (1993). A review on image segmentation
techniques. Pattern recognition, 26(9), 1277-1294.
8. Yuval, F. (1996). Fractal image compression (theory and
application).Institute for non-linear Science,University of California, San
Diego, USA.
[9] Osuna, E., Freund, R., &Girosi, F. (1997). Training support vector
machines: An applicat ion to face detection. Proceedings of IEEE Conference
Computer Vision and Pattern Recognition.
[10]. Sparr, G. (2002). Image processing and pattern classification for
character recognition. Center for Mathemat ical Sciences, Lund University,
2, 25-78.
[11] Raman Maini & Dr. Himanshu Aggarwal ,”Study and Comparison of
Various Image Edge Detection Techniques “,International Journal of Image
Processing (IJIP), Volume (3) : Issue (1), pp. 1 – 12
[12] S,Annadurai and R.Shanmugalakshmi, Fundamentals of digital image
processing, third impression Pearson Education, pp 232-233.
[13] C J Setchell, N W Campbell ,”Using Colour Gabor Texture Features For
Scene Understanding.” In Proc. 7th Internat Conf. on image processing
applications. Vol. 67(5), pp. 372-376.
[14] Zubair A. Baig, Abdulrhman S. Shaheen, and RadwanAbdelAal, "OneDependence Estimators for Accurate Detection of Anomalous Network
Traffic," International Journal for Information Security Research (IJISR),
Volume 1, Issue 4, December 2011.
[15]. Besserve. M, Garnero. L, Martinerie. J. Cross-Spectral Discriminant
Analysis (CSDA) for the classification of Brain Computer Interfaces. 3rd
International IEEE/EMBS Conference on Neural Engineering, 2007. CNE
'07. pp:375 - 378,2007
[16]. Dustin Boswell, 2002,”Introduction to Support Vector Machines”
[17] Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin, 2010,”A Practical
Guide to Support Vector Classification”
[18] Steve R. Gunn, 1998,”Support Vector Machines for Classification and
Regression.
[19]. Brent Martin,1995 “INSTANCE-BASED LEARNING:Nearest
Neighbourwith Generalisation” A thesis report Department of Computer
ScienceUniversity of WaikatoHamilton, New ZealandMarch, 1995.
[20]. Ammar Shaker and Eyke H¨ullermeier. IBLStreams: A System for
Instance-Based Classification and Regression on Data Streams. journal of
Evolving Systems
[21] Belur V. Dasarathy, editor. Nearest Neighbor (NN) Norms: NN Pattern
Classification Techniques. IEEE Computer Society Press, Los
Alamitos,California, 1991.
[22]. Breiman, L., Friedman, J.H., Olshen, R., and Stone, C.J., 1984.
Classification and Regression
Tree Wadsworth & Brooks/Cole Advanced Books & Software, Pacific
California.
[23]. Breiman. L.2001 “Random Forests” A Technical Report January 2001,
Statistics Department, UCB.
[24]. Amit, Y. and Geman, D. Shape quantization and recognition
withrandomized trees, Neural Computation 9, 1545-1588,1997.
[25]. Vamsidhar Enireddy and Kiran Kumar Reddi. Article: A Data Mining
Approach for Compressed Medical Image Retrieval.International Journal of
Computer Applications 52(5):26-30, August 2012.
[26]. Vamsidhar Enireddy and Kiran Kumar Reddi. Article: Application of
CART and IBL for Image Retrieval International Journal of Computer
Science and Telecommunications Volume 3, Issue 12, December 2012.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 148
Download