Detection of Cervical Cancer Using Image Processing Tools 1 Monika Jain, 2Shipra Roy, 3Naina Jain Department of Electronics and Instrumentation Engineering, Galgotias College Of Engineering & Technology, Greater Noida. monikajain.bits@gmail.com,shipraroy1710@gmail.com,jainnaina05@gmail.com Abstract—this paper access to the detection of cervical cancer cells which is based on cell nuclei distribution and shape and size analysis. PAP smear test is useful and easy method to detect abnormalities in cervical cells. An automated detection system of cervical cancer cells has been developed. Output of the test shows that by using structure based segmentation and shape analysis the system is able to differentiate between normal and cancerous cells. The proposed approach is implemented in MATLAB®, multi-paradigm numerical computing environment and fourth-generation programming language that allows implementation of algorithms. The MATLAB Image Processing Toolbox was used to segment the digital images and calculate various statistical data. By taking into account cell nuclei distribution and considering the shape and size features MATLAB® can be programmed to distinguish normal cervical cell from abnormal ones. Keywords—Pap Smear, Cervical Cancer, Image Processing and MATLAB INTRODUCTION Cervical cancer is a disease that happens when cells in the cervix area begin to grow out of control and invade nearby tissues or spread throughout the body. Cancer or tumor can be divided into two groups i.e. benign and malignant where benign is described as tumor that does not invade and destroy the tissue in which it originates or spread to distant sites in the body (non-cancerous tumor) while malignant is described as tumor that invades and destroys the tissue in which it originates and can spread to other sites in the body via blood-stream and lymphatic system (cancerous tumor). The automated detection and segmentation of cell nuclei in PAP smear images is the most interesting fields in cytological image analysis as observed by Plissiti et. al.[1]. There is a high degree of cell overlap in such images, the presence of more than one nucleus in a cell and the lack of homogeneity in image intensity. These confront any method to overcome the complexity of conventional cervical cell images and to accomplish a correct segmentation. In addition, the nucleus is a very important structure within the cell and it presents significant changes when the cell is affected by a disease and thus the exact definition of the nucleus boundary is a crucial task. The recognition and quantification of these changes in the nucleus morphology and density contribute in I. differentiating between normal and abnormal cells in PAP smear images. The segmentation of nuclei in cytological images has been studied by several researchers [2-8]. In this paper we present a method for analysis of PAP smear images based on histogram and structuring element based segmentation and shape and size analysis of the cell nuclei. The methodology includes segmentation, calculation of cell nuclei distribution and shape and size analysis of the cell nuclei. The first order statistics of an image is its histogram that gives information about the distribution of the gray level value in its dynamic range [0, L-1], where L is the number of gray levels. The histogram gives information of an image as a whole and hence it can be considered as global statistics of the image. Morphological operations are based on shapes. In a morphological operation, each pixel value in the output image is based on a comparison of the corresponding pixel in the input image with its neighbours. Dilation and Erosion are the two most common morphological operations. Dilation adds pixels to the object boundary of an image and erosion removes pixels on the object boundary of an image. The number of pixels added or removed from the object boundary of an image depends in the size and shape of the structuring element. An image, A and a structuring element B, dilation and erosion are defined as A+B and A-B respectively. Another morphological operation is filling operation which is carried out on binary or grayscale images. For binary images, filling operation changes connected background pixels to foreground pixels. For grayscale images, filling operation brings the intensity values of dark areas that are surrounded by lighter areas up to the same intensity level as surrounding pixels. This work is based on Marroquin et. al. [9] and Hernandez et. al.[10]. The aim of this study was to provide assistance to pathologists and to enhance the accuracy of statistical data. II. METHODOLGY The features to be extracted from the pap smear images include 1. Size of the cell nuclei 2. Shape of the cell nuclei 3. Cell nuclei distribution The methodology is arranged into three steps: simplification and image enhancement, segmentation and feature extraction. 1) Simplification and image enhancement The digitized images are coloured in RGB mode. Matrix corresponding to colour image is three dimensional and hence it is difficult to process. The images in RGB mode have three colour components and therefore is a tedious task to segment using those colour component. The fact that gray level image are easy to process so we converted the images to gray level by using the rgb2gray unction in MATLAB. Then we enhance the contrast of the images using histogram analysis using imadjust function in MATLAB. 2) Segmentation The cell nuclei are darker than the surrounding cytoplasm and all the cell nuclei tends to have same gray level. We created the histogram for each image to view the density distribution of the different shades of gray. As the cell nuclei are darker, we filtered out the light areas and created a uniform background. Then using structuring element function, we created binary images containing only the cell nuclei. We considered only round shape for segmenting cell nuclei, but all the nuclei are not round. So, the binary images are again processed with dilation and filling function. As a result of applying dilation and erosion, extra parts which were not part of the nuclei were removed and the boundary of the cell nuclei became prominent. Then we apply filling operation to create a uniform intensity level inside the cell nuclei. 3) Feature Extraction a) Cell nuclei distribution The number of normal and abnormal cell in a cytological image is good criteria for identifying abnormality. The size of the cell nuclei is an important factor for determining whether they are normal or not. We calculated the area of each nucleus. The area was calculated in pixels. We then calculated the cell nuclei distribution per images and presented the result in tabular from with one column having the area and the other having the number of nuclei with that area. b) Shape and size analysis Shape is another important feature for classification of cervical cells in PAP smear images. Generally, the shape of cell nuclei are round and elongation occurs when abnormality occurs. The degree of change in shape is a good measure to analyse the shape of the nuclei. We considered two factors. First one is Compactness. It is a dimensionless shape feature which measure compactness. It is defined by Sheng et. al. [11] as D = M2/P where M is the perimeter and P is the area of the cell nuclei. The calculation was done in pixel. Second one is Eccentricity. The cell nuclei are round in shape in normal condition. The roundness is not perfect and the shape can be considered as ellipse. We calculate the semimajor and semiminor axes’ lengths and the eccentricity for each nucleus using the following formula E= {(a2 – b2)/b2}1/2 where a and b are the semimajor and semiminor axes of the ellipse respectively. The eccentricity value 0 corresponds to a circle and with an increasing value the deviation from circle becomes more significant as observed by Marroquin. III. RESULTS AND DISCUSSION The pre-processing step excludes all the background and leaves for further processing the parts of the image which contain isolated cells or cell clusters results in the reduction of the region of interest in the image. This method has been applied in several PAP smear images defined by an expert observer. The step for the detection of the cell nucleus centroid has exposed that the resulted points of the image indicate the area of the nuclei, as it is confirmed by the expert observer. Two images are shown here, one showing cancerous cells in initial stage that is mild dysplasia and the other showing normal cervical cell. The statistics were then obtained and compared to confirm the differences. The results are shown below. 1) Simplification and Image Enhancement Figure 1-2 show the results after applying the techniques mentioned above for simplification and enhancement of two images. Figure 1 - Original image showing cancerous cells in initial stage that is mild dysplasia is converted to gray level and then enhanced image. Figure 2 - Original image showing normal cervical cell is converted to gray level and then into enhance image. 2) Segmentation: The images were then segmented for the region of interest, which is the cell nuclei. They are shown below: Figure 5 – Compactness and eccentricity histogram for normal cell Figure 3 - Segmented image showing only the cell Nuclei 3) b) Compactness Compactness is the dimensionless shape measure of the cell nuclei. A normal nucleus has a well-formed and a compact shape in normal condition. Cells with abnormality gradually deform and the compactness decreases. We found out the compactness of the cell nuclei and also normal cells have higher value of compactness then that of the abnormal nuclei. Feature Extraction c) Eccentricity Eccentricity is the measure of roundness of the cell nuclei. Generally, the eccentricity can be said to be calculated from the width and height of the cell nuclei. The normal nuclei have a minimal proportion between the width and height and thus have greater roundness. Uncontrolled growth of the nuclei does not keep this uniform proportion and as result their eccentricity deviated farther away from zero (0). IV. Figure 4 – compactness and eccentricity histogram of abnormal cervical cell CONCLUSION An effective method to identify and classify cervical cancer is becoming increasingly needed due to the fact that early detection and a decision of correct therapy may save the patient. Medical images have various limitations such as low quality, presence of noise and human error in interpretation. Digital image processing can help the pathologists to a great extent. The statistical data can be used to differentiate normal or questionable sample while the pathologist looks at the slide under a microscope which will be highly time saving. Some ideas for future enhancement includes: to design an interactive system where a pathologist can feed his own grayscale threshold or to automate the process by computer using histogram or fuzzy logic. Another enhancement can be to establish cutoff values between normal and abnormal values and to classify the abnormal values according to stage of the cancer. The images processed are magnified and the calculations are done in pixels. So, if a relation among magnification, pixels and actual size is established, the analysis will be more efficient. Previously CT images were used to detect cervical cancer, but now MRI images can be used for future works due to its high resolution. Artificial neural network, contour methods and wavelets are also some of the methods used to detect cervical cancer at an early stage V. REFERENCES [1] Plissiti M.E., Charchanti A., Krikoni O. and Fotiadis D.I., “Automated segmentation of cell nuclei in PAP smear images”, ITAB Proceedings International Special Topic Conference on Information Technology in Biomedicine, Greece, Ionnia, 26-28, October 2006. [2] Bamford P., Lovell B., “Unsupervised cell nucleus segmentation with active contours”, Signal Processing 71(2), pp. 203-213, 1998. [3] Lipi B. Mahanta and Dilip Ch. Nath , “Cervix Cancer Diagnosis from Pap Smear Images Using Structure Based Segmentation and Shape Analysis,” Dept. of Statistics, GauhatiUniversity,VOL.3, NO. 2, February 2012, ISSN 20798407. [4] Bamford P., Lovell B., “A water immersion algorithm for cytological image segmentation”, Proceedings of the APRS Image segmantation workshop, pp. 75-79, University of Technology Sydney, Sydney 1996. [5] Mouroutis T., Roberts S. J., “Robust cell nuclei segmentation using statistical modelling”, IOP Bioimaging, 6, pp. 79-91, 1998. [6] Garrido A., Perez de la Blanca N., “Applying deformable templates for cell image segmentation”, Pattern Recognition 33, pp. 821-832, 2000. [7] Lee K.M., Street W.N., “Learning shapes for automatic image segmentation”, Proc. INFORMS-KORMS Conference, pp. 1461-1468, Seoul, Korea, June 2000. [8] Begelman G., Gur E., Rivlin E., Rudzsky M., Zalevsky Z., “Cell nuclei segmentation using fuzzy logic engine”, International Conference on Image Processing, Vol. 5, pp 2937-2940, October 2004. [9] Marroquin E. Martinez, Vos C., Santamaria E., Jove X., Socoro J.C., “Non Linear Image Analysis for Fuzzy Classification of Breast Cancer”, IEEE Proceedings of International Conference on Image Processing, vol.2 , 943– 946, 1996. [10] Hernandez L., Gothreaux P., Shih L., “Towards Realtime Biopsy Image Analysis and Cell Segmentation”. In Proceedings of IPCV, pp.81-87, 2006. [11] Sheng L., Rangayyan R. M., Desautels L., “Application of Shape analysis to Mammographic Calcification”. IEEE Trans. On Medical Imaging, Vol. 13,NO. 2, 1994.