Applied Mechanics and Materials Vols 602-605 (2014) pp 2147-2152 © (2014) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/AMM.602-605.2147 Submitted: 2014-06-01 Accepted: 2014-06-11 Online: 2014-08-11 Image Classification Recognition for Rock Micro-Thin Section based on Probabilistic Neural Networks Xinshan Wei1,a, Xiaohua Qin2,b, Chunlong Rong1,c, Junxiang Nan1,d, Guojian Cheng3,e 1 Research Institute of E&D, Changqing Oilfield Company, CNPC, Xi’an, 710021, China 2 Communication Branch of Changqing Oilfield Company, CNPC, Xi’an, 710021, China 3 School of Computer Science, Xi’an Shiyou University, Xi'an, Shaanxi, 710065, China a wxs_cq@petrochina.com, bqxh_cq@petrochina.com, crcl_cq@petrochina.com, d njx_cq@petrochina.com, egjcheng@xsyu.edu.cn Keywords: Color image segmentation, K-means clustering, Probabilistic neural network, Rock thin section image, Pattern recognition Abstract. In order to implement the recognition automation of rock section pore images, a method combined K-means clustering with probabilistic neural network is proposed and applied to rock thin section images. Firstly, K-means clustering is used as segmentation algorithm, the rock images are divided into two types and extracted enough features and it is shown good classification recognition effect on testing dataset. Secondly, 100 pieces of rock image section are used as validation dataset, including 20 groups, each group has 5 images and 200 data samplings. Experiments show that the probabilistic neural network can be used as rock texture classifier, the average correct classification rate is around 95.12%, which can meet the practical application needs. Introduction Reservoir porosity is one of the important parameters of prediction and reserves calculation of oil and gas, to obtain the reservoir porosity accurately is the basis of geological modeling and stratigraphic interpretation. Among the methods of calculating the reservoir porosity, the rock slice image analysis technology has remarkable technical superiority in identification and calculation of the porosity and reservoir evaluation, which provides the reliable basis for the accurate identification of oil and gas layer [1]. Relative to gray image, color images contain more valuable space information, but the common used color spaces is CIE L*a*b* (ignored *, shorthand for the Lab) [2].The color model is closer to human perception of color than RGB and the space has the characteristics of the invariance of the Euclidean distance[3], so implement image segmentation in this color space. In recent years, there is a lot of research techniques used for color image segmentation: such as the watershed algorithm based on morphology and region growing method and the method of combination of both. However, watershed algorithm is prone to over-segmentation phenomenon[4,5], and also very sensitive to noise to make it not suitable to the rock microstructure images. Traditional region growing rule [6] is calculation of adjacent pixels gray level difference and it is not more than a fixed threshold, but artificially selecting a threshold will lead to the poor robustness. K-means clustering segmentation is used in this paper. The main composition of the rock slice section images is clastic particle and pore, those pore has been filled with red or blue colloid, the experiment images are red colloid. The main purpose of this paper is to identify the division after the pore area. Firstly, to convert RGB image to the CIE Lab space, by using K-means clustering method, the image is divided into two kinds, target and background. The probabilistic neural network is used for training and testing, in order to accurately identify the pore structure after segmentation. All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of Trans Tech Publications, www.ttp.net. (ID: 132.174.254.159, Pennsylvania State University, University Park, USA-25/05/15,04:13:33) 2148 Advanced Manufacturing and Information Engineering, Intelligent Instrumentation and Industry Development K-means algorithm and feature extraction K-means algorithm. The main idea of K-means algorithm is to try to divide n objects into k classes in an image. First of all, the random k point in Lab space of the image as the initial clustering center. The remaining point of the image will be classified to the most similar with the classification according to the similarity criterion principles between them and the clustering center. Then, it will recalculate the clustering center of the new class, and classify the rest of the objects again [7, 8]. The algorithm iterates throughout the process until the similarity criterion function converging to a fixed threshold value, or to achieve the upper limit of iterative times. Set the data set X={x1, x2, … , xn }, the algorithm steps are shown as following: Step 1: Set up k initial clustering centers: C (jt ) = (a (jt ),b (jt ) ) , C (jt ) represents the jth class clustering center during in the tth iteration, a (jt ) represents the red and green color values of the jth class clustering center during the tth iteration, b (jt ) represents the blue and yellow color values of the jth class clustering center during the tth iteration, j=1, 2, …, k. Step 2: For the remaining samples xi in the dataset, add it to the most nearest classes. Euclidean distance is used to determine similarity and to choose the right category for each sample xi. The Euclidean distance between vector data sets xi = (ai , bi ) and the clustering center C (jt ) = (a (jt ) , b (jt ) ) is defined as: ( ) ( ) ( 2 D 2 xi , C (jt ) = ai − a (jt ) + bi − b (jt ) ). 2 (1) Step 3: Recalculating the clustering center of the Step 2. Step 4: If the distance D converges to a fixed threshold in the formula (1) or the number of iterations is achieved, the algorithm will be stopped, and the output is the final k classification image, otherwise, repeat Step 2. (a1) (b1) (c1) (a2) (b2) (c2) (a3) (b3) (c3) Fig. 1 Rock thin images (a1~a3), background images (b1~b3) and pore target images (c1~c3) In order to achieve the minimum differences within the class and the largest differences between class after the segmentation, the algorithm is executed repeatedly, until all the elements within the class are stable [9]. Fig. 1 shows the images after segmentation, which (b1) and (c1) represent respectively background and porosity images after segmentation related with the primitive rock color images (a1), (b2) and (c2) represent respectively background and porosity images after segmentation related with the primitive rock color images (a2), and so on. Feature extraction. Image characteristics refers to the characteristics of each pixel in the image, rather than the entire image as a whole. For color rock slice images, give full consideration to the rich color information that contains, by selecting the appropriate features, it is used to distinguish the target and the background according to the different characteristics between both of them. Considering to two types of images after clustering, as shown in Fig. 1 corresponds to the class b and class c, 20 points in the target area for the selected sample, each point contains 24 features including the current point, the four-neighbourhood domain of R, G, B value, the mean, median and Applied Mechanics and Materials Vols. 602-605 2149 mean square deviation of each component of 3×3 (R, G, and B) within the neighborhood. Namely, in a M×N image, for Pij, i = 1, 2,... ,M - 2;j = 1, 2,..., N - 2, its feature vector can be represented as: Fij = [P(i,j ),P(i − 1,j ),P(i,j − 1),P(i,j + 1),P(i + 1,j ),Mean(i,j ),Mse(i,j ),Mid (i,j )] (2) Where: P(m,n ) = ( f R (m,n ),f G (m,n ),f B (m,n )) . The mean: 1 i + 1 j +1 1 i +1 j + 1 1 i +1 j +1 (3) Mean(m,n ) = R, G, B ∑ ∑ ∑ ∑ ∑ ∑ 9 9 m =i −1 n = j −1 9 m =i −1 n = j −1 m =i −1 n = j −1 The mean square deviation: j +1 j +1 i +1 i +1 1 i +1 j +1 1 1 Mse(m,n ) = ( R (i,j ) − MeanR )2 , ( G (i,j ) − MeanG )2 , (B(i,j ) − MeanB)2 , n n n m =i −1 n = j −1 m =i −1 n = j −1 m =i −1 n = j −1 ∑ ∑ Where, ∑ ∑ i +1 MeanR = j +1 i +1 (4) ∑ ∑ j +1 i +1 j +1 1 1 1 R , MeanG = G , MeanB = B. 9 m=i −1 n = j −1 9 m =i −1 n = j −1 9 m=i −1 n = j −1 ∑ ∑ ∑ ∑ ∑ ∑ 174 … 166.56 … 164 … 4.8762 … 181 … 1 1 2 2 2 … … 114.5 51.667 65.167 204.75 168.33 … … 170 113 153 202 153 … 163.67 87.5 109.33 180.75 148.83 … … … … … … … … 112 78 83 208 146 … … 600 168 90 82 74 196 142 … 169 133 154 192 124 … 2 3 4 5 6 … By K-means algorithm, the 15 image will be divided as two kinds of background rock and target pore, using the formula mentioned in this section for feature extraction, for the two segmented images, 20 points are selected randomly, extracting total of 600×24 datum. Part datum is as shown in Table 1. Table 1 Part of the texture feature atrributes No. R(i, j) G(I, j) B(i, j) … MeanR … MidR … MseR … Mark 1 170 91 110 … 157.83 … 153 … 103.17 … 1 … 2 Probabilistic neural network algorithm The algorithm principle of PNN. Probabilistic neural network, PNN, is a feedforward neural network, which is developed by the radial basis neural network, RBNN. PNN uses the probability density function as a nonlinear transformation function, which can form complex nonlinear categorical space through intermediate mapping, therefore PNN with Gaussian nonlinear function is widely used as pattern classifier. PNN is constructed by a network with four layers structure [13], including an input layer, model layer, the summation layer and output layer. The first layer is for pattern input without any function calculation, only the input vector X = (x1, x2, ... , xn) is simply loaded to the network. Next two hidden layer, the first hidden layer receives the input pattern, its output can be calculated as follows: ( X − Wi )T ( X − Wi ) f ( X,Wi ) = exp − 2δ 2 (5) Where, Wi is the connection weights between the input layer and model layer, δ is smoothing factor which plays an vital role in classification. Second hidden layer as the summation layer, it calculates the sum of initial probability of every sample, or the weighted sum between the output of each class of radial basis in pattern layer and each neuron in the summation layer. The last layer is output layer, or competitive layer. It receives various probability density value from the summation layer, the output of neurons with the largest value is 1, for other neurons it is 0. Selection of Spread value. There is a function newpnn(P, T, Spread) in neural network toolbox of Matlab, in which P is the input sample points characteristic, for example, 24×400 input data in the paper. T is the output classification, for example, 2×400 output data in the paper. PNN function requires the input and output vectors to have same number of columns. Spread is the distribution density of PNN, its value has a big influence on the classification performance of network. When the 2150 Advanced Manufacturing and Information Engineering, Intelligent Instrumentation and Industry Development Spread value of the distribution density is close to zero, it can constitute a nearest neighbor classifier. If the Spread value get a bigger value, it will form a nearest neighbor classifier with a few of training samples. In application, it depends on the specific application. The classification error rate is shown in Fig.2, the value range of Spread is between 0.5 and 3.5, related to 10 selected kinds of test data sequence. As it can be seen that the classification error rate is lower when the Spread value is between 2.1 and 2.4, so, the Spread value is selected as 2.3 in the experiment. Experiments The experimental environment is MATALAB 7.0, the whole experiment flow chart is as shown in Fig. 3. Starting with experiment, it carried out feature extraction and validated the classification recognition in accordance with Fig. 3. Fig. 4 (a) shows that the overall network error of 200 groups of test data, in which there are four samples misclassified as class 1 (should be class 2), and there are seven misclassified as class 2 (should be class 1). The predicted effect of part dataset is shown as in Fig. 4 (b), there are five error classification located at 150~200. The testing accuracy is about 94.5% (see table 2). 0.25 classification error rate 0.2 0.15 0.1 0.05 0 0.5 1.0 1.5 2.0 2.5 3 3.5 Spread value Fig. 2 The classification error for different spread Fig. 3 The experiment flow chart (a)PNN network prediction error (b) PNN network prediction effect Fig. 4 PNN network prediction results Applied Mechanics and Materials Vols. 602-605 Table 2 The classification accuracy on total dataset 1st Class PNN 2nd Class Error Total Error Error Training dataset (400 0 (0.00%) 0 (0.00%) 0 (0.00%) samples) Testing dataset (200 4 (2.00%) 7 (3.5%) 11 (5.50%) samples) Total samples (600) 11 (1.83%) 2151 Correct Rate 100% 94.5% 98.17% Finally, it is selected 100 rock section slice image with 256 × 256, 5 image is grouped, which forms a total of 20 groups. 200 sample points are extracted in each group for the classification testing. Fig. 5 shows the 20 experiments implemented according to the dotted box in Fig. 3. From these experiments we can get recognition accuracy after extraction of enough feature vectors, the average accuracy is 95.12%. The experiments show that that PNN classification results can meet the requirement and it can be used for the other measurement related to rock pore. 93 Fig. 5 The classification accuracy on experimental data Conclusion The classification recognition of the rock section slice image is implemented by combination of K-means clustering with probabilistic neural network. For the thin images got from Erdos basin, the method has been shown a higher classification rate and good recognition effect. However, the actual rock sampling pictures are very large with lower resolution, the further studying is how to get higher resolution images, improve the method and make it suitable to real-time requirement in working condition. For extracted target rock pore, how to calculate a series of parameters related to 3D reconstruction is also a research topic. Acknowledgements The E&P of large oil/gas fields and coal bed methane of National science and technology key projects, Reservoir E&D demonstration project of Erdos basin and low permeability lithologic formation, 2011zx05044. References [1] Qing-li Liu, Guo-ping Wu, Jian-ce Hu, Wei-guo Hou:Strike the Reservoir Porosity Using the Casting Sheet Image[J].Journal of Geomatics Science and Technology. Vol. 26(1) (2009), p. 69. (In Chinese) [2] Xiao-min Pang, Zi-jian Min, Jiang-ming Kan:Color image segmentation based on HSI and LAB color space[J].Journal of Guangxi University(Nat Sci Ed.). Vol. 36(6) (2011), p. 976. (In Chinese) [3] Li-xue Chen, Zhao-jiong Chen:Image Retrieval Algorithms Based on Lab Space[J].Computer Engineering.Computer Engineering. Vol. 34(13) (2008), p. 224. (In Chinese) 2152 Advanced Manufacturing and Information Engineering, Intelligent Instrumentation and Industry Development [4] Jia-hong Yang, Jie Liu, Jian-cheng Zhong, Ming-zhi He:A Color Image Segmentation Algorithm by Integrating Watershed with Automatic Seeded Region Growing[J].Journal of Image and Graphics. Vol. 15(1) (2010), p. 63. (In Chinese) [5] Zhi-qiang Wei, Miao Yang:Segmentation of color building images based on Watershed and Region merging[J].Infrared Millim Waves. Vol. 27(6) (2008), p. 447. (In Chinese) [6] Xiao-ming Xiao, Zhi Ma, Zi-xing Cai:Road Image Segmentation Based on an Adaptibe Region Growing[J].Control Engineering of China. Vol. 18(3) (2011), p. 364. (In Chinese) [7] Chen Te-Wei, Chen Yi-Ling, Chen Shao-Yi:Fast Image Segmentation Based on K-Means Clustering with Histograms in HSV Color Space[J].Multimedia Signal Processing. Vol. 10(2008), p. 322. (In Chinese) [8] Yong Zhou, Haibin Shi:Adaptive K-means clustering for Color Image Segmentation [J]. Advances in information Sciences and Service Sciences (AISS). Vol. 3(10) (2011), p. 216. (In Chinese) [9] Anil Z Chitade , DR. S.K. Katiyar:Colour Based image segementition using Kmeans clustering[J].International Journal of Engineering Science and Technology. Vol. 2(10) (2010), p.5319. [10] Lian-huan Xiong, Hu Han-ping, Li De-hua:Segmenting the Color I mage and Detecting the Edge by BP NN Method[J].Huazhong Univ. of Sci. & Tech. Vol. 27(2) (1999), p. 87. (In Chinese) [11] Qing Yang, Jingran Guo, Donggxu Zhang:Fault Diagnosis Based on Fuzzy C-means Algorithm of the Optimal Number of Clusters and Probabilistic Neural NetWork[J].International Journal of Intelligent Engineering and Systems. Vol. 2(4) (2011), p. 51. [12] Hui Wang, Lin Yang, Jin-hua Ding, Ming-ying Li:Classification and recognition of wood block texture based on PNN[J].Journal of Dalian Polytechnic University. Vol. 28(5) (2009), p. 387. (In Chinese) [13] Gui-ying Wang, Shi-jun Zhang, Si-yao Pan:Investigation of Transformer Fault Diagnosis Method Based on PNN[J].Computer Measurement & Control. Vol. 20(7) (2012), p. 1760. (In Chinese) Advanced Manufacturing and Information Engineering, Intelligent Instrumentation and Industry Development 10.4028/www.scientific.net/AMM.602-605 Image Classification Recognition for Rock Micro-Thin Section Based on Probabilistic Neural Networks 10.4028/www.scientific.net/AMM.602-605.2147