1 P009 THE DEFINITION OF A LITHOLOGIC WELL PROFILE WITH APPLICATION OF ARTIFICIAL NEURAL NETWORKS V.E.Lyalin, Dr.Sc.*; V.V.Vasilyev, Dr.-Eng**; S.I.Kachurin, Dr.-Eng*; L.E.Tonkov, Dr.-Eng** *Izhevsk state technical university, 7 Studencheskaya St, Izhevsk, Russia ** Izhevsk scientific and technical center, Sidanko, 175 Svobody St, Izhevsk, Russia Abstract Nowadays the advantages of an artificial intelligence allows to reach a better quality level of well log data processing, as it allows to use automated hardware-software complexes functioning practically without involvement of a person. In this case, the purpose of the work was the development and scientifically valid application of multilayer neural network (NN) for a solution of the task of qualitative well log data interpretation. The application of this network is of great importance in the field of processing and express-interpretation of logging data directly at a well. The error back-propagation algorithm was applied to NN tutoring. The input and output data were normalized and were centered before tutoring. Two types of estimations were applied to an evaluation of quality of the interpretation: mean square deviation and cross correlation. For processing of an output signal two methods were used: with the help of rounding of estimations of probabilities with preliminary introduction of a threshold of a rounding and with the use of a fuzzy logic system with a defined set of rules, in which breadth of splash of an NN output signal and its height are used as data-ins. The carried out computing experiments have shown a good recognition of geological differences by means of NN. The application of the algorithms of processing of NN output with a threshold of rounding has allowed to receive up to 85 % of coincidences of estimations of the network with standard data and up to 93 % with the means of fuzzy logic. Application of NN for interpretation of well logging data will make it possible to enhance reliability of interpretation results and to reduce time required for solving this problem. Introduction At present the computers used for geophysics make it possible to retreat from conventional methods of information collection and processing, to solve the problem of large data arrays processing, to eliminate a human factor related to interpretation of survey results. At the same time, the systems which make use of artificial intelligence are few in number. Taking advantages of artificial intelligence allows for reaching the other, more qualitative level of processing of well logging results, as it makes it possible to use computer-aided softwarehardware complexes, which function virtually without assistance of a human being. In this connection, the aim of this work is to develop and to justify application of multilayer Neural Network (NN) for solving a problem of qualitative interpretation of well 9th European Conference on the Mathematics of Oil Recovery — Cannes, France, 30 August - 2 September 2004 2 logging data. The application of this network is of great importance in the field of processing and express-interpretation of logging data directly at a well. In spite of simplicity of construction and functioning, NNs make it possible to accumulate the regularities which are known by now, to generalize facts and to make proper assessments under the conditions when noisy data are presented at the NN input. NNs are widely used abroad in various systems of image identification, for example, forecasting, control, etc. In this connection, application of NN for interpretation of well logging data will make it possible to enhance reliability of interpretation results, to reduce time required for solving this problem. Setting a Problem A well log can be represented as function of a well depth: u = f (d ) + ξ , (1) where u – value of logging method; d – depth; ξ – random component (noise or interference). As the diagram is represented in a digital form with uniform sampling interval, the depth d can be expressed in terms of sampling interval ∆d and interval number i. di = ∆d·i. (2) The interpreter’s conclusions can be also described by functional well depth dependence: y1 , K , y k = f ( d ) , where y1,…,yk – results of qualitative interpretation related to various types of lithologic formations and reservoir saturation nature ( y i = {0,1} ); d – depth. In case the interpretation is made on the basis of digital logs, the results are presented as discrete quantities : (3) yi1 ,K , yik = f (∆d ⋅ i ) , 1 k where y i,…,y i – results of qualitative interpretation regarding various types of lithologic formations and reservoir saturation nature at i depth reading; ∆d – depth sampling interval; i - depth interval number. Thus , the main target of qualitative interpretation is to obtain relations between well logs and types of lithologic formations at a specified well depth. Mathematical notation is as follows: (4) y1d ,K, ydk = f u1d ,K, udl , 1 k where y ,…,y – results of qualitative interpretation according to various types of lithologic formations and reservoir saturation nature ( y i = {0,1} ); u1,…,ul – values of well logs; d – depth. ( ) ( ( ) ) ( ) Problem Formalization Multilayer Neural Network (NN) may calculate an output vector y for any input vector x, that is, set a value to a certain vector function. To put this another way, the Network develops mapping X→Y for ∀x ∈ X . Thus, in order to state a problem for NN it is necessary to derive the rules of input data presentation using vectors x and solution presentation with vectors y. The input data of the problem are described by formula (1) for continuous alternative. Quantization is made from formula (2). As a number of log surveys is conducted in a well, so in the general case, for one depth interval the input data may be represented by vector: (5) U i = u i1 , K , u il , where l – number of methods represented on a well; i – depth interval number; In some cases one value of each log curve is not enough for exact determination of a type of lithologic formation. The reason is that it is impossible to analyse the curve behaviour in the ( ) 3 neighbourhood of i- interval depth. Solution is to use several values of each log curve for adjacent depths in one input vector: (6) U i = ui1− w , ui1− ( w −1) , K , ui1 , K , ui1+ w , K , uil− w , uil− ( w −1) , K , uil , K , uil+ w , where w – number of adjacent depths. In case w=0 the approach is reduced to (5). This approach increases the dimensionality of NN input, but makes it possible to analyze log curve behaviour nature by the Network. Solution of the problem of lithologic stratification of well log will be to obtain relation (4) and as a result , of function (3). Each formation can be described by vector: (7) C = (l1 , K , lk , p1 , K , pn , s1 , K , sm ) , where l1,…,lk – probabilities of belonging of formation to one or another type of rocks ; p1,…,pn – probabilities of attributing formation to one or another type of reservoirs; s1,…,sm – probabilities of formation saturation with one or another type of fluid ; The following conditions are fulfilled: 0 ≤ li ≤ 1, ∑ li = 1 ; ( ) i 0 ≤ pi ≤ 1, ∑ pi = 1 ; 0 ≤ si ≤ 1, ∑ si = 1 . (8) i i Accordingly vector (7) for i depth interval will be written as follows: (9) Ci = (l1, i ,K, lk , i , p1, i ,K, pn, i , s1, i ,K, sm, i ) . Depending on setting of a problem ( determination of reservoirs only, water saturation only, etc.) some of the vector components (9) can be omitted. Based on practice, this mode of formalization is the best one. A limitation of this method is that after Network training the conditions (8) will be fulfilled in a rough way: ∑ li ≈ 1; i ∑ pi ≈ 1 ; (10) i ∑ si ≈ 1 . i This is because NN is analogous. Most of the results produced by Neural Networks are rough. Furthermore, when training the Network, the conditions (9) to be superposed on probabilities, are not input into the Network directly, but are implicitly contained in data set under which the Network is trained. Basic Data Prior to using of NN it is required to provide training procedure, during which neuron weighting factors of all the Network layers will be adjusted. This will finally make it possible to construct an adequate mapping of input data (log surveys) into output data (interpretation output). All one has to do is to form 2 disjoint data samples, to be the same by its structure – learning and testing. The input data vectors and associated output data vectors contain both samples. Training sample, as the name suggests, is used at immediate training of the Network as reference data. Testing sample is required for training performance check. It is imperative that all the input data be standardized. The purpose of standardization is to bring heterogeneous fields of permissible values of well logs to a unit interval. Standardization 9th European Conference on the Mathematics of Oil Recovery — Cannes, France, 30 August - 2 September 2004 4 is conducted in order to bring the data to uniform dimensions as well. Furthermore, this makes it possible to avoid problems which arise when training the Network, these are permanent saturation or dormancy of the neurons of the input network layer, Network “paralysis” ( conditions under which the training is stopped). Also this contributes to reducing of time and enhancing training performance. In operation, the results produced by the Network should be denormalized. Estimation of Training Performance Two values: mean-square deviation (11) and cross correlation (12) have been put forward in this work to estimate identification quality. 1 n (11) ε= (Yi − yi )2 , ∑ n i =1 where Yi – interpreter’s conclusion at d depth interval; yi – estimation of the Network at d depth interval; n – number of depth intervals. Cov (Y , C ) , (12) ρ= σ Y ⋅σ C 1 n Cov (Y , C ) = ∑ ( yi − y )(ci − c ) , n i =1 where Y = (y1, …, yn) – population of interpreter’s conclusion; C = (c1,…, cn) – population of the Network estimations; n – number of depth intervals; σ Y , σ C – standard deviations of populations Y and C respectively. Interpretation of the estimations can be as follows – the optimal training of the Network is made when the mean-square deviation is minimum and cross correlation is maximum. Enhancing Identification Reliability In view of the fact that the trained NN produces the interpretation output estimations as results of operation, the possibility exists of using processing algorithms for input Network signal for the purpose of enhancing the reliability of identification with simultaneous reduction of the estimations to absolute values. The use of the Network estimations rounding-off with preset rounding-off threshold can be one of the similar algorithms. Selection of rounding-off threshold can be rested on the interpreter, which trains the Network. Calculations can be made automatically, for example, by minimizing a mean-square error of the Network estimation (or by maximizing cross correlation) for training sample. Along with application of rounding-off threshold other methods of NN performance processing are available. The use of fuzzy-logic method for reducing the estimation of attributing formation to one or another class, obtained as result of NN operation, into absolute values are worthy of notice. Estimations of membership form a certain function, which has decreasing and increasing sections. In another way, the function is divided into a set of «bursts» – curve intervals, limited by function increasing section from one side and function decreasing section – from the other side. An appropriate fuzzy model was found when a burst width and height were taken as input parameters. A degree of conformity of the burst to a formation of identified lithology is taken as system operation results. Trapezoidal functions were used as membership functions and they took the following form: 5 P 1 низк. 0 средняя 5 10 P 1 высокая 15 ширина 20 P 1 25 низк. 0 0 30 средняя 0,2 низк. высокая средняя 0,2 0,4 высота 0,6 0,8 1 высокая 0,4 0,6 0,8 степень соответствия 1 Fig. 1. Membership Functions for Fuzzy System A fuzzy inference was made according to Mamdani mechanism. The method of first maximum is taken as dephasing rule, according to which a fuzzy set, which is obtained at the system output, will be presented in a numerical form. According to this method an absolutely valid value is evaluated as the least value, at which maximum of total fuzzy set is obtained. To reduce fuzzy system performance to absolute estimates the rounding-off threshold is used as it is used for NN. Further enhancement of efficiency related to NN performance identification is possible with application of fuzzy logic training system. For example, training is possible with application of genetic algorithms. In this case the membership functions parameters, membership functions form and fuzzy system rules can be taken as variables of the fuzzy system. Conditions of an Experiment on Reservoir Identification When training NN the following logging methods were selected as input data: lateral logging (BK), well diameter (DS), acoustic logging (DT), gamma-ray-logging (GR), neutron gamma-ray logging (NGR). The reason is that these methods are included into the available data set for all wells. In the course of the experiment several NNs for each variant of problem formalization were constructed and trained. And then the one, which constructed the specified mapping more accurately, was selected from the obtained set. To identify reservoirs the NN (9) resultant vector was reduced to the following form: Ci = (li ) , where li – the probability that formation at i depth is a reservoir. Thus, all NNs used for identification of the reservoirs contain one neuron in the output layer. Application of the algorithm without data window for the methods group implies the use of input data vector (2.11) of the following form: U i = ui1 , ui2 , ui3 , ui4 , ui5 , ( where ui1 – value of BK at i depth; ui2 ) – value of DS at i depth; ui3 – value of DT at i depth; ui4 – value of GR at i depth; ui5 – value of NGR at i depth. Application of the data window makes it possible for NN to analyze not only current values of the logging methods, but nature of the curve behaviour in the neighbourhood of the estimated depth as well. During the experiment the windows with such dimensions as 3 and 5 of depth intervals were used. Thus, the input vectors appear as follows: 9th European Conference on the Mathematics of Oil Recovery — Cannes, France, 30 August - 2 September 2004 6 ( ) U i = ui1−1 , ui1 , ui1+1 , ui2−1 , ui2 , ui2+1 , ui3−1 , ui3 , ui3+1 , ui4−1 , ui4 , ui4+1 , ui5−1 , ui5 , ui5+1 , ( ) U i = ui1− 2 , K , ui1+ 2 , ui2− 2 , K , ui2+ 2 , ui3− 2 , K , ui3+ 2 , ui4− 2 , K , ui4+ 2 , ui5− 2 , K , ui5+ 2 , Dimensions of the NN input layers made up 5, 15 and 25 neurons respectively. Reservoir Identification Output Based on Neural Networks (NN) The results of the experiment to identify reservoirs using NN are given in Fig.2. a) b) c) Fig. 2. NN performance related to reservoir identification with various data windows a) – without data window; b) – with data window ( 3 intervals) c) – with data window (5 intervals) 7 Mean square deviation and cross correlation factor for 10 wells of test sampling are given in table 1 Table 1 Estimations of NN Based Reservoir Identification Free of Data Windows and With Data Windows at 3 and 5 Intervals Mean square deviation Cross Correlation Factor и ## Well 1 3 5 1 3 5 interval intervals intervals interval intervals intervals 1 0,018 0,015 0,010 0,58 0,75 0,81 2 0,015 0,014 0,012 0,62 0,65 0,73 3 0,015 0,015 0,012 0,64 0,65 0,70 4 0,014 0,014 0,011 0,70 0,72 0,76 5 0,015 0,015 0,012 0,65 0,65 0,72 6 0,015 0,015 0,012 0,55 0,60 0,73 7 0,016 0,015 0,012 0,60 0,63 0,75 8 0,015 0,014 0,010 0,64 0,67 0,79 9 0,014 0,014 0,011 0,67 0,68 0,77 10 0,017 0,015 0,012 0,56 0,60 0,70 As evident from table 1 the reservoir identification efficiency increases when data window is used But at the same time the dimensionality of the interpretating NN increases, which results in rise of training time. The results of application of the algorithms to process the Network input signal The optimal result of rounding-off threshold and the fuzzy system performance are given in Fig.3. Rounding-off Threshold Fuzzy System Performance Fig. 3. Performance Related to NN Output Signal Processing Algorithms The results of application of rounding-off threshold and fuzzy system are given in comparison table 2 9th European Conference on the Mathematics of Oil Recovery — Cannes, France, 30 August - 2 September 2004 8 Table 2 Comparison Table of NN Output Signal Processing Methods Well 1 2 3 4 5 6 7 8 9 10 0,1 82 81 84 87 84 78 80 82 84 79 Rounding-off threshold; % coincidences 0,3 0,5 0,7 88 89 86 79 76 71 80 74 68 84 78 71 77 69 60 76 74 71 75 71 69 78 75 71 80 78 73 74 70 64 0,9 79 65 61 60 58 66 65 69 67 57 Fuzzy logic 90 85 87 89 88 82 84 86 87 83 As is seen from the table, application of the fuzzy logic methods produces the best results. Also the major advantage of the fuzzy system is that the system, once constructed, is independent of which a well it will be applied to, as opposed to the rounding-off threshold. The drawback to the fuzzy system is that additional implementation of fuzzy logic method is required. Conclusion The computational experiments demonstrated good identification of geological differences with NN system. And application of NN output processing algorithms made it possible to obtain up to 85% of coincidences of the Network estimations with reference data when using the rounding –off threshold and up to 93% when using the fuzzy logic method. It should be noted that these estimations are relative, as arguments of absolute identification of reservoirs by experts are lacking, and as consequence, the stated estimations are derived ones adjusted for possible mistakes of the experts. In other words, the accuracy of Lithologic formations identification in terms of a well log using NN may be higher than the stated one.