Data mining VIMS data for information on truck condition Tad S. Golosinski and Hui Hu University of Missouri-Rolla, Rolla, MO 65409-0450, USA Ralph Elias University of Botswana, Gaborone, Botswana ABSTRACT: The paper presents initial research related to use of data mining for analysis of condition of an off-highway mining truck. The raw data was collected using VIMS system of Caterpillar in a Botswana mine. The data mining tool was the IBM Intelligent Miner for data. The results indicate that data mining allows identification and quantification of relations between the various types of data. As such the data mining offers the potential for development of a predictive tool for prognosticating equipment condition and performance. Development of this capacity requires further research. 2 DATA MINING 1 INTRODUCTION Modern mining equipment if fitted with numerous sensors that monitor its condition and performance. The data collected by these sensors is used to alert the operator to existence of abnormal operating conditions and to perform emergency shut-own if the pre-set values of the monitoring parameters are exceeded. This data is also used for post-failure diagnostics and for reporting and analysis of equipment performance. It is believed that availability of this voluminous data, together with availability of sophisticated data processing methods and tools, may allow for extraction of additional information contained in the data. One method that may be of value is data mining (Golosinski, 2001). The research presented in this paper investigates use of the data collected from various sensors installed on a mining truck to develop predictive tool that would allow for reliable projection of both the equipment performance and condition into the future. Subject to the research was data collected by a variety of sensors installed on an off-highway mining truck (CAT 785) by the VIMS (Vital Information Monitoring System) system of Caterpillar. The data mining tool was the IBM Intelligent Miner for Data. Data mining is the next step beyond online analytical processing (OLAP) for querying data warehouses. It is frequently used in analysis of customer relationships in retail industry, for financial analysis & research, for fraud and risk management, supply chain management and e-business (Westphal and Blaxton, 1998). The representative methods used in data mining are: Association discovery Sequential pattern discovery Clustering Classification Value prediction Similar time sequences Numerous data mining techniques are used. AS an example predictive model creation is supported by supervised induction techniques, link analysis is supported by association discovery and sequence discovery techniques, clustering techniques supports database segmentation, and deviation detection is supported by statistical techniques. Analysis of results is usually accomplished through various forms of visualization that facilitates identification of patterns hidden in data, as well as in better comprehension of the information extracted by the data mining techniques. Data mining involves problem definition, data selection and preparation, data analysis and presentation of results. 3 VIMS: VITAL INFORMATION MANAGEMENT SYSTEM Caterpillar's Vital Information Management System (VIMS) is installed on selected CAT mining equipment. It is a powerful tool for machine management that provides operators, service personnel and managers with information on a wide range of vital machine functions and information on equipment production and performance. VIMS monitors and records indications of numerous sensors that are integrated into the vehicle design. It has the capacity to alert the operator if these indications exceed the pre-set critical values and conduce emergency equipment shut-down if so programmed (Caterpillar, 2000). The flow of VIMS data is illustrated in fig. 1. VIMS Data Mining Data down- load ACCESS DatabFig Data transfer ure 1 Figure 1. VIMS schematic. Precedure of data VIMS unit records the occurrence of cerOn-board mintain VIMS events and real time machine conditions. ingase The recorded data can be downloaded from the onboard VIMA unit either to a notebook computer or it can be sent to the central control unit via radio (VIMS Wireless). All collected data is grouped into seven following categories: 3.1.1 Event list summary list The event list is a record of stored events (“what happened and when”) that have occurred on the machine. The record contains the last 500 events that have occurred before equipment shutdown, listed in chronological order. 3.1.2 Snapshot The snapshot stores a segment of equipment history recorded in real time for all monitored parameters at a one-second interval. The snapshot relates to a set of pre-defined events and is triggered automatically if one of these events occurs (e.g. abnormal condition or emergency situation). 3.1.3 Data logger The data logger records all the machine parameters that are monitored by VIMS. The data is sampled in real time at one-second intervals. The logger is started and stopped at the operator command and can record data for up to 30 minutes. 3.1.4 Trends Trend information reported by VIMS consists of the minimums, maximums and averages of the selected indications calculated over a pre-selected period of time. 3.1.5 Cumulative Cumulative information refers to the number of occurrences of specific events over a pre-set period of time. An example of cumulative information can be the total engine revolutions or total fuel consumption over the life of the machine or component. 3.1.6 Histogram Histogram information records the performance history of a selected parameter since last reset. For example a histogram of the engine speed would indicate the percentages of time that the engine operated within a pre-specified speed ranges. 3.1.7 Payload Indications of the payload measurement system, if installed, may be recorded if so specified. 4 INTELLIGENT MINER OF IBM Variety of data mining software is commercially available from numerous vendors. It includes Intelligent Miner of IBM (International Business Machines Corporation), MineSet of SGI (Silicon Graphic Inc.), Clementine of ISL (Integral Solutions Limited of U.K.) and other. The IBM Intelligent Miner (IM) version 6.1 was used for the data mining work reported in this paper (IBM, 2000). It offers a large choice of algorithms, is easy to use, and has proven itself useful in many commercial applications. The IM included the following mining and statistics functions: 1. Mining functions: associations, demographic and neural clustering, sequential patterns and similar sequences, tree and neural classification, and neural and RBF (Radial Basis Function) prediction. 2. Statistics functions: bivariate statistics, linear regression, principal component analysis, univariate curve fitting and factor analysis. The IM allows modeling of pre-defined phenomena for events that can be either usual or unusual. Usual events describe the situation that is considered normal and for which the relations between different attributes are sought. For example, relations between truck operating and mechanical attributes can be defined or a relation between the engine load and truck payload. Definition and quantification of these relations may be of help in improving efficiency of truck operation or be of help with operator training. The unusual events are a failure of the monitored machine. Data mining of these events may allow for definition of algorithms that would facilitate prediction of the machine failure events. and failure situation. For example, from the huge VIMS data, if an important rule is found with high confidence. This rule can be used to improve the VIMS system itself for the higher reliable prediction of machine failure or fraud. Some emergent situation is set up to improve VIMS, which can alarm the operator for some emergency with this important rule. Moreover, the proper data analysis is helpful for the design of the machine. Based on the discovered relations and rules from the off board data, some attentions of the machine designer can focus on some important components. 5 DATA PREPARATION To facilitate IM data mining of VIMS recorded data, it has to be adapted to the format acceptable for the IM. The original VIMS data, downloaded from an on-board VIMS unit, can be merged into Microsoft ACCESS 97 database using the VIMS PC99 software. Unfortunately the IM does not accept Access data format. Consequently the data has first to be converted into ASCII format that is one of the acceptable data input formats for version 6.1 of Intelligent Miner software. The first step of data mining is data preparation. It involves data clean-up, and identification and extraction of data that is of interest to the problem at hand. This data must be presented in a form that is able to represent all information consistently. 6 DATA MINING Subject to data mining were 4437 VIMS records each of which contained values of 90 parameters recorded on a CAT 789 truck operating in a diamond mine in Botswana. The records were taken on March 8, 1999 and represent 4437 seconds of consecutive truck operation (sampling rate of one record per second). The purpose of the investigations was to confirm the feasibility of determining which condition and performance parameters of the truck are related to its fuel consumption rate. After that the strength of the relation between each of these parameters and the fuel consumption rate was to be tested. Three different data mining methods were used to define and quantify the relation between the recorded data streams. These were classification, demographic clustering, and principal component analysis combined with factor analysis (Bernson and Smith, 1997). All three are briefly summarized below. 6.1 Relationship discovery with Principal Component Analysis and Factor Analysis PCA (Principal Component Analysis) is used in statistics to extract the main relationships in data of high dimensionality. A common way to find the Principal Components of a data set is by calculating the eigenvectors of the data correlation matrix. These vectors give the directions in which the data cloud is stretched most. The projections of the data on the eigenvectors are the Principal Components. The corresponding eigenvalues give an indication of the amount of information the respective Principal Components represent. Principal Components corresponding to large eigenvalues represent much information in the data set and thus tell much about the relations between the data points. In Intelligent Miner, it looks for the standardized linear combination of the original variables. This tool can be used to summarize data and identify linear relationships among variables. It is also a dimension-reduction technique. In difference to factor analysis, principal component analysis tries to transform the vector describing the original variables linearly into a lowerdimensional subspace. Other benefit of this analysis tool is the handling of missing values. If a valid record contains missing values, these values are replaced with the mean value of the field or variable in question. It is possible that these data contains many missing values, so this tool is also useful as a preparatory step for running a mining function using the generated components as input fields. Moreover it is not necessary to run some mining method on all variables for saving plenty of time. Factor Analysis is an exploratory approach, which aim to make sense of multivariate data in a systematic manner. It searches for hidden variables in order to reduce data involving many variables down to small number of dimensions. Factor Analysis discovers the relationships among variables in terms of a few underlying, but unobservable, random quantities called factors. It has the same function for handling the missing and invalid value with Principal Component Analysis. Factor loadings window, shown as fig. 8, offers a graphical representation of the factors. In case of no clear interpretation of the factors, the factor rotation can simplify the factor structure to help user to better identify the meanings of the calculated factors. Its application in this case is similar to component analysis. 6.2 Database segmentation with Demographic Clustering In difference to the Principal Component Analysis and the Factor Analysis, the Clustering searches for hidden groups and classifies data into related clusters on the basis of values of several variables. The Demographic Clustering provides fast and natural clustering of very large databases. It automatically determines the number of clusters to be generated. Similarities between records are determined by comparing their field values. The clusters are then defined so that Condorcet’s criterion is maximized (IBM, 2000). This tool presents the percentage of the parameter of interest that appears in the whole population (all records), and the clustering percentage ie. the percentage current clustering that the parameter accounts for. Therefore, different combinations of these two allow to uncover interesting relations in the data set. as follows: e not particular. 6.3 Classification Another utility is to allow quantification of the correlation between various parameters under consideration. It is expressed as correlation coefficients of input variables. Fig. 3 and fig. 4 present value of these coefficients for several representative parameters. To illustrate the relations involved, fig. 3 presents several negative correlation coefficients for parameters determined to be related to engine fuel consumption rate. In this case, the turbocharger air in pressure has the largest negative correlation coefficient with the engine fuel consumption rate. Likewise the turbocharger out pressure has the largest positive correlation (fig. 4). It appears to be logical as the properly operating turbocharger can boost engine power by up to 40%. Classification is used to segregate database records into pre-defined classes based on selected criteria. Thus this technique can be used to define what truck operating or condition parameters define fuel consumption rate, what parameters define its cycle time and the like. 7 RESULTS AND DISCUSSION 7.1 Statistical analysis The chart in fig. 2 presents the IM display (Principal Components Result Viewer) of the principal attributes, generated by applying the IM Principal Component Analysis tool to selected VIMS data. In total it lists 65 principal attributes as related to truck fuel consumption rate out of the total of 90 listed in the database. Figure 3. Negative correlation coefficient (engine fuel rate consumption). Figure 4 Positive correlation coefficient (engine fuel rate consumption). Figure 2. Principal Component Analysis One of the principal utilities of this method is to reduce the number of parameters of interest that will form the input to the other data mining methods, thus to simplify the further investigations. In the case under consideration statistical analysis allowed reduction of parameters of interest to 65 principal, or by some 30%. Besides this correlation, other correlations were also defined. Some other truck parameters that have high positive correlation to the fuel consumption rate include: Booster Pressure calculated by subtracting atmospheric pressure from the turbocharger outlet pressure Engine Load calculated from the engine speed, throttle switch position, throttle position, boost pressure, and atmospheric pressure and expressed as a percentage of full load Right Exhaust Temperature and Left Exhaust Temperature, the temperature within the exhaust manifold of the engine on both sides of the truck. It is relevant for the purpose of subject research that the relations between various operating parameters of the truck can be found and quantified using data mining techniques, in this case Principal Component Analysis technique. Full discussion of the defined correlations is beyond the scope of this paper. truck spends more time running at the full load. On short hauls more time is spent on loading / dumping / maneuvering and waiting, the truck activities during which fuel consumption is low. 7.2 Tree-Clustering Following the principal component analysis the remaining data set was data mined using the IM demographic clustering technique. As a result the data set was segmented into 9 clusters as shown in fig. 5. The three largest clusters each account for the 14% of the whole data set. % 0 20 40 60 80 100 120 140 160 180 Figure 7. Demographic clustering: payload cluster (horizontal scale: payload in tons). Fig. 7 shows the payload cluster. It indicates that all trucks in the analyzed cluster were running empty (100 of the cluster), while in the whole population only around 50% of the trucks were empty. All the trucks in this cluster were traveling at 4th with the speed of 25 to 35 MPH and the fuel consumption rate was average. Fact that the truck was running empty for all the hauls in this cluster does not allow for drawing valid conclusions on the fuel consumption rate. The other clusters identified in this work are presented in fig. 5. These contain variety of other information related to truck performance. Figure 5 Demographic Clustering – IM output Fig. 6 and 7 show a zoom of the cluster related to haul distance and by truck payload. The haul distance cluster, shown in fig. 6 indicates that the haul distance is one of the main determinants of fuel consumption rate. In this cluster the percentage of 6 to 10 mile long hauls is approximately 40%, while the same percentage for the whole population is only 5%. One possible explanation is that on the long hauls truck fuel consumption rate is larger since % 0 2 4 6 8 10 12 14 16 18 20 22 24 Figure 6. Demographic Clustering: haul distance cluster (horizontal scale: haul distance in miles) . 7.3 Tree-Classification A sample of results obtained using classification technique of data mining is shown in fig. 8. It presents the statistical information and the confusion matrix of the data mining run. The tree-classification mining function builds a classification model as a binary decision tree. Each interior node of the binary decision tree tests an attribute of a record. If the attribute value satisfies the test, the record is sent down the left branch of the node. If the attribute value does not meet the requirements, the record is sent down the right branch of the node. At upper left corner, the 4 classes are marked with different colors. They are reflected in the tree map as Solid Square. The solid circles are the decision nodes. The binary decision tree consists of the root node on top, followed by non-leaf nodes and leaf nodes. Branches connect a node to 2 other nodes. Root and non-leaf nodes are represented as pie charts. Leaf nodes are represented as rectangles. Clicking on each node displays its characteristics in the window at the bottom of the window (see fig.9). This information includes: Engine Fuel Rate -------------Number of classes = 4 Errors = 1205 (27.78%) Confusion matrix for pruned tree Predicted Class ->| low | 100-200(l/h)| 200-300(l/h)| high| --------------------------------------------------------------------low | 1150 | 278 | 100-200(l/h) | 100 | 801 | 20 | 35 | total = 956 200-300(l/h) | 89 | 121 | 386 | 61 | total = 657 | 100 | 24 | high 122 | 194 | total = 1744 61 | 795 | total = 980 Selected observations that can be made in this case are: When ground speed is in the range from 12.25MPH to 15.5 MPH and the payload is over 126.85t, of the 283 records 96.8% indicate high engine fuel consumption rate; When ground speed is more than 31.5MPH and actual gear is higher than 5, all 146 records show low engine fuel consumption rate; The ground speed has more impact on the engine fuel consumption rate than do other parameters. --------------------------------------------------------------------1439 | 1224 | 589 | 1085| total = 4337 Figure 8 Classification result statistics Label: The pre-dominant class label of the selected node. Test: The split criterion for this node. This applies only to non-leaf nodes and specifies a simple selection. Records: The number of records contained in each of the sub-nodes the selected node. Distributions: The number of records corresponding to each of the possible class labels. The classification is most meaningful if all records belong to one leaf node only. However, by pruning the binary decision tree, records of other nodes can be assigned to the selected node. Purity-The percentage of correctly classified records assigned to a node. For the fuel consumption rate run the IM defined four classes that group the various input parameters. These allow definition of the parameters that contribute most to the high fuel consumption rate. This is done by tracking of the thicker black line with the arrow that link the nodes and continues on to the rectangles at the foot of the figure. Since the original plot is in color, the IM tracking is fairly straightforward. Figure 9. Classification-Tree. 8 CONCLUSIONS The investigations presented above prove that data mining techniques can be used to analyze performance of mining equipment. In particular the relations between its various operating, condition and performance parameters can be defined and quantified. These relations, in turn, can be used to develop predictive capability related to equipment condition and performance. Further research is needed to develop this capacity. ACKNOWLEDGEMENTS Investigations presented in this paper were funded by the grant from the Research Board of the University of Missouri System. Support and cooperation of Caterpillar, Inc. and of Debswana Diamond Mining Company is gratefully acknowledged. REFERENCES Bernson, A. and Smith, S. J. 1997. Data warehousing, data mining and OLAP. McGraw-Hill. Caterpillar, Inc. 1999. Vital Information Management System (VIMS): System Operation Testing and Adjusting. Company publication. Golosinski. 2001. Data mining uses in mining. Proceedings, APCOM 2001, Beijing, China. IBM (International Business Machines Corporation). 2000. Manual: Using the Intelligent Miner for Data. Company publication. Westphal, C. and Blaxton, T. 1998. Data mining solutions. John Wiley and Sons, Inc.