A Novel Machine Learning Technique for Online Health Monitoring of High-speed Trains LINHAO ZHANG, YIQING NI, SIUKAI LAI and SHENGGUO WANG ABSTRACT To ensure the operation safety and ride comfort of high-speed trains, a combination of smart sensory systems and intelligent identification models for online condition monitoring and assessment is highly desired. During routing operations, various dynamic responses induced by the wheel-rail interaction can cause severe wheel defects. The deterioration of train wheels, normally classified as “out-of-roundness” (OOR), can seriously threaten the operation safety and cause catastrophic derailment events. Conventional model-based prognostic methods often require an in-depth understanding of the wheel-track system to develop favorable mathematical models that are rather cumbersome. To complement the deficiencies of model-based prognostic approaches, the use of data-driven methods has been increasingly applied to various engineering fields. This research introduces a random forest (RF)-based method for online condition prediction and monitoring of train wheels. The RF-based method is a novel machine learning technique that possesses good stability and high accuracy for data classification with less parameter adjustment in modeling processes. A crucial step for the successful implementation of the RF-based technique is the data mining process to extract valuable feature information from the raw data. Therefore, the Teager-Kaiser energy operator (TKEO) and the wavelet packed decomposition (WPD) technique are integrated together for feature extraction in this work. The optimized feature subsets can thus be employed in the presented data-driven model for the online health monitoring of high-speed train wheels. _____________ Lin-Hao Zhang, Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, P.R. China. Yi-Qing Ni, Department of Civil and Environmental Engineering, and Hong Kong Branch of National Rail Transit Electrification and Automation Engineering Technology Research Center, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, P.R. China. Siu-Kai Lai, Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, P.R. China. Sheng-Guo Wang, College of Engineering, University of North Carolina at Charlotte, Charlotte, NC 28223-0001, USA. 349 INTRODUCTION High-speed railway (HSR) is currently deemed as an environmentally friendly mode of transport that can bring greatly beneficial for a huge volume of people to strengthen social networks and business activities. Wheelsets of a high-speed train act as one of crucial components, any deterioration poses a significant threat to the service life and running quality. Wheel defects are generally known as “out-of-roundness” (OOR), such as wheel flats, wheel spalling, corrugation and polygonization, which can cause severe damages on both tracks and vehicle components [1, 2]. To maintain desirable safety, comfort and economic trips of high-speed trains, it is therefore needed to draw more attention to the health status of wheelsets. In the literature, there are plenty of research studies on the performance assessment of wheelsets. Conventional inspection techniques (e.g., wheel impact load detector (WILD)) are suitable for a large number of wheel inspections, but it is often one-off. Besides, various wheel-rail interaction models have been proposed to investigate the dynamic responses of wheels and rail structures due to the presence of wheel defects [3, 4]. Nevertheless, the assumptions and simplifications adopted in the model-based approaches can greatly affect the level of accuracy and effectiveness in fault detection. In recent years, the data-driven methods based on monitoring data emerge as an alternative way for long-term assessment of high-speed trains [5, 6]. This paper proposes a novel strategy to combine smart sensory systems and intelligent identification models for online health monitoring of high-speed trains. The random forest (RF)-based prognostic method that possesses good stability and high accuracy for data classification is employed. In the present study, the monitoring data were acquired from an on-board sensing system, which was installed on an in-service high-speed train before and after the wheel lathing procedure (i.e., a process making out-of-round wheels perfectly round again in a depot). Since the information hidden behind the measured data is crucial to identify the health status of train wheels, the Teager-Kaiser energy operator (TKEO) and the wavelet packed decomposition (WPD) technique are integrated together for feature extraction before implementing the RF-based method. RF-BASED METHODOLOGY FOR CONDITION ASSESSMENT This section presents the major procedures to implement the RF-based method for online condition assessment of train wheels. It consists of three phases, namely (i) the data pre-processing with a moving window, (ii) the feature extraction in both timedomain and frequency-domain, and (iii) the construction of a RF model. Each of them is discussed in the subsequent sub-sections in detail. Data Pre-processing with Moving Window To effectively identify the operational performance of wheels by using the online monitoring data, a fixed-size moving window is used to extract various sub-datasets from the original signals. The raw data processed by the moving window at time 𝑡𝑘 can be compacted into a matrix form as 350 𝑚1 (𝑡𝑘 ) 𝑚1 (𝑡𝑘+1 ) 𝐌𝐤 (𝐭) = ⋮ [𝑚1 (𝑡𝑘+𝑁𝑤 ) 𝑚2 (𝑡𝑘 ) 𝑚2 (𝑡𝑘+1 ) ⋮ 𝑚2 (𝑡𝑘+𝑁𝑤 ) 𝑚𝑁𝑠 (𝑡𝑘 ) ⋯ ⋯ 𝑚𝑁𝑠 (𝑡𝑘+1 ) 𝑁𝑝 , 𝑘 = 1, … , 𝑁 (1−𝑄) 𝑤 ⋱ ⋮ ⋯ 𝑚𝑁𝑠 (𝑡𝑘+𝑁𝑤 )] (1) where 𝑁𝑠 is the number of sensors deployed on the high-speed train, 𝑁𝑝 is the number of sampling points and 𝑁𝑤 is the number of measurements within each moving window. 𝑄 denotes the overlap degree of the window along the column. In each time step 𝑡𝑘 , 𝑚𝑖 (𝑡𝑘 ) is the i-th sensor data at time 𝑡𝑘 , the data segment 𝐌𝐤 (𝐭) represents the running condition of train wheels within the moving time window at time segment 𝑡𝑘 . To dig out the information of the original signals, the TKEO is used to transfer the data into a TK time domain and gains the following amplitude modulated signals as in (2) with the description in (6), 𝑎𝑁𝑠 (𝑡𝑘 ) ⋯ 𝑎1 (𝑡𝑘 ) 𝑎2 (𝑡𝑘 ) ⋯ 𝑎𝑁𝑠 (𝑡𝑘+1 ) 𝑎1 (𝑡𝑘+1 ) 𝑎2 (𝑡𝑘+1 ) 𝑁𝑝 (2) 𝐀 𝐤 (𝐭) = , 𝑘 = 1, … , 𝑁 (1−𝑄) 𝑤 ⋱ ⋮ ⋮ ⋮ [𝑎1 (𝑡𝑘+𝑁𝑤 ) 𝑎2 (𝑡𝑘+𝑁𝑤 ) ⋯ 𝑎𝑁𝑠 (𝑡𝑘+𝑁𝑤 )] The statistical approaches are then employed for feature exaction as presented in Table I. Along with other features extracted from a frequency-domain by using the WPD method and the Sperling index, the feature matrix is then expressed as 𝑁𝑝 𝐅(𝐭 𝒌 ) = [𝑓1 (𝑡𝑘 ) 𝑓2 (𝑡𝑘 ) ⋯ 𝑓𝑁𝑣 (𝑡𝑘 )], 𝑘 = 1, … , 𝑁 (1−𝑄) (3) 𝑤 where 𝑁𝑣 is the number of features extracted at each time segment. Feature Extraction Strategies TIME-DOMAIN FEATURES: TKEO The TKEO [·] was first proposed by Kaiser [7]. It has good adaptability and high time resolution without complicated signal transform procedures and any band-pass or low-pass filtering. It is defined as (4) [𝑥(𝑡)] = [𝑥̇ (𝑡)]2 − 𝑥(𝑡)𝑥̈ (𝑡) where 𝑥(𝑡) and 𝑥̈ (𝑡) are the first and second time derivatives of the original signals, respectively. In terms of a discrete time, it can be expressed as (5) [𝑥(𝑛)] = 𝑥(𝑛)2 − 𝑥(𝑛 + 1)𝑥(𝑛 − 1) where only three sampling points are required for energy computation at each time instant. To capture the variation of signals in the form of energy fluctuation, DESA-1 is employed to estimate the amplitude modulated signals [8]: |𝑎(𝑛)| = [𝑥(𝑛)] [𝑥(𝑛)] + [𝑥(𝑛 + 1)] } 4[𝑥(𝑛)] 2 1 − {1 − (6) 𝑓(𝑛) = arccos {1 − [𝑦(𝑛)] + [𝑦(𝑛 + 1)] } 4[𝑥(𝑛)] (7) √ where 𝑦(𝑛) = 𝑥(𝑛) − 𝑥(𝑛 − 1) . Functions 𝑎(𝑛) and 𝑓(𝑛) are, respectively, the amplitude modulated signals and the instantaneous frequency-modulation signals. 351 After that, the statistical measures are applied for feature extraction as shown in Table I below. FREQUENCY-DOMAIN FEATURES: WPD AND SPERLING INDEX Two frequency-domain techniques are used to extract features in this work. The WPD method can evaluate the variation of energy by decomposing the original signals into 2𝑁 bands [9]. In Figure 1, a significant change is observed when comparing the power spectral density (PSD) of the acceleration on the axle box acquired before and after the wheel lathing process. Similarly, the variation of energy at the second band (4.88 Hz – 9.77 Hz) can be obviously found by the WPD technique. This is well consistent with the available results [10], in which the vibration frequencies ranging from 5 Hz to 10 Hz are mainly caused by the wheel-rail contact bouncing at two sides of abrasion concave due to wheel defects. Hence, the energy value of the second frequency band obtained by the WPD technique is selected as a feature in RF modeling. In addition, the vibration-based Sperling index, relating to the subjective comfort feeling of passengers with objective physical variables, is selected as a feature to reflect the running status of train vehicles [11]. The details are presented in Table II. (a) (b) Figure 1. Acceleration on the axle box before and after lathing: (a) original signals in PSD; (b) corresponding energy values in WPD. Label ME RM SF SK TABLE I. SUMMARY OF TIME-DOMAIN FEATURES. Expression Specification and illustration 1 𝑁 Mean value of the amplitude of TK 𝑚 = ∑𝑖=1 𝑥𝑖 𝑁 signals. Root mean square is also known as a 1 𝑥𝑟𝑚𝑠 = √ ∑𝑁 𝑥2 quadratic mean. 𝑁 𝑖=1 𝑖 1 Shape factor refers to a value that is |𝑥 | 𝑓 = 𝑥𝑟𝑚𝑠 ⁄ ∑𝑁 𝑁 𝑖=1 𝑖 affected by the shape of waveforms. Skewness measures the asymmetry of 3 the probability distribution of a real1 1 (𝑥 − 𝑚)3 ⁄[√ ∑𝑁 (𝑥 − 𝑚)2 ] 𝑠 = ∑𝑁 valued random variable around its 𝑁 𝑖=1 𝑖 𝑁 𝑖=1 𝑖 mean value. 352 1 1 𝑁 𝑁 KU 𝑁 4 2 𝑘 = ∑𝑁 𝑖=1(𝑥𝑖 − 𝑚) ⁄[ ∑𝑖=1(𝑥𝑖 − 𝑚) ] CF 𝑐 = max(𝑥) /𝑥𝑟𝑚𝑠 Kurtosis is a descriptor for the shape of a probability distribution by different quantifying ways. Crest factor is the measure of a waveform to show the ratio of the peak value to the effective value. 2 TABLE II. SUMMARY OF FREQUENCY-DOMAIN FEATURES. Expression Specification and illustration 𝑛 The calculation of energy by the 2 2 WPD technique using the 𝐸 = ∫|𝑆𝑁𝑗 (𝑡)| 𝑑𝑡 = ∑|𝑥𝑗𝑘 | acceleration data [12]. 𝑘=1 Label EN 0.1 𝑛 𝑊= (𝑊110 + 𝑊210 + ⋯ 𝑊𝑛10 )0.1 SI 𝑊 = 7.08 [ = (∑ 𝑊𝑖10 ) 𝑖=1 0.1 𝐴2𝑖 𝐹(𝑓𝑖 )] 𝑓𝑖 The Sperling index is a specific indicator relating to the subjective comfort feeling of passengers with the objective physical variables of a running vehicle [13]. RF-based Classification Random forest (RF) model is an ensemble learning technique for classification. It has good stability and is not sensitive to noise. As shown in Figure 2, the basic rationale of a RF model combines a plenty of individual classifier decision trees and allows them to vote for the most favorite class to achieve a high level of accuracy [14]. Three major steps for the implementation of this RF classification technique are presented as follows: Step 1: Data subset generation for training. The feature matrix 𝐅(𝐭) is employed as the original input dataset in the RF. Making use of a bagging process, 𝑘 training sub-datasets are randomly selected from the original 𝐅(𝐭) dataset. Due to the selection with displacements and no deletion of the sampling data, other unused data consisted of the out-of-bag (OOB) datasets are used to estimate the accuracy. Step 2: Growth of classifier trees. Unlike conventional decision trees, the RF technique does not require using any pruning techniques to gain high performance. It randomly selects a fixedsize (𝑚) of split features from 𝐅(𝐭). Then, the inherent Gini index is used to decide the best feature in the division of each splitting node for growing trees. Step 3: Selection of the most popular class. The margin function in RF assists the selection of a right class. Finally, the RF technique counts the number of times for the appearance of the samples at the same terminal classification node and then votes for the best classification. 353 Figure 2. Architectural hierarchy of a RF classification model. ILLUSTRATIVE APPLICATION As aforementioned, an on-board sensing system was installed on an in-service high-speed train. Both piezoelectric and optical fiber sensors were used to continuously collect various types of data, including acceleration, strain, temperature and sound data, from the trailer bogie, axle box and interior car floor of the train [15]. According to Tables I and II, 𝑁𝑣 (= 21) features are extracted from the raw data and they are recorded in the matrix columns 𝐅(𝐭) = [𝑓1 (𝑡), 𝑓2 (𝑡), … 𝑓𝑁𝑣 (𝑡)]. In this study, the moving window, having a width of 100s and a 75% overlap degree, is employed. To validate the effectiveness of the proposed RF-based prognostic technique, the monitoring data acquired from the train before and after the wheel lathing process are used. At each time segment 𝑡𝑘 , 𝐅(𝐭 𝐤 ) is stuck with a classification tag that can be used to represent the status of train wheels. TABLE III. CLASSIFICATION ACCURACY (%) WITH DIFFERENT SPLIT FEATURES. m Split feature(s) selected for RF modeling and prediction (k = 2000, Nv’ = 10) Category m=1 2 3 4 5 6 7 8 9 10 Class 1 99.83 99.72 99.65 99.55 99.55 99.51 99.38 99.27 99.27 99.13 Class 2 97.96 97.96 98.10 98.14 98.14 98.21 98.14 98.10 98.10 97.96 Average 98.90 98.84 98.88 98.85 98.85 98.88 98.83 98.74 98.69 98.55 (Note: Nv’ is the number of features extracted from the elimination work (i.e., the selection of relative important features from the original matrix F(t) according to the Boruta feature selection; Class 1 is for the estimate accuracy of the well-behaved statue of wheels, and Class 2 is for the estimate accuracy of the out-of-round wheels.) TABLE IV. CLASSIFICATION RESULTS OF THE OPTIMIZED RF MODEL. Classifier RF Identification accuracy of the wheel conditions (k = 1500, m = 6) Well-behaved Out-of-round Overall 99.62% 98.21% 98.91% 354 0.4 0.2 0.0 -0.2 Dim 2 2 Dimension Classification 1 Classification 2 -0.4 -0.2 0.0 Dimension 1 0.2 0.4 Dim 1 Figure 3. A metric multi-dimensional scaling representation for the online monitoring data of the inservice high-speed train. Two metric parameters, the number of trees (𝑘) and the split number of features (𝑚), are required for RF modeling. In this work, 𝑘 is set as 1500 since the generation error converges as the number of trees increases in accordance with the “Strong Law of Large Numbers” [16]. To ensure the robustness and reduce the data processing load, the Boruta feature selection (BFS) as one of the powerful RF-based importance measures [17] is used to estimate the significance of features, it can also rearrange the sequence of elements in 𝐅(𝐭). Finally, the first 10 important features (𝑁𝑣′ = 10) in 𝐅𝐵𝐹𝑆 (𝐭) are considered for RF modeling based on a convergence study. Table III shows the influence of the number of split variables (𝑚) on the classifier accuracy as the tree number 𝑘 is equal to 2000. The best performance of the RF model to recognize classification 1 (i.e. well-behaved statue of wheels) reaches to 99.83% when 𝑚 is set to 1. Similarly, the RF model using 6 split features (𝑚 = 6) can well identify the second classification (i.e. out-of-round wheels) with an accuracy of 98.21%. Regarding of the average classifying accuracy, the RF models employing 1, 3 and 6 split features can achieve the relative higher values (98.88% – 98.90%) when comparing to other cases. It is worth noting that the identification accuracy of the outof-round wheels (class 2) should be paid more attention since the wheel defects can affect the running safety of high-speed trains. From a modeling perspective, the number of 𝑚 (>1) is preferred to balance the tree strength and the correlation among trees in a RF structure. Therefore, the RF model trained by defining 1500 trees (𝑘 = 1500), 6 split variables (𝑚 = 6) and 𝐅𝐵𝐹𝑆 (𝐭) with 𝑁𝑣′ = 10 is selected as the optimal one for condition assessment of the train wheels. In this research, 50% of the collected data are randomly selected for RF modeling, and the rest samples (5676 sets) are used to test the model performance. In Table IV, the proposed method can effectively identify the out-of-round wheels (class 2) with an accuracy of 98.21%, and the average recognition accuracy is up to 98.91%. The metric multi-dimensional scaling results from the optimized RF model are illustrated in Figure 3. It is observed that the features to characterize the same condition of the wheels are clustered well and each cluster is clearly separated. 355 CONCLUSIONS Operation safety is always of paramount importance for high-speed trains. The quality of train wheels is a dominant safety factor whose deterioration can seriously cause catastrophic derailment events. It is motivated to exploit innovative techniques for online condition assessment of high-speed trains. The use of data-driven approaches based on online monitoring data paves an effective avenue to trace the health status of high-speed trains. This paper presents a novel RF-based strategy that combines with both TKEO and WPD techniques for real-time condition monitoring of train wheels. The RF technique possesses good stability and involves less parameter adjustment in classification. In RF modeling, the moving window technique is used to realize the fast and continuous status identification. The feature extraction that digs out the useful information contained in the raw data is a crucial step for the successful implementation of the RF-based method for online health monitoring of high-speed trains. An illustrative example is provided herein, and the test results show that the RF model can achieve an excellent performance for the identification of out-of-round wheels. ACKNOWLEDGEMENTS The work described in this paper was supported by the funding from the Innovation and Technology Commission of Hong Kong SAR Government (Grants No. K-BBY1 and 1-BBYJ) to the Hong Kong Branch of Chinese National Rail Transit Electrification and Automation Engineering Technology Research Center. Prof. Sheng-Guo Wang appreciates the Fulbright award program and HK PolyU support to his work as a US Fulbright-PolyU senior scholar of 2016-2017 at the HK PolyU. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. Johansson, A., and Nielsen, J. C. 2003. “Out-of-round railway wheels—wheel-rail contact forces and track response derived from field tests and numerical simulations,” Proc. Instit. Mech. Eng., Part F: J. Rail Rapid Transit, 217(2): 135-146. Barke, D. W., and Chiu, W. K. 2005. “A review of the effects of out-of-round wheels on track and vehicle components,” Proc. Instit. Mech. Eng., Part F: J. Rail Rapid Transit, 219(3): 151-175. Nielsen, J. C., and Oscarsson, J. 2004. “Simulation of dynamic train–track interaction with statedependent track properties,” J. sound vib., 275(3-5): 515-532. Alexandrou, G., Kouroussis, G., and Verlinden, O. 2016. “A comprehensive prediction model for vehicle/track/soil dynamic response due to wheel flats,” Proc. Instit. Mech. Eng., Part F: J. Rail Rapid Transit, 230(4), 1088-1104. Ni, Y. Q., Liu, X. Z., Zhao, W. Z., and Liang, S. L. 2015. “Outlier Detection in Sensor-assisted Online Ride Comfort Assessment of High-speed Trains,” Proceeding of the 10th International Workshop on Structural Health Monitoring, Stanford, California, USA. Zhang, L. H., Wang, Y. W., Ni, Y. Q., and Lai, S. K. 2018. “Online condition assessment of highspeed trains based on Bayesian forecasting approach and time series analysis,” Smart Struct. Syst., 21(5), 705-713. Kaiser, J. F. 1990. “On a simple algorithm to calculate the 'energy' of a signal,” Proc. Int. Conf. Acoust., Speech, Signal Process., Apr. 1990, pp. 381-384. Kvedalen, E. 2003. “Signal processing using the Teager Energy Operator and other nonlinear operators,” PhD thesis, University of Oslo, 2003. 356 9. 10. 11. 12. 13. 14. 15. 16. 17. Ekici, S., Yildirim, S., and Poyraz, M. 2008. “Energy and entropy-based feature extraction for locating fault on transmission lines by using neural network and wavelet packet decomposition,” Expert Syst. Appl., 34(4), 2937-2944. Huang, Z. W., Cui, D. B., Du, X., and Jin, X. S. 2013. “Influence of deviated wear of wheel on performance of high-speed train running on straight tracks,” J. China Railw. Soc., 35(2), 14-20. Zhou, J., Goodall, R., Ren, L., and Zhang, H. 2009. “Influences of car body vertical flexibility on ride quality of passenger railway vehicles,” Proc. Instit. Mech. Eng., Part F: J. Rail Rapid Transit, 223(5), 461-471. Zhou, Q., Zhou, H., Zhou, Q., Yang, F., and Luo, L. 2014. “Structure damage detection based on random forest recursive feature elimination,” Mech. Syst. Signal Process., 46(1), 82-90. China National Bureau of Standards 1985, Railway Vehicles Specification for Evaluation of Dynamic Performance and Accreditation Test, GB5599, Beijing, China. Breiman, L. 2001. “Random forests,” Mach. Learn., 45(1), 5-32. Wang, X., Ni, Y. Q., Zhang, L. H., and Sun, Q. 2016. “Understanding of dynamic interaction of an in-service high-speed train via on-board monitoring,” Presented at the 1st International Workshop on Structural Health Monitoring for Railway System, October 12-15, 2016. Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., Chica-Olmo, M., and Rigol-Sanchez, J. P. 2012. “An assessment of the effectiveness of a random forest classifier for land-cover classification,” ISPRS J. Photogramm. Remote Sens., 67, 93-104. Kursa, M. B., and Rudnicki, W. R. 2010. “Feature selection with the Boruta package,” J. Stat. Softw., 36(11), 1-13. 357