IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 10, OCTOBER 2020 7319 An Improved Quantum-Inspired Differential Evolution Algorithm for Deep Belief Network Wu Deng , Member, IEEE, Hailong Liu , Member, IEEE, Junjie Xu , Member, IEEE, Huimin Zhao , Member, IEEE, and Yingjie Song , Member, IEEE Abstract— Deep belief network (DBN) is one of the most representative deep learning models. However, it has a disadvantage that the network structure and parameters are basically determined by experiences. In this article, an improved quantuminspired differential evolution (MSIQDE), namely MSIQDE algorithm based on making use of the merits of the Mexh wavelet function, standard normal distribution, adaptive quantum state update, and quantum nongate mutation, is proposed to avoid premature convergence and improve the global search ability. Then, the MSIQDE with global optimization ability is used to optimize the parameters of the DBN to construct an optimal DBN model, which is further applied to propose a new fault classification, namely MSIQDE-DBN method. Finally, the vibration data of rolling bearings from the Case Western Reserve University and a real-world engineering application are carried out to verify the performance of the MSIQDE-DBN method. The experimental results show that the MSIQDE takes on better optimization performance, and the MSIQDE-DBN can obtain higher classification accuracy than the other comparison methods. Index Terms— Deep belief network (DBN), fault classification, multistrategies, parameter optimization, quantum-inspired differential evolution (QDE). Manuscript received October 21, 2019; revised January 26, 2020; accepted March 13, 2020. Date of publication March 25, 2020; date of current version September 15, 2020. This work was supported in part by the National Natural Science Foundation of China under Grant 61771087, Grant 51605068, Grant 51475065, and Grant 51879027, in part by the Open Project Program of the Traction Power State Key Laboratory of Southwest Jiaotong University under Grant TPL2002, in part by the State Key Laboratory of Mechanical Transmissions of Chongqing University under Grant SKLMT-KFKT-201803, and in part by the Key Laboratory of Air Traffic Control Operation Planning and Safety Technology of CAUC under Grant 600001010932. The Associate Editor coordinating the review process was John Sheppard. (Corresponding author: Huimin Zhao.) Wu Deng is with the College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China, and also with the Traction Power State Key Laboratory, Southwest Jiaotong University, Chengdu 610031, China (e-mail: dw7689@163.com). Hailong Liu is with the School of Electronics and Information Engineering, Dalian Jiaotong University, Dalian 116028, China (e-mail: 18340807855@163.com). Junjie Xu is with the College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China (e-mail: connyadmin@163.com). Huimin Zhao is with the College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China, and also with the State Key Laboratory of Mechanical Transmissions, Chongqing University, Chongqing 400044, China (e-mail: hm_zhao1977@126.com). Yingjie Song is with the Co-innovation Center of Shandong Colleges and Universities: Future Intelligent Computing, Shandong Technology and Business University, Yantai 264005, China (e-mail: songyj@sdtbu.edu.cn). Color versions of one or more of the figures in this article are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIM.2020.2983233 I. I NTRODUCTION N THE research of fault diagnosis, the time-domain or frequency-domain analysis methods are usually used to diagnose faults by the vibration monitoring data [1]–[3] because the time-domain signal can directly be used to extract fault features, which are beneficial to keep the basic characteristics of the signal. In time-domain analysis, the dimensionless indexes of pulse, kurtosis, peak, and waveform are widely used, but these indexes are only sensitive to some fault types, and may be poorer for other fault types [4]–7]. Therefore, the traditional dimensionless indexes are combined and optimized to construct new dimensionless indexes for fault classification. However, it is difficult to obtain a dimensionless index with better classification ability for the sample data with a large aliasing. As a typical representative of deep learning, deep belief network (DBN) forms a more abstract high-level representation by combining low-level features to discover distributed feature representation [8]. It can directly obtain high-level features from low-level signals layer by layer using greedy learning. The DBN can avoid the brought complexity and uncertainty of traditional feature extraction, and enhance recognition intelligence. Therefore, the DBN is widely applied in fault classification [9]–[14]. Tran et al. [15] presented a new diagnosis method using Teager–Kaiser energy operator and DBN. Chen and Li [16] presented an approach using sparse autoencoder and DBN for multisensor feature fusion. Shao et al. [17] presented an electric locomotive bearing fault diagnosis method using DBN. Zhao et al. [18] presented a fault diagnosis method using principal component analysis (PCA) and a broad learning system. Qin et al. [19] presented an optimized DBN with logistic sigmoid units. However, the structure and parameters of DBN are basically determined by experiences. This could not only bring man-made influence diagnosis error, but is also not conducive to optimize the network structure, which results in higher calculation cost and slow speed, and cannot meet the actual needs of the fault classification. Differential evolution (DE) algorithm is a heuristic optimization algorithm, which uses the difference guiding algorithm between individuals to search in the solution space [20]. Quantum computing is a new computing technique for solving various problems [21]. Quantum-inspired DE (QDE) makes full use of the fast performance of quantum computing and the optimization ability of the DE [22]. It can prevent premature convergence, promote fast convergence, I 0018-9456 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. 7320 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 10, OCTOBER 2020 and ensure the diversity of population. In recent years, the researchers have studied QDE for solving complex problems. Draa et al. [23] presented a QDE for the N-queens problem. Su and Yang [24] presented a QDE to learn the Takagi–Sugeno fuzzy. Xu et al. [25] presented a multiobjective evolutionary algorithm to optimize the design of the hydraulic excavator shovel attachment. The other optimization methods are presented in [26]–[31]. In general, the QDE can effectively prevent premature convergence, promote fast convergence, and ensure the diversity of population. However, it has low global search ability and difficultly in determining parameters. To solve the experience selection of the DBN parameters, the QDE with better optimization performance is used to select the parameter of the DBN; an improved QDE algorithm based on multistrategies, namely MSIQDE, is proposed to optimize the parameters of the DBN to obtain a fault classification (MSIQDE-DBN) method which is used to deal with the fault classification problem. The MSIQDE-DBN method can eliminate the interference of human factors and adaptively select the optimal parameters of the DBN, so as to effectively improve the classification accuracy and meet the actual requirements. The rest of this article is organized as follows. Section II introduces the QDE. Section III describes an improved QDE with multistrategies. Section IV describes an optimized DBN model. A fault classification is discussed in Section V. Section VI discusses and compares the outcomes by experiment and analysis. Section VII includes the conclusion and future works. II. QDE In the QDE, a quantum chromosome is randomly selected from the quantum population. The Q-bit is used as the base vector, and the Q-bit of the other two quantum chromosomes is used as the difference vector. The generation method is described as t t t (5) + F0 · rand · θr2 − θr3 θit = θr1 where F0 is mutation coefficient, t is the number of iterations. r1 , r2 , and r3 are randomly generated exclusive integers in [1, NP]. The mutated quantum chromosomes are screened. If the chromosome values exceed range [−π, π], the quantum chromosome values are generated randomly. The screening method is described as θi,t j new = (1) where α and β represent the probabilistic amplitudes of the corresponding states. The normalized condition must be satisfied with |α|2 + |β|2 = 1. A quantum chromosome with length m can be expressed as α1 α2 . . . αm | . (2) q= β1 β2 . . . βm (3) because the sum of probabilistic amplitudes of two states of each single Q-bit is 1, i.e., α 2 +β 2 = 1. Therefore, α and β can correspond to cosθ and sinθ one by one, respectively, so Q-bit can also be expressed as (cosθ, sinθ )T . In the QDE, there are NP individuals in population, and each individual is composed of D Q-bits. Then, the i th individual x i can be expressed as x i = x i,1 x i,2 |∧ x i,m cos(θi,1 ) cos(θi,2 ) ∧ cos(θi,m ) = sin(θi,1 ) sin(θi,2 ) ∧ sin(θi,m ) (4) θi, j = π · rand θi,t j , ifθi,t j > −π and θi,t j < π π · rand, else. (6) C. Quantum Crossover Operation To ensure the diversity of quantum population and enhance the global search ability, quantum mutation individuals and predetermined parent individuals are mixed to generate new individuals. The crossover operation is described as t In QDE, the Q-bit is used to represent chromosomes. A pair of complex numbers (α, β) are used to define a Q-bit, which is represented as the vector [α, β]T . So a Q-bit can be expressed as A 3 Q-bits chromosome can be expressed as √ √ √ 1/√2 1/ √2 1/ √ 2 q= 1/ 2 −1/ 2 3/2 B. Quantum Mutation Operation θ i, j = A. Chromosome Coding With Real Number |ϕr = α| 0r + β|1r where rand is a random number in [0, 1], i ∈ (1, 2, . . . , N P), and NP is the number of individuals in the population. if rand < CR or j = rand θi,t j , θi,t j new , else (7) where CR is the quantum crossover coefficient. D. Quantum Selection Operation In the QDE, the greedy strategy is used to evaluate the objective function values of the test vectors to select individuals with better fitness values for the next generation. The selection process is described as t θit , if f x it < f x i u ki = (8) t θ i, j , else where x ik is the chromosome without quantum mutation and quantum crossover of the kth iteration. III. I MPROVED QDE W ITH M ULTISTRATEGIES For the low global search ability and difficultly in determining parameters of QDE, the multistrategies of standard normal distribution, Mexh wavelet function, adaptive quantum state update, and quantum nongate mutation are used to propose an improved QDE (MSIQDE) in this article. In the MSIQDE, the Mexh wavelet function is used to improve the mutation coefficient from falling into the local optimum and to ensure the diversity of population. The standard normal distribution is used to improve the crossover coefficient by increasing the value diversity of parameters and to improve the global search performance. The adaptive quantum state update is used to Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. DENG et al.: IMPROVED QDE ALGORITHM FOR DBN 7321 dynamically adjust the position of the quantum chromosome at each iteration to speed up the convergence. The quantum nongate mutation is used to mutate the quantum chromosome to keep the optimal position of the memory. So the MSIQDE with different strategies can effectively improve the global search ability and population diversity, avoid premature convergence, accelerate the convergence, and prevent falling into local optimum. A. Mutation Coefficient Using Mexh Wavelet In the QDE, the mutation coefficient F0 is the most important factor. To keep the diversity of the population and to improve the convergence speed, the Mexh wavelet function is used to improve the mutation coefficient. Mexican Hat is the second derivative of the Gauss function. It has better localization characteristics in both time and frequency domains, and R (t)dt = 0. Therefore, it is used to improve the mutation coefficient, whose value is randomly selected between (0, 1). The mutation coefficient based on the Mexh wavelet is described as 1 x2 2 (9) F0 = √ · π − 4 · 1 − x 2 · e− 2 . 3 B. Crossover Coefficient Using Standard Normal Distribution In quantum crossover operation, the crossover coefficient CR represents the crossover probability to reflect the probability of inheriting information from the parent population. The larger CR value makes the new population to depend more on the mutation process and inherits less information from the parent population, so as to achieve a larger range of global search and reduce the possibility of falling into the local optimum. On the contrary, the smaller CR encourages local search around the parent population, which can accelerate the convergence and improve the solution accuracy. In this article, the standard normal distribution is used to improve the CR. The expression is described as CR = N (0, 1) . (10) C. Adaptive Quantum State Update Strategy Based on the traditional quantum mutation, the crossover, selection, and quantum gate mutation, the adaptive state update of the quantum rotation angle is used to adjust the position of the quantum chromosome for each iteration. The adaptive state update strategy of the quantum rotation angle is described as θit = θmin + fit ∗ (θmax −θ min ) ∗ rand ∗ exp(G/Gm) (11) (12) fit = (fit Best − fit i )/fitBest where θmin is the lower limit of the angle which is set to 0.001π. θmax is the upper limit of angle which is set to 0.05π. fit Best is the global optimal value, and fit i is the fitness value of the current individual. rand is the random number in [0, 1], G and Gm are the current number and maximum number of iterations, respectively. D. Quantum Nongate Mutation Strategy In quantum computation, a series of transformations of Q-bits are used to realize the logical transformation function. A quantum device that realizes logical transformation within a certain time interval is called a quantum gate. In this article, quantum nongate is used to mutate the quantum chromosome, and the mutation probability Pm is set. If Pm > rand, then some quantum bits in the quantum chromosome are randomly selected to mutate. The optimum position of the memory is kept. The quantum nongate mutation strategy is described as cos(θi,1 ) cos(θi,2 ) ∧ cos(θi,m ) u ki = sin(θ ) sin(θ ) ∧ sin(θ ) i,1 i,2 i,m ⎞ ⎛ π − θ cos i, j 01 cos θi, j 2 ⎠ =⎝ π 10 sin(θ i, j ) − θi, j sin 2 ⎧ ⎨ π − θ k , if Pm > rand 2 v i,k j = i, j ⎩ θk , else. (13) (14) (15) i, j IV. O PTIMIZE THE PARAMETERS OF DBN A. DBN The DBN is one of the neural networks which is stacked by multiple restricted Boltzmann machine (RBM) networks and a supervised back propagation (BP) network [12]. Each RBM consists of a visible layer and a hidden layer. Let v i and h j denote the first i th neuron in the visible layer and the j th neuron in the hidden layer, respectively. For a group of (v, h), the energy function of RBM is defined as E (v, h|θ ) = − I i=1 ai v i − J j =1 bjh j − J I v i ωi j h j (16) i=1 j =1 where θ = (ωi j , ai , b j ) is the parameters of RBM, ωi j is the weight between the node v i in the visible layer and the node h j in the hidden layer, ai and b j are the bias values of v i and h j , respectively. According to the energy function, the joint probability distribution of (v, h) can be obtained (17) p(v, h|θ ) = e−E(v,h|θ ) / [Z (θ )] −E(v,h|θ ) is the normalization factor. where Z (θ ) = v h e Because there is no connection between the layers in the RBM when the state of the nodes in the visible layer is determined, the activation states of the nodes in the hidden layer are independent of each other. Therefore, the activation probability of the node in the j th hidden layer is described as I vi ω j i (18) p h j = 1|v, θ = σ b j + i=1 where σ (x) = 1/(1 + e−x ) is the sigmoid function. At the same time, the activation probability of the node in the i th visible layer can be obtained ⎛ ⎞ J hi ω j i ⎠ . p(v i = 1|h, θ ) = σ ⎝ai + (19) j =1 Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. 7322 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 10, OCTOBER 2020 RBM is a random neural network with activation function of the sigmoid. The parameter θ = (ωi j , ai , b j ) is obtained by continuous iterations, and θ = (ωi j , ai , b j ) is fitted with the given training data. The parameter θ ∗ can be obtained by the maximum logarithmic likelihood function on the training set θ ∗ = argθ maxL (θ ) = argθ max T ln p(v (t) |θ ). (20) t=1 The logarithmic likelihood divergence of RBM is calculated by the contrast divergence algorithm and the random gradient rise method is used to solve the maximum logarithmic likelihood function. The updated formulas of the parameters are described as ωi j = ε(v i h j data − v i h j recon ) ai = ε(v i data − v i recon ) b j = ε h j data − h j recon (21) (22) (23) where ε is the learning rate of pretraining, ·data is the mathematical expectation of training data, and ·recon is the mathematical expectation of the reconstructed model. For the output layer, if the output of the i th node is oi , the expectation is di , the sensitivity is δi , then the δi is described as δi = oi (1 − oi ) (di − oi ) . (24) The expression of sensitivity in the lth hidden layer is described as l l+1 δil = yil 1 − yil ωi j δ j . (25) j The updating formulas of the weights and biases of each layer in the DBN under the learning rate ε are described as ωil j = ωil j + ε ∗ yil δl+1 j a lj blj = = a lj blj +ε∗ +ε∗ δl+1 j δl+1 j . (26) (27) (28) B. Optimized Parameters of DBN Using MSIQDE The DBN model has strong approximation ability to the nonlinear function; it is widely applied in the fields of online prediction, classification, and fault diagnosis. In the application, the connection weights and biases of DBN are initialized randomly, which in turn leads to the insufficient global optimization ability and eventually falls into the local optimum. Therefore, the connection weights and biases of DBN seriously affect the classification accuracy. The MSIQDE algorithm takes on the global search ability, diversity of population, and fast convergence. Therefore, the MSIQDE with global optimization ability is introduced into the DBN to optimize the connection weights of each node. A parameter optimization method for DBN based on MSIQDE is proposed to obtain an optimized DBN (MSIQDE-DBN) model in order to avoid the blindness of parameter selection, reduce the impact on modeling accuracy, and enhance the classification and forecasting ability. Fig. 1. Flow of the MSIQDE-DBN. C. Model of the MSIQDE-DBN The flow of the MSIQDE-DBN is shown in Fig. 1. V. FAULT C LASSIFICATION BASED ON MSIQDE-DBN A. New Fault Classification Method Fault diagnosis is used to select the appropriate method, to determine the fault type, location, severity, and so on. In past years, different fault diagnosis methods are proposed to realize fault classification and to obtain better classification results. But some of these methods could not effectively recognize the early faults and fault classification with large-scale data. The DBN is one of the neural networks; it is a probability generating model. Compared to the traditional discriminant neural network, the generating model establishes a joint distribution between observation data and labels. Therefore, the MSIQDEDBN is applied in fault classification in order to propose a new fault classification method for realizing fault classification with higher accuracy for rotating machinery. B. Detailed Steps of Fault Classification Method Step 1: The data are normalized and divided into training set (TrainData) and test set (TestData). Step 2: Initialize the parameters of the MSIQDE-DBN, which include population number (NP), quantum chromosome length (N), the maximum number of iterations (Gm), mutation probability (Pm), crossover probability (Pc), the number of hidden layers, the number of connecting nodes in each layer, learning rate, initial momentum factor, unsupervised training times, supervised iteration times, and so on. Step 3: The quantum chromosome length is determined according to the number of layers of RBMs. The quantum Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. DENG et al.: IMPROVED QDE ALGORITHM FOR DBN population is initialized and the quantum chromosome is encoded. Step 4: According the objective of classification, the fitness function is constructed. Step 5: The quantum population is evaluated, and the initial fitness value of each individual in the population is calculated. Step 6: The quantum mutation operation and quantum crossover operation are carried out, and the population is normalized to the feasible solution space. Step 7: Implement the quantum mutation operation and the quantum crossover operation. Step 8: The parameters of the DBN and the fitness value of new population are obtained and compared with the original population in order to carry out the greedy selection operation. Step 9: The updated quantum population is obtained by the adaptive quantum state update and quantum nongate mutation. Step 10: The parameters of the updated DBN and the fitness value of the quantum population are obtained by the individuals of the updated population, and the optimal values are saved. Step 11: Compare calculation results. If the end conditions are met, the results are output and the optimal MSIQDE-DBN is obtained. Otherwise, return to Step 5. Step 12: The training set is used to train MSIQDE-DBN to obtain an optimal MSIQDE-DBN classification method. Step 13: The test set is keyed in and the MSIQDE-DBN to obtain fault classification results. 7323 Fig. 2. Fitness value of individual for time-domain data. TABLE I C LASSIFICATION A CCURACY OF T IME -D OMAIN S IGNALS VI. E XPERIMENT AND A NALYSIS In this section, the experiments on two data sets, i.e., the data of rolling bearings from the Case Western Reserve University (CWRU) and a real-world application engineering are used to evaluate the performance of the MSIQDE-DBN by comparison with the DBN, DE-DBN, QDE-DBN, quantum-behaved genetic algorithm (QGA)-DBN, and quantum-behaved particle swarm optimization (QPSO)DBN in terms of classification accuracy and running time for time-domain data and frequency-domain data, respectively. The parameters are set as follows. The DBN is set as five layers of neurons and four RBMs. The population number is 50, the length of the chromosome is 3-D, the number of iterations are 200 and 20 for time-domain data and frequencydomain data, respectively. The upper and lower limits of nodes are 16 and 1500, respectively. The training times of RBM are 20, the number of unsupervised training of RBM is 80, and the fine-tuning times of BP are set as 100 and 200 for the time-domain data, the fine-tuning times of BP are set as 10 and 20 for the frequency-domain data. The learning rate of the first three layers of RBM is 0.012, and that of the last layer of linear RBM is 0.001. The penalty coefficient of weight is 0.0002, and the initial momentum is 0.5. All experiments are carried out on Intel (R) core (TM) i5-7400 CPU 3 GHz, 8G RAM, Win 10, and MATLAB R2018a. A. Experiments on the CWRU Data Set 1) Experimental Data: In this article, the experiments are carried out on the CWRU data set that is 12K drive end rolling bearing data from the CWRU [32]. The sampling frequency is 12 kHz, and the data are divided into no-load (HP0), 1 HP (HP1), 2 HP (HP2), and 3 HP (HP3). Under each load, the faults include inner ring fault, outer ring fault, and rolling element fault. For each type of fault, there are three kinds of damages –0.1778, 0.3556, and 0.5334 mm, respectively. The data are intercepted by 1024 lengths and there are ten kinds of classifications. 2) Experiment Results and Analysis for Time-Domain Data: In the experiment, the training data are 500 × 1024 and the test data are 300 × 1024. The fitness value of the individual for bearing time-domain data (error rate of fault diagnosis) is shown in Fig. 2. The fault classification accuracies of the time-domain data with different number of hidden layer nodes are shown in Table I. The comparison results of error rate changes are shown in Fig. 3. As can be seen from Table I and Fig. 3, the optimal classification accuracies are 47.62%, 55.92%, 59.38%, 59.38%, 58.85%, and 64.62% using the DBN, DE-DBN, QDE-DBN, QGA-DBN, QPSO-DBN, and MSIQDE-DBN, respectively. The proposed MSIQDE-DBN can improve the classification accuracy by about 15% than the DBN. With the increase in the fine-tuning value, the classification accuracies and the running time are also gradually increased. Therefore, the experiment results show that the MSIQDE can effectively and feasibly Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. 7324 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 10, OCTOBER 2020 Fig. 3. Comparison results of error rate. Fig. 4. Fitness value of individual for frequency-domain data. TABLE II C LASSIFICATION A CCURACY OF F REQUENCY-D OMAIN S IGNALS Fig. 5. Comparison results of error rate. Fig. 6. Fitness value of individual for time-domain data. When the frequency spectrum of bearing vibration signal is input, the number of different nodes has a greater impact on the classification results. If the parameters of DBN are unreasonable, the classification accuracy will be unsatisfactory. The MSIQDE algorithm is used to select reasonable connection weights of each node to obtain optimized DBN model, which can effectively improve classification accuracy. Therefore, the experiment results show that the MSIQDE algorithm can effectively optimize the parameters of the DBN model and the MSIQDE-DBN can obtain high classification accuracy. The running time of the frequency-domain data is much less than that of the time-domain data. B. Experiments on the Actual Engineering Data Set optimize the parameters of DBN, and the MSIQDE-DBN can obtain better classification accuracy than other models. 3) Experiment Results and Analysis for Frequency-Domain Data: The 12K driver data are transformed by the frequency spectrum data using fast Fourier transform (FFT). The training data are 300 × 513 and the test data are 1300 × 513. The fitness value of the individual for bearing frequency-domain data (error rate of fault classification) is shown in Fig. 4. The classification accuracies of the frequency-domain data with different number of hidden layer nodes are shown in Table II. The comparison results of error rate changes are shown in Fig. 5. As can be seen from Table II and Fig. 5, the optimal classification accuracies are 98.84%, 99.23%, 99.38%, 99.15%, 99.15%, and 99.7% using the DBN, DE-DBN, QDE-DBN, QGA-DBN, QPSO-DBN, and MSIQDE-DBN, respectively. 1) Experimental Data: In this article, the experiments are carried out on the actual engineering data set, i.e., the vibration signals of the QPZZ-II rotary machinery under 1500 r/min. The sampling frequency is 12 kHz and the sampling time is 100 s. Nine kinds of fault vibration signals and one normal vibration signal are collected under no-load. The intercept length of vibration signal is 1024 and there are ten kinds of classifications. 2) Experiment Results and Analysis for Time-Domain Data: The training data are 500 × 1024 and the test data are 300 × 1024. The fitness value of the individual for bearing time-domain data (error rate of fault diagnosis) is shown in Fig. 6. The classification accuracies of the time-domain signals with different number of hidden layer nodes are shown in Table III. The comparison results of error rate are shown in Fig. 7. Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. DENG et al.: IMPROVED QDE ALGORITHM FOR DBN 7325 TABLE III C LASSIFICATION A CCURACY OF T IME -D OMAIN S IGNALS Fig. 8. Fitness value of individual for frequency-domain data. TABLE IV C LASSIFICATION A CCURACY OF F REQUENCY-D OMAIN S IGNALS Fig. 7. Comparison results of error rate. As can be seen from Table III and Fig. 7, the optimal classification accuracies are 50.31%, 59%, 61.31%, 60.69%, 61.38%, and 66.23% using the DBN, DE-DBN, QDE-DBN, QGA-DBN, QPSO-DBN, and MSIQDE-DBN, respectively. The MSIQDE-DBN can improve the classification accuracy by about 16% than the DBN. With the increase in the finetuning value, the classification accuracies and the running time are also gradually increased. Therefore, the experiment results show that the MSIQDE can effectively and feasibly optimize the parameters of DBN, which can obtain better classification accuracy than other models. 3) Experiment Results and Analysis for Frequency-Domain Data: The collected data are transformed by the frequency spectrum data using FFT. The training data are 300 × 513 and the test data are 1300 × 513. The fitness value of the individual for bearing time-domain data (error rate of fault classification) is shown in Fig. 8. The classification accuracies of the frequency-domain data with different number of hidden layer nodes is shown in Table IV. The comparison results of error rate changes are shown in Fig. 9. As can be seen from Table III and Fig. 9 for the frequencydomain data of actual engineering, the optimal classification accuracies are 95.23%, 96.46%, 96.38%, 96%, 96.23%, and 96.92%, and the running times are 5.85, 13.96, 11.76, 31.95, 9.16, and 25.02 s using the DBN, DE-DBN, QDE-DBN, QGADBN, QPSO-DBN, and MSIQDE-DBN, respectively. The running time of DBN is 5.85 s, which is the least running time. Fig. 9. Comparison results of error rate. The optimal classification accuracy of the MSIQDE-DBN is 96.92%, which is highest classification accuracy. Therefore, the experiment results show that the MSIQDE can better optimize the parameters of DBN and the MSIQDE-DBN can effectively classify the faults of rolling bearings in actual engineering application. It is an effective classification method for rotating machinery in industrial application. From the experiment results of time- and frequency-domain data of CWRU and actual engineering, we can see that the DBN model cannot deal better with time-domain data. However, it can deal better with frequency-domain data. The running times of the DBN model for frequency-domain data are much lesser than those for time-domain data. The optimal classification accuracies of the MSIQDE-DBN Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. 7326 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 69, NO. 10, OCTOBER 2020 can reach 99.7% and 96.92% for frequency-domain data of CWRU and actual engineering, respectively. However, the running time of the MSIQDE-DBN is a little more. In general, the MSIQDE-DBN has better generalization ability and robustness in fault classification. VII. C ONCLUSION AND F UTURE W ORK In this article, an improved QDE (MSIQDE) algorithm with multistrategies of the Mexh wavelet function, standard normal distribution, adaptive quantum state update, and quantum nongate mutation is presented to optimize the connection weights of DBN to obtain an optimal DBN, which is applied in fault classification to propose a new fault classification method, called MSIQDE-DBN. The data of CWRU and actual engineering application are used to assess the effectiveness of the MSIQDE-DBN method. The MSIQDE-DBN algorithm is compared with four state-of-the-art algorithms of DBN, DE-DBN, QGA-DBN, and QPSO-DBN. The optimal classification accuracies of the MSIQDE-DBN can reach 99.7% and 96.92% for frequency-domain data of CWRU and actual engineering application, respectively. Therefore, the experiment results demonstrate that the MSIQDE algorithm is significantly better than the rest of the compared methods. The MSIQDE-DBN method can obtain better classification accuracy, and takes on better generalization ability and robustness in fault classification. In the upcoming research, the structure optimization of DBN can be performed on each layer, respectively, to further improve the DBN. At the same time, the complexity of the MSIQDE-DBN needs to be further reduced substantially in future works. R EFERENCES [1] S. Lu, R. Yan, Y. Liu, and Q. Wang, “Tacholess speed estimation in order tracking: A review with application to rotating machine fault diagnosis,” IEEE Trans. Instrum. Meas., vol. 68, no. 7, pp. 2315–2332, Jul. 2019. [2] H. D. Shao, C. Junsheng, J. Hongkai, Y. Yu, and W. Zhantao, “Enhanced deep gated recurrent unit and complex wavelet packet energy moment entropy for early fault prognosis of bearing,” Knowl.-Based Syst., vol. 188, Jan. 2020, Art. no. 105022, doi: 10.1016/j.knosys.2019.105022. [3] Y. Liu, Y. Mu, K. Chen, Y. Li, and J. Guo, “Daily activity feature selection in smart homes based on pearson correlation coefficient,” Neural Process. Lett., to be published, doi: 10.1007/s11063-01910185-8. [4] Q. Hu, A. Qin, Q. Zhang, J. He, and G. Sun, “Fault diagnosis based on weighted extreme learning machine with wavelet packet decomposition and KPCA,” IEEE Sensors J., vol. 18, no. 20, pp. 8472–8483, Oct. 2018. [5] H. Zhao, H. Liu, J. Xu, and W. Deng, “Performance prediction using high-order differential mathematical morphology gradient spectrum entropy and extreme learning machine,” IEEE Trans. Instrum. Meas., early access, Oct. 21, 2019, doi: 10.1109/TIM.2019.2948414. [6] Z. He, H. Shao, X. Zhang, J. Cheng, and Y. Yang, “Improved deep transfer auto-encoder for fault diagnosis of gearbox under variable working conditions with small training samples,” IEEE Access, vol. 7, pp. 115368–115377, 2019. [7] T. Li, Z. Qian, and T. He, “Short-term load forecasting with improved CEEMDAN and GWO-based multiple kernel ELM,” Complexity, vol. 2020, pp. 1–20, Feb. 2020. [8] T. Kuremoto, S. Kimura, K. Kobayashi, and M. Obayashi, “Time series forecasting using a deep belief network with restricted Boltzmann machines,” Neurocomputing, vol. 137, pp. 47–56, Aug. 2014. [9] H. Zhao, J. Zheng, W. Deng, and Y. Song, “Semi-supervised broad learning system based on manifold regularization and broad network,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 67, no. 3, pp. 983–994, Mar. 2020. [10] F. Jia, Y. Lei, J. Lin, X. Zhou, and N. Lu, “Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data,” Mech. Syst. Signal Process., vols. 72–73, pp. 303–315, May 2016. [11] Y. Liu, X. Wang, Z. Zhai, R. Chen, B. Zhang, and Y. Jiang, “Timely daily activity recognition from headmost sensor events,” ISA Trans., vol. 94, pp. 379–390, Nov. 2019. [12] C. Zhang, K. C. Tan, H. Li, and G. S. Hong, “A cost-sensitive deep belief network for imbalanced classification,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 1, pp. 109–122, Jan. 2019. [13] J. Sun, C. Yan, and J. Wen, “Intelligent bearing fault diagnosis method combining compressed data acquisition and deep learning,” IEEE Trans. Instrum. Meas., vol. 67, no. 1, pp. 185–195, Jan. 2018. [14] L. Wen, X. Li, L. Gao, and Y. Zhang, “A new convolutional neural network-based data-driven fault diagnosis method,” IEEE Trans. Ind. Electron., vol. 65, no. 7, pp. 5990–5998, Jul. 2018. [15] V. T. Tran, F. AlThobiani, and A. Ball, “An approach to fault diagnosis of reciprocating compressor valves using Teager–Kaiser energy operator and deep belief networks,” Expert Syst. Appl., vol. 41, no. 9, pp. 4113–4122, Jul. 2014. [16] Z. Chen and W. Li, “Multisensor feature fusion for bearing fault diagnosis using sparse autoencoder and deep belief network,” IEEE Trans. Instrum. Meas., vol. 66, no. 7, pp. 1693–1702, Jul. 2017. [17] H. Shao, H. Jiang, H. Zhang, and T. Liang, “Electric locomotive bearing fault diagnosis using a novel convolutional deep belief network,” IEEE Trans. Ind. Electron., vol. 65, no. 3, pp. 2727–2736, Mar. 2018. [18] H. Zhao, J. Zheng, J. Xu, and W. Deng, “Fault diagnosis method based on principal component analysis and broad learning system,” IEEE Access, vol. 7, pp. 99263–99272, 2019. [19] Y. Qin, X. Wang, and J. Zou, “The optimized deep belief networks with improved logistic sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines,” IEEE Trans. Ind. Electron., vol. 66, no. 5, pp. 3814–3824, May 2019. [20] B. Dorronsoro and P. Bouvry, “Improving classical and decentralized differential evolution with new mutation operator and population topologies,” IEEE Trans. Evol. Comput., vol. 15, no. 1, pp. 67–98, Feb. 2011. [21] R. Barends et al., “Digitized adiabatic quantum computing with a superconducting circuit,” Nature, vol. 534, no. 7606, pp. 222–226, 2006. [22] D. Zouache and A. Moussaoui, “Quantum-inspired differential evolution with particle swarm optimization for knapsack problem,” J. Inf. Sci. Eng., vol. 31, no. 5, pp. 1757–1773, 2015. [23] A. Draa et al., “A quantum-inspired differential evolution algorithm for solving the N-queens problem,” Int. Arab J. Inf. Technol., vol. 7, no. 1, pp. 21–27, 2010. [24] H. Su and Y. Yang, “Differential evolution and quantum-inquired differential evolution for evolving Takagi–Sugeno fuzzy models,” Expert Syst. Appl., vol. 38, no. 6, pp. 6447–6451, 2011. [25] G. Xu, H. Ding, and Z. Feng, “Optimal design of hydraulic excavator shovel attachment based on multiobjective evolutionary algorithm,” IEEE/ASME Trans. Mechatronics, vol. 24, no. 2, pp. 808–819, Apr. 2019. [26] W. Deng, H. Zhao, L. Zou, G. Li, X. Yang, and D. Wu, “A novel collaborative optimization algorithm in solving complex optimization problems,” Soft Comput., vol. 21, no. 15, pp. 4387–4398, Aug. 2017. [27] H. Chen, S. Jiao, A. A. Heidari, M. Wang, X. Chen, and X. Zhao, “An opposition-based sine cosine approach with local search for parameter estimation of photovoltaic models,” Energy Convers. Manage., vol. 195, pp. 927–942, Sep. 2019. [28] W. Deng, J. Xu, and H. Zhao, “An improved ant colony optimization algorithm based on hybrid strategies for scheduling problem,” IEEE Access, vol. 7, pp. 20281–20292, 2019. [29] Y. Xu, H. Chen, J. Luo, Q. Zhang, S. Jiao, and X. Zhang, “Enhanced moth-flame optimizer with mutation strategy for global optimization,” Inf. Sci., vol. 492, pp. 181–203, Aug. 2019. [30] R. Chen, S.-K. Guo, X.-Z. Wang, and T.-L. Zhang, “Fusion of multiRSMOTE with fuzzy integral to classify bug reports with an imbalanced distribution,” IEEE Trans. Fuzzy Syst., vol. 27, no. 12, pp. 2406–2420, Dec. 2019. [31] T. Li, J. Shi, X. Li, J. Wu, and F. Pan, “Image encryption based on pixel-level diffusion with dynamic filtering and DNA-level permutation with 3D latin cubes,” Entropy, vol. 21, no. 3, p. 319, 2019. [32] Case Western Reserve University Bearing Data Center. Accessed: Jul. 31, 2017. [Online]. Available: http://csegroups.case.edu/ bearingdatacenter/home Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply. DENG et al.: IMPROVED QDE ALGORITHM FOR DBN Wu Deng (Member, IEEE) received the Ph.D. degree in computer application technology from Dalian Maritime University, Dalian, China, in 2012. Since 2019, he was a Professor with the College of Electronic Information and Automation, Civil Aviation University of China, Tianjin, China. His research interest includes artificial intelligence, optimization method, and fault diagnosis. 7327 Huimin Zhao (Fellow, IEEE) received the Ph.D. degree in mechanical engineering from Dalian Maritime University, Dalian, China, in 2013. Since 2019, she has been a Professor with the College of Electronic Information and Automation, Civil Aviation University of China, Tianjin, China. Her research interest includes artificial intelligence, signal processing, and fault diagnosis. Hailong Liu (Member, IEEE) received the B.S. degree in optoelectronic engineering from Shandong Normal University, Jinan, China, in 2017. He is currently pursuing the Ph. D. degree with Dalian Jiaotong University, Dalian, China. His research interest includes deep learning and fault classification. Junjie Xu (Fellow, IEEE) received the Ph.D. degree in computer application technology from Dalian Maritime University, Dalian, China, in 2016. Since 2016, she has been a Lecturer with the College of Computer Science and Technology, Civil Aviation University of China, Tianjin, China. Her research interest includes artificial intelligence and information safety. Yingjie Song (Fellow, IEEE) received the Ph.D. degree in computer application technology from Dalian Maritime University, Dalian, China, in 2013. Since 2013, she has been an Associate Professor with the College of Computer Science and Technology, Shandong Institute of Business and Technology, Yantai, China. Her research interest includes artificial intelligence and signal processing. Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY JALANDAR. Downloaded on April 14,2023 at 17:37:24 UTC from IEEE Xplore. Restrictions apply.