10844 IEEE SENSORS JOURNAL, VOL. 21, NO. 9, MAY 1, 2021 Weld Defect Detection From Imbalanced Radiographic Images Based on Contrast Enhancement Conditional Generative Adversarial Network and Transfer Learning Runyuan Guo, Han Liu , Member, IEEE, Guo Xie , Member, IEEE, and Youmin Zhang , Senior Member, IEEE Abstract —When a sensor data-based detection method is used to detect the potential defects of industrial products, the data are normally imbalanced. This problem affects improvement of the robustness and accuracy of the defect detection system. In this work, welding defect detection is taken as an example: based on imbalanced radiographic images, a welding defect detection method using generative adversarial network combined with transfer learning is proposed to solve the data imbalance and improve the accuracy of defect detection. First, a new model named contrast enhancement conditional generative adversarial network is proposed, which is creatively used as a global resampling method for data augmentation of X-ray images. While solving the limitation of feature extraction due to low contrast in some images, the data distribution in the images is balanced, and the number of the image samples is expanded. Then, the Xception model is introduced as a feature extractor in the target network for transfer learning, and based on the obtained balanced data, fine-tuning is performed through frozen– unfrozen training to build the intelligent defect detection model. Finally, the defect detection model is used to detect five types of welding defects, including crack, lack of fusion, lack of penetration, porosity, and slag inclusion; an F1-score of 0.909 and defect recognition accuracy of 92.5% are achieved. The experimental results verify the effectiveness and superiority of the proposed defect detection method compared to conventional methods. For other similar applications to defect detection, the proposed method has promotional value. Index Terms — Contrast enhancement conditional generative adversarial network, deep learning, imbalanced data, sensor data processing, transfer learning, weld defect detection. Manuscript received January 4, 2021; revised February 11, 2021; accepted February 11, 2021. Date of publication February 16, 2021; date of current version April 5, 2021. This work was supported in part by the National Natural Science Foundation of China under Grant 61973248 and Grant 61833013, in part by the Key Project of Shaanxi Key Research and Development Program under Grant 2018ZDXM-GY089, and in part by the Research Program of the Shaanxi Collaborative Innovation Center of Modern Equipment Green Manufacturing under Grant 304-210891704. The associate editor coordinating the review of this article and approving it for publication was Prof. Guiyun Tian. (Corresponding author: Han Liu.) Runyuan Guo, Han Liu, and Guo Xie are with the School of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China (e-mail: xianryan@163.com; liuhan@xaut.edu.cn). Youmin Zhang is with the Department of Mechanical, Industrial, and Aerospace Engineering, Concordia University, Montreal, QC H3G 1M8, Canada (e-mail: youmin.zhang@concordia.ca). Digital Object Identifier 10.1109/JSEN.2021.3059860 I. I NTRODUCTION EFECTS arising from welding operations damage the quality of manufactured products. Therefore, welding defects must be detected in the manufacturing process [1]. Considering the defect detection of a pipeline weld seam as an example, the internal defects are generally observed by X-raynondestructive testing technology [2]. However, the existing X-ray nondestructive testing processes are still dominated by manual observations of real-time sensor images, and such observations are highly subjective. In low-contrast images, the defects cannot be accurately detected, and the detection results are easily affected by visual fatigue of the detection inspectors during mass production, resulting in the false and undetected defects [3]. Therefore, the research of D 1558-1748 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. GUO et al.: WELD DEFECT DETECTION FROM IMBALANCED RADIOGRAPHIC IMAGES an intelligent high accuracy weld defect detection method has become popular in the sensor data-driven detection field in recent years. To detect weld defects based on X-ray images, the main process is divided into three steps: weld seam extraction, defect segmentation, and defect detection [4]. Defect detection refers to the detection of defects after extracting their features (the detection ability is evaluated by two aspects: the first is whether the detection method can accurately detect defects, and the second is whether the specific type of defect can be identified if an image is recognized as containing defects) [5]. Scholars have carried out extensive studies on these three aspects and have achieved limited progress [6], [7]. In these studies, defect detection is mostly achieved by machine learning methods based on the artificially designed features. The detection results of such methods are always limited by the quality of the designed features and the number of image data. Compared with traditional machine learning methods, deep learning technology can automatically extract deep representative features from sensor data, which makes it unnecessary to manually design the features [8], [9]. Another advantage of deep learning is that the networks, such as SAE and DBN, can be trained using unlabeled data [10]. Given the massive acquisition of industrial sensor data, this advantage is conducive to comprehensively learning the information in the sensor data [11]. Therefore, more scholars have attempted to use deep learning models for weld defect detection [12]. We also designed a convolutional neural network (CNN) for the detection of weld defects, and the detection accuracy was improved compared with the traditional methods [13]. This is attributable to the fact that the convolutional structure can extract the essential features at the pixel level independently and more information is introduced with massive data to facilitate CNNs to achieve comprehensive characterization for each type of defect [14]. According to the current research results, the advantages of deep learning can indeed improve detection. However, it must be noted that such new methods also have problems that must be solved. Complex industrial production practices today generally follow the six-sigma quality management requirements, that is, minimizing the number of defects in products and processes to improve product quality. Therefore, the quantity of defective products will be significantly lower than that of intact products [15]. This phenomenon also exists in production involving welding, and the different types of welding defects are often unevenly distributed. The imbalanced distribution of these different defect types is called the imbalanced type problem or the imbalanced data problem in weld defect detection. This problem may be attributed to the technological gaps in the welding process, so it is intrinsic to the defect detection field. Therefore, when the new intelligent feature and deep learning method is used for defect detection, the data imbalance also exists [16]. Therefore, defect detection from imbalanced data must be further studied. Research strategies for the imbalanced data problem can be roughly divided into two groups. The first involves research at the data level. This method mainly changes the distribution of the training dataset through resampling, so that the 10845 number of different types of data tends to be balanced, and then the traditional detection and classification algorithms are used for the study [17]. The second entails research at the algorithm level. This method often improves the classification algorithm, for example, by setting a high misclassification cost factor to improve the learning ability for the minority class, thus achieving improved classification effect [18]. In general, methods at the algorithm level are more effective for some specific detection and recognition problems, while data-level methods are preferable in practice because they are applicable to any learning system used for detection and recognition without changing the basic learning strategies [19]. Owing to the late research interest for imbalanced data problem, very few studies have been conducted to solve for the imbalanced data in the field of welding defect detection, among which the most representative works are [20] and [21]. The work in [20] is the first known research to study this problem; it evaluates the effectiveness of a total of 22 resampling methods for processing imbalanced data for welding defect detection at the data level, and concludes that no resampling method dominates the other techniques, regardless of their combination with any classifier. We believe that this is because the resampling methods used in that work were all sampled from the local neighborhood without considering the overall distribution of the dataset. If we can directly sample from the distribution of the global data and balance the training dataset, an ideal learning method for imbalanced data at the data level may be obtained. The research results of [21] show that deep features extracted by a deep CNN (DCNN) allow better characterization capability for welding defects than the features designed artificially, such as the histogram of oriented gradient features. Therefore, to obtain better defect detection results, a DNN with more advanced structure and stronger feature extraction ability must be used. Meanwhile, although three resampling methods are used in [21] to solve the imbalanced data problem, namely random oversampling (ROS), random under sampling (RUS), and the synthetic minority oversampling technique (SMOTE), the possible influence of the different resampling methods on detection results are not discussed in detail. Moreover, the effects of the three resampling methods are necessarily limited because resampling from the global distribution of data is not considered. To determine an ideal resampling method, we considered the generative adversarial network (GAN), which has been a hot topic in academia [22]. GAN adopts internal adversarial mechanisms for training and directly samples from the global distribution of data, which can theoretically achieve a closer approximation to real data [23]. However, GANs tend to obtain better algorithm performance for public datasets, whereas many sensor images collected in industrial scenarios do not have the obvious foreground, which makes it difficult to detect and identify the target. For example, some of the radiographic defect images have low contrast, which means the defect area as the image foreground cannot be highlighted, thus affecting the neural network’s performance in extracting the defect features. Therefore, the industrial image resampling method based on GAN needs further study. In addition, it has been proven that the new defect detection method of intelligent features Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. 10846 and deep learning has great advantages. To further improve the accuracy and robustness of defect detection, we introduce the transfer learning technology to defect detection [24]. Since the features of the deep neural network (DNN) are extracted layer-wise, the features from the lower layers tend to be general, and only the features from the upper layers are related to specific tasks. A DNN trained by large quantities of data usually has strong feature extraction ability at the lower layers. Therefore, a new DNN based on these lower layers is built, and the fine-tuning training for this new network is carried out to cater to specific applications, such as defect detection, so that the obtained network has improved feature extraction performance for defects [25]. Hence, to solve the problem of imbalanced data and further improve the accuracy of defect detection, a new defect detection method (contrast enhancement conditional GAN (CECGAN) combined with Xception transfer leaning) is proposed in this paper. The CECGAN model is first proposed and then introduced to defect detection as a resampling method for the imbalanced data. By adding external condition information (such as label information of data) on the basic GAN model, CECGAN is able to guide the model to generate data of corresponding types, which greatly reduces the impact of the GAN’s mode collapse phenomenon [26]. Meanwhile, the contrast enhancement operation is integrated into the structure of the GAN model to eliminate the influence of obscure foreground features in industrial images during feature extraction. The Xception model has a residual structure, and its excellent feature extraction performance has been proven with natural images based on its improved depth separable convolution operation. Introducing Xception to defect detection as a base learner for transfer learning is of great value to study its applicability for industrial image data [27]. In this study, considering the weld defect detection of petroleum pipelines as an example, a series of comparative experiments were performed, and the results confirm the effectiveness and superiority of the proposed defect detection method compared to conventional methods. The remainder of this paper is organized as follows. In Section II, the characteristics of the X-ray welding defect data are introduced. Further, the CECGAN resampling method (including CECGAN’s structure and training method, as well as CECGAN-based resampling method) is proposed and detailed, and the basic theories of transfer learning and Xception model are introduced. In Section III, the proposed welding defect detection method is described in detail. In Section IV, test experiments are presented to verify the effectiveness and superiority of the proposed method. Finally, the conclusions are presented in Section V. II. OVERVIEW OF R ADIOGRAPHIC I MAGE DATASET, CECGAN, AND X CEPTION A. Radiographic Image Dataset For the automatic nondestructive detection of welding defects in petroleum pipelines, the detection object is the X-ray welding image obtained by the real-time welding seam imaging system. In this study, we detect five main welding defects in the weld seam based on the X-ray images: IEEE SENSORS JOURNAL, VOL. 21, NO. 9, MAY 1, 2021 Fig. 1. Examples of the five defects and non-defect. crack (CR), lack of fusion (LF), lack of penetration (LP), slag inclusion (SI), and porosity (PO) [28]. In our previous research on welding defect segmentation, we proposed a clustering algorithm based on the ordering points to successfully achieve accurate segmentation of defects of any shape or size in the weld seam [13]. Based on this method, a total of 20,360 X-ray images of size 71 × 71 are obtained, including 4,640 image samples containing defects and 15,720 images without defects; this original welding defect dataset is denoted as Data original . As shown in Fig. 1, five examples are listed for each defect in Data original , and the ND in the figure represents image data without defects. It is seen that owing to the influence of the X-ray projection angle and X-ray quantum quantity, some of the X-ray defect images have relatively low contrast, making the defect part in the weld seam area difficult to identify, which is unfavorable for feature extraction of weld defects [29]. From further observation of the defect characteristics, it is seen that the shape or geometric characteristics of the defects of a specific type have some common features. For example, the shapes of CR, LF, and LP are more linear; compared with the other three types of defects, PO and SI are generally round in shape. It should be noted that the classification distribution of the welding defect data used is still imbalanced. The numbers in brackets represent the number of samples in each category. B. Contrast Enhancement Conditional Generative Adversarial Network Because GANs can learn the actual global distribution of data, this work solves the imbalanced data problem in the defect detection field based on GANs. However, both discriminator networks (D networks) and generator networks (G networks) in simple GANs are multilayer perceptrons, and their ability to extract image features is limited. Moreover, mode collapse and mode mixing often occur in the training process, which means that the diversity and quality of the generated image data cannot be guaranteed. Therefore, some scholars introduced a category label variable, y, into both the G and D networks of a simple GAN, and the constructed conditional generative adversarial network (CGAN) model extends the unsupervised GAN to a supervised learning model so as to guide the generation of specific categories of defect images and avoid the phenomenon of mode mixing [30]. On this basis, we propose for introducing the contrast enhancement operation into the CGAN and replacing the multilayer perceptrons in the D and G networks with DCNN to enhance Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. GUO et al.: WELD DEFECT DETECTION FROM IMBALANCED RADIOGRAPHIC IMAGES 10847 Fig. 2. Structure and parameter settings of the proposed CECGAN. feature extraction capability. The proposed CECGAN model is shown in Fig. 2. At the beginning of the D network, contrast enhancement is performed on both the generated and real data so that the relatively obscure defect areas in the lowcontrast images can be highlighted, which is conducive to the subsequent convolution operation in the D network to more accurately extract defect features while not affecting the generation of the defect images without contrast enhancement in the G network. The resampling method based on the CECGAN model is divided into two stages: CECGAN training and resampling. In the training process of the CECGAN, the objective function follows the formulae presented in [31]. The validity of this objective function was proved via mathematical derivation and it was considered to be able to help GAN achieve effective training. The specific steps are shown in Table I. C. Xception To further improve the accuracy of defect detection, we introduce transfer learning technology into defect detection. During the implementation of transfer learning, the network obtained by training with a large-scale dataset is called the base network, and the lower layers of the base network are extracted to build a new network called the target network (these lower layers are used as the feature extractor of the new network). The specific implementation steps for transfer learning will be detailed in Section III. In this study, the Xception model is selected as the feature extractor of the target network in transfer learning because it is not only one of the most advanced CNN structures developed thus far but also beneficial as it occupies less memory and is convenient for deployment. The main internal structure of the model is realized by combining the residual structure with the improved depthwise separable convolution (channel adjustment before feature extraction). More information of Xception model is detailed in [27]. III. W ELD D EFECT D ETECTION M ETHODOLOGY A. Radiographic Image Resampling To solve the problem of data imbalance and further improve detection accuracy, a new defect detection method combining CECGAN with Xception transfer learning is proposed here. First, we design a CECGAN whose structure and parameter settings are as shown in Fig. 2. This CECGAN is used for resampling to balance the data distribution. It should be noted that batch normalization is not used in the output deconvolution layer of the generator or the input convolution layer of the discriminator; this setting is intended to prevent batch normalization from being used in all layers and to prevent model instability [32]. The specific parameter settings of the Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. 10848 IEEE SENSORS JOURNAL, VOL. 21, NO. 9, MAY 1, 2021 TABLE I A LGORITHM 1 Fig. 3. Transfer learning and fine-tuning process. and the model is trained by the means of frozen–unfrozen method. The training process is shown in Fig. 3, and the specific steps are as follows. First, the base network is obtained after training with the ImageNet dataset. The first n layers of the base network are copied to the first n layers of the target network, and each layer of the new classifier in the target network is randomly initialized. Then, the weights of the first n layers are frozen, and the new classifier in the target network is trained based on the balanced welding defect data. To further improve the performance of transfer learning, the target network is fine-tuned after the new classifier is trained and converged. The fine-tuning here refers to unfreezing the top layers of the first n layers and continuing to freeze the bottom layers of these first n layers to train the new classifier layers as well as the unfrozen top layers simultaneously. In this manner, the higher-order feature representation of the first n layers is fine-tuned to make it more relevant to specific defect detection tasks. In the training process of the target network in this work, the number of unfrozen layers h is set to 30. After training, the target network obtained is used for defect detection. CECGAN model are depicted in Fig. 2. The dimension p is 100, and c is 6 (because there are six defect types to be detected of which five are defects and one is the nondefect), and the one hot coding procedure is performed on the type labels. Based on the algorithm shown in Table I, the number of training epochs, T E, is set to 200, and the Adam optimizer is used to train the CECGAN network. The exponential decay rate of the first-order moment estimation, β1 , is equal to 0.4, while the other parameters are maintained at default values [33]. When the training of the CECGAN is complete, random noise and specific category labels are input to the trained G network to generate the corresponding types of welding defect images, thus completing the X-ray data resampling with global scope based on the CECGAN B. Transfer Learning Based Defect Detection Model Based on the balanced data obtained after resampling, a defect detection model is established using transfer learning, C. Defect Detection and Defect Recognition Process Welding defect detection requires that we first determine the presence of any defects in the weld seam and then apply pattern recognition on the types of defects. Therefore, two target networks are trained to complete the required detection tasks. The specific detection scheme is as shown in Fig. 4. A 71 × 71 pixel window is used to move horizontally and vertically on the X-ray images of the pipeline, and the step size is set as 3 pixels. Then, the image block is extracted and input to the first target network, and the model is used to judge whether the image block is a defect. Each pixel of an image block judged to be a part of a defect is marked as “1”, and each pixel judged to be nondefective is marked as “0”. As the window moves over the image, each pixel in the image is marked multiple times, and the greater the cumulative value is, the more likely is a pixel to represent a defect. When the window movement is complete, the defect location is realized by combining the threshold judgment mechanism, and the defect image is then input to the second target network to finally identify the specific defect type. To summarize, the overall steps of the defect detection method proposed in this work are presented in Table II. Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. GUO et al.: WELD DEFECT DETECTION FROM IMBALANCED RADIOGRAPHIC IMAGES 10849 Fig. 4. Defect detection process. TABLE II A LGORITHM 2 IV. E XPERIMENTS This section presents verification of the effectiveness and superiority of the proposed defect detection method (CECGAN combined with Xception transfer learning) through experiments. The evaluations are mainly divided into two parts: verification of the CECGAN resampling method and verification of the Xception transfer learning method. All the experiments were conducted in Linux Ubuntu OS, with a computational platform having 3.20 GHz CPU and 6 GB GPU. The program was compiled in the integrated development environment Spyder 3.1.4 using Python 3.6.1 and the opensource deep learning framework TensorFlow 2.0. A. Parameter Setting and Evaluation Criterion For the verification of the CECGAN resampling, we use CECGAN resampling, simple random resampling, and data enhancement resampling to conduct comparative experiments. The most commonly used image data resampling method is simple random resampling. Based on the original imbalanced dataset, this method randomly selects samples from a minority class for replication to achieve the same number of defective and non-defective data. The balanced dataset generated by this method is denoted as Data copy_bal . We also realize data augmentation resampling by flipping the image data and adjusting the contrast of the image data. Flipping means randomly calculating horizontal image or vertical image of the image data. Adjusting the contrast means doubling the contrast of the image data. The two augmentation operations are alternated to achieve the same number of defective and non-defective data. Such operations do not change the geometry and shape features of the defect image. Compared with simple random resampling, the samples obtained by this approach are different, and a balanced dataset is rendered, which is recorded as Data aug_bal . Meanwhile, CECGAN is used for data resampling, and the sine contrast enhancement method is used in the experiment [34]; this balanced dataset is denoted as Data C EC G AN_bal . Thus, the three X-ray welding defect balanced datasets Data copy_bal , Data aug_bal , and Data C EC G AN_bal are obtained, with the same number of samples of 31,440, which comprise 15,720 each of the defect and non-defect samples. With these three balanced datasets and the original dataset Data original , the defect detection experiments were carried out on each dataset using the proposed Xception transfer learning method. In actual industrial production, the F1 value is the most important indicator of weld defect detection. Therefore, the F1 value is considered as one of the evaluation criteria for the detection results. The larger the F1 value, the more are the correctly detected defects using this method, indicating that the corresponding dataset is of higher quality with a more effective data resampling method. Meanwhile, it is meaningful to carry out the subsequent defect type identification only in the case of a high F1 value; otherwise, even if all of the types of defects that have been detected in the weld seam can be accurately identified, it is likely that the problem of missing detection still exists. The formulas to calculate the F1 value are given in (3)-(5): recall = T P/(T P + F N), precision = T P/(T P + F P), F1 = 2 ∗ recall ∗ presision/(recall + presision). (3) (4) (5) In the above three equations, T P represents the number of correctly identified defect samples, F N represents the number of incorrectly identified defect samples, and F P represents the number of non-defect samples that are incorrectly identified as defects. It is seen from the formulas that the F1 value is the harmonic average of the recall and precision. If the F1 value is high, it indicates that the classification based on the corresponding dataset is ideal and that the defect detection ability of the approach is good. For the validation of the Xception transfer learning method, we verified its feature extraction capability. This is because in the defect detection task, in addition to assessing whether the defect can be detected correctly, the subsequent identification of the detected defect type is also required. The Xception Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. 10850 IEEE SENSORS JOURNAL, VOL. 21, NO. 9, MAY 1, 2021 TABLE III PARAMETER S ETTINGS FOR THE D SEA AND DCNN M ODELS Fig. 5. Loss function curves during CECGAN training. Fig. 6. Defect images generated by CGAN and CECGAN. model is thus a deep image feature extractor after training completion. In this work, the deep stacked autoencoder (DSAE) and DCNN are also used to extract the deep features of the defect images. The extracted deep features are correspondingly recorded as D F X cept ion , D F S AE , and D F C N N , and all these features are input to a softmax classifier. Then, the performance of each feature extractor is evaluated by the final defect recognition accuracy, which is denoted as ACC, and refers to the quantity ratio of the accurately classified samples to the total test samples. The structures and parameter settings of the DSAE, DCNN, and new softmax classifier in Xception transfer learning are shown in Table III, and these network architectures are selected using cross-validation. In addition, the loss functions of the DSAE and DCNN models are all of the sparse categorical cross entropy type. The Adam algorithm is used for optimization, and its parameters remain in the default settings. B. Results and Discussion 1) Resampling Results Using CECGAN: The CECGAN designed in this work is used to resample the X-ray defect images, and the change in loss value for the training process of the CECGAN model is shown in Fig. 5. The loss value of the discriminator is the sum of the loss values obtained when the real and generated images are input to the discriminator for training. The yellow and green curves represent the loss values obtained in each epoch when the real and generated images are input to the discriminator. The change in the total loss value of the discriminator is shown by the blue curve, and the change in the generator’s loss value is shown by the red curve. By observing the changes in the trends of these two curves, it can be concluded that the discriminator and generator have been in adversarial states during training, and the change trends of the two curves have evolved from violent shock to stability after 25 epochs, which indicates that the designed CECGAN has been trained more effectively. Finally, the training process is considered complete after reaching the scheduled training epochs. The trained CECGAN is used to resample the X-ray defect dataset Data original , and a CGAN is also trained to resample Data original in the same manner. The resampling results of the two models are shown in Fig. 6 (for the five types of defect samples, each type is randomly selected for display). It can be seen that the image quality generated by CGAN is not ideal, while the images generated by the CECGAN maintain the shapes and geometric features of each type of defect, which indicates that the sine contrast enhancement operation helps the model learn the defect features more accurately and eliminate the limitations of low-contrast image features, which are usually not obvious. Based on the CECGAN resampling results, 500 images are selected and mixed with 500 real defect images. Two defect inspectors are organized to carry out the Turing test. The first inspector has ten years’ working experience and the second inspector has three years’ working experience. The judgment criterion is that the image that cannot clearly represent the defect features is assigned a score of 1, while the image that better represents the defect features is assigned a score of 5. For the real images, the first and the second inspectors score 4.98 and 4.95, respectively. For the generated images, the first inspector scores 4.80 and the second inspector scores 4.75. The results show that although the average score of the generated images is slightly lower than that of the real images, the generated images also achieved a high score, confirming the effectiveness of CECGAN resampling. 2) Defect Detection Results and Discussion: The defect detection models are established on the basis of four different datasets, and the evaluation results of their detection abilities Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. GUO et al.: WELD DEFECT DETECTION FROM IMBALANCED RADIOGRAPHIC IMAGES TABLE IV D EFECTS D ETECTION R ESULTS B ASED ON D IFFERENT D ATASETS are shown in Table IV. The models constructed by each dataset are evaluated by the five-fold cross validation procedure, where AVG represents the arithmetic average of the five F1 values obtained by the five-fold cross validations. It can be seen that the average F1 value of defect detection based on the original dataset Data original is 0.812, while the F1 value based on Data copy_bal is almost the same as 0.812, which indicates that the resampling method of simple random replication cannot solve the data imbalance problem well in this application. The average F1 value based on Data aug_bal increases to 0.825, but the increased magnitude of 10−2 is considered as a small increment. This indicates that although the operations such as flip and shift help increase the diversity of the samples, they do not add new information, so it was not enough to solve the problem of data imbalance. In fact, these data enhancement operations do not change the pixel distributions of the defect images, so the information in the receptive field do not change much, and the improvement of detection ability is limited. Defect detection based on Data C EC G AN_bal achieves the highest F1 value of 0.909, which is greatly improved compared with the F1 value of Data original . In addition, the F1 value in each fold of Data C EC G AN_bal is higher than the F1 values corresponding to the other three datasets, which indicates that the CECGAN resampling method achieves optimal defect detection, that is, the balanced dataset obtained by this method helps better differentiation between defects and non-defects. The greatest difference between CECGAN and other resampling methods is that CECGAN is performed within the global distribution of the defect data. In this manner, the imbalanced data distribution can be balanced, and the real distribution of the data can be learned to generate representative new samples to solve the problem of data imbalance and further improve the defect detection. Based on the above experimental results, the validity and superiority of the CECGAN as a resampling method for X-ray defect images are demonstrated. 3) Defect Recognition Results and Discussion: First, the DSAE, DCNN, and Xception transfer learning models are trained according to the network structures and parameter settings as shown in Table III. Then, a total of 500 images containing five types of defects are randomly selected as the inputs of the three models from Data C EC G AN_bal (100 images for each type of defect), and the separability of the original data and the three different defect characteristics, D F X cept ion , D F S AE , and D F C N N , are shown in twodimensional space by adopting the t-distributed stochastic neighbor embedding (t-SNE) method [35]. The t-SNE is a manifold learning dimension reduction algorithm that uses 10851 the probability modeling approach, which can effectively realize visualization of high-dimensional data. The dimension reduction results are shown in Fig. 7. First, observing the results of the 500 defect pictures in Fig. 7(a), and it is seen that the 500 defects overlap in two-dimensional space and are difficult to distinguish, which reflects the difficulty of the defect detection task. This difficulty is determined by the welding process and the characteristics of the various defects themselves. For example, the shapes of the PO and SI defects usually show round characteristics, while those of CR, LF, and LP show linear characteristics to a certain extent. Different types of defects have similar shape characteristics, which makes it difficult to accurately identify the types of defects. Fig. 7(b) shows the dimension reduction results of the feature D F S AE . It is seen that the 500 defects are distributed in homogeneously, indicating that their separability is poor. Fig. 7(c) shows the dimension reduction results of D F C N N features, and its separability is seen to be greatly improved compared with the results of D F S AE , which manifests as relatively poor separability for only LP and CR defects, whereas the other three types of defects are able to achieve more accurate separation. By observing the dimension reduction results shown in Fig. 7(d), it can be concluded that the feature D F X cept ion has the strongest separability among the five types of defects, which indicates that the residual structure and improved depthwise separable convolution of the Xception model can effectively extract the deep characteristics of the welding defects. Moreover, compared with the DSEA and DCNN models, the Xception model has stronger feature extraction capability, and the extracted features can better represent the various defects. However, some LF and CR defects still overlap in the two-dimensional space because these defects are very similar in shape. This will cause the Xception network to extract similar defect features, making it difficult to effectively distinguish between LF and CR defects. The similarity in defect features is caused by the formation mechanism of the defects. Therefore, the accurate recognition of defects having similar shape characteristics needs to be further studied in the future. The results of the above t-SNE analyses have proven that applying Xception transfer learning to the feature extraction of welding defects is effective. To further verify this superiority, we first apply the CECGAN method to generate a balanced dataset Data C EC G AN_recognit ion for defect recognition. Data C EC G AN_recognit ion has a total of 12,500 X-ray welding defect images (there are five types of defects in this dataset, and the number of defects of each type is 2,500), and about 10,000 of these are used as the training dataset, with the remaining 2,500 used as the test dataset. Then, the DSAE, DCNN, and Xception are used to establish a defect detection model in combination with the softmax classifier, and the test dataset is input for testing. The statistical results of the ACC are shown in Table V. As presented in Table V, the Xception transfer learning model achieves the highest defect detection accuracy of 92.5%, followed by DCNN (82.3%) and DSAE (61.0%). The accuracy of 92.5% is higher than the accuracy of manual detection and meets the generally accepted requirements of accuracy [36]. Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. 10852 IEEE SENSORS JOURNAL, VOL. 21, NO. 9, MAY 1, 2021 TABLE V D EFECTS R ECOGNITION R ESULTS B ASED ON D IFFERENT M ODELS TABLE VI C ONFUSION M ATRIX OF F IVE -C LASSIFICATION FOR D EFECTS This indicates that the Xception transfer learning model benefited from the effective training and advanced network structure to provide better defect detection results than the DSAE and DCNN; that is, more defects are correctly classified. Therefore, the effectiveness and superiority of the proposed defect detection method compared to other methods are proved. Next, the detection results are compared with the results presented in [15] and [16]. In [15], the optimal combination of the preprocessing and classifier achieved an F1 value of 0.833 and a defect recognition accuracy of 94.5%. The F1 value is much lower than that achieved by our approach while the recognition accuracy is slightly higher than our accuracy value. However, it should be noted that the dataset used in [15] contains only 147 samples; a small dataset may lead to over-fitting of the test results. By contrast, our dataset contains 12500 samples and the test results are considered to have good generalization ability. In [16], the author did not calculate the F1 value of the detection model. The optimal defect recognition accuracy was as high as 97.2%, which was attributed to the excellent feature extraction ability of the designed deep learning model. However, the dataset used in that study is different from ours. In fact, scholars conducting different studies have used different datasets, so the testing accuracy results obtained are, to a certain extent, not comparable [37]. In view of this situation, it is necessary to further study and publish a representative public welding defect dataset to facilitate scholarly research. In addition, we show the confusion matrix of the defect recognition results (obtained by the combination of CECGAN and Xception transfer learning) in Table VI, and it can be seen that the recognition rates of the model for the five types of defects are all >90%. This result shows that the Xception model can directly and effectively extract pixel-level intrinsic characteristics of the defects in an image based on the depthwise separable convolution operation, then realize accurate identification of all types of defects. The highest recognition accuracy for weld defects is 94.6% for SI, and the lowest recognition accuracy of weld defects is 91.0% for LF. The reason for this recognition result is that the defect shape of LF is more diverse than those of other defects. Its aspect Fig. 7. t-SNE map of different features. (a) t-SNE map of the defect images in DataCECGAN_bal ; (b) t-SNE map of DFSAE ; (c) t-SNE map of DFCNN ; (d) t-SNE map of DFXception . ratio is not fixed and the blackness is not uniform, which causes some challenges in defect recognition. Although the boundary of the SI defects is irregular, their shape is mostly fixed (spherical), which makes it relatively easy to identify and achieve a higher recognition rate. Based on the above experimental results and discussion, we conclude that the proposed defect detection method (CEGAN combined with Xception transfer learning) solves the data imbalance problem well and realizes more accurate detection of the welding defects. The superiority of the detection results is reflected in two aspects. First, the method can accurately distinguish whether an X-ray image contains any welding defects; in the experiments, our method achieves the highest of F1-score of 0.909. Second, when a defect is detected, our method can recognize the specific defect type with the highest accuracy rate of 92.5%, comparing with other comparative methods. V. C ONCLUSION This paper proposes a new GAN model named CECGAN that is applicable to the defect detection field to solve the problem of data imbalance. The main conclusions are as follows: 1) Experimental results demonstrate the effectiveness and superiority of the CECGAN method compared to conventional resampling methods. The CECGAN method solves the data imbalance problem well, while it also solves the impact of low-contrast defect images on detection. 2) The Xception model is used as the base learner to build the target network for defect detection. The experimental results confirm successful construction of a DNN with stronger feature extraction capabilities than the typically used deep learning models, which further improves the accuracy of defect detection and proves that the transfer Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply. GUO et al.: WELD DEFECT DETECTION FROM IMBALANCED RADIOGRAPHIC IMAGES learning technology has good applicability to industrial images. 3) The experiments in this study are all based on real industrial cases of welding defect detection. We believe that the defect detection method proposed in this paper can be easily generalized to other sensor data-based application scenarios for detection. 4) This paper focuses on the study of defect recognition in defect detection, Next, by considering the mechanisms of various defects, a hybrid (mechanism and data driven) detection model with stronger detection ability could be constructed in the future. In addition, the CECGAN algorithm can also be further studied. ACKNOWLEDGMENT Runyuan Guo thanks the Chinese National Engineering Research Center for Petroleum and Natural Gas Tubular Goods for their assistance of defect detection experiments. R EFERENCES [1] Y. Zhang, X. Gao, D. You, and N. Zhang, “Data-driven detection of laser welding defects based on real-time spectrometer signals,” IEEE Sensors J., vol. 19, no. 20, pp. 9364–9373, Oct. 2019. [2] Y. Zou, D. Du, B. Chang, L. Ji, and J. Pan, “Automatic weld defect detection method based on Kalman filtering for real-time radiographic inspection of spiral pipe,” NDT E Int., vol. 72, pp. 1–9, Jun. 2015. [3] J. Sun, C. Li, X.-J. Wu, V. Palade, and W. Fang, “An effective method of weld defect detection and classification based on machine vision,” IEEE Trans. Ind. Informat., vol. 15, no. 12, pp. 6322–6333, Dec. 2019. [4] F. Duan, S. Yin, P. Song, W. Zhang, C. Zhu, and H. Yokoi, “Automatic welding defect detection of X-ray images by using cascade Adaboost with penalty term,” IEEE Access, vol. 7, pp. 125929–125938, 2019. [5] W. Hou, D. Zhang, Y. Wei, J. Guo, and X. Zhang, “Review on computer aided weld defect detection from radiography images,” Appl. Sci., vol. 10, no. 5, p. 1878, Mar. 2020. [6] Y. Wu et al., “Weld crack detection based on region electromagnetic sensing thermography,” IEEE Sensors J., vol. 19, no. 2, pp. 751–762, Jan. 2019. [7] Z. Long, X. Zhou, X. Zhang, R. Wang, and X. Wu, “Recognition and classification of wire bonding joint via image feature and SVM model,” IEEE Trans. Compon., Packag., Manuf. Technol., vol. 9, no. 5, pp. 998–1006, May 2019. [8] X. Yuan, C. Ou, Y. Wang, C. Yang, and W. Gui, “A novel semisupervised pre-training strategy for deep networks and its application for quality variable prediction in industrial processes,” Chem. Eng. Sci., vol. 217, May 2020, Art. no. 115509. [9] Q. Sun and Z. Ge, “A survey on deep learning for data-driven soft sensors,” IEEE Trans. Ind. Informat., early access, Jan. 20, 2021, doi: 10. 1109/TII.2021.3053128. [10] Y. Liu and M. Xie, “Rebooting data-driven soft-sensors in process industries: A review of kernel methods,” J. Process Control, vol. 89, pp. 58–73, May 2020. [11] X. Yuan, Y. Wang, C. Yang, and W. Gui, “Stacked isomorphic autoencoder based soft analyzer and its application to sulfur recovery unit,” Inf. Sci., vol. 534, pp. 72–84, Sep. 2020. [12] N. Boaretto and T. M. Centeno, “Automated detection of welding defects in pipelines from radiographic images DWDI,” NDT E Int., vol. 86, pp. 7–13, Mar. 2017. [13] H. Liu and R. Guo, “Detection and identification of SWAH pipe weld defects based on X-ray image and CNN,” Chin. J. Sci. Instrum., vol. 39, no. 4, pp. 247–256, 2018. [14] X. Yuan, S. Qi, Y. Wang, and H. Xia, “A dynamic CNN for nonlinear dynamic feature learning in soft sensor modeling of industrial process data,” Control Eng. Pract., vol. 104, Nov. 2020, Art. no. 104614. 10853 [15] R. M. Palhares, Y. Yuan, and Q. Wang, “Artificial intelligence in industrial systems,” IEEE Trans. Ind. Electron., vol. 66, no. 12, pp. 9636–9640, Dec. 2019. [16] E. Zhang, B. Li, P. Li, and Y. Chen, “A deep learning based printing defect classification method with imbalanced samples,” Symmetry, vol. 11, no. 12, p. 1440, Nov. 2019. [17] M. El-Banna, “A novel approach for classifying imbalance welding data: Mahalanobis genetic algorithm (MGA),” Int. J. Adv. Manuf. Technol., vol. 77, nos. 1–4, pp. 407–425, Mar. 2015. [18] S. H. Khan, M. Hayat, M. Bennamoun, F. A. Sohel, and R. Togneri, “Cost-sensitive learning of deep feature representations from imbalanced data,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 8, pp. 3573–3587, Aug. 2018. [19] L. Abdi and S. Hashemi, “To combat multi-class imbalanced problems by means of over-sampling techniques,” IEEE Trans. Knowl. Data Eng., vol. 28, no. 1, pp. 238–251, Jan. 2016. [20] T. W. Liao, “Classification of weld flaws with imbalanced class data,” Expert Syst. Appl., vol. 35, no. 3, pp. 1041–1052, Oct. 2008. [21] W. Hou, Y. Wei, Y. Jin, and C. Zhu, “Deep features based on a DCNN model for classifying imbalanced weld flaw types,” Measurement, vol. 131, pp. 482–489, Jan. 2019. [22] T. Hu, Q. Guo, H. Sun, T.-E. Huang, and J. Lan, “Nontechnical losses detection through coordinated BiWGAN and SVDD,” IEEE Trans. Neural Netw. Learn. Syst., early access, Jun. 4, 2020, doi: 10. 1109/TNNLS.2020.2994116. [23] X. Wang and H. Liu, “Data supplement for a soft sensor using a new generative model based on a variational autoencoder and Wasserstein GAN,” J. Process Control, vol. 85, pp. 91–99, Jan. 2020. [24] L. Shao, F. Zhu, and X. Li, “Transfer learning for visual categorization: A survey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 5, pp. 1019–1034, May 2015. [25] Y. Zhang, X. Gao, L. He, W. Lu, and R. He, “Objective video quality assessment combining transfer learning with CNN,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 8, pp. 2716–2730, Aug. 2020. [26] L. Liu, H. Zhang, X. Xu, Z. Zhang, and S. Yan, “Collocating clothes with generative adversarial networks cosupervised by categories and attributes: A multidiscriminator framework,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 9, pp. 3540–3554, Sep. 2020. [27] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 1251–1258. [28] R. R. D. Silva, M. H. S. Siqueira, M. P. V. D. Souza, J. M. A. Rebello, and L. P. Calôba, “Estimated accuracy of classification of defects detected in welded joints by radiographic tests,” NDT E Int., vol. 38, no. 5, pp. 335–343, Jul. 2005. [29] R. R. D. Silva and D. Mery, “The state of the art of weld seam radiographic testing: Part 1, image processing,” Mater. Eval., vol. 65, no. 6, pp. 643–647, 2007. [30] G. Douzas and F. Bacao, “Effective data generation for imbalanced learning using conditional generative adversarial networks,” Expert Syst. Appl., vol. 91, pp. 464–471, Jan. 2018. [31] I. Goodfellow et al., “Generative adversarial nets,” in Proc. Adv. Neural Inf. Process. Syst., 2014, pp. 2672–2680. [32] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” 2015, arXiv:1502.03167. [Online]. Available: http://arxiv.org/abs/1502.03167 [33] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980. [Online]. Available: http://arxiv.org/ abs/1412.6980 [34] C. Gong, C. Luo, and D. Yang, “Improved image enhancement algorithm based on sine gray level transformation,” Video Eng., vol. 36, no. 13, p. 64, 2012. [35] L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, pp. 2579–2605, Nov. 2008. [36] F. Fucsok, C. Müller, and M. Scharmach, “Reliability of routine radiostudy of the human factor,” in Proc. 8th Eur. Conf. Nondestruct. Test., Barcelona, Spain, 2002, pp. 17–21. [37] Y. Han, J. Fan, and X. Yang, “A structured light vision sensor for on-line weld bead measurement and weld quality inspection,” Int. J. Adv. Manuf. Technol., vol. 106, nos. 5–6, pp. 2065–2078, Jan. 2020. Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on February 22,2022 at 02:02:38 UTC from IEEE Xplore. Restrictions apply.