RESEARCH ARTICLE | JUNE 07 2018 Deep learning based state recognition of substation switches Jin Wang AIP Conf. Proc. 1971, 040041 (2018) https://doi.org/10.1063/1.5041183 View Online CrossMark Export Citation 01 November 2023 08:27:55 Deep Learning Based State Recognition of Substation Switches Jin Wang State Grid Hubei Electric Power Research Institute, Wuhan Hubei 430077 China wangjin@163.com Key words: Substation, switches, state recognition, deep learning. INTRODUCTION In the substation, the large current circuit is switched on and off with the isolator control, in the event of a power failure or routine inspection, when you need to know the isolator is closed or disconnected. At present, the method of judging the status of the isolator is by sending people to observe with their own eyes. However, the substations may be far away from the resident locations of the staffs and the manual observation on site is time-consuming and inconvenient. And substations are distributed widely, if you need to know more than one substation isolator state at the same time, it will require massive manpower. With the continuous improvement and construction of substation automation construction and transformation, most of the power grid enterprises have already realized telemetry, remote signaling, remote control and remote adjustment of remote substations, namely, "four remote" functions. At present, in order to improve labor productivity, increase economic benefits and realize the unmanned substation mode, all enterprises use video image monitoring system to further supplement and improve the "four remote sites", and are called "the fifth remote" by the industry - remote viewing. Although some substations have implemented some video image monitoring systems [1], these monitoring systems only have the function of monitoring images without image recognition. They still need specialized managers to continuously observe and analyze the collected images without interruption. This method is neither reduces the labor costs, nor detects abnormal and warning timely. SUBSTATION SWITCH STATE AUTOMATIC IDENTIFICATION RESEARCH STATUS QUO At present, in addition to manual monitoring, researchers have proposed the use of SCADA system to collect information, combined with the operating rules of the power system and the logical relationship between telemetry and remote signaling, manual development of specific rules to achieve automatic opening and closing of the isolator state [2]. However, the information collected by the SCADA system has some errors such as incomplete data and Materials Science, Energy Technology and Power Engineering II (MEP2018) AIP Conf. Proc. 1971, 040041-1–040041-8; https://doi.org/10.1063/1.5041183 Published by AIP Publishing. 978-0-7354-1678-9/$30.00 040041-1 01 November 2023 08:27:55 Abstract. Different from the traditional method which recognize the state of substation switches based on the running rules of electrical power system, this work proposes a novel convolutional neuron network-based state recognition approach of substation switches. Inspired by the theory of transfer learning, we first establish a convolutional neuron network model trained on the large-scale image set ILSVRC2012, then the restricted Boltzmann machine is employed to replace the full connected layer of the convolutional neuron network and trained on our small image dataset of 110kV substation switches to get a stronger model. Experiments conducted on our image dataset of 110kV substation switches show that, the proposed approach can be applicable to the substation to reduce the running cost and implement the real unattended operation. DEEP LEARNING MODEL STRUCTURE Deep Learning is a learning method by establishing a hierarchical model structure similar to the human brain, the input data is gradually extracted from the bottom to the top features, which can well establish the mapping from the bottom signal to the top semantic. [10] These fields are the hot spots for scholars and industry of the intelligent machine vision research. Its spatial structure and algorithm are very similar to the neural network model of animal visual perception system. It does not need to pre-process or reconstruct the original image data and avoids the drawbacks of the traditional recognition methods which need to manually extract the artificial features of the image data. In essence, deep learning enhances the accuracy of classification or prediction by building neural network models with lots of hidden layers and lots of training data to learn more useful features. Deep learning model mainly includes Deep Belief Networks (DBN) and Convolutional Neural Networks (CNN). DBN is an unsupervised learning model and does not consider the two-dimensional structure information of the image. CNN is a supervised multi-layer perceptive learning model designed to recognize two-dimensional shapes without the need to extract image features in advance, without the need to over-process the original data and automatically obtain highly nonlinear representations of the inputs and outputs. Therefore, CNN is used in this paper to model the switchboard image of well-marked substations. CNN is a special deep neural network model that is inspired by the mechanism of the biological receptive field. In the visual nervous system, the receptive field of a neuron refers to a specific area on the retina, and only the stimuli in this area can activate the neuron. The particularity of CNN manifests itself in two aspects: on the one hand, the connection between its neurons is not fully connected; on the other hand, the weights of connections between some neurons in the same layer are shared (ie, the same). It’s non-fully connected and weighted network structure makes it more similar to biological neural networks, has the following advantages: 040041-2 01 November 2023 08:27:55 noise, and it is very difficult to establish effective discriminant rules. Literature [3] put forward a smart isolator contact state monitoring system based on fiber grating sensing technology theoretically, but it depends on IoT technology in practical applications, and considers factors such as temperature cross-sensitivity of fiber grating sensor. In recent years, due to great progress in image processing and artificial intelligence, researchers propose to use image processing technology to automatically monitor and recognize the status of power equipment [4], so as to realize real unattended and real-time events Reported for safety production. These technologies provide a strong technical support to achieve "controllable, in control, control" in safety production. Many image processing techniques and models, such as the method of analysis based on connected regions [5], the method based on Hough line detection [6], the method based on template matching [7], the method based on boundary closure [8] were applied to a variety of detection and identification of electrical equipment, fault type and isolator state, which have achieved good results. The recognition methods based on the traditional image processing technology have good effect in certain application scenarios, and the robustness and generalization ability of the method need to be further improved. If the substation switch status recognition is regarded as a classification problem in the field of machine learning, some classification models in machine learning can be used to further improve the robustness and generalization ability of detection methods. Classical machine learning models for image classification include KNearest Neighbor, Support Vector Machines, Decision Trees, Neural Networks and so on. There are not many research works on the image analysis of substation monitoring based on machine learning method. Since 2006, a topic called "deep learning" in the field of machine learning began to attract the attention of academics. Its application in many fields of research has achieved great success [9], especially in the field of image recognition. At present, in the safety and security of power transmission lines, Literature [10] proposed to use UAV aerial images of transmission lines to introduce a deep learning model to realize the automatic detection of insulator status. However, due to the complexity of the aerial image background and the unfavorable factors of the image resolution, further research is needed to reach the practical level. Inspired by this, this paper introduces deep learning technology into the opening and closing state of the substation switch gate image. We think this method is feasible because: 1) Many researchers have proved that deep learning technology can be used in other image recognition fields; 2) substation monitoring image background is relatively simple, image resolution can be guaranteed; 3) depth learning substation gate image state recognition is divided into two stages of training and identification, although the model training is time-consuming, it’s an offline training. The state recognition process can still be done efficiently and in real time. (1) Feature extraction. Each neuron receives a synaptic input from a locally accepted domain, thereby forcing it to extract local features. Once a feature is extracted, its exact position becomes less important as long as its position relative to the other features is approximately preserved. (2) Feature mapping. Each computing layer of the network is composed of multiple feature maps, each of which is in the form of a plane. The neurons in the plane share the same set of synaptic weights under constraint, which has the property of translational invariance, reduces the number of free parameters and reduces the complexity of the network model. (3) Sub-sampling. Each convolutional layer follows a computational layer that implements local averaging and subsampling, whereby the resolution of the feature mapping is reduced. This operation has the effect of reducing the sensitivity of the output of the feature mapping to translation and other forms of deformation. ISOLATOR IMAGE STATE RECOGNITION BASED ON CONVOLUTIONAL NEURAL NETWORK According to the above analysis, this paper uses the convolutional neural network to identify the on and off state switch substation gate image. Firstly, CNN level model was constructed, and then the network model parameters were initialized. The weight of the algorithm was iteratively updated by BP algorithm based on mini-batch descending gradient to realize the image recognition of isolator. CNN construction The structure of the convolutional neural network used in this paper is shown in Fig.1. Conv: 5*5 Kernels: 8 Stride: 1 P2 8@12*12 Pooling: 2*2 Stride: 2 C3 16@8*8 Conv: 5*5 Kernels: 79 Stride: 1 P4 16@4*4 Pooling: 2*2 Stride: 2 C5 100@1*1 Conv: 4*4 Kernel: 1600 Stride: 1 F6 64@1*1 FC Output Logistics FC FIGURE 1. CNN structure diagram used in this paper Regardless of the input layer, a total of seven layers, each layer of the structure: Input layer: The input image is a 28 × 28 substation switch image. C1 layer: convolution layer. The input layer is convolved with eight convolution kernels of size 5 * 5 and step size of 1, resulting in eight 24 * 24 feature maps. There are 8 filters in the C1 layer, and a total of 8 * 24 * 24 * (5 * 5 + 1) = 119808 connections with the input layer. Since the weights and offsets of each convolution kernel are shared, the C1 layer The number of parameters to be trained is only 8 * (5 * 5 + 1) = 208. P2 layer: for the pool layer. This layer downscaling the C1 characteristic map to reduce the number of neurons in the C1 characteristic map to improve the processing efficiency and avoid over-fitting. P2 takes a maximum of 2 * 2 neighborhood samples and gets 8 12 * 12 features. Down sampling operation can improve the spatial invariance of the entire CNN network. C3 layer: convolution layer. In this layer, 16 convolution kernels of size 5 * 5 and step size of 1 are used to convolve eight 12 * 12 feature maps of the P2 layer to obtain 16 feature maps of 8 * 8. Since P2 layer has multiple feature maps, incomplete mapping mechanism is used here to implement the feature mapping from layer P2 to layer C3. Specifically, the first six feature maps of C3 take as input the 4 adjacent feature map subsets in P2, the next 6 feature maps take the 6 adjacent feature map subsets of P2 as input, and the next 3 feature maps The non-contiguous subset of six feature maps in P2 is used as input, and the last one is entered in all feature maps in S2. C3 layer and 040041-3 01 November 2023 08:27:55 Input 28*28 C1 8@24*24 P2 layer connection shown in Table 1. Thus, the C3 layer has a total of 6 * 4 + 6 * 6 + 3 * 6 + 1 * 8 = 79 filters and C3 and P2 have (79 * (5 * 5) +16) * (8 * 8) = 127424 connection. Due to the shared weight and bias, the number of parameters to be trained is 16 * (5 * 5 + 1) = 416. TABLE 1. C3 layer and P2 layer connection table of CNN used in this paper. "0" means no connection, "1" means there is connection C3 P 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 1 1 2 1 1 1 0 0 0 1 1 1 0 0 1 1 0 1 1 3 1 1 1 1 0 0 1 1 1 1 0 0 1 1 0 1 4 0 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 5 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 6 0 0 0 1 1 1 0 1 1 1 1 1 1 0 1 1 7 0 0 0 0 1 1 0 0 1 1 1 1 1 1 0 1 (1) Due to the fragility of the ReLU function, which cannot be derived at x = 0, here the deformed Leaky-ReLU function of the ReLU function is used: (2) equal 0.01 here. CNN initialization and training The convolutional neural network is essentially an input-to-output mapping, which is essentially the same as a traditional neural network. It simulates the actual system operation results by learning the mapping between input 040041-4 01 November 2023 08:27:55 P4 layer: for the pool layer. Similar to P2, this layer takes a maximum of 2 * 2 neighborhood samples and downsamples 16 of the C3 features to yield 16 4 * 4 features. C5 layer: convolution layer. This layer uses 100 convolution kernels with the size of 4 * 4 to perform convolution operation on the 16 characteristic maps of the P2 layer to obtain 100 1 * 1 feature graphs. There are 100 * 16 = 1600 filters, there are (4 * 4) * 16 + 1) * 100 = 25700 connections, the number of parameters to be trained is 100 * (4 * 4 + 1) = 1700. F6 layer: for the entire connection layer. This layer, like the traditional neural network, is fully connected with the C5 layer by 64 units and calculates the inner product between the input vector and the weight vector with 100 * 64 + 64 = 6464 training parameters. Finally, the output layer connects to a logistic classifier, which implements the binary classification using the Sigmoid activation function. Among them, all convolution layers use a Rectified Linear Unit (ReLU) as an activation function. This function is more in line with the neuroscience knowledge, has good sparseness, high computational efficiency can alleviate the problem of gradient dispersion and accelerate the training process. ReLU function is defined as: and output vector pairs. The training algorithm is basically similar to the traditional BP algorithm, including initialization of parameters, activation value and error of forward propagation, propagation of errors back to front through backpropagation, and update of weight matrix by BP algorithm. Forward and backward propagation constantly iterative is to reduce the output error. Before starting training, all the weights need to be initialized with a few different small random numbers to prevent the network from becoming saturated due to over-weighting and ensuring that the network can learn normally. Initialization The parameters of CNN mainly include the weights and offsets of the connections between the layers. In order to avoid the saturation of neurons and speed up the convergence of training, the weights of each connection are Gaussian random distribution with a mean of 0 and standard deviation off ., is indicated the number of input weights of the neurons in the first layer. The bias is initialized using a Gaussian random distribution with mean 0 and standard deviation 1. Forward propagation calculation The feature map of the layer of convolution may depend on a number of previous layer features map input, with indicate the mapping relationship of the feature map of the of convolution and the layer of convolution in connection table . If there is a connection then feature map , otherwise of the layer of convolution is shown as below: (3) In the formula, indicate the offset of features map connection of features map and of convolution, indicate the weight of the of convolution. The pooling layer downsamples the convolution layer of the previous layer, preserves the useful information while reducing the amount of data, and the feature map of the pooling layer corresponds one-to-one with the characteristic map of the previous convolution layer. The feature map of the layer of pooling is shown as below: (4) In the formula, in the indicates the down sampling function, take the maximum pooling value of all neurons neighborhood of the input image. Reverse error propagation calculation In order to alleviate the network over-fitting, the regularized cross-entropy loss function 040041-5 is used: 01 November 2023 08:27:55 it equals 0. The feature map of the layer as (5) The first term is the cross-entropy expression, and the second term is the L2 regularization term. n is the number of training samples, summing is performed on all training inputs x, a is the output of the training samples, y is the target output of the training samples, and is the regularization parameter. The stochastic gradient descent algorithm is used to estimate the value of the parameter by minimizing the loss function and using the chain derivation rule. The formula for updating the parameter from the t to t + 1 is: (6) In the formula, is the learning rate. BP algorithm is used to reverse the error transfer, and the model weights and convolution kernels are reversely updated from Output-F6-C5-P4-C3-P2-C1. RBM-based migration training (7) Here, the parameters in represent the weight of the edge between the visible unit and the hidden unit, the offset of the hidden unit and the offset of the visible unit, respectively. RBM shows that the probability of , the probability of hidden unit neurons being unit neurons being activated is activated is . RBM parameter estimation is based on the maximum log likelihood function, such as (8) 040041-6 01 November 2023 08:27:55 As mentioned above, a large number of parameters need to be estimated for the convolutional neural network, so a large number of labeled samples are required for training. In the absence of sufficient training samples, this paper uses the idea of migration learning to train the proposed CNN model on the ILSVRC2012 large dataset, and then fine-tune the parameters on the small data set of substation switch gate images in this paper. However, the content of these two data sets is different, which will affect the feature extraction ability of the model. After training the CNN on a large data set, the fully connected F6 layer is removed and replaced with a Restricted Boltzmann Machine (RBM) layer, which fixes the first five layers of parameters in the CNN. In this small data set on the training. Restricted Boltzmann machine is a kind of generative stochastic neural network proposed by Hinton and Sejnowski in 1986. The network consists of a number of visible units and some hidden units. It can be seen that both variables and hidden variables are binary variables, and its state Take {0,1}, 1 indicates that the unit is activated, 0 indicates not activated. The whole network is a bipartite graph, showing that there is no connection between units and hidden units. RBM is an energy-based model with system energies of: (9) Here, is the likelihood function, represents the t training sample. The training was done by updating parameters using the contrastive divergence fast learning method [11]. EXPERIMENTAL RESULTS AND ANALYSIS The experiment collected 110kV substation isolator images, a total of 1390, of which 730 for the isolator closed state, 660 for the disconnected state. In order to improve the training and testing efficiency, we manually cut out the partial images mainly including the switch from the original image, as shown in Fig.2. The sample image is normalized to 28 * 28, and its gray-scale conversion, to the mean and normalization. The convolution neural network is realized by caffe, and the training set is made using 80% closed state image and 80% open state image randomly, respectively, with the remaining 20% as the test set. 01 November 2023 08:27:55 (A) Isolator closed state (B) Tool off state FIGURE 2. 110kV substation isolator image examples Training is based on the mini-batch batch-based random gradient descent, in which the number of batch is 25, the number of training epoch = 40, the learning rate is set as 0.01, the attenuation coefficient is 0.9. The initialization of weights and offsets is described in Section 4.2.1. The number of training samples is determined by two conditions and stops when all the trained samples reach the specified number of classes or iterations. In order to prevent overfitting and improve the generalization ability of the model, 50% Dropout technology is used to train the fully connected layer of convolutional neural network. In this paper, 80% of the images were selected randomly for training set and 20% of the images as test set. The average recognition accuracy of the final calculation reached 91.3%. This shows that the proposed method based on convolutional neural network can identify the state of the switchgear substation with certainty Effectiveness. CONCLUSION In many substations gradually implement the video surveillance system in the context of this paper presents a new substation isolator state discrimination idea. Different from the traditional judgments from the rules of power system operation, from the perspective of image recognition, taking the 110-kV substation switchboard image as the research object, this paper studies how to identify the switch off and on by the image recognition technology based on convolutional neural network status. Experiments show that the method proposed in this paper has a certain feasibility, effectively reduce substation human costs, and achieve warning linkage and timely fault alarm. At the same time, the method proposed in this paper can also be used to analyze other types of accidents in substations, 040041-7 such as substations for fire prevention and anti-theft, which needs to be further analyze. However, the method proposed in this paper needs to be further tested in practical application scenarios. REFERENCES 1. 040041-8 01 November 2023 08:27:55 WANG De-cai, QI Jin-wen. Research and Analysis of Status Monitoring of 6kV Switchgear in Substation [J], 2011, 9: 40-41 2. GE Run-dong, LIU Wen-ying, HAN Xu-shan. A Practical Method for Automatic Identification of Isolator State. Power System Protection and Control [J]. 2012, 40 (15): 87-92 3. Lu Xueping, Pan Liangyu, Hou Junfei, Yue Zhenya. EXPERIMENTAL RESEARCH ON INTELLIGENT TOOL STATE CONTACT MONITORING SYSTEM BASED ON FIBER OPTIC GRID SENSING TECHNOLOGY. Electrical Manufacturing [J], 2013, 1: 60-61 4. Zhang Hao, Wang Wei, Xu Lijie, Qin Huan, Liu Ming.Application of Image Recognition Technology in Power Equipment Monitoring. Power System Protection and Control [J], 2010, 38 (6): 88-91 5. Yang Jun, Ai Xin, Jia Xiufang, Li Yansong. Intelligent Research on Unmanned Substation Image Recognition. China International Power Conference [C], 2006, 1-5. 6. DING Si-Hai, LIU Yu-Xue, LU Lin-Ji.Research on Switching State Recognition of Electrical Control Cabinet Based on Digital Image Processing. Microcomputer Applications [J], 2013, 30 (5): 39-40 7. Zhao Yongjun, Zhao Shoutao, Chen Peng Yong. Intelligent Analysis of Operation Status of Substation Electrical Equipment Based on Image Recognition. 2012 China Institute of Electrical Engineering DC transmission and power electronics special committee academic annual [C]. Beijing, 2012.8 8. Xia Zhihong, Luo Yi, Tu Guangyu. Automatic Recognition of Rejecting Placement of Relay Shield Based on Visual Information [J], 2005, 33 (4): 40-43 9. YU Kai, JIA Lei, CHEN Yu-Qiang, XU Wei. Yesterday, today and tomorrow of deep learning. Computer Research and Development [J], 2013, 50 (9): 1799-1804 10. Zhenbing Zhao, Ning Liu, and Le Wang. "Localization of multiple insulators by orientation angle detection and binary shape prior knowledge." IEEE Transactions on Dielectrics and Electrical Insulation [J], 2015, 22 (6): 3421-3428. 11. Hinton G E., "Training products of experts by minimizing contractive divergence." Neural Computation (S0899-7667) [J], 2002, 14 (8): 1771-1800.