Internal Symmetry Nets for Back Propagation in Edge Detection Gu Li

advertisement
2009 International Conference on Computer Engineering and Applications
IPCSIT vol.2 (2011) © (2011) IACSIT Press, Singapore
Internal Symmetry Nets for Back Propagation in Edge Detection
Guanzhong Li
School of Computer Science & Engineering, University of New South Wales, Sydney, Australia
glix955@cse.unsw.edu.au
Abstract. Neural Networks are increasingly applied for edge detection recently. Internal Symmetry Networks are a recently
developed class of Cellular Neural Network inspired by the phenomenon of internal symmetry in quantum physics. In this paper,
Internal Symmetry Neural Nets are trained with Back Propagation for edge detection.
Keywords. neural network; internal symmetry; back propagation; edge detection; Canny operator
1. Introduction
Traditional approaches to edge detection did not involve Neural Networks (NN), but rather mathematical operators are
enough such as Sobel, Robert, Prewitt, Marr, Canny and so on [1]. With the development of NN, many studies have
been done on edge detection with NN [1,2]. Most of these studies are based on BP. The differences are in how BP is
applied. Srinivasan, Bhatia and Ong (1991) used a winner-take-all strategy to select appropriate weights for training,
based on standard BP [3]. He and Siyal (1998) set the target as 18 different patterns in a 3*3 template [4]. Zheng and
He (2003) summarized four ways to improve standard BP on edge detection, fast processing, selection of numbers of
hidden nodes, adaptive learning rate and adding a control map [5]. The point of these studies is weight selection and
target set. The model of Srinivasan et al used 8 different weights and set target inspired from Sobel and Robert. This
method proved to be better than Sobel and Robert. Other models did weight selection on the iteration. In this paper,
the target is set using Canny operator [6], because Canny operator is the most acceptable contemporary operator in
terms of its general capability. Meanwhile, weight-sharing scheme is used by Internal Symmetry Nets (ISN).
ISN is inspired from the phenomenon of quantum mechanism. Some previous work trains feed forward ISN by
TD_learning for the game of Go [7]. Recently, my research group extended ISN to recurrent NN for a variety of image
processing tasks. [8,9,10,11] In this paper, only feed forward ISN is applied because it is enough.
2. Internal Symmetry Nets
The ISN in this paper is a sort of Cellular Neural Networks (CNN). Each cell in an array represents a corresponding
pixel in an image. Without loss of generality, a square image of size m-by-m, with m = 2t+1, is used for the
demonstration of ISN. The array can be represented as a lattice Λ of vertices λ=[a,b] with –t ≤ a,b ≤ t. To make the
programming easier, Λ is denoted as the “extended” lattice which includes the four additional row and line of vertices
around the edge of the image, i.e. Λ = {[a,b]}-(t+1) ≤ a,b ≤ (t+1) , though this extension make it a little bit difficult when
training of the pixels in these borders.
Geometric transformations of the image are invariant in many image-processing tasks [13]. Including shift-invariant
(with appropriate allowance for “edge effects”), rotations and reflections are two main transformations. The system in
this paper is designed in a way that the network updates are invariant to rotations and reflections. There are some
different weight-sharing schemes in previous work [7]. In this paper, a recently developed weight-sharing scheme
known as Internal Symmetry Networks [14], based on group representation theory is used. By weight-sharing,
connections from the current pixel to the neighboring n*n pixels share the same weight, only RNN but not FFNN can
Corresponding author: +612 433 754 098
E-mail address:unswgoldenleo@hotmail.com
509
guarantee the global relations of each pixel in the image.
The group G of symmetries of a square image is a dihedral group D4 of order 8. This group is generated by two
elements r and s – where r represents a (counter-clockwise) rotation of 90° and s represents a reflection in the vertical
axis (see Fig 1). The action of D4 on Λ (or Λ) is given by
r [a, b] = [-b, a]
s [a, b] = [-a, b]
(1)
To make the explanation and the experiment clear, assume a, b∈{0,1}, then M and N denote two different neighborhood
structures in the form of offset values respectively:
M = {[0,0], [1,0], [0,1], [-1,0], [0,-1]}, N = M ∪ {[1,1], [-1,1], [-1,-1], [1,-1]}
As one offsets from a particular vertex, M represents the vertex itself and adds the neighboring vertices to its East,
North, West and South; N includes M and includes the diagonal vertices to the North-East, North-West, South-West and
South-East. Assuming the action of G on N (or M) is also given by (1), it is easy to prove that for g ∈ G, λ ∈ Λ and ν ∈ N,
g(λ + ν) = g(λ) + g(ν).
There are five different irreducible hidden units. Then the hidden unit activation for a single cell is represented of a
vector-product of 5 different hidden nodes.
H = TiT × SiS × DiD × CiC × (F1 × F2)iF
with the action of G on H given by
g(H) = {g(Hg[a,b])}[a,b]∈Λ
More details can be seen in [7].
This invariance imposes certain constraints on the weights of the network, which are outlined below.
ν
ν
ν
ν
ν
ν
ν
ν
ν
ν
ν
ν
ν
ν
T
V OH = [ V OT V OS V OD V OC V OF1 V OF2 ]; V HI = [ V TI V SI V DI V CI V F1I V F2I ]
NE
VEOI = VNOI = VWOI = VSOI , VNEOI = VNWOI = VSWOI = VSEOI; VEOT = VNOT = VWOT = VSOT , V OT= VNWOT= VSWOT = VSEOT
O
E
N
W
S
NE
NW
SW
SE
O
O
O
V TI = V TI = V TI = V TI , V TI = V TI = V TI = V TI; V OF2= V OF2= V OF1= V OF2= 0
S
S
VEOF1= VNOF2=-VWOF1=-V OF2= VEF1I= VNF2I =-VWF1I =-VSF2I; VEOF2= VNOF1= VWOF2= V OF1 = VEF2I= VNF1I = VWF2I = VSF1I = 0
NE
NW
SW
SE
NE
NW
SW
SE
NE
NW
SW
V OF1=-V OF1=-V OF1=V OF1, V OF2=V OF2=-V OF2=-V OF2;V F1I= -V F1I=-V F1I=VSEF1I , VNEF2I =VNWF2I =-VSWF2I=-VSEF2I
VEOS = -VNOS = VWOS =-VSOS , VNEOD =-VNWOD= VSWOD=-VSEOD; VESI = -VNSI = VWSI =-VSSI , VNEDI =-VNWDI = VSWDI =-VSEDI
μ
μ
ν
ν
ν
ν
V OD = V DI = 0, μ ∈ {O, E, N, W, S};V OS = V SI = 0, ν ∈ {O, NE, NW, SW, SE};V OC = V CI = 0,ν ∈ {O, E, N, W, S, NE, NW, SW, SE}
There are five irreducible presentations of D4 (Fig. 1.1). According to group representation theory, the combination
of different numbers of these irreducible presentations can generate all possible representations. [15] To extend this
idea to the application image segmentation, the combinations of these five kinds of hidden nodes can generate all the
weights that can be generated by general BP. The advantage of ISN for BP than general BP is the clear starting
structures. These representations are the starting structure, then the process will know the stable trend of training
directions. General BP ignores the internal symmetries of images and gives random different weights to different
neighboring pixels.
510
Fig1.1. The five irreducible representations of D4
Fig 1.2. The dihedral group D4 with generators r,s
3. Experiments
For black and white images, the network has two inputs per pixel. One input encodes the intensity of the pixel as a
grey scale value between 0 and 1. The other input is a dedicated "off-edge" input which is equal to 0 for inputs inside
the actual image, and equal to 1 for inputs off the edge of the image (i.e. for vertices in Λ\Λ). This kind of encoding
could in principal be extended to color images by using four inputs per pixel (three to encode the R,G,B or Y,U,V
values, plus the dedicated "off-edge" input).
In the experiment, there are two purposes. The first one is to test whether an ISN can be trained to perform Canny
Edge Detection. The other is compare the output with the targets set by Canny operator. In this case, there is only one
output unit for each pixel, and the aim is for the network to reproduce, as well as possible, the result of a Canny edge
detection algorithm applied to the original image. Five training and five test images were used.
A number of combinations of parameters and hidden
units were tried. The best results were obtained using
cross entropy minimization, with a learning rate of
5×10-9, momentum of 0.9 and hidden unit configuration
of (4,0,0,0,0). Fig 2 shows the training and test set error
(per pixel) for 600,000 training epochs. Fig 3 shows the
input, target and network output for the test images.
Images end with 1- 5 are for training, 6-10 for test.
Between epochs 747100 to 747200, ISN gets its
lowest test error, 0.02314.
Figure 2. Cross entropy training error (dotted) and test error (solid) per pixel,
for Canny edge detection task.
Input
Target
Output
i1
t1
o1
i3
t3
o3
Input
Target
i2
i4
511
Output
t2
t4
o2
o4
i5
t5
i7
o5
t7
i9
i6
o7
t9
o9
i8
i10
Figure 3. Input, Target and Output images.
t6
o6
t8
o8
t10
o10
By low value of test error and general viewing of these images, ISN can approximate Canny operator. Then detailed
comparisons are held between the targets and the outputs in terms of Canny’s three criteria:
1.good detection 2.good localization 3.minimal response
By comparison, for criteria 1, Canny operator has found more edges, both correct and wrong. The phenomenon
happens for both the training and the test sets. For example, o4 and o10 lost many necessary edges for her or his hair and
chins, but delete some unimportant lines on her or his faces, compared to t4 and t10 respectively.
Canny operator is equal to ISN at criteria 2, approximately. The width of an edge by Canny is thinner nearly the same
to the corresponding edge by ISN. Sometimes, Canny is better than ISN, for example, the grids behind the girl in t4 is
thinner and clearer than o4; and vice versa, the skeleton of the girl’s right face in o4 is thinner than t4.
For criteria 3, they are similar by viewing.
In this experiment, ISN with BP can approximate the main functions of Canny operator. Sometimes ISN can perform
better than Canny operator.
4. Conclusion
In this paper, a specific NN- ISN is applied on BP with edge detection. ISN inherit main functions from Canny operator
and perform better than it.
512
The interesting point is that the output can be better than the target to some extent in supervised learning. This means
that components of Canny are not totally compatible. So NN can be as a supplementary or evaluation tool to Canny
operator. Unlike other BP methods using weight selection in the iteration process, ISNs use weight-sharing scheme to
save computation time and store space.
In the future, the target will not be set as Canny operator directly. A more detailed analysis of other hybrids with Canny
will be held and then the target will be made from the analysis.
5. References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
D.Ziou, and S.Tabbone, “Edge Detection Tenique-An Overview”,International Journal of Pattern Recgonition and Image
Analysis, 1998.
L. Spreeuwers, “A neural network edge detector”, Nonlinear Image Processing II, SPIE Vol. 1451, pp. 204-215,1991
V.Srinivasan, P.Bhatia and S.H.Ong, “Edge Detection Using a Neural Network”, Pattern Recognition. Vol.12.pp.16531662,1994.
Z. He, and M. Siyal. “Edge Detection with BP Neural Networks”, IEEE, Proceeding of ICSP, 1998.
L.Zheng, and X. He, “Edge Detection Based on Modified BP Algorithm of ANN”, Pan-Sydney Area Workshop on Visual
Information Processing, 2003.
J. Canny, “A Computational Approach to Edge Detection”, IEEE Trans. Pattern Analysis and Machine Intelligence 8, pp
679-714, 1986.
Y. LeCun et al, “Backpropagation Applied to Handwritten Zip Code Recognition”, Neural Computation 1(4), pp. 541-551,
1989.
A. Blair , and G. Li, “Training of Recurrent internal Symmetry Networks by Backpropagation”, IEEE Trans on Neural
Networks, Proceedings of the International Joint Conference, 2009.
G. Li, “Problem and Strategy: Overfitting in Recurrent cycles of Internal Symmetry Networks by Back Propagation”, IEEE
Trans on Conference of Computational Intelligent and Natural Computing, 2009
G. Li, “A new dynamic strategy of Recurrent Neural Network”, The 8th IEEE International Conference on Cognitive
Informatics, 2009
G. Li, “Recurrent Internal Symmetry by Back Propagation in Wallpaper Image Segmentation”, The 10th International
Conference of Pattern Recgonition and Information Processing, 2009
G. Li, “Phenomenons and Methods: Uncertainty in Internal Symmetry Nets with Back Propagation in Image Processing”,
International Symposium on intelligent Ubiguitous Computing and Education, 2009
M.Sonka, V Hlavac, and R Boyle, “Image Processing, Analysis, and Machine Vision”, pp 62-65,1998.
A. Blair, “Learning Position Evaluation for Go with Internal Symmetry Networks”, Proc. 2008 IEEE Symposium on
Computational Intelligence and Games, pp. 199-204
B.Waerden. “Group Theory and Quantum Mechanics”, chapter2. 32-78, 1980.
513
Download