Kolmogorov Superposition Theorem and its application to wavelet

Kolmogorov Superposition Theorem and its application to wavelet image decompositions Pierre-Emmanuel Leni, Yohan D. Fougerolle, and Frédéric Truchetet Université de Bourgogne, Laboratoire LE2I, UMR CNRS 5158, 12 rue de la fonderie, 71200 Le Creusot, France ABSTRACT This paper deals with the decomposition of multivariate functions into sums and compositions of monovariate functions. The global purpose of this work is to find a suitable strategy to express complex multivariate functions using simpler functions that can be analyzed using well know techniques, instead of developing complex Ndimensional tools. More precisely, most of signal processing techniques are applied in 1D or 2D and cannot easily be extended to higher dimensions. We recall that such a decomposition exists in the Kolmogorov’s superposition theorem. According to this theorem, any multivariate function can be decomposed into two types of univariate functions, that are called inner and external functions. Inner functions are associated to each dimension and linearly combined to construct a hash-function that associates every point of a multidimensional space to a value of the real interval [0, 1]. Every inner function is the argument for one external function. The external functions associate real values in [0, 1] to the image by the multivariate function of the corresponding point of the multidimensional space. Sprecher, in Ref. 1, has proved that internal functions can be used to construct space filling curves, i.e. there exists a curve that sweeps the multidimensional space and uniquely matches corresponding values into [0, 1]. Our goal is to obtain both a new decomposition algorithm for multivariate functions (at least bi-dimensional) and adaptive space filling curves. Two strategies can be applied. Either we construct fixed internal functions to obtain space filling curves, which allows us to construct an external function such that their sums and compositions exactly correspond to the multivariate function; or the internal function is constructed by the algorithm and is adapted to the multivariate function, providing different space filling curves for different multivariate functions. We present two of the most recent constructive algorithms of monovariate functions. The first method is due to Sprecher (Ref. 2 and Ref. 3). We provide additional explanations to the existing algorithm and present several decomposition results for gray level images. We point out the main drawback of this method: all the function parameters are fixed, so the univariate functions cannot be modified; precisely, the inner function cannot be modified and so the space filling curve. The number of layers depends on the dimension of the decomposed function. The second algorithm, proposed by Igelnik in Ref. 4, increases the parameters flexibility, but only approximates the monovariate functions: the number of layers is variable, a neural networks optimizes the monovariate functions and the weights associated to each layer to ensure convergence to the decomposed multivariate function. We have implemented both Sprecher’s and Igelnik’s algorithms and present the results of the decompositions of gray level images. There are artifacts in the reconstructed images, which leads us to apply the algorithm on wavelet decomposition images. We detail the reconstruction quality and the quantity of information contained in Igelnik’s network. Keywords: Kolmogorov superposition theorem, multidimensional function decomposition, neural network, signal processing, image analysis, wavelets Further author information: pierre-emmanuel.leni@u-bourgogne.fr yohan.fougerolle@u-bourgogne.fr frederic.truchetet@u-bourgogne.fr 1. INTRODUCTION In 1900, Hilbert has conjectured 23 mathematical problems. Amongst them, the 13rd states that high order equations cannot be solved by sums and compositions of bivariate functions. 57 years later, Kolmogorov demonstrated his superposition theorem (KST) and proved the existence of monovariate functions, such that every multivariate functions can be expressed as sums and compositions of monovariate functions. Unfortunately, Kolmogorov did not propose any construction method of these monovariate functions. The superposition theorem, reformulated and simplified by Sprecher in Ref. 5, can be written as: Theorem 1.1 (Kolmogorov superposition theorem). Every continuous function defined on the identity hypercube ([0, 1]d noted I d ) f : I d −→ R can be written as sums and compositions of continuous monovariate functions:  P2d  f (x1 , ..., xd ) = n=0 gn ξ(x1 + na, ..., xd + na) (1) Pd  ξ(x1 + na, ..., xd + na) = i=1 λi ψ(xi + an), with ψ continuous function, λi and a constants. ψ is called inner function and g(ξ) external function. The inner function ψ associates every component xi from the real vector (x1 , ..., xd ) of I d to a value in [0, 1]. The function ξ associates each vector (x1 , ..., xd ) ∈ I d to a number yn from the interval [0, 1]. These numbers yn are the arguments of functions gn , that are summed to obtain the function f . There are two different parts in this decomposition: components xi , i ∈ J1, dK of each dimension are combined into a real number by a hash function (the inner function ξ), that is associated to corresponding value of f for these coordinates by the external function g. Hecht-Nielsen has shown in Ref. 6 that the KST-constituting monovariate functions can be organized has a one hidden layer neural network. Figure 1. Representation of the KST as a one hidden layer neural network, from Ref. 3. Most recent contributions related to KST broach the problem of the monovariate function construction. Sprecher has proposed an algorithm for exact monovariate function reconstruction in Ref. 2 and Ref. 3, that introduces fundamental notions (such as tilage). See Figure 3 for a representation of an internal function ξ defined by Sprecher in 2D. On the other side, Igelnik has presented in Ref. 4 an approximating construction that offers flexibility and modification perspectives over the monovariate function construction. Precisely, using Igelnik’s approximation network, the image can be represented as a superposition of layers, i.e. a superposition of images with a fixed resolution, depending on the size of the tiles. The size of inner functions is fixed, and the sizes of external functions depend on number of tiles: images are decomposed into monovariate functions, and the complete size of the decomposition is determined by the precision of the approximation. We study the global size of the decomposition network to determine the compression properties of Igelnik’s decomposition, i.e. for an image with a given number of pixels to decompose, how many numbers are required to define Igelnik’s network? We show that using large tiles, which reduces Igelnik’s network size, produces artifacts on reconstructed images. We propose to apply Igelnik’s approximation on wavelets image decomposition to reduce the artifacts number in reconstructed images. The structure of the paper is as follows: we present Sprecher’s algorithm in section 2, and Igelnik’s algorithm in section 3. In section 4, we present the results of Igelnik’s algorithm applied to the images of a wavelet decomposition. In the last section, we conclude and consider several research perspectives. Our contributions are a short review of Sprecher’s algorithm and a synthetic explanation of Igelnik’s algorithm. We present the results of the application of both algorithms to gray level images. We point out the lack of flexibility of Sprecher’s algorithm and focus our study on Igelnik’s network. Considering the error reconstruction of this approximation, we apply Igelnik’s network to ”simpler” images: wavelet decomposition images. We characterize the reconstruction quality as a function of tiles density. 2. SPRECHER’S ALGORITHM We present the algorithm proposed by Sprecher in Ref. 2 and Ref. 3, in which he presents an exact algorithm for internal and external functions construction, respectively. In Ref. 7, Braun and al. have pointed out that the internal function ψ defined by Sprecher was discontinuous for several values. They provide a new definition that ensures continuity and monotonicity of the function ψ. The construction of internal functions ψ relies on a real number decomposition dk in a given base γ. Definition 2.1 (Notations). d is the dimension, d > 2. m is the number of tilage layers, m > 2d. γ is the base of the variables xi , γ > m + 2. a= 1 γ(γ−1) is the translation between two layers of tilage. λ1 = 1 and for 2 6 i 6 d, λi = the argument of function g. 1 r=1 γ (i−1)(dr −1)/(d−1) P∞ are the coefficients of the linear combination, that is every decimal number (noted dk ) in [0, 1] with k decimals can be written: dk = Pk dk + n r=2 γ −r defines a translated dk . Using the dk defined in definition  f or k   f or k ψk (dk ) =   f or k Pk r=1 ir γ −r , and dnk = 2.1, Braun and al. define the function ψ by: = 1 : dk > 1 and ik < γ − 1 : ψk−1 (dk − > 1 and ik = γ − 1 : 1 2 (ψk (dk − ik γk ) 1 ) γk + γ ik dk −1 d−1 + ψk−1 (dk + (2) 1 )). γk Function ψ is applied to each component xi of the input value. The ψ(xi ) values are linearly combined with real numbers λi to compute function ξ. Figure 2 represents the plot of function ψ on the interval [0, 1] and figure 3 represents the function ξ on the space [0, 1]2 : to every couple in [0, 1]2 , a unique value in the interval [0, 1] is associated. Sprecher has demonstrated that the image of disjoint intervals I are disjoint intervals ψ(I). This separation property generates intervals that constitute an incomplete tilage of [0, 1]. This tilage is extended to a d-dimensional tilage by making the cartesian product of the intervals I. In order to cover the entire space, the tilage is translated several times by a constant a, which produces the different layers of the final tilage. Thus, we obtain 2d + 1 layers: the original tilage constituted of the disjoint hypercubes having disjoint images through ψ, and 2d layers translated by a along each dimension. Figure 4 represents a tilage section of a 2D space: 2d + 1 = 5 different superposed tilages can be seen, displaced by a. Figure 2. Plot of the function ψ for the base γ = 10, from Ref. 7. Figure 3. Plot of the hash function ξ for dimension d = 2 and base γ = 10, from Ref. 8. Figure 4. Section of the tilage for a 2D space and a base γ = 10 (5 different layers), from Ref. 8. Figure 5. Function ξ associates each paving block (hypercube) to a unique interval Tk in [0, 1]. For a 2D space, a hypercube is associated with a couple dkr = (dkr 1 , dkr 2 ). Each hypercube Skr (dkr ) is uniquely associated to an interval Tkr (dkr ) by the function ξ, see Figure 5. A unique set of internal functions ψ and ξ is generated for every function f . Only the external functions gn are adapted to each function to decompose. External functions gn are not explicitly defined: an algorithm iteratively constructs approximations gnr in three steps, such that their sum converges to the external function gn . The first step is dedicated to precision kr determination, second step to the computation of the internal functions ψ and ξ, and the third step to partial external functions gnr and fr . Every iteration r is computed at a precision kr . The constructed approximations gnr depend on the approximation error function fr , that tends to 0 when r increases. Consequently, the more iterations, the better is the approximation of the function f by functions gn . We invite the interested reader to refer to Ref.3 for a complete description of the algorithm. 2.1 Results We apply Sprecher’s algorithm to gray levels images (considered as bivariate functions of the form f (x, y) = I(x, y)) and present the reconstruction results. The algorithm constructs images that correspond to layers of external functions, that are summed to obtain the final image. Each layer is a sum of approximations at a given accuracy r, obtained at each iteration of the algorithm. Figure 6 represents two layers obtained after two iterations and figure 7 shows two reconstructions of the same original image, using layer obtained after one and two iterations of Sprecher’s algorithm: few differences exist between one and two iterations approximations, which illustrates that the reconstruction rapidly converges to the original image. 3. IGELNIK’S ALGORITHM The parameters of Sprecher’s algorithm are fixed and cannot be adjusted: e.g. the number of layers and the inner function definition. Igelnik’s algorithm provides greater flexibility at the expense of reconstruction error: (a) (b) Figure 6. Reconstruction after two iterations of the first layer (a) and last layer (b) (a) (b) (c) Figure 7. (a) Original image. (b) and (c) Reconstruction after one and two iterations, respectively. the monovariate functions are approximated. The original equation 1.1 has been adapted to: f (x1 , ..., xd ) ≃ N X n=1 an g n X d i=1 λi ψni (xi ) (3) For a given layer n, d inner functions ψni are randomly generated: one per dimension (index i) and per layer (index n), independently from function f . The convex combination of these internal functions ψni with real values λi is the argument of external function gn , choosing real numbers λi (one per dimension) linearly independent, P strictly positive and such that di=1 λi 6 1. Finally, external functions gn are constructed. To conclude layer construction, the functions ψ and g are sampled with M points, that are interpolated by cubic splines. Each layer is weighted by a coefficients an and summed to approximate the multivariate function f . One of the joint concept with Sprecher is the definition of a tilage, that is defined once for all at the beginning of the algorithm. The tilage is constituted with hypercubes Cn obtained by cartesian product of the intervals In (j), defined as follows: Definition 3.1. ∀n ∈ J1, N K, j > −1, In (j) = [(n − 1)δ + (N + 1)jδ, (n − 1)δ + (N + 1)jδ + N δ], where δ is the distance between two intervals I of length N δ, such that the function f oscillation is smaller than on each hypercube C. Values of j are defined such that the previously generated intervals In (j) intersect the interval [0, 1]. Figure 8 illustrates such a construction of intervals I. 1 N Figure 8. From Ref.4, intervals I1 (0) and I1 (1) for N = 4. 3.1 Inner functions construction Each function ψni is defined as follows: generate a set of j distinct numbers ynij , between ∆ and 1−∆, 0 < ∆ < 1, such that the oscillations of the interpolating cubic spline of ψ values on the interval δ is lower than ∆. j is given by definition 3.1. The real numbers ynij are sorted, i.e.: ynij < ynij+1 . The image of the interval In (j) by function ψ is ynij . This discontinuous inner function ψ is sampled by M points, that are interpolated by a cubic spline. We obtain two sets of points: points located on plateaus over intervals In (j), and points M ′ located between two intervals In (j) and In (j + 1), that are randomly placed. Points M ′ are optimized during the neural network construction, using a stochastic approach. Figure 9(a) represents final function ψ on the interval [0, 1], and figure 9(b) illustrates a construction example of function ψ for two consecutive intervals In (j) and In (j + 1). Pd Once functions ψni are constructed, the function ξn (x) = i=1 λi ψni (x) can be evaluated. On hypercubes P Cnij1 ,...,jd , the function ξ has constant values pnj1 ,...,jd = di=1 λi yniji . Every random number yniji generated verifies that the generated values pniji are all different, ∀i ∈ J1, dK, ∀n ∈ J1, N K, ∀j ∈ N, j > −1. (a) (b) Figure 9. (a) Example of function ψ sampled by 500 points that are interpolated by a cubic spline. (b) From Ref.4, plot of ψ. On the intervals In (j) and In (j + 1), ψ has constant values, respectively ynij and ynij+1 . 3.2 External function construction The function gn is defined as follows: For every real number t = pn,j1 ,...,jd , function gn (t) is equal to the N th of values of the function f at the center of the hypercube Cnij1 ,...,jd , noted bn,j1 ,...,jd , i.e.: gn (pn,j1 ,...,jd ) = N1 bn,j1 ,...,jd . The definition interval of function gn is extended to all t ∈ [0, 1]. Consider A(tA , gn (tA )) and D(tD , gn (tD )) two adjacent points, where tA and tD are two levels pn,j1 ,...,jd . Two points B et C are placed in A and D n (tB ) . neighborhood, respectively. Points B and C are connected with a line defined with a slope r = gn (tCtC)−g −tB Points A(tA , gn (tA )) and B(tB , gn (tB )) are connected with a nine degree spline s, such that: s(tA ) = gn (tA ), s(tB ) = gn (tB ), s′ (tB ) = r, and s(2) (tB ) = s(3) (tB ) = s(4) (tB ) = 0. Points C and D are connected with a similar nine degree spline. The connection condition at points A and D of both nine degree splines gives the remaining conditions. Figure 10(a) illustrates this construction, whereas (b) gives a complete overview of the function gn for a layer. Remark 1. Points A and D (values of function f at the centers of the hypercubes) are not regularly spaced on the interval [0, 1], since their abscissas are given by function ξ, and depend on random values ynij ∈ [0, 1]. The placement of points B and C in the circles centered in A and D must preserve the order of points: A, B, C, D, i.e. the radius of these circles must be smaller than half of the length between the two points A and D. The external function has a noisy shape, which is related to the global sweeping scheme of the image: Sprecher and al. have demonstrated in Ref.1 that using internal functions, space-filling curves can be defined. Function ξ associates a unique real value to every couple (dkr 1 , dkr 2 ) of the multidimensional space [0, 1]d . Sorting these real values defines a unique path through the tiles of a layer: the space filling curve. Figure 11 illustrates an example of such a curve: the pixels are swept without any neighborhood property conservation. (a) (b) Figure 10. (a) From Ref.4, plot of gn . Points A and D are obtained with function ξ and function f . (b) Example of function gn for a complete layer of lena decomposition. Figure 11. Igelnik’s space filling curve. 3.3 Neural network stochastic construction Igelnik defines parameters during construction that are optimized using a stochastic method (ensemble approach): the weights an associated to each layer, and the placement of the sampling points M ′ of inner functions ψ that are located between two consecutive intervals. To evaluate the network convergence, three sets of points are constituted: a training set DT , a generalization set DG , and a validation set DV . N layers are successively built. To add a new layer, K candidate layers are generated with the same plateaus ynij , which gives K new candidate neural networks. The difference between two candidate layers is the set of sampling points located between two intervals In (j) and In (j + 1), that are randomly chosen. We keep the layer from the network with the smallest mean squared error that is evaluated using the generalization set DG . The weights an are obtained by minimizing the difference between the approximation given by the neural network and the image of function f for the points of the training set DT . The algorithm is iterated until N layers are constructed. The validation error of the final neural network is determined using validation set DV . To determine coefficients an , the difference between f and its approximation f˜ must be minimized:   f (x1,1 , ..., xd,1 ) , ... kQn an − tk , noting t =  f (x1,P , ..., xd,P ) (4) with Qn a matrix of column vectors qk , k ∈ J0, nK that corresponds to the approximation (f˜) of the k th layer for points set (x1,1 , ..., xd,1 ), ..., (x1,P , ..., xd,P ) of DT :  f˜k (x1,1 , ...xd,1 ) . ... Qn = [q0 , q1 , ..., qn ], with ∀k ∈ [0, ..., n], qk =  ˜ fk (x1,P , ...xd,P )  An evaluation of the solution Q−1 n t = an is proposed by Igelnik in Ref.9. The coefficient al of the column vector (a0 , ..., an )T is the weight associated to layer l, l ∈ J0, nK. 3.4 Results We have applied Igelnik’s algorithm to gray level images. In this example, the original image is a 100x100 pixels lena, decomposed into 5 layers, with about 1740 tiles each, that are summed to obtain the final approximation. Considering the translation of the different layers, it means that only about 6250 different pixels over 10000 are used to construct the network. The reconstruction PSNR is 22.29dB for the approximation using optimized weights, and 22.20dB without. The result of such approximation is represented by figure 13, and figure 12 details two layers: first (N = 1) and last (N = 5) layer. These results show that the network optimization is not efficient enough: the optimization does not significantly improve the convergence, and the artifacts that can be seen (see figure 13) for example) are located between the intervals In , where the internal functions are adapted. In other words: artifacts appear between tiles and cannot efficiently be removed only by optimizing M ′ points and weights an . (a) (b) Figure 12. Two decomposition layers: (a) First layer. (b) Last layer. (a) (b) (c) Figure 13. (a) Original image. (b) Igelnik’s approximation for N = 5 and identical weight for every layer. (c) Igelnik’s approximation for N = 5 and optimized weights an . 4. KST AND WAVELET DECOMPOSITION Section 3.4 illustrates a reconstruction result and shows the emergence of artifacts, using Igelnik’s approach. Igelnik’s monovariate functions rely on the size of the tiles, i.e. the quantity of tiles/layer. The definition of external functions implies that some pixels of the image are not used to define the network: depending on the tilage, the decomposed image is defined by fewer pixels, which can induce a lossy compression: we study the quality of the reconstruction as a function of tiles size. We propose to apply Igelnik’s decomposition to wavelet decomposition images, which are simplified images, especially for high frequency images. Reconstruction artifacts in high frequency images affect the details of the original image: using adapted tile sizes for the low and high frequencies images will improve reconstruction error and decrease the number of artifacts. To characterize this compression, the size of the Igelnik’s network has to be compared to the size of the original image: to decompose a 100x100 pixels image, how many doubles must be stored to define the Igelnik’s network? Igelnik’s network is characterized by internal functions ψ (one per dimension and layer), dimensional constants λi and external functions gn (one per layer). The size of Igelnik’s network is the size of the couple consisting of internal functions (per dimension) and the external function, multiplied by the number of layers. The size of internal functions ψ is related to the number of sampling points M : most of the sampling points are located on plateaus, i.e. groups of points with the same values can be constituted, which means that the size of internal function ψ can be reduced to the number of points on a plateau (and its value yn ij), and the values of points located between two consecutive intervals In (about 10% of the M sampling points). Furthermore, the larger the tiles, the smaller M . The size of external function gn is determined by the number of points that correspond to the centers of tiles, which is equal to the number of tiles per layer. The size of the tiles can be adjusted in Igelnik’s algorithm, i.e. the number of tiles of a layer, which also determines how many pixels of the original image are used to construct a layer. To decrease the size of Igelnik’s network, the size of external functions gn has to be reduced, so the number of tiles. The size of the edge of a tile is N × δ and the gap between two consecutive intervals is δ. To decrease the number of tiles per layer, δ or the number of layers N have to be increased. Increasing δ produces larger gaps between consecutive intervals In , where the internal function is optimized: the reconstruction error grows. Increasing N adds new couples of internal and external functions, which augments the complete size of the network, but reduces the distance where internal functions are optimized: the reconstruction error decreases. The result of one level wavelet image decomposition can be represented with 4 sub-images: a low frequencies image and 3 high frequencies images for horizontal, diagonal, vertical details. We consider a Haar wavelets decomposition. We apply Igelnik’s algorithm on these decomposition images, varying the size of the tilage. The original image is a 200x200 pixels image, so the 4 decompositions sub-images are 100x100 pixels. We use a 5 layers network. We reconstruct the wavelet images and then the original image, to determine the PSNR. The results are presented in table 1 and figure 14. We give an example of this decomposition with figure 15(a)(j), that corresponds to 1740 tiles/layer in Igelnik’s network. These results show that Igelnik’s decomposition produces artifacts in the low frequencies image that are present in final reconstruction. They also confirm that weight optimization is inefficient: with small tiles (one per pixel for first table line), reconstruction is more accurate without weighted layers, and the improvement brought by weight optimization with less than 3090 tiles/layer (right part of the curve) is negligible. To improve reconstruction, we can combine different tilage sizes to reduce the impact of the Igelnik’s decomposition artifacts that appear with largest tiles. We decomposed the low frequencies image using small tiles to obtain 10000 tiles/layer (every pixels are used to construct the network), and 280 tiles/layer for images of details (only about 1540 pixels are used to construct the network). The reconstruction PSNR is 25.90dB. Using 1110 tiles/layer for images of details, we increase reconstruction PSNR to 26.57dB, which is illustrated on figure 15(k)-(o). 5. CONCLUSION AND PERSPECTIVES We have dealt with the problem of multivariate function decompositions into monovariate functions. To construct monovariate functions, we have presented two algorithms: Sprecher’s algorithm, that provides an exact reconstruction by strictly reproducing the theorem organization scheme; and Igelnik’s algorithm, that approximates the monovariate functions, which generates worse reconstruction errors than Sprecher but improves flexibility over parameters and monovariate functions definitions. We also showed that Igelnik’s method can be adapted to control the size of monovariate functions, and that the network optimization was not sufficient to removed number of tiles/layer 10000 8570 6940 3090 1740 1110 280 PSNR(dB) - optimized layer weights 25.28 26.16 24.72 23.87 22.25 22.62 19.37 PSNR(dB) - no layer weight 52.09 27.21 23.41 23.87 22.02 22.58 19.15 Table 1. PSNR of image reconstruction. Figure 14. PSNR of image reconstruction vs. number of tiles. Dot line is for optimized weights. Continuous line is for un-weighted layers. artifacts from reconstructed images. We showed that changing tile size was leading to only partial use of original image information, which can be considered as a compression process: we studied Igelnik’s network size, and concluded that to reduce the size of Igelnik’s network, the number of tiles per layer has to be reduced, which increases reconstruction error. In section 3.2, we introduced space filling curves. Igelnik’s internal functions are not strictly increasing: function ξ has a constant value over a tile: the points contained on a tile can be swept in any order, which is shown in figure 11. Consequently, two neighbor points are not always associated to consecutive values through function ξ: an homogeneous area of the image is not transposed to a unique interval in the external function g. The noisy shape of the external function implies that an indetermination in abscissa determination implies a larger indetermination through external function g. Control image sweeping to avoid edge crossing can reduce external function oscillation, which reduces the impact of location error through internal functions. In extension, image of details obtained from wavelet decomposition have a large number of pixels equals to (0). Controlling the sweep of the image allows to scan all the black pixels separately from non-black pixels, which also simplifies external functions by introducing an interval equals to zero. Considering the compression process, another research perspective concerns monovariate functions compression. We only considered a naive compression of internal functions in our approach. An effective monovariate compression provides flexibility over tilage density, that can be increased to improve reconstruction. REFERENCES [1] Sprecher, D. A. and Draghici, S., “Space-filling curves and Kolmogorov superposition-based neural networks,” Neural Networks 15(1), 57–67 (2002). [2] Sprecher, D. A., “A numerical implementation of Kolmogorov’s superpositions,” Neural Networks 9(5), 765–772 (1996). (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) Figure 15. (a)(b)(c)(d) Wavelet decomposition of lena (low frequencies image, horizontal, diagonal and vertical image of details, respectively). (e) Direct reconstruction using (a)(b)(c)(d). (f)(g)(h)(i) Corresponding reconstructions using Igelnik’s network with 1740 tiles/layer, and (j) the final reconstruction. (k)(l)(m)(n) Reconstruction of (a)(b)(c)(d) with different tilage densities, and (o) the final reconstruction. [3] Sprecher, D. A., “A numerical implementation of Kolmogorov’s superpositions ii,” Neural Networks 10(3), 447–457 (1997). [4] Igelnik, B. and Parikh, N., “Kolmogorov’s spline network,” IEEE transactions on neural networks 14(4), 725–733 (2003). [5] Sprecher, D. A., “An improvement in the superposition theorem of Kolmogorov,” Journal of Mathematical Analysis and Applications 38, 208–213 (1972). [6] Hecht-Nielsen, R., “Kolmogorov’s mapping neural network existence theorem,” Proceedings of the IEEE International Conference on Neural Networks III, New York , 11–13 (1987). [7] Braun, J. and Griebel, M., “On a constructive proof of Kolmogorov’s superposition theorem,” Constructive approximation (2007). [8] Brattka, V., “Du 13-ième problème de Hilbert à la théorie des réseaux de neurones : aspects constructifs du théorème de superposition de Kolmogorov,” L’héritage de Kolmogorov en mathématiques. Éditions Belin, Paris. , 241–268 (2004). [9] Igelnik, B., Pao, Y.-H., LeClair, S. R., and Shen, C. Y., “The ensemble approach to neural-network learning and generalization,” IEEE Transactions on Neural Networks 10, 19–30 (1999). [10] Igelnik, B., Tabib-Azar, M., and LeClair, S. R., “A net with complex weights,” IEEE Transactions on Neural Networks 12, 236–249 (2001). [11] Köppen, M., “On the training of a Kolmogorov Network,” Lecture Notes in Computer Science, Springer Berlin 2415, 140 (2002). [12] Lagunas, M. A., Pérez-Neira, A., Nájar, M., and Pagés, A., “The Kolmogorov Signal Processor,” Lecture Notes in Computer Science, Springer Berlin 686, 494–512 (1993). [13] Moon, B., “An explicit solution for the cubic spline interpolation for functions of a single variable,” Applied Mathematics and Computation 117, 251–255 (2001).

Kolmogorov Superposition Theorem and its application to wavelet

Related documents

Products

Support

Kolmogorov Superposition Theorem and its application to wavelet

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib