Kolmogorov Superposition Theorem and its application to wavelet

advertisement
Kolmogorov Superposition Theorem and its application to
wavelet image decompositions
Pierre-Emmanuel Leni, Yohan D. Fougerolle, and Frédéric Truchetet
Université de Bourgogne, Laboratoire LE2I, UMR CNRS 5158,
12 rue de la fonderie, 71200 Le Creusot, France
ABSTRACT
This paper deals with the decomposition of multivariate functions into sums and compositions of monovariate
functions. The global purpose of this work is to find a suitable strategy to express complex multivariate functions
using simpler functions that can be analyzed using well know techniques, instead of developing complex Ndimensional tools. More precisely, most of signal processing techniques are applied in 1D or 2D and cannot easily
be extended to higher dimensions. We recall that such a decomposition exists in the Kolmogorov’s superposition
theorem. According to this theorem, any multivariate function can be decomposed into two types of univariate
functions, that are called inner and external functions. Inner functions are associated to each dimension and
linearly combined to construct a hash-function that associates every point of a multidimensional space to a
value of the real interval [0, 1]. Every inner function is the argument for one external function. The external
functions associate real values in [0, 1] to the image by the multivariate function of the corresponding point of
the multidimensional space.
Sprecher, in Ref. 1, has proved that internal functions can be used to construct space filling curves, i.e. there
exists a curve that sweeps the multidimensional space and uniquely matches corresponding values into [0, 1]. Our
goal is to obtain both a new decomposition algorithm for multivariate functions (at least bi-dimensional) and
adaptive space filling curves. Two strategies can be applied. Either we construct fixed internal functions to obtain
space filling curves, which allows us to construct an external function such that their sums and compositions
exactly correspond to the multivariate function; or the internal function is constructed by the algorithm and is
adapted to the multivariate function, providing different space filling curves for different multivariate functions.
We present two of the most recent constructive algorithms of monovariate functions. The first method is
due to Sprecher (Ref. 2 and Ref. 3). We provide additional explanations to the existing algorithm and present
several decomposition results for gray level images. We point out the main drawback of this method: all the
function parameters are fixed, so the univariate functions cannot be modified; precisely, the inner function
cannot be modified and so the space filling curve. The number of layers depends on the dimension of the
decomposed function. The second algorithm, proposed by Igelnik in Ref. 4, increases the parameters flexibility,
but only approximates the monovariate functions: the number of layers is variable, a neural networks optimizes
the monovariate functions and the weights associated to each layer to ensure convergence to the decomposed
multivariate function.
We have implemented both Sprecher’s and Igelnik’s algorithms and present the results of the decompositions
of gray level images. There are artifacts in the reconstructed images, which leads us to apply the algorithm on
wavelet decomposition images. We detail the reconstruction quality and the quantity of information contained
in Igelnik’s network.
Keywords: Kolmogorov superposition theorem, multidimensional function decomposition, neural network, signal processing, image analysis, wavelets
Further author information:
pierre-emmanuel.leni@u-bourgogne.fr
yohan.fougerolle@u-bourgogne.fr
frederic.truchetet@u-bourgogne.fr
1. INTRODUCTION
In 1900, Hilbert has conjectured 23 mathematical problems. Amongst them, the 13rd states that high order
equations cannot be solved by sums and compositions of bivariate functions. 57 years later, Kolmogorov demonstrated his superposition theorem (KST) and proved the existence of monovariate functions, such that every
multivariate functions can be expressed as sums and compositions of monovariate functions. Unfortunately,
Kolmogorov did not propose any construction method of these monovariate functions.
The superposition theorem, reformulated and simplified by Sprecher in Ref. 5, can be written as:
Theorem 1.1 (Kolmogorov superposition theorem). Every continuous function defined on the identity
hypercube ([0, 1]d noted I d ) f : I d −→ R can be written as sums and compositions of continuous monovariate
functions:

P2d
 f (x1 , ..., xd ) = n=0 gn ξ(x1 + na, ..., xd + na)
(1)
Pd

ξ(x1 + na, ..., xd + na) = i=1 λi ψ(xi + an),
with ψ continuous function, λi and a constants. ψ is called inner function and g(ξ) external function. The
inner function ψ associates every component xi from the real vector (x1 , ..., xd ) of I d to a value in [0, 1]. The
function ξ associates each vector (x1 , ..., xd ) ∈ I d to a number yn from the interval [0, 1]. These numbers yn are
the arguments of functions gn , that are summed to obtain the function f .
There are two different parts in this decomposition: components xi , i ∈ J1, dK of each dimension are combined
into a real number by a hash function (the inner function ξ), that is associated to corresponding value of f
for these coordinates by the external function g. Hecht-Nielsen has shown in Ref. 6 that the KST-constituting
monovariate functions can be organized has a one hidden layer neural network.
Figure 1. Representation of the KST as a one hidden layer neural network, from Ref. 3.
Most recent contributions related to KST broach the problem of the monovariate function construction.
Sprecher has proposed an algorithm for exact monovariate function reconstruction in Ref. 2 and Ref. 3, that
introduces fundamental notions (such as tilage). See Figure 3 for a representation of an internal function ξ
defined by Sprecher in 2D. On the other side, Igelnik has presented in Ref. 4 an approximating construction
that offers flexibility and modification perspectives over the monovariate function construction. Precisely, using
Igelnik’s approximation network, the image can be represented as a superposition of layers, i.e. a superposition
of images with a fixed resolution, depending on the size of the tiles. The size of inner functions is fixed, and the
sizes of external functions depend on number of tiles: images are decomposed into monovariate functions, and
the complete size of the decomposition is determined by the precision of the approximation. We study the global
size of the decomposition network to determine the compression properties of Igelnik’s decomposition, i.e. for an
image with a given number of pixels to decompose, how many numbers are required to define Igelnik’s network?
We show that using large tiles, which reduces Igelnik’s network size, produces artifacts on reconstructed images.
We propose to apply Igelnik’s approximation on wavelets image decomposition to reduce the artifacts number
in reconstructed images.
The structure of the paper is as follows: we present Sprecher’s algorithm in section 2, and Igelnik’s algorithm
in section 3. In section 4, we present the results of Igelnik’s algorithm applied to the images of a wavelet
decomposition. In the last section, we conclude and consider several research perspectives.
Our contributions are a short review of Sprecher’s algorithm and a synthetic explanation of Igelnik’s algorithm.
We present the results of the application of both algorithms to gray level images. We point out the lack of
flexibility of Sprecher’s algorithm and focus our study on Igelnik’s network. Considering the error reconstruction
of this approximation, we apply Igelnik’s network to ”simpler” images: wavelet decomposition images. We
characterize the reconstruction quality as a function of tiles density.
2. SPRECHER’S ALGORITHM
We present the algorithm proposed by Sprecher in Ref. 2 and Ref. 3, in which he presents an exact algorithm for
internal and external functions construction, respectively. In Ref. 7, Braun and al. have pointed out that the
internal function ψ defined by Sprecher was discontinuous for several values. They provide a new definition that
ensures continuity and monotonicity of the function ψ. The construction of internal functions ψ relies on a real
number decomposition dk in a given base γ.
Definition 2.1 (Notations).
ˆ d is the dimension, d > 2.
ˆ m is the number of tilage layers, m > 2d.
ˆ γ is the base of the variables xi , γ > m + 2.
ˆ a=
1
γ(γ−1)
is the translation between two layers of tilage.
ˆ λ1 = 1 and for 2 6 i 6 d, λi =
the argument of function g.
1
r=1 γ (i−1)(dr −1)/(d−1)
P∞
are the coefficients of the linear combination, that is
ˆ every decimal number (noted dk ) in [0, 1] with k decimals can be written: dk =
Pk
dk + n r=2 γ −r defines a translated dk .
Using the dk defined in definition

f or k


f or k
ψk (dk ) =


f or k
Pk
r=1 ir γ
−r
, and dnk =
2.1, Braun and al. define the function ψ by:
= 1 : dk
> 1 and ik < γ − 1 : ψk−1 (dk −
> 1 and ik = γ − 1 :
1
2 (ψk (dk
−
ik
γk )
1
)
γk
+
γ
ik
dk −1
d−1
+ ψk−1 (dk +
(2)
1
)).
γk
Function ψ is applied to each component xi of the input value. The ψ(xi ) values are linearly combined with
real numbers λi to compute function ξ. Figure 2 represents the plot of function ψ on the interval [0, 1] and
figure 3 represents the function ξ on the space [0, 1]2 : to every couple in [0, 1]2 , a unique value in the interval
[0, 1] is associated.
Sprecher has demonstrated that the image of disjoint intervals I are disjoint intervals ψ(I). This separation property generates intervals that constitute an incomplete tilage of [0, 1]. This tilage is extended to a
d-dimensional tilage by making the cartesian product of the intervals I. In order to cover the entire space, the
tilage is translated several times by a constant a, which produces the different layers of the final tilage. Thus, we
obtain 2d + 1 layers: the original tilage constituted of the disjoint hypercubes having disjoint images through ψ,
and 2d layers translated by a along each dimension. Figure 4 represents a tilage section of a 2D space: 2d + 1 = 5
different superposed tilages can be seen, displaced by a.
Figure 2. Plot of the function ψ for the base γ = 10,
from Ref. 7.
Figure 3. Plot of the hash function ξ for dimension d = 2
and base γ = 10, from Ref. 8.
Figure 4. Section of the tilage for a 2D space and a base
γ = 10 (5 different layers), from Ref. 8.
Figure 5. Function ξ associates each paving block (hypercube) to a unique interval Tk in [0, 1].
For a 2D space, a hypercube is associated with a couple dkr = (dkr 1 , dkr 2 ). Each hypercube Skr (dkr ) is uniquely
associated to an interval Tkr (dkr ) by the function ξ, see Figure 5.
A unique set of internal functions ψ and ξ is generated for every function f . Only the external functions
gn are adapted to each function to decompose. External functions gn are not explicitly defined: an algorithm
iteratively constructs approximations gnr in three steps, such that their sum converges to the external function
gn . The first step is dedicated to precision kr determination, second step to the computation of the internal
functions ψ and ξ, and the third step to partial external functions gnr and fr . Every iteration r is computed at
a precision kr . The constructed approximations gnr depend on the approximation error function fr , that tends
to 0 when r increases. Consequently, the more iterations, the better is the approximation of the function f by
functions gn . We invite the interested reader to refer to Ref.3 for a complete description of the algorithm.
2.1 Results
We apply Sprecher’s algorithm to gray levels images (considered as bivariate functions of the form f (x, y) =
I(x, y)) and present the reconstruction results. The algorithm constructs images that correspond to layers of
external functions, that are summed to obtain the final image. Each layer is a sum of approximations at a
given accuracy r, obtained at each iteration of the algorithm. Figure 6 represents two layers obtained after
two iterations and figure 7 shows two reconstructions of the same original image, using layer obtained after one
and two iterations of Sprecher’s algorithm: few differences exist between one and two iterations approximations,
which illustrates that the reconstruction rapidly converges to the original image.
3. IGELNIK’S ALGORITHM
The parameters of Sprecher’s algorithm are fixed and cannot be adjusted: e.g. the number of layers and the
inner function definition. Igelnik’s algorithm provides greater flexibility at the expense of reconstruction error:
(a)
(b)
Figure 6. Reconstruction after two iterations of the first layer (a) and last layer (b)
(a)
(b)
(c)
Figure 7. (a) Original image. (b) and (c) Reconstruction after one and two iterations, respectively.
the monovariate functions are approximated. The original equation 1.1 has been adapted to:
f (x1 , ..., xd ) ≃
N
X
n=1
an g n
X
d
i=1
λi ψni (xi )
(3)
For a given layer n, d inner functions ψni are randomly generated: one per dimension (index i) and per layer
(index n), independently from function f . The convex combination of these internal functions ψni with real values
λi is the argument of external function gn , choosing real numbers λi (one per dimension) linearly independent,
P
strictly positive and such that di=1 λi 6 1. Finally, external functions gn are constructed. To conclude layer
construction, the functions ψ and g are sampled with M points, that are interpolated by cubic splines. Each
layer is weighted by a coefficients an and summed to approximate the multivariate function f .
One of the joint concept with Sprecher is the definition of a tilage, that is defined once for all at the beginning
of the algorithm. The tilage is constituted with hypercubes Cn obtained by cartesian product of the intervals
In (j), defined as follows:
Definition 3.1.
∀n ∈ J1, N K, j > −1, In (j) = [(n − 1)δ + (N + 1)jδ, (n − 1)δ + (N + 1)jδ + N δ],
where δ is the distance between two intervals I of length N δ, such that the function f oscillation is smaller than
on each hypercube C. Values of j are defined such that the previously generated intervals In (j) intersect the
interval [0, 1]. Figure 8 illustrates such a construction of intervals I.
1
N
Figure 8. From Ref.4, intervals I1 (0) and I1 (1) for N = 4.
3.1 Inner functions construction
Each function ψni is defined as follows: generate a set of j distinct numbers ynij , between ∆ and 1−∆, 0 < ∆ < 1,
such that the oscillations of the interpolating cubic spline of ψ values on the interval δ is lower than ∆. j is
given by definition 3.1. The real numbers ynij are sorted, i.e.: ynij < ynij+1 . The image of the interval In (j)
by function ψ is ynij . This discontinuous inner function ψ is sampled by M points, that are interpolated by a
cubic spline. We obtain two sets of points: points located on plateaus over intervals In (j), and points M ′ located
between two intervals In (j) and In (j + 1), that are randomly placed. Points M ′ are optimized during the neural
network construction, using a stochastic approach. Figure 9(a) represents final function ψ on the interval [0, 1],
and figure 9(b) illustrates a construction example of function ψ for two consecutive intervals In (j) and In (j + 1).
Pd
Once functions ψni are constructed, the function ξn (x) = i=1 λi ψni (x) can be evaluated. On hypercubes
P
Cnij1 ,...,jd , the function ξ has constant values pnj1 ,...,jd = di=1 λi yniji . Every random number yniji generated
verifies that the generated values pniji are all different, ∀i ∈ J1, dK, ∀n ∈ J1, N K, ∀j ∈ N, j > −1.
(a)
(b)
Figure 9. (a) Example of function ψ sampled by 500 points that are interpolated by a cubic spline. (b) From Ref.4, plot
of ψ. On the intervals In (j) and In (j + 1), ψ has constant values, respectively ynij and ynij+1 .
3.2 External function construction
The function gn is defined as follows:
ˆ For every real number t = pn,j1 ,...,jd , function gn (t) is equal to the N th of values of the function f at the
center of the hypercube Cnij1 ,...,jd , noted bn,j1 ,...,jd , i.e.: gn (pn,j1 ,...,jd ) = N1 bn,j1 ,...,jd .
ˆ The definition interval of function gn is extended to all t ∈ [0, 1]. Consider A(tA , gn (tA )) and D(tD , gn (tD ))
two adjacent points, where tA and tD are two levels pn,j1 ,...,jd . Two points B et C are placed in A and D
n (tB )
.
neighborhood, respectively. Points B and C are connected with a line defined with a slope r = gn (tCtC)−g
−tB
Points A(tA , gn (tA )) and B(tB , gn (tB )) are connected with a nine degree spline s, such that: s(tA ) = gn (tA ),
s(tB ) = gn (tB ), s′ (tB ) = r, and s(2) (tB ) = s(3) (tB ) = s(4) (tB ) = 0. Points C and D are connected with a
similar nine degree spline. The connection condition at points A and D of both nine degree splines gives
the remaining conditions. Figure 10(a) illustrates this construction, whereas (b) gives a complete overview
of the function gn for a layer.
Remark 1. Points A and D (values of function f at the centers of the hypercubes) are not regularly spaced on
the interval [0, 1], since their abscissas are given by function ξ, and depend on random values ynij ∈ [0, 1]. The
placement of points B and C in the circles centered in A and D must preserve the order of points: A, B, C, D,
i.e. the radius of these circles must be smaller than half of the length between the two points A and D.
The external function has a noisy shape, which is related to the global sweeping scheme of the image: Sprecher
and al. have demonstrated in Ref.1 that using internal functions, space-filling curves can be defined. Function
ξ associates a unique real value to every couple (dkr 1 , dkr 2 ) of the multidimensional space [0, 1]d . Sorting these
real values defines a unique path through the tiles of a layer: the space filling curve. Figure 11 illustrates an
example of such a curve: the pixels are swept without any neighborhood property conservation.
(a)
(b)
Figure 10. (a) From Ref.4, plot of gn . Points A and D are obtained with function ξ and function f . (b) Example of
function gn for a complete layer of lena decomposition.
Figure 11. Igelnik’s space filling curve.
3.3 Neural network stochastic construction
Igelnik defines parameters during construction that are optimized using a stochastic method (ensemble approach):
the weights an associated to each layer, and the placement of the sampling points M ′ of inner functions ψ that
are located between two consecutive intervals.
To evaluate the network convergence, three sets of points are constituted: a training set DT , a generalization set
DG , and a validation set DV .
N layers are successively built. To add a new layer, K candidate layers are generated with the same plateaus
ynij , which gives K new candidate neural networks. The difference between two candidate layers is the set of
sampling points located between two intervals In (j) and In (j + 1), that are randomly chosen. We keep the layer
from the network with the smallest mean squared error that is evaluated using the generalization set DG . The
weights an are obtained by minimizing the difference between the approximation given by the neural network
and the image of function f for the points of the training set DT . The algorithm is iterated until N layers are
constructed. The validation error of the final neural network is determined using validation set DV .
To determine coefficients an , the difference between f and its approximation f˜ must be minimized:


f (x1,1 , ..., xd,1 )
,
...
kQn an − tk , noting t = 
f (x1,P , ..., xd,P )
(4)
with Qn a matrix of column vectors qk , k ∈ J0, nK that corresponds to the approximation (f˜) of the k th layer for
points set (x1,1 , ..., xd,1 ), ..., (x1,P , ..., xd,P ) of DT :

f˜k (x1,1 , ...xd,1 )
.
...
Qn = [q0 , q1 , ..., qn ], with ∀k ∈ [0, ..., n], qk = 
˜
fk (x1,P , ...xd,P )

An evaluation of the solution Q−1
n t = an is proposed by Igelnik in Ref.9. The coefficient al of the column
vector (a0 , ..., an )T is the weight associated to layer l, l ∈ J0, nK.
3.4 Results
We have applied Igelnik’s algorithm to gray level images. In this example, the original image is a 100x100 pixels
lena, decomposed into 5 layers, with about 1740 tiles each, that are summed to obtain the final approximation.
Considering the translation of the different layers, it means that only about 6250 different pixels over 10000
are used to construct the network. The reconstruction PSNR is 22.29dB for the approximation using optimized
weights, and 22.20dB without. The result of such approximation is represented by figure 13, and figure 12 details
two layers: first (N = 1) and last (N = 5) layer. These results show that the network optimization is not efficient
enough: the optimization does not significantly improve the convergence, and the artifacts that can be seen (see
figure 13) for example) are located between the intervals In , where the internal functions are adapted. In other
words: artifacts appear between tiles and cannot efficiently be removed only by optimizing M ′ points and weights
an .
(a)
(b)
Figure 12. Two decomposition layers: (a) First layer. (b) Last layer.
(a)
(b)
(c)
Figure 13. (a) Original image. (b) Igelnik’s approximation for N = 5 and identical weight for every layer. (c) Igelnik’s
approximation for N = 5 and optimized weights an .
4. KST AND WAVELET DECOMPOSITION
Section 3.4 illustrates a reconstruction result and shows the emergence of artifacts, using Igelnik’s approach.
Igelnik’s monovariate functions rely on the size of the tiles, i.e. the quantity of tiles/layer. The definition of
external functions implies that some pixels of the image are not used to define the network: depending on the
tilage, the decomposed image is defined by fewer pixels, which can induce a lossy compression: we study the
quality of the reconstruction as a function of tiles size. We propose to apply Igelnik’s decomposition to wavelet
decomposition images, which are simplified images, especially for high frequency images. Reconstruction artifacts
in high frequency images affect the details of the original image: using adapted tile sizes for the low and high
frequencies images will improve reconstruction error and decrease the number of artifacts.
To characterize this compression, the size of the Igelnik’s network has to be compared to the size of the
original image: to decompose a 100x100 pixels image, how many doubles must be stored to define the Igelnik’s
network? Igelnik’s network is characterized by internal functions ψ (one per dimension and layer), dimensional
constants λi and external functions gn (one per layer). The size of Igelnik’s network is the size of the couple
consisting of internal functions (per dimension) and the external function, multiplied by the number of layers.
The size of internal functions ψ is related to the number of sampling points M : most of the sampling points are
located on plateaus, i.e. groups of points with the same values can be constituted, which means that the size of
internal function ψ can be reduced to the number of points on a plateau (and its value yn ij), and the values of
points located between two consecutive intervals In (about 10% of the M sampling points). Furthermore, the
larger the tiles, the smaller M .
The size of external function gn is determined by the number of points that correspond to the centers of tiles,
which is equal to the number of tiles per layer. The size of the tiles can be adjusted in Igelnik’s algorithm, i.e.
the number of tiles of a layer, which also determines how many pixels of the original image are used to construct
a layer.
To decrease the size of Igelnik’s network, the size of external functions gn has to be reduced, so the number
of tiles. The size of the edge of a tile is N × δ and the gap between two consecutive intervals is δ. To decrease
the number of tiles per layer, δ or the number of layers N have to be increased.
Increasing δ produces larger gaps between consecutive intervals In , where the internal function is optimized: the
reconstruction error grows.
Increasing N adds new couples of internal and external functions, which augments the complete size of the
network, but reduces the distance where internal functions are optimized: the reconstruction error decreases.
The result of one level wavelet image decomposition can be represented with 4 sub-images: a low frequencies
image and 3 high frequencies images for horizontal, diagonal, vertical details. We consider a Haar wavelets
decomposition. We apply Igelnik’s algorithm on these decomposition images, varying the size of the tilage. The
original image is a 200x200 pixels image, so the 4 decompositions sub-images are 100x100 pixels. We use a 5
layers network. We reconstruct the wavelet images and then the original image, to determine the PSNR. The
results are presented in table 1 and figure 14. We give an example of this decomposition with figure 15(a)(j), that corresponds to 1740 tiles/layer in Igelnik’s network. These results show that Igelnik’s decomposition
produces artifacts in the low frequencies image that are present in final reconstruction. They also confirm that
weight optimization is inefficient: with small tiles (one per pixel for first table line), reconstruction is more
accurate without weighted layers, and the improvement brought by weight optimization with less than 3090
tiles/layer (right part of the curve) is negligible. To improve reconstruction, we can combine different tilage sizes
to reduce the impact of the Igelnik’s decomposition artifacts that appear with largest tiles. We decomposed
the low frequencies image using small tiles to obtain 10000 tiles/layer (every pixels are used to construct the
network), and 280 tiles/layer for images of details (only about 1540 pixels are used to construct the network).
The reconstruction PSNR is 25.90dB. Using 1110 tiles/layer for images of details, we increase reconstruction
PSNR to 26.57dB, which is illustrated on figure 15(k)-(o).
5. CONCLUSION AND PERSPECTIVES
We have dealt with the problem of multivariate function decompositions into monovariate functions. To construct
monovariate functions, we have presented two algorithms: Sprecher’s algorithm, that provides an exact reconstruction by strictly reproducing the theorem organization scheme; and Igelnik’s algorithm, that approximates
the monovariate functions, which generates worse reconstruction errors than Sprecher but improves flexibility
over parameters and monovariate functions definitions. We also showed that Igelnik’s method can be adapted
to control the size of monovariate functions, and that the network optimization was not sufficient to removed
number of
tiles/layer
10000
8570
6940
3090
1740
1110
280
PSNR(dB) - optimized
layer weights
25.28
26.16
24.72
23.87
22.25
22.62
19.37
PSNR(dB) - no
layer weight
52.09
27.21
23.41
23.87
22.02
22.58
19.15
Table 1. PSNR of image reconstruction.
Figure 14. PSNR of image reconstruction vs. number of tiles. Dot line is for optimized weights. Continuous line is for
un-weighted layers.
artifacts from reconstructed images. We showed that changing tile size was leading to only partial use of original
image information, which can be considered as a compression process: we studied Igelnik’s network size, and
concluded that to reduce the size of Igelnik’s network, the number of tiles per layer has to be reduced, which
increases reconstruction error.
In section 3.2, we introduced space filling curves. Igelnik’s internal functions are not strictly increasing:
function ξ has a constant value over a tile: the points contained on a tile can be swept in any order, which is
shown in figure 11. Consequently, two neighbor points are not always associated to consecutive values through
function ξ: an homogeneous area of the image is not transposed to a unique interval in the external function
g. The noisy shape of the external function implies that an indetermination in abscissa determination implies a
larger indetermination through external function g. Control image sweeping to avoid edge crossing can reduce
external function oscillation, which reduces the impact of location error through internal functions.
In extension, image of details obtained from wavelet decomposition have a large number of pixels equals to (0).
Controlling the sweep of the image allows to scan all the black pixels separately from non-black pixels, which
also simplifies external functions by introducing an interval equals to zero.
Considering the compression process, another research perspective concerns monovariate functions compression. We only considered a naive compression of internal functions in our approach. An effective monovariate
compression provides flexibility over tilage density, that can be increased to improve reconstruction.
REFERENCES
[1] Sprecher, D. A. and Draghici, S., “Space-filling curves and Kolmogorov superposition-based neural networks,” Neural Networks 15(1), 57–67 (2002).
[2] Sprecher, D. A., “A numerical implementation of Kolmogorov’s superpositions,” Neural Networks 9(5),
765–772 (1996).
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
Figure 15. (a)(b)(c)(d) Wavelet decomposition of lena (low frequencies image, horizontal, diagonal and vertical image
of details, respectively). (e) Direct reconstruction using (a)(b)(c)(d). (f)(g)(h)(i) Corresponding reconstructions using
Igelnik’s network with 1740 tiles/layer, and (j) the final reconstruction. (k)(l)(m)(n) Reconstruction of (a)(b)(c)(d) with
different tilage densities, and (o) the final reconstruction.
[3] Sprecher, D. A., “A numerical implementation of Kolmogorov’s superpositions ii,” Neural Networks 10(3),
447–457 (1997).
[4] Igelnik, B. and Parikh, N., “Kolmogorov’s spline network,” IEEE transactions on neural networks 14(4),
725–733 (2003).
[5] Sprecher, D. A., “An improvement in the superposition theorem of Kolmogorov,” Journal of Mathematical
Analysis and Applications 38, 208–213 (1972).
[6] Hecht-Nielsen, R., “Kolmogorov’s mapping neural network existence theorem,” Proceedings of the IEEE
International Conference on Neural Networks III, New York , 11–13 (1987).
[7] Braun, J. and Griebel, M., “On a constructive proof of Kolmogorov’s superposition theorem,” Constructive
approximation (2007).
[8] Brattka, V., “Du 13-ième problème de Hilbert à la théorie des réseaux de neurones : aspects constructifs du
théorème de superposition de Kolmogorov,” L’héritage de Kolmogorov en mathématiques. Éditions Belin,
Paris. , 241–268 (2004).
[9] Igelnik, B., Pao, Y.-H., LeClair, S. R., and Shen, C. Y., “The ensemble approach to neural-network learning
and generalization,” IEEE Transactions on Neural Networks 10, 19–30 (1999).
[10] Igelnik, B., Tabib-Azar, M., and LeClair, S. R., “A net with complex weights,” IEEE Transactions on
Neural Networks 12, 236–249 (2001).
[11] Köppen, M., “On the training of a Kolmogorov Network,” Lecture Notes in Computer Science, Springer
Berlin 2415, 140 (2002).
[12] Lagunas, M. A., Pérez-Neira, A., Nájar, M., and Pagés, A., “The Kolmogorov Signal Processor,” Lecture
Notes in Computer Science, Springer Berlin 686, 494–512 (1993).
[13] Moon, B., “An explicit solution for the cubic spline interpolation for functions of a single variable,” Applied
Mathematics and Computation 117, 251–255 (2001).
Download