Uploaded by panakino

trainable block transform

advertisement
Trainable Block Transform
Kyong Hwan Jin
Dept. of Electrical Engineering & Computer Science
Image Processing Lab.
1
Overview
■ Introduce Block Transform Coding
■ Deep Block Transform
■ Trainable block transform
■ Applied to Autoencoders
■ Summary
2
Block Transform | Basics
■ Block transform (transform coding)
*Bernd Girod: EE368b Image and Video Compression
■ Variable block transform coding in Video codec domain (HEVC)
Schwarz, Heiko, Thomas Schierl, and Detlev Marpe. "Block structures and parallelism features in
HEVC." High Efficiency Video Coding (HEVC). Springer, Cham, 2014. 49-90.
3
Previous researches
■ JPEG Encoding : block splitting -> DCT 8x8
https://www.edn.com/baseline-jpeg-compression-juggles-image-quality-and-size/
4
Previous researches
5
Previous researches
6
Equality between convolution and block transform
■ We discover a block transform from a convolutional layer of a stride
≥ 2 and kernel size ≥ stride
7
Equality between convolution and block transform
■ s : stride, k: 2x2 kernel, x: input discrete signal
■ No overlap happens when stride==kernel size
>> becomes a block transform
>> trainable block transform when we use trainable convolution
layer with the same stride and kernel size
8
Padding-free backpropagation
Typical convolution layer
9
Padding-free backpropagation
Proposed block transform
10
Padding-free backpropagation
Typical convolution layer
Proposed block transform
11
Padding-free backpropagation
a block Toeplitz matrix
a matrix producing zero-padded vector
Local gradient for C
additional errors by inserting zero-rows (4I + 2HI + 2W I)
12
Padding-free backpropagation
a transposed block Toeplitz matrix
a block Toeplitz matrix
Local gradient for Ba and Bb
are non-overlapped convolution/transposed convolution matrices (Ba and Bb), so it is fullrank leading to no additional errors coming from a rank-deficient matrix, such as Z.
13
DC term (zero-frequency) removal
■ In K-SVD, they set an atom consisting of ones > averaging values in a
DC basis in DCT
patch > DC term
■ In DCT, DC basis is present
■ Discrete wavelet transform
go beyond with residuals subtracted from the output of scaling
function >> do not consider a DC term in further dyadic level
14
Nonlinear activations
■ CLIP
■ Ablation studies
15
Architecture of deep block transform
16
Autoencoders
"Autoencoding" is a data compression algorithm where the compression and decompression functions are
1) data-specific, 2) lossy, and 3) learned automatically from examples rather than engineered by a human.
Additionally, in almost all contexts where the term "autoencoder" is used, the compression and
decompression functions are implemented with neural networks.
https://blog.keras.io/building-autoencoders-in-keras.html
17
Experiments
■ Make autoencoders with the trainable block transform
■ AE: Cv(st:1,kr:3), M(st:2), Cv(st:1,kr:3), M(st:2), Cv(st:1,kr:3),
B(st:2), Cv(st:1,kr:3), B(st:2), Cv(st:1,kr:3,c:1), M: maxpooling,
B:bilinear interpolation
■ BTN-K-S: Cv(st:S,kr:K), Cv(st:S,kr:K), CvT (st:S,kr:K), CvT
(st:S,kr:K,c:1)
■ BTN/DC-2-2 = proposed trainable block transform
18
Experiments - dataset
■ Study 1
■ 64×64 numerical images which had 4 lines with different widths at every boundary. Six
generated images were ternary (0, 0.5, 1)
■ L2 loss, 50 epochs, Adam with 10-3
■ Fashion MNIST
■ L2 loss, batch size : 32, 10 epochs, Adam with 10-3
■ Splitting ratio : training and validating sets was 5:1
■ BSDS500
■ Processing Luma channel only
■ L2 loss, batch size : 8, 100 epochs, Adam with 10-3
■ Splitting ratio : training and validating sets was 4:1
19
Experiment - Study 1
■ Baselines
ü
20
Experiment – Fashion MNIST
ü
21
Experiment - summary
22
Experiment – BSDS500
■ LC:8, C:16
ü
23
Experiment – BSDS500
■ BSDS 500 (LC:8)
24
Conclusion
■ We discover a trainable block transform from convolutional network
with stride>1 and kernel size=stride
■ We apply a trainable block transform to autoencoders, and obtain
superior representation than a normal autoencoder based on 3x3
kernel and stride 1
■ We observe that with simple changes, convolutional neural networks
become trainable block transform
25
Download