paper_report

advertisement
Training-Based Super Resolution Enhancement using CUDA
Yang Yi-Ming
Chang Shu-How
{R99944035,D99922013,R99944018}@ntu.edu.tw
Abstract
Learning-based super-resolution can reconstruct the
high resolution image results in high quality. However,
building a training-based super resolution system has a
huge computing complexity in times and spaces. In this
paper, we propose a GPU-based super resolution system
using Super Resolution Through Neighbor Embedding
(SRNE) and GPU-Brute Force (BF)KNN search method to
reduce the processing time.
SRNE is that the local geometries of image patched are
similar in two distinct HR and LR feature spaces. We use
BF KNN method to search training database the similar
image patch and reconstruct the HR training patches.
Experiments show that our results can improve more than
10x speed up computation times.
1. Introduction
Super resolution is the problem of generating a high
resolution image from one or more low-resolution images.
One of the most popular methods is “Super Resolution
through Neighbor Embedding”(SRNE)[1]. This paper is on
generating a high-resolution image (HR) from a single
low-resolution image (LR) with a set of many training data
images from scenes or different type. Chang et al.
proposed method that is generated a high resolution image
patch does not depend on only one of the nearest neighbors
in the training set.
In SRNE, there are some computational problems. For
example, it is very challenging to build result image in a
few minutes time, learning-based can‘t process online. In
this paper, we propose a training-based super resolution in
GPU parallel computing. Because SRNE has high parallel
in feature extraction, build training data, KNN search
method, grain matrix and reconstruct HR image. In KNN
search, we apply fast KNN search in GPU[2], this method
propose a CUDA implementation of the “brute force” KNN
search to solving this problem.
We consider SRNE in GPU computing that will have
some performance improvement. And build a large training
patch dataset will reconstruct high quality result images.
The rest of the paper is organized as follows. In Sect. 2
we discuss related works in super resolution and GPU
computing. And Sect. 3 we describe our SRNE with GPU
in details. We report the experiment results and compare
performance evaluation between CPU version and GPU
Chang Man-Chia
version SRNE in Sect 4. Finally, discuss and conclusion in
Sect. 5.
2. Related work
Resolution enhancement using smoothing or
interpolation method is widely used in image processing.
Smoothing is to use some filters such as Gaussian Filter and
Median Filter. Interpolation method is usually achieved by
using bilinear interpolation or bicubic interpolation, which
are easy to implement. However, both methods may cause
some artifacts.
Recently, many methods have been proposed. Some
methods make use of training images [5, 6] or require
strong priors [7]. Each of these methods in high resolution
image comes from only one nearest neighbor.
Learning-based super resolution can recover the high
resolution image in high quality. [2] use random projection
tree with manifold learning and use GPU-base knn search
to accelerate the procedure.
In the paper we implement, it proposes a method use
multiple training images to generate each image patch in
high resolution image. We find k most similar patches to
composite final high resolution patch given a specific
weighting vector. This property makes generalization over
the training samples is possible and hence fewer training
samples are required. Most important of all, the method can
be highly-parallelized using GPU.
3. System Architecture
3.1 System overview
The system architecture overview is show in Fig. 1.
There are two parts in this architecture, LLE processing in
left-side and training processing in right-side.
Fig.1 system flowchart
3.
In system flowchart detail, we show in Fig. 2, each
state has different color representation. Red state denote
compute in GPU, light blue state is the addition function in
SRNE method, brown means training data processing.
4.
Compute the reconstruction weights of the neighbors
that minimize the reconstruction error.
Compute the high-resolution embedding patch
q
Yt using the appropriate high-resolution featuresof the
Knearest neighbors and thereconstruction weights.
3.3 K-NN search using CUDA
K nearest neighbor (KNN) is a method for classifying
object based on closing training examples in feature space.
In our method, we use Brute Force KNN search[2] to find k
nearest LR patches in training data for each patch in query
image, fig. 3 is an example of KNN search problem
solution.
Fig. 3 Example graph of the kNN search problem in k =3
Red cross is query point, blue points are reference points
Fig. 2 system flowchart in detail
At the pre-processing stage, we apply color transform
and calculate patch complexity, a simple check
mechanism – patch complexity measurement, such as patch
mean and variance. Because we believe that some patches
are too smooth or no evidently different than neighbor
patches if those run into LLE process will be increased
more computing. In GPU computation, there widely
classify three categories, global memory, shared memory
and register. Global memory can store large data as video
card support, usually 128MB to 1GB in general market.
Shared memory is opposite, it’s local memory and only at
the same SM can use. So we need to reduce the memory
usage and more efficiency.
3.2 LLE processing
First, given some variable definition, an input of
low-resolution imageX t , we estimate the high resolution
image Yt from training data set of one or more
low-resolution image Xs and the corresponding
high-resolution images Ys .
As in LLE, it can be summarized as four parts:
1. Compute the input low-resolution patch feature vector.
2. Find the K nearest neighbor in KNN search method.
Let𝐑 = {π‘Ÿ1 , π‘Ÿ2 , π‘Ÿ3 , … , π‘Ÿπ‘š }be the set of training patches
feature vectors,𝐐 = {π‘ž1 , π‘ž2 , π‘ž3 , … , π‘žπ‘› } be the set of query
image patch feature vector. Our task is to search the k
nearest neighbors of each query patch given a specific
distance. In this project we use 2-norm distance as metrix.
In our implementation,we use brute (BF) search to
find KNN. The BF algorithm is the following:
1.Compute all the distance between π‘žπ‘– andπ‘Ÿπ‘— , j ∈
[1, m].
2.Sort the computed distance.
3.Choose the k training patches feature vector
corresponding to k smallest distance
4.Repeat step 1. to 3. For all query image patch
The two main parts BF method of KNN are
computation of distance and sorting. Each of these two
parts is highly-parallelizable. This property makes the BF
method perfectly suitable for CUDA implementation. The
computation of distance can be fully parallelized since the
distance between pairs of points are independent. The
sorting part can also be implemented in CUDA. Each
thread sorts all distances computed for a given query patch.
3.4 Neighbor embedding method
For each patch, in the low-resolution imageXt , first
compute the reconstruction weights of its neighbors in Xs
by minimizing the local reconstruction error. The HR
embedding is then estimated from the training image pairs
by preserving local geometry. At last, we overlap training
HR patches in the result image.
The neighbor embedding algorithm as follows:
q
1. For each patch Xt , in image Xt :
A. Find the set Nq of K nearest neighbors in Xs ,
using [2].
B. Compute the reconstruction weights of the
neighbors that minimize the error of
q
reconstructing Xt
q
C. Compute the high-resolution embedding Yt
using the appropriate high-resolution
features of the K nearest neighbors and the
reconstruction weights, example in fig.4.
2. Construct the target high-resolution image Yt by
enforcing local compatibility and smoothness
constraints between adjacent patches obtained
instep 1C.
Fig. 4 Neighbor embedding procedure applied to a
low-resolution patch for3X magnification
4
Fig 5.(a) build
Fig. 5(b) bilinear 2X
Fig. 5(c) SRNE 2X
4.3 Performance evaluation
In this section, we analyze the CPU and GPU processing
time in SRNE. The scaling factors are all 2 and the database
contains 46370 patches. Table 1 show the performance
between CPU and GPU version SRNE. There are four test
image dataset in this evaluation, original size from 50x50 to
171x120. The speed-up detail which show in Table 1.
Image
house
testing01
paint
butterfly
house01
Size
50x50
124x106
125x137
156x100
171x120
Table 1
CPU(s)
35.42
162.99
281.95
176.11
364.96
GPU(s)
1.93
7.55
9.67
8.87
11.42
Speed up
18.37
21.60
29.15
19.85
31.97
Results
4.1 Setup
The computer use for show results and evaluation
performance is an Intel Duo CPU3 GHz, with 4GB of RAM
memory. The graphic card we used is a NVIDIA GeForce
9600 GT with 2G memory and 64 multiprocessors.
Fig. 6(a) house
Fig. 6(b) Result(2x)
Fig. 7(a) testing01
Fig. 7(b) Result(2x)
4.2 Bilinear vs. SRNE with GPU
In results, we show some results between bilinear and
our SRNE method in GPU. Second, we give the processing
time between CPU and GPU version.
Fig. 5 shows the bilinear interpolation and our results.
We apply neighbor embedding to a small 3x3 patch from a
low-resolution house image in fig. 5(a). Other results show
in fig.6 to fig. 10.
5
Fig. 8(a) paint
Fig. 8(b)Result(2x)
Fig. 9(a) butterfly
Fig. 9(b) Result(2x)
Fig. 10(a) house01
Fig. 10(b) Result(2x)
Conclusion
In this paper, we propose a GPU-based super resolution
system using Super Resolution Through Neighbor
Embedding (SRNE) and GPU-Brute Force (BF) KNN
search. We show that using CUDA to do super resolution
can accelerate up to a factor between 20 and 30 compared
to CPU-base implementation. This improvement allows us
to get higher quality high resolution image with more
training images within short time. Besides, we speed up
whole procedure with pre-processing the image by
checking patch complexity. Image patches with lower
complexity represent its smoothness. We skip the CUDA
process and just do bilinear interpolation for these smooth
patches.
Future work is that we want to accelerate the linear
system solving part in CUDA since we just implement in
CPU currently.
References
[1] H. Chang, D.Y Yeung, Y. Xiong, “Super-Resolution
Through Neighbor Embedding”, CVPR 2004, pp.
275 – 282, 2004
[2] J. Pu, J. Zhang, P. Guo and X. Yuan, “Interactive
Super-resolution through Neighbor Embedding,”
ACCV 2009, pp. 496 – 505, 2009
[3] V. Garcia and E. Debreuve and M. Barlaud. “Fast K
Nearest Neighbor Search using GPU”, CVPR
Workshop on Computer Vision on GPU, June 2008
[4] V. Garcia. Ph.D. Thesis: Suivi d'objets d'intérêt dans
une séquence d'images : des points saillants aux
mesures statistiquesUniversité de Nice - Sophia
Antipolis, Sophia Antipolis, France, December 2008
[5] W.T. Freeman, T.R. Jones, and E.C. Pasztor.
“Example-based Super-Resolution”, Computer
Graphics and Applications, 22(2):56–65, 2002.
[6] W.T. Freeman and E.C. Pasztor, “Learning Low-level
vision”, ICCV 1999, volume 2, pages 1182–1189,
September 1999.
[7] R.R. Schultz and R.L. Stevenson, ”A Bayesian
Approach to Image Expansion for Improved
Definition”, IEEE Transactions on Image Processing,
3(3):233–242, 1994.
Download