Poster

advertisement
Super-Resolution of Remotely-Sensed Images Using a
Learning-Based Approach
Isabelle Bégin and Frank P. Ferrie
Abstract
Construction of the Training Set
Results
Super-resolution addresses the problem of estimating a high-resolution version of a
low-resolution image. In this research, a learning-based approach for super-resolution
is applied to remotely-sensed images. These images contain fewer sharp edges than
those previously used to test the method, making the problem more difficult. The
approach uses a Markov network to model the relationship between an image/scene
training pair (low-resolution/high-resolution pair). Bayesian belief propagation is used
to obtain the posterior probability of the scene, given a low-resolution input image.
Preliminary results on Landsat-7 images show that the framework works reasonably
well when given a sufficiently large training set.
The steps involved in the construction of the training set and in the computation of
the joint probabilities are the following:
1. Blur and subsample a set of high-resolution images. The set should be large
enough so as to include the widest range of intensity structures as possible.
2. Interpolate the images back to the original resolution.
3. Break the images and scenes into center-aligned patches.
(a)
4. Apply Principal Components Analysis on the training set.
5. Fit Gaussian mixtures to the joint probabilities P(xk,xj) (neighbouring scene patches
k and j) and P(yk,xk) (image/scene pair), as well as the prior probability of a scene
patch P(xj).
Problem
Is it possible to infer a high-resolution version of an image that is first given at low
resolution? Stronger approaches than simple interpolation use Bayesian techniques
or regularization while assuming prior probabilities or constraints. Furthermore, these
techniques are independent of both the structures present in the image and the
nature of the image itself. Freeman et al. propose that statistical relationships
between pairs of low-resolution and high-resolution images (image/scene pairs) can
be learned. Thus, given a new image to super-resolve (of the same nature as the
images in the training set), it is possible to obtain a higher-resolution image based on
the statistics acquired from the training pairs. The most important features learned by
the method are edges. The algorithm learns what a blurry edge should look like in its
corresponding high-resolution image. Thus, the training data need to contain many
sharp edges. The experiments described in Freeman et al. were done mostly on
such training sets. In the present research, the specific problem of super-resolution
of remotely-sensed images (with few sharp edges) is addressed.
Computation of the Conditional Probabilities
1. Break a new low-resolution (interpolated) image into patches and apply Principal
Components Analysis.
2. For each patch yk, find the n closest patches in the training set.
corresponding scenes will be used in the calculations.
Theory
The statistical dependencies between image/scene training pairs (low-resolution and
high-resolution images) are modeled in a Markov network where each node
describes an image or a scene patch. The topology of the network is shown in the
following diagram. Statistical dependencies are shown by lines connecting nodes.
Scene patches are depicted in white and image patches in red.
y1
y3
y2
y4
x1
x3
(c)
(d)
Figure 3. (a) Input image (bilinear interpolation) (b) Result of the super-resolution
algorithm (c) Original full-resolution image (d) Nearest neighbour interpolation
A new image was also blurred, subsampled and interpolated to form the input image.
The results of the super-resolution algorithm (Figure 3b) approximates the highresolution image reasonably well (Figure 3c). Edges are better preserved than the
interpolated version of the input image (Figure 3a). For comparison, the nearest
neighbour interpolation is shown in Figure 3d, where the edges appear more choppy.
However, thin structures (such as road networks) are lost in the super-resolution
process.
This problem might be caused by a small patch-size as well as an
incomplete training set.
The n
3. The conditional probabilities for each candidate are computed with the following
equations:
m
l
P
(
x
,
x
j
k)
l
m
(3)
P ( xk | x j ) 
m
P( x j )
l
P
(
y
,
x
l
k
k)
P ( y k | xk ) 
,
l
P ( xk )
(b)
(4)
where l and m run from 1 to n and are the candidates for patches xk and xj,
respectively. P(yk|xkl) is thus a nx1 vector and P(xkl|xjm) is a nxn matrix. The
messages Mjk are nx1 vectors whose elements are set to 1 for the first iteration.
4. The MAP estimate for patch j (xjMAP) is obtained from equations (1) and (2).
The advantage of this learning-based algorithm compared to other learning methods
is that local support is added in the form of conditional probabilities. In other
learning-based approaches, the closest patch in the training set is chosen
regardless of its neighbours. This Bayesian technique also has the advantage of
learning the prior probabilities instead of assuming them.
x2
Discussion
Many factors will influence the quality of the results.
• This framework works better with training images that contain many sharp edges.
This is not the case for the images used in this experiment. In the examples shown,
no pre-processing steps were done on the images. Results may be improved by first
enhancing the edges in the image.
• Because the framework is statistical, the size and completeness of the training set
will have a big impact on the result. In the experiment, the size of the training set was
small (350 x 350 image) and therefore might not contain the full variety of structures
possibly present in a remotely-sensed image.
•The fitting of Gaussian mixture models is also very important. Models were obtained
with the Expectation Maximization algorithm. The number of iterations, the structure
of the covariance and the number of clusters can all modify the computation of the
joint probabilities, which are the core of the framework.
• The size of the patches, the number of principal components and the number of
closest patches considered will also impact the final result.
x4
Figure 1: Markov network for the super-resolution problem
There are two types of connections. First, each scene patch is connected to its
neighbours. Second, a patch in the scene is connected to its corresponding patch in
the low-resolution image. Two nodes not directly connected are assumed to be
statistically independent.
The Markov assumption allows a “message-passing” rule that involves only local
computations. The maximum a posteriori estimate (MAP) for node j is calculated
using the following equation (Freeman et al.):
x j MAP  argmax P( x j ) P( y j | x j ) M kj ,
xj
where
M
k
j
The goal of this experiment was to test whether a learning-based approach for superresolution can be used on remotely-sensed images. These images contain fewer
sharp edges than those previously used to test the method. The training set is a pair
of blurred and non-blurred Landsat-7 images (Band #3). The 350x350 image was
first blurred with a Gaussian filter, subsampled and interpolated back to the original
resolution.
(1)
k
~l
 max P( xk | x j ) P( yk | xk ) M k .
xk
Training
Conclusions
This research presented a learning-based method for super-resolution applied to
remotely-sensed images. A Markov network was used to express the statistical
dependencies between a low-resolution/high-resolution training pair. The maximum a
posteriori estimate of the scene was obtained by Bayesian propagation. Preliminary
results show that such an approach can be used on remotely-sensed images
provided that the training set is sufficiently large. Future work will include preprocessing of the images for the enhancement of contained edges as well as the use
of a more complete training set.
(2)
l j
~l
M k is the message in the previous iteration. In principle, this rule should only be
used on chains or trees. However, the topology of the Markov network for the superresolution problem contains loops. In this situation, it was previously demonstrated
that equations (1) and (2) can still be used. It is thus assumed that the network
consists of vertical or horizontal chains.
Reference
Figure 2: Training set
(left: sharp image; right: blurred image)
Freeman, W.T., Pasztor, E.C. and Carmichael, O.T., "Learning Low-Level Vision",
International Journal of Computer Vision, 40(1), pp. 25-47, 2000.
Download