No-reference image quality assessment using statistical wavelet-packet features Pattern Recognition Letters

1
Pattern Recognition Letters
journal homepage: www.elsevier.com
No-reference image quality assessment using statistical wavelet-packet features
Hadi Hadizadeha,∗∗, Ivan V. Bajićb,
a Quchan
b School
University of Advanced Technology
of Engineering Science, Simon Fraser University
ABSTRACT
In this paper an efficient no-reference (NR) image quality assessment (IQA) method is presented based
on the statistical features of subband coefficients in the wavelet-packet domain. The proposed method
is based on the hypothesis that potential distortions may alter the statistical characteristics of natural
un-distorted images. Hence, by characterizing the statistical properties of a given distorted image one
can identify the distortion and its strength in the distorted image. For this purpose, several statistical
features of a given gray-scale image as well as the magnitude of its gradient and its Laplacian are
extracted in the wavelet-packet domain. The extracted features are then mapped to quality scores
within a two-stage quality assessment framework. The proposed method is general-purpose, and is
able to assess the image quality across various distortion categories. Experimental results indicate
that the proposed method achieves high accuracy in image quality prediction as compared to several
prominent and state-of-the-art full-reference and no-reference IQA methods.
c 2016 Elsevier Ltd. All rights reserved.
1. Introduction
With the rapid proliferation of digital images in our daily life,
image quality assessment (IQA) has become very important in
a wide variety of different practical applications such as digital
imaging, image compression, transmission, enhancement, and
restoration. Degradation of digital images is often inevitable
due to image acquisition, transmission, and compression. Because of these processes, various image distortions such as blur,
noise, blocking artifacts, ringing, oversaturation, etc. may be
produced in the obtained images.
To goal of IQA is to measure how much the perceived quality
of a given digital image has been degraded by potential distortions. This is accomplished by assigning a quality score to the
image for representing its perceived quality. The quality score
can be estimated either subjectively by human ratings or objectively through automatic computer algorithms. Since subjective IQA methods rely on human observers, they are not always
readily available, especially in real-time scenarios. They are
also slow and costly. On the other hand, objective IQA methods are readily and routinely available in different applications.
∗∗ Corresponding
author.
e-mail: hha54@sfu.ca (Hadi Hadizadeh)
They are automatic and fast, and do not have the limitations of
the subjective methods. Therefore, they can easily be utilized
to quantitatively measure the image quality in a wide variety of
applications.
In the past decades, various objective IQA methods have
been developed in the literature (Damera-Venkata et al. (2000);
Chandler and Hemami (2007); Wang et al. (2004); Wang and Li
(2011); Sheikh et al. (2005); Sheikh and Bovik (2006); Zhang
et al. (2011)). Based on the availability of a reference image
(i.e. an original distortion-free image), the objective IQA methods are classified into three classes: full reference (FR) (Wang
et al. (2004), Zhang et al. (2011), Sheikh and Bovik (2006)),
reduced reference (RR) (Xu et al. (2015)), and no reference
(NR) (Moorthy and Bovik (2011), Mittal et al. (2012)). In FR
methods, the original un-distorted image is provided along with
the distorted image whose quality is to be assessed. In RR approaches, some additional information about the original undistorted image is provided along with the distorted image, either by a sepeate auxiliary channel or by embedding some information (e.g. a watermark) in the distorted image. In NR (blind)
methods, only the distorted image is provided, and the method
must predict the image quality only based on the given distorted
image without having any knowledge or information about the
original un-distorted image. Hence, designing NR methods is
very challenging as compared to FR and RR methods.
2
It is widely known that natural distortion-free images possess
specific statistical properties, and distortions may change these
properties. Based on this idea, natural scene statistic (NSS)
models (Simoncelli and Olshausen (2001)) have been developed to capture such statistical properties, and using such models a number of NSS-based FR and NR IQA methods have been
developed (Moorthy and Bovik (2011), Moorthy and Bovik
(2010)).
In this paper, we propose a NSS-based NR IQA method for
gray-scale images, which is capable of assessing the quality
of a distorted image across multiple distortion categories in a
modular manner. In the proposed method, a wavelet packet decomposition (WPD) (Coifman and Wickerhauser (1992)) is first
applied on a given image. A number of statistical features are
then computed from all the obtained subbands of the given image and the magnitude of its gradient and its Laplacian. A twostage framework for NR image quality assessment proposed in
(Moorthy and Bovik (2010), Moorthy and Bovik (2011)) is then
employed to compute a quality score for the given image using the extracted features. Experimental results on two popular
IQA databases indicate that the image quality scores produced
by the proposed method correlate well with human perception
and that the proposed method is competitive with several FR
IQA methods as well various state-of-the-art NR IQA methods.
It must be pointed out that there are some existing NSSbased NR IQA methods like DIIVINE (Moorthy and Bovik
(2011)) that use statistical features of wavelet subbands. However, to the best of our knowledge, our proposed method is the
first that uses wavelet packet features for NR IQA. Note that
wavelet packets are an overcomplete generalization of standard
orthonormal wavelets in which, unlike the standard wavelets,
both the low- and high-frequency components of each level of
decomposition are recursively decomposed, thus constructing
a tree structured multiband extension of the standard wavelet
transform. Hence, WPD allows us to capture the statistical
characteristics of a given image more accurately. Moreover,
it is known that standard wavelets are ill-suited to represent oscillatory patterns (i.e. signals with strong stationary highpass
components) such as rapid variations of intensity in complex
textures while wavelet packets have a better ability to represent such patterns (Meyer et al. (2000), Coifman and Wickerhauser (1992)), thus increasing the applicability of the proposed
method to a wider range of natural images.
Another difference of the proposed method with the previous wavelet-based methods is that in the proposed method the
statistics are gathered not only from the subband coefficients of
the distorted image, but also from the subband coefficients of
the magnitude of its gradient and its Laplacian. Note that the
gradient and the Laplacian of an image carry important information about the structure of the image, and they are sensitive
to noise and other distortions, and that is the reason for using
them in the proposed method.
The organization of this paper is as follows. In Section 2,
prominent previous works on FR and NR IQA are breifly reviewed. The proposed method is then presented in Section 3.
The experimental results are given in Section 4, followed by
conclusions in Section 5.
2. Related Works
In the literature several FR and NR IQA methods have been
proposed. The most popular and widely-used objective FR IQA
metrics include the peak signal-to-noise ratio (PSNR) and the
mean squared error (MSE). These methods operate directly on
the intensity values of the image, but they do not correlate well
with the subjective fidelity ratings. The reason is that these
methods do not consider any properties of the human visual
system (HVS). On the other hand, there are methods that are
designed based on the HVS properties or attempt to mimic it.
These include the very popular structural similarity (SSIM) index (Wang et al. (2004)), the information fidelity criterion (IFC)
(Sheikh et al. (2005)), and the visual information fidelity (VIF)
metric (Sheikh and Bovik (2006)).
SSIM works based on the hypothesis that HVS is highly sensitive to the loss of structure in the image. Hence, to measure
the perceived image quality, SSIM measures the structural similarity between a distorted image and its related reference image.
In IFC the information shared between the distorted and reference images is measured and used for IQA in an informationtheoretic framework. VIF is an extension of IFC in which HVS
is modeled as a simple channel that introduces additive noise
in the wavelet domain. Using this model, VIF quantifies the
Shannon information that is shared between the reference and
the distorted images relative to the information contained in the
reference image itself.
The prominent NR IQA methods include BIQI (Moorthy
and Bovik (2010)), DIIVINE (Moorthy and Bovik (2011)),
BLINDS-II (Saad et al. (2012)), BRISQUE (Mittal et al.
(2012)), and SSEQ (Liu et al. (2014)). BIQI is a two-step
framework for NR IQA, which involves distortion classification and distortion-specific quality assessment, and it uses several NSS features. The DIIVINE index is an extension of BIQI
in which a series of NSS features in the wavelet domain are
used to predict image quality, and it achieves excellent performance. BLIINDS-II extracts NSS features in the blockbased DCT domain using a fast single-stage quality assessment
framework. The BRISQUE index provides a low-complexity
NR IQA method in which several features are extracted in the
spatial domain, and it shows very good performance for image
quality prediction. SSEQ utilizes spatial and spectral entropy
features from a distorted image in the block-based DCT domain
to predict the image quality in a two-stage framework. Experimental results showed that SSEQ achieves high accuracy as
compared to several state-of-the-art NR IQA methods.
The abovementioned NR-IQA methods are able to assess
the image quality across various distortion categories, similar
to the method proposed here. However, there are some NR
IQA methods that are distortion-specific and target a certain
distortion category such as compression or blur. The examples
are the methods proposed in (Suthaharan (2009), Meesters and
Martens (2002), Ferzli and Karam (2009)).
3. Proposed Method
In this section, we propose a method to estimate the subjective quality of a given gray-scale distorted image in a no-
3
Fig. 1. The flowchart of the proposed method. Wavelet-packet decomposition (WPD) is applied on the given distorted image I as well as the magnitude of
its gradient G and its Laplacian L.
reference manner. The proposed method consists of three steps.
In the first step, the magnitude of the gradient and the Laplacian
of the given distorted image are first computed. These two additional images serve as the first and second derivative of the
given distorted image, respectively. The gradient and Laplacian
of an image carry important information about the edges and the
structure of the image, and as they are dervative operators, they
are very sensitive to noise and other similar distortions. Hence,
characterizing their statistical properties may help identify distortions better. In the second step, a wavelet-packet decomposition pyramid is applied to the distorted image, as well as its
first and second derivative images, and a number of statistical
features are extracted from all the subband coefficients of each
of the three images. In the third step, the extracted features are
fed to a distortion classifier as well as a number of regression
modules to estimate a quality score for the given distorted image. The details of each step are elaborated in the next sections.
A flowchart of the proposed method is shown in Fig. 1.
Consider a distorted gray-scale image I. Our goal is to quantify the subjective quality of I without having a reference image.
For this purpose, the magnitude of the gradient and also the
Laplacian of I are first computed as G and L, respectively. For
computing the gradient information, we used the Scharr gradient operator whose horizontal and vertical components, G x and
Gy , are defined as:


 3
0 −3 
1


0 −10 , Gy =
 0

16
0 −3
−3
After computing G and L, a 2D wavelet-packet decomposition (WPD) is applied on I, G, and L up to level N. Let Ctj be
the j-th subband for t ∈ {I, G, L}.
3.2. Statistical feature extraction
Let Dtj = |Ctj | and Etj = log2 (Dtj ). We extract the following
statistical features out of Etj :
We also compute the
3.1. Wavelet-packet decomposition

3
1 
10
Gx =
16  3
are described in Section 4.3. For computing L, we used the
following Laplacian kernel:


−1 −1 −1
−1 +8 −1 .
(2)


−1 −1 −1
10
0
−10

3 

0  .

−3
(1)
Other possible gradient operators such as Sobel and Prewitt
can also be used here but in our experiments we found that
the Scharr operator provides a better accuracy. The details
mtj = mean(Etj ),
(3)
vtj = var(Etj ),
(4)
stj = skewness(Etj ),
ktj = kurtosis(Etj ).
entropy of Ctj as follows:
etj = −
XX
htj log2 htj ,
(5)
(6)
(7)
where htj is defined as:
(Ctj )2
htj = P P t 2 .
(C j )
(8)
Based on the calculated features, we create a feature vector f t
as follows:
f t = [mtj , vtj , stj , ktj , etj ].
(9)
Using (9), we obtain f I for I, f G for G, and f L for L. Finally,
we create a single feature vector f as follows:
f = [f I , f G , f L ].
(10)
We use f for measuring the subjective quality of I as explained
in the next section.
4
3.3. Quality Assessment
In order to quantify the subjective quality of a given image I
based on its feature vector f, we employ the 2-stage quality assessment framework proposed in (Moorthy and Bovik (2011)).
This 2-stage framework consists of the following two stages:
(1) distortion indentification, and (2) distortion-specific quality
assessment.
Similar to (Moorthy and Bovik (2011), Liu et al. (2014)),
for the distortion identification stage a Support Vector Classification (SVC) is utilized to estimate the probability that the
distorted image is distorted with one of the n distortion classes,
and for the distortion-specific quality assessment stage a Support Vector Regression (SVR) is employed to obtain n regression modules, each of which maps a given feature vector to an
associated quality score. Both SVC and SVR require training
with a set of images with known quality scores as follows.
Given a training set of images with known distortion class
(spanning all the n distortion classes), a SVC classifier is
trained whose inputs are the true class and the feature vector extracted by the proposed method. During the training
procedure, the classifier learns the mapping from the feature space to class label, and once the training is performed,
the trained classifier is able to estimate the distortion class
of a given input image out of its extracted feature vector. In
our approach, the classifier does not produce a hard classification decision. Instead, it produces a set of probability
estimates, which indicate with what probability the input
image belongs to any of the n different distortion classes.
Similarly, a separate regression module (SVR) is trained
for each of the n distortion classes using a set of training images with known quality scores from that distortion class.
These regression modules map the input feature vector to an
associated quality score under the assumption that the input feature vector comes from an image, which is distorted
by that particular distortion. Once trained, each of these
regression modules acts as a distortion-specific assessor of
quality. The training procedure for both SVC and SVRs is
explained in more detail in Section 4.
The proposed procedure for estimating the quality score
of a given distorted image I with feature vector f proceeds
as follows. The feature vector is first fed to the trained SVC,
and a n-dimensional vector of probabilities p is obtained such
that p(i) (i = 1, · · · , n) indicates the probability of I being distorted with the i-th distortion class. After that, f is fed to each
of the n SVR modules, and a n-dimensional vector of estimated
qualities q is obtained, where q(i) is the quality score estimated
by the i-th regression module. Using this framework, the quality score estimated by the proposed method for I, denoted by
Q, is computed as follows:
Q=
n
X
p(i)q(i).
(11)
i=1
As will be discussed in Section 4, in our proposed method,
we train the aforementioned SVC and SVR modules with the
data from the well-known LIVE database using the difference
mean opinion scores (DMOS). Hence, larger values of Q indicate lower subjective quality and vice versa.
4. Experiments and Results
In this section, we evaluate the performance of the proposed
NR IQA method for estimating the image quality and compare
it with various prominent FR- and NR-IQA methods.
4.1. IQA database and experimental setup
For the performance evaluation of the proposed method, we
employed the LIVE IQA database (Sheikh et al. (2014)). This
database contains 29 reference images, each distorted with the
following 5 different types of distortion: white noise (WN),
JPEG and JP2K compression, Gaussian blur (Blur), and Fast
Rayleigh fading (FF), yielding 799 distorted images. Each distorted image is provided with a difference mean opinion score
(DMOS), which represents the subjective quality of the image. Smaller DMOS indicates higher subjective quality and
vice versa. Using the popular LIVE IQA database allows us to
perform a fairer comparison with other FR and NR-IQA methods because many of the existing IQA methods utilize LIVE either for training (for NR methods) or testing their performance.
All the images in this database are color, so we converted them
to gray-scale in order to be able to test the proposed method.
We experimentally found that only N = 2 levels of waveletpacket decomposition is sufficient to achieve acceptable results.
As in (Moorthy and Bovik (2011), Liu et al. (2014)), we used
the popular libSVM package for training the SVC and SVR
modules. The parameters for both SVC and SVR modules discussed in Section 3.3 were optimized by the training process as
in (Moorthy and Bovik (2011), Liu et al. (2014)).
We partitioned the LIVE database into a training and test sets
such that 80% of the database constitues the training set and
the remaining 20% makes the test set. The training set was
used to train the SVC and SVR modules, and the test set was
used to evaluate the ability of the proposed method for image
quality prediction. This partitioning scheme was repeated 1000
times in a random fashion to get 1000 random test sets, and the
median of the obtained quality scores across the 1000 random
test sets was considered as the final evaluation result.
4.2. The performance metrics
We compared the proposed NR-IQA method with several FR
and NR-IQA methods introduced in Section 2, for which code
is publicly available. These methods were trained on LIVE, so
they are good candidates for a fair comparison with our proposed method.
For the comparisons, the following three criteria were utilized to measure the prediction monotonicity and accuracy of
the compared methods: (1) the Pearson Linear Correlation Coefficient (PCC), (2) the Spearman Rank-Order Correlation Coefficient (SROCC), (3) the Root Mean Square Error (RMSE)
between the predicted DMOS and the actual DMOS provided
by the IQA database. As recommended in (VQEG (2003)), the
SROCC serves as a measure of prediction monotonicity while
PLCC and RMSE serve as measures of prediction accuracy. A
better correlation with human perception means a value close to
zero for RMSE and a value close to one for PLCC and SROCC.
Note that SROCC operates only on the rank of the data, and
5
it does not consider the relative distance between datapoints.
Therefore, it is generally considered to be a less sensitive measure of correlation, and is typically used only when the number
of datapoints is small (Sheikh et al. (2006)).
As recommended in (VQEG (2003)), before computing all
the abovementioned metrics, a regression function must be applied on the predicted quality scores to provide a nonlinear
mapping between the predicted scores and the actual DMOS
values provided in the database. For this purpose, similar to
(Moorthy and Bovik (2011), Mittal et al. (2012), Liu et al.
(2014)), we utilized the following logistic function with an
added linear term:
f (x) = β1
1
2
−
1
+ β4 x + β5 ,
1 + exp(β2 (x − β3 ))
Table 1. Median PLCC across 1000 train-test trials of various IQA methods
for different types of distortions on LIVE. Italicized entries denote NR-IQ
methods while others are FR-IQA methods.
Method
PSNR
SSIM
VIF
BIQI
DIIVINE
BLIINDS-II
BRISQUE
SSEQ
Proposed
JP2K
0.8837
0.9601
0.9664
0.8414
0.9409
0.9493
0.9472
0.9464
0.9591
JPEG
0.8515
0.9485
0.9478
0.7603
0.9097
0.9505
0.9330
0.9702
0.9720
WN
0.9817
0.9861
0.9924
0.9732
0.9744
0.9614
0.9883
0.9806
0.9954
Blur
0.8006
0.9537
0.9774
0.9118
0.9393
0.9375
0.9463
0.9607
0.9717
FF
0.8939
0.9616
0.9698
0.7342
0.9128
0.9079
0.9142
0.9198
0.9345
All
0.8081
0.9100
0.9520
0.7422
0.9116
0.9241
0.9365
0.9383
0.9601
(12)
where x denotes the predicted quality score, and βi for i =
1, · · · , 5 are determined by least square fitting to the actual
DMOS values provided by the IQA database. Note that SROCC
is independent of the selected regression function as it relies
only on the rank-ordering.
4.3. Corss-validation results on LIVE
The median PLCC, SROCC, and RMSE values across 1000
train-test trials of various FR and NR-IQA methods are tabulated in Tables 1, 2, and 3 for each individual distortion type as
well as across all distortion classes. In these tables, the names
of the NR-IQA methods are italicized.
In order to evaluate statistical significance, a one-sided ttest was conducted with a 95% confidence level between the
SROCC values generated by each of the compared methods
across the 1000 train-test trials. The null hypothesis was that
the mean SROCC value of the method in the row is equal to
the mean SROCC value of the method in the column, and the
alternative hypothesis was that the mean SROCC value of the
row is greater (or less) than the mean SROCC value of the column. The results of the test are shown in Table 4. The entries in
this table indicate which row is statistically superior (’1’), statistically equivalent (’0’), or statistically inferior (’-1’) to which
column.
From the data reported in these four tables, it can be seen
that the proposed method achieves a superior accuracy in image quality prediction as compared to other FR and NR-IQA
methods used in this study. In particular, we observe that the
proposed NR-IQA method provides competitive results as compared to the considered FR methods. We also observe that the
proposed method has a very high accuracy on predicting the
image quality of noisy images. This is because we use the gradient and Laplacian information in the proposed method, both
of which are sensitive to noise.
To select a proper gradient operator, the following three
gradient operators were examined on all distortion types in
LIVE: Sobel, Prewitt, and Scharr. The median SROCC
across 1000 train-test trials of the proposed method using each of these three operators was 0.9491, 0.9403, and
0.9521, respectively. Based on these results, we selected the
Scharr gradient operator as it provides a better accuracy.
Table 2. Median SROCC across 1000 train-test trials of various IQA methods for different types of distortions on LIVE. Italicized entries denote NRIQ methods while others are FR-IQA methods.
Method
PSNR
SSIM
VIF
BIQI
DIIVINE
BLIINDS-II
BRISQUE
SSEQ
Proposed
JP2K
0.8837
0.9601
0.9664
0.8414
0.9409
0.9493
0.9472
0.9464
0.9467
JPEG
0.8515
0.9485
0.9478
0.7603
0.9097
0.9505
0.9330
0.9702
0.9768
WN
0.9817
0.9861
0.9924
0.9732
0.9744
0.9614
0.9883
0.9806
0.9903
Blur
0.8006
0.9537
0.9774
0.9118
0.9393
0.9375
0.9463
0.9607
0.9695
FF
0.8939
0.9616
0.9698
0.7342
0.9128
0.9079
0.9142
0.9198
0.9202
All
0.8081
0.9100
0.9520
0.7422
0.9116
0.9241
0.9365
0.9383
0.9521
4.4. Generalization
As mentioned earlier, we trained our proposed method based
on the LIVE database. However, it is interesting to see how
the proposed method acts on another unseen IQA database. For
this purpose as in (Liu et al. (2014)), we tested the proposed
method on a portion of the TID2008 database (Ponomarenko
et al. (2009)) on the same distortion classes used in the training
stage. The TID2008 database consists of 25 reference images
and 1700 distorted images over 17 distortion classes. Of these
25 reference images only 24 are natural images and the remaining is a synthetic image. Therefore, we tested the proposed
method only on the 24 natural images over the same distortion
classes that were used in the training stage (JPEG, JP2K, WN,
FF). This time, however, we used the entire LIVE database for
training the proposed method. The obtained median SROCC
results are shown in Table 5.
From the results reported in Table 5, we observe that the proposed method is always the best or second best among the NR
methods except on Blur. Hence, we can conclude that, in general, the proposed method achieves competitive results as compared to other NR methods used in this study. The reason for
the somewhat poorer performance under Blur is that blurring
diminishes both the gradient and the Laplacian and therefore
reduces their informativeness.
4.5. Complexity Analysis
In the proposed method, 3 × 5 = 15 features are calculated for each subband. The total number of subbands in
PN i
WPD with N levels of decomposition is i=1
4 . Hence, since
6
Table 4. Results of the statistical significance test conducted on the SROCC values of various methods across 1000 train-test trials.
PSNR
SSIM
VIF
BIQI
DIIVINE
BLIINDS-II
BRISQUE
SSEQ
Proposed
PSNR
0
1
1
-1
1
1
1
1
1
SSIM
-1
0
1
-1
1
1
1
1
1
VIF
-1
-1
0
-1
-1
-1
-1
-1
1
BIQI
1
1
1
0
1
1
1
1
1
DIIVINE
-1
-1
1
-1
0
1
1
1
1
Table 3. Median RMSE across 1000 train-test trials of various IQA methods for different types of distortions on LIVE. Italicized entries denote NRIQ methods while others are FR-IQA methods.
Method
JP2K
JPEG
WN
Blur
FF
PSNR
7.5641 8.3269 3.0741 9.4289 7.3990
SSIM
4.5389 5.0771 2.6584 4.6823 4.4855
4.1943 5.0856 1.9608 3.3315 3.9624
VIF
13.7871 17.0133 5.3804 9.6562 15.5515
BIQI
DIIVINE
8.5703 10.6070 5.2137 8.0663 9.6520
BLIINDS-II 8.1730 7.7658 6.5009 8.0696 9.7141
BRISQUE 8.3625 9.3782 3.5294 7.5636 9.4359
SSEQ
7.8285 5.8467 4.3211 6.0027 8.5418
Proposed 6.9476 5.6868 2.1010 5.1445 7.8730
All
9.4973
6.6355
4.9180
15.9547
9.9347
9.0473
8.3295
8.0039
6.5584
Table 5. Median SROCC across 1000 train-test trials of various IQA methods for different types of distortions on TID2008. Italicized entries denote
NR-IQ methods while others are FR-IQA methods.
Method
PSNR
SSIM
VIF
DIIVINE
BRISQUE
SSEQ
Proposed
JP2K
0.8248
0.9603
0.9697
0.9240
0.9037
0.8460
0.9366
JPEG
0.8753
0.9354
0.9307
0.8660
0.9102
0.8661
0.8702
WN
0.9177
0.8168
0.9136
0.8510
0.8227
0.8012
0.8824
Blur
0.9335
0.9598
0.9576
0.8620
0.8742
0.8354
0.8237
All
0.8703
0.9016
0.9403
0.8890
0.8977
0.8501
0.8962
BLIINDS-II
-1
-1
1
-1
-1
0
1
1
1
BRISQUE
-1
-1
1
-1
-1
-1
0
-1
1
SSEQ
-1
-1
1
-1
-1
-1
1
0
1
Proposed
-1
-1
0
-1
-1
-1
0
-1
0
in the proposed method we used only N = 2 levels of decomposition, the total number of extracted features per image
is 15 × 20 = 300. These features are simply concatenated
together to create a single feature vector of length 300.
The average processing time of the proposed method on
the LIVE database with 5 distortion types (implemented in
MATLAB without code optimization) on an Intel Core 2
Duo @ 3.33 GHz, with 8 GB RAM was about 0.9 seconds
per image. The average processing time of the feature extraction stage was about 0.5 seconds, and the average processing time for the quality assessment stage was about 0.4
seconds.
5. Conclusions
In this paper, an efficient NR IQA method was presented
for gray-scale images based on characterizing the statistical
properties of a given distorted image in the wavelet-packet
domain. In the proposed method, several statistical features
are first extracted from the subband coefficients of the given
image as well as the magnitude of its gradient and Laplacian. The extracted features are then processed within a twostage quality assessment framework using SVC and SVR modules to produce a quality score, which indicates the subjective
quality of the distorted image. The proposed method is able
to assess the image quality for various distortion types in a
modular manner. Experimental results indicated that the proposed method achieves high image quality prediction accuracy
as compared to several prominent FR and NR IQA methods.
The code of the proposed method is available for public at
www.sfu.ca/~ibajic/#software.
References
Chandler, D.M., Hemami, S.S., 2007. VSNR: a wavelet-based visual signal-tonoise ratio for natural images. IEEE Trans. Image Process. 16, 2284–2298.
Coifman, R., Wickerhauser, M.V., 1992. Entropy-based algorithms for best
basis selection. IEEE Trans. Inform. Theory 38, 713–718.
Damera-Venkata, N., Kite, T., Geisler, W., Evans, B., Bovik, A., 2000. Image quality assessment based on a degradation model. IEEE Trans. Image
Process. 9, 636–650.
Ferzli, R., Karam, L., 2009. A no-reference objective image sharpness metric
based on the notion of just noticeable blur (jnb). Trans. Image Process. 18,
717–2009.
Liu, L., Liu, B., Huang, H., Bovik, A.C., 2014. No-reference image quality
assessment based on spatial and spectral entropies. Signal Process.: Image
Communication 29, 856–863.
7
Meesters, L., Martens, J.B., 2002. A single-ended blockiness measure for jpegcoded images. Signal Processsing 82, 369–387.
Meyer, F.G., Averbuch, A.Z., Strmberg, J.O., 2000. Fast adaptive wavelet
packet image compression. IEEE Trans. Image Process. 9, 792–800.
Mittal, A., Moorthy, A.K., Bovik, A.C., 2012. No-reference image quality
assessment in the spatial domain. IEEE Trans. Image Process. 21, 4695–
4708.
Moorthy, A.K., Bovik, A.C., 2010. A two-step framework for constructing
blind image quality indices. IEEE Signal Process. Lett. 17, 513–516.
Moorthy, A.K., Bovik, A.C., 2011. Blind image quality assessment: From
natural scene statistics to perceptual quality. IEEE Trans. Image Process.
20, 3350–3364.
Ponomarenko, N., Lukin, V., Zelensky, A., Egiazarian, K., Carli, M., Battisti,
F., 2009. TID2008 - a database for evaluation of full-reference visual quality
assessment metrics. Adv. Modern Radioelectron. 10, 30–45.
Saad, M., Bovik, A.C., Charrier, C., 2012. Blind image quality assessment:
a natural scene statistics approach in the dct domain. IEEE Trans. Image
Process. 21, 3339–3352.
Sheikh, H.R., Bovik, A.C., 2006. Image information and visual quality. IEEE
Trans. Image Process. 15, 430–444.
Sheikh, H.R., Bovik, A.C., de Veciana, G., 2005. An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans.
Image Process. 14, 2117–2128.
Sheikh, H.R., Sabir, M.F., Bovik, A.C., 2006. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image
Process. 15, 3441–3452.
Sheikh, H.R., Seshadrinathan, K., Moorthy, A.K., Wang, Z., Bovik, A.C., Cormack, L.K., 2014. Image and video quality assessment research at LIVE.
http://live.ece.utexas.edu/research/quality. [Online].
Simoncelli, E.P., Olshausen, B.A., 2001. Natural image statistics and neural
representation. Annual Review of Neuroscience 24, 1193–1216.
Suthaharan, S., 2009. No-reference visually significant blocking artifact metric
for natural scene images. Signal Processing 89, 1647–1652.
VQEG, 2003. Final report from the video quality experts group on the validation of objective models of video quality assessment, phase ii. http:
//www.vqeg.org. [Online].
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., 2004. Image quality
assessment: From error visibility to structural similarity. IEEE Trans. Image
Process. 13, 600–612.
Wang, Z., Li, Q., 2011. Information content weighting for perceptual image
quality assessment. IEEE Trans. Image Process. 20, 1185–1198.
Xu, Y., Liu, D., Quan, Y., Callet, P.L., 2015. Fractal analysis for reduced
reference image quality assessment. IEEE Trans. Image Process. 24, 2098–
2109.
Zhang, L., Zhang, L., Mou, X., Zhang, D., 2011. FSIM: A feature similarity
index for image quality assessment. IEEE Trans. Image Process. 20, 2378–
2386.