Novel Development of Video Coding using SVC Concepts in IP Scenario —

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 9-April 2015
Novel Development of Video Coding using SVC
Concepts in IP Scenario
Rahimunisa Nagma, Dr. TC Manjunath, Pavithra G.
MTech, Department of ECE, HKBKCE
Nagawara, Bangalore, Karnataka-560045, India
Abstract— Scalable video coding is a very useful option for
video service providers because it has the ability to adapt a
video’s bit stream at the server so as to suit various network
conditions and also to suit various device characteristics. For the
compression to happen, lowering of video’s bit rate has to be
done. This can be achieved by reducing frame rate, spatial
resolution, and/or by increasing the quantization levels that is
applied to the video sequence under consideration. In this paper
we evaluate the effects of scalability using no-reference or
reduced-reference video quality metrics, namely PSNR, SSIM,
blocking and blurring. In this paper we provide comparison
between various video coding standards and signify the
advantage of SVC over prior video coding standards.
Keywords— H.264 SVC, PSNR, SSIM, Blocking, Blurring.
I. INTRODUCTION
With the introduction of various standards for video
coding demonstration of video compression capability have
been done regarding significant improvements achieved in the
same field. The Scalable video Coning(SVC) has also been
standardized by the Joint Video Team of the ISO/IEC MPEG
and the ITU-T VCEG. SVC is the extension of the already
available video standard H.264/AVC. SVC enables the
transmission of partial bit streams and decoding them, that
results in providing of video services with reduced temporal
or spatial resolutions, at the same time reducing the
reconstruction quality that is high as compared to the partial
bit streams rate. Hence SVC provides adaptation of power, bit
rate and format and also provides graceful degradation in
environments with lossy transmission, Relative to the scalable
profiles of previous video coding standards, there are
significant improvements in coding efficiency achieved by
SVC along with an improved degree of supported scalability.
II. OBJECTIVE
The main objective of the proposed dissertation work is to
realize the traffic aware video coding using Scalable Video
Coding. The dissertation aims in achieving scalability of the
video stream in three different dimensions namely Spatial,
Temporal and Quantization through degradation by
considering varying internet speeds. After achieving
scalability, quantifying of the effect of network transmissions
and scalability options at the end user side who is the final
receiver of the transmitted video is done. The results obtained
can be used to select appropriate scalability options to satisfy
the requirements of end-users and to satisfy the quality
constraints or bandwidth.
III. PROPOSED WORK
In the proposed work we use a reduced-reference method
instead of a full-reference method in order to get the
quantitative measure of video quality at the end user side. In
the proposed work, we are aiming in achieving scalability of
the video stream in three different dimensions namely Spatial,
Temporal and Quantization through degradation by
considering varying internet speeds. Here we provide a
feedback to the network service provider about the loss
perceptually introduced by the wireless transmission network.
By doing this, the assessment strategy can be simplified to a
greater extent. In our proposed work we see how scalability is
achieved by giving the basic concepts of extending previous
video coding standard H.264/Advanced Video Coding(AVC)
towards Scalable video coding(SVC) standard. We show the
block diagram of proposed work to show traffic aware video
coding is happening using Scalable Video Coding. Also an
Encoder and Decoder block diagrams are shown separately
with their detailed explanation.
Scalability is the ability of a system, network or process to
handle a growing amount of work in a capable manner or its
ability to be enlarged to accommodate that growth. A video
bit stream is called scalable if some parts of the stream can be
removed such that the sub stream obtained, forms another
valid bit stream and represents the content of the source with
lower reconstruction quality than the original video bit stream.
The most common scalability modes are: Temporal, Spatial
and Quality scalability.
1) Temporal scalability: Here the subsets of video bit
streams have reduced frame rate (frame resolution).
2) Spatial scalability: Here the subset of video bit
streams have reduced picture size (spatial resolution).
3) Quality scalability: It is commonly referred to as
signal-to-noise-ratio(SNR) or fidelity scalability.
Figure 1: Types of Scalabilities
ISSN: 2231-5381
http://www.ijettjournal.org
Page 405
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 9-April 2015
A. Encoder part
Figure 4: General H.264 Encoder block diagram
The source video is taken and prediction of video is carried
out. H.264 SVC involves intra-prediction and inter-prediction.
In Intra-prediction prediction, prediction of macroblock is
done by referring to only current slice and not referring any
outside data. For luma component there are three choices of
Figure 2: Proposed Block Diagram
intra prediction block size i.e., 16x16, 8x8 or 4x4 and foe
The video sequence is first converted into YUV sequence chroma component there is a single prediction block. Example
where YUV model defines a color space in terms of 1 luma(Y) of a 4x4 block to be predicted is shown below:
and 2 chrominance(UV) components. Encoding of raw YUV
sequence is done with the help of H.264 SVC encoder. This
encoder encodes raw YUV sequence into 3 different
bitstreams-temporal, spatial or quantization that are scaled
down to one dimension. Since SVC has the ability to adjust
itself to suit different network conditions and device
characteristics, the encoded scalable video‘s bit stream is
subjected to network-based video scaling. Then the bitstream
is transmitted and fed to H.264 SVC decoder where the
decoded sequence is obtained called the extracted YUV
sequence. The extracted YUV sequence is compared with the
raw original sequence and the amount of degradation in
spatial, temporal or quantization dimensions is made(assessed)
on video quality. Full-reference and reduced-reference video
quality measurement is done. Full-reference measurement
requires both original and extracted sequence whereas
reduced-reference requires mostly only extracted YUV
Figure 5: 4 ×4 luma block to be predicted
sequence for video quality measurement.
Inter prediction is done referring to previously coded
frames using motion compensated prediction. Inter prediction
involves prediction region selection, prediction block
generation and then subtraction of this from original block of
samples to give a residual which is then coded and transmitted.
A motion vector is the offset between position of the current
partition and the prediction region in the reference picture.The
Figure 3: Basic Block diagram of Encoder and Decoder
motion vector is differentially coded one by one from
In the basic block diagram of an encoder and decoder, the neighbouring block‘s motion vectors. Figure below shows
source video is taken from a PC-display and is encoded using macroblock and sub-macroblock partition:
H.264 SVC encoder to obtain a video binary bit stream. This
bit stream is transmitted via the HTTP network-which is
designed to enable client and server communications. From
the HTTP network, the bit stream is fed to the H.264 SVC
decoder. Decoder decodes the binary bit stream to give
extracted video sequence as output which is fed to the receiver
side PC display.
Figure 6: Macroblock partitions and sub-macroblock partitions
ISSN: 2231-5381
http://www.ijettjournal.org
Page 406
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 9-April 2015
based on previous coding statistics, the probability
models are updated.
B. Transform and Quantization
The transform and quantization in H.264/SVC are
designed to minimize computational complexity, to avoid
mismatch of the encoder-decoder and to be suitable for
implementation using limited-precision integer arithmetic.
This can be achieved by:
Using an integer transform, a core transform which is
carried out using integer or fixed-point arithmetic
and
With quantization process of integration and
normalization step, to minimize the number of
multiplications required to process a block of
residual data.
In order that every H.264 implementation produce
identical results by eliminating mismatch between different
transform implementations, scaling and inverse transform
processes are carried out by a decoder.
H.264 SVC uses CABAC entropy coding mode. Following
steps are performed while coding:
Encoding of Binary decision: First all the binary decisions(1
or 0) are encoded. A non-binary valued symbol is binarized
prior to arithmetic coding.
The following steps are repeated for each bit or bin of
binarized symbol:
1) Selection of control model: It is done for one or more
bins by the selecting from available models
depending on the statistics of recently-coded data
symbols. It is a probability model which stores the
probability of each bin as ‗1‘ or ‗0‘.
2) Arithmetic Encoding: There are only 2 sub-ranges for
each bin, corresponding to values ‗0‘ and ‗1‘.
3) Updating of selected context model: It is done based
on the actual coded value. Example: If bin value was
‗0‘, the frequency count of ‗0‘ is increased.
C. Entropy Encode
Prior to entropy coding, the blocks of transform coefficients are converted into a linear array. The intention of
scan order is to group together the non-zero quantized coefficients called significant co-efficients. For a typical
progressive frame‘s block, the non-zero co-efficients tend to
be grouped or clustered around top-left DC co-efficient. A
zigzag scan order is most efficient in this case.
Example for a progressive scan order for 4x4 blocks is shown
below:
As a result we obtain a compressed H.264 data which is then
transmitted (or stored) via a wireless http network to the
H.264 SVC decoder.
D. Decoder part
The compressed H.264 bitstream is given to the H.264
SVC video decoder which extracts the information such as
quantized transform co-efficients, prediction information, etc.,
by decoding each of the syntax. Later this information is used
to reverse the coding process and recreate video sequence.
Figure 8: General H.264 Decoder block diagram
IV. PROJECT IMPLEMENTATION AND SIMULATION
REPORT
Figure 7: Zigzag coding pattern
A H.264 stream consists of a series of codec symbols.
There are several methods for coding these symbols. Some of
them are:
1) Fixed length code: Here the symbols are converted
into a binary code with specified length.
2) CAVLC (Context-Adaptive Variable Length Coding):
Here by using context adaptation, different sets of
variable-length codes are chosen depending on the
statistics of recently-coded co-efficients.
3) CABAC (Context-Adaptive Binary Arithmetic
Coding): It is a method of arithmetic coding wherein,
ISSN: 2231-5381
The dissertation work have started with the
implementation of a H.264 SVC encoder part and H.264 SVC
decoder part. Subjective and Objective measurement of video
quality is carried out.
1) Subjective Measurement of video quality: It is based on or
influenced by personal opinions. It does not give a
quantitative measure of the quality but defines the quality
in terms of words such as good, better, best, etc. There are
various factors that influence subjective quality such as
the Human Visual System(HVS), the eye and the brain. A
viewers opinion about video quality is also affected by
other factors such as his state of mind, his viewing
environment, ans his visual attention.
http://www.ijettjournal.org
Page 407
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 9-April 2015
2) Objective measure of video quality: This gives a
quantitative measure of video quality with a little
complexity and cost as compared to subjective
measurement. In this we have Full-reference evaluation
and Reduced-reference evaluation of video.
Example of Full-reference evaluation methods are:
1) PSNR: It is the ratio of useful energy to the error
energy.
PSNR db=10log10 (2^n-1)^2…………………eqn (1)
MSE
Taking n=8,
PSNR db=10log10 (255)^2………………….. eqn (2)
MSE
It is simple to calculate and requires very less time.widely
used to compare compressed and decompressed video image
quality.
2) SSIM: It is based on measuring the three components
like luminance similarity, contrast similarity and
structural similarity and then combining these to give
the result.
SSIM(i)=((2×mx×my)+c1)×((2×covxy)+c2) ..eqn(3)
(((mx)^2+c1)×varx+vary+c2)
Examples of reduced–reference evaluation are:
1) Blocking: These are the square or rectangular shaped
distortion areas in an image. This kind of distortion is
likely to be seen at boundary between blocks that
contain coded co-efficients or boundary of an intra
code macro block. Block distortion is likely to be
more significant when quantization parameter (QP) is
higher.
2) Blurring: This is one of the degradation parameter.
Blurring increases with increase in compression, as
there is a reduction in contrast between neighboring
pixels.
V. SIMULATION RESULTS
Figure 9: The simulation waveforms of the metrics.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 408
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 9-April 2015
10
15
.0346
.0018
38.6449
63.1510
TABLE II
COMPARISON TABLE FOR RHINOS VIDEO CLIP
SNR
-20
-15
-10
-5
0
5
10
15
MSE
.4794
.4640
.4365
.3897
.3078
.1863
.0574
.0023
PSNR
7.3523
7.6792
8.2899
9.4290
11.7833
16.8024
28.5847
61.1243
TABLE III
COMPARISON TABLE FOR PETS VIDEO CLIP
Figure 10: Comparison waveform of SNR versus MSE.
SNR
-20
-15
-10
-5
0
MSE
.4796
.4645
.4355
.3890
.3068
PSNR
7.3479
7.6687
8.3121
9.4415
11.8617
5
10
15
.1867
.0568
.0024
16.7808
28.6786
60.1923
VI. CONCLUSION AND FUTURE SCOPE
The H.264 SVC encoder and decoder parts are
implemented. The two no-reference metrics blocking and
blurring are used to find out the effect on video quality as we
progress through degradation path for every scalable
Figure 11: Comparison waveform of SNR versus PSNR.
dimension. Also the effect on video quality in the presence of
The above snapshots in Figure 9 shows the simulation loss is investigated for each scalable dimension. Our findings
waveforms of the four metrics plotted against the number of indicate that-as spatial resolution decreases and the
frames at a condition when noise is present at the decoder side. quantization decreases, the impact of loss on video quality is
Noise is manually added to produce disturbance at the decoder decreased. Also the impact of loss in temporal degradation
side since we are not showing the transmission process. The leads to a greater impact on video quality.
four metrics are PSNR, SSIM, blocking and blurring.The
This work can be used as reference for selection of
Figure 10 shows the behavior of Mean Squared Error(MSE)
when SNR(Signal to Noise Ratio) is varied and Figure 11 suitable dimensions to maximize the video quality when
shows the behavior of Peak Signal to Noise Ratio(PSNR) constructing the SVC sequence layers and also to save the
when SNR is varied. From the comparison waveforms, tables bandwidth to its largest amount.
for different types of video clips (screw video clip, rhinos
video clip and pets video clip) is shown below to know how
ACKNOWLEDGMENT
the SNR value, MSE value and the PSNR value will actually
vary.
We wish to acknowledge HKBK College of Engineering
TABLE I
for providing the Infrastructure to carry out the process of
COMPARISON TABLE FOR SCREW VIDEO CLIP
developing a soft core for Traffic aware video coding using
Scalable Video Coding(SVC).
SNR
MSE
PSNR
-20
-15
-10
-5
0
5
.3050
.2916
.2761
.2407
.1910
.1163
ISSN: 2231-5381
11.8783
12.3246
12.8705
14.2440
16.5544
21.5200
http://www.ijettjournal.org
Page 409
International Journal of Engineering Trends and Technology (IJETT) – Volume 22 Number 9-April 2015
[9] M.Ghareeb, A. Ksentini, and C. Viho, ―Scalable video coding(SVC) for
REFERENCES
[1] Patrick McDonagh, Amit Pande, Member, IEEE, Liam Murphy, Member,
IEEE, and Prasant Mohapatra, Fellow, ―Towards Deployable Methods
for Assessment of Quality for Scalable IPTV Services‖, IEEE
transactions on Broadcasting, vol 59, No.2, June 2013.
[2] H.Schwarz, D.Marpe, and T.Wiegand, ―Overview of the scalable video
coding extension of the H.264/AVC standard,‖
[3] IEEE Transactions on Circuits and Systems for Video Technology,
vol.17, no.9, pp.1103-1120, 2007.
multipath video streaming over video distribution networks(VDN),‖ in
Proc. IEEE Int. Conf. Inform. Networking,2011,pp.206-211
Ksentni, M.Naimi, and A. Gu‘eroui, ―Toward an improvement of
H.264 video transmission over IEEE 802.11e through a cross-layer
architecture,‖ IEEE Commun. Mag., vol.44,no.1,pp.107-114, Jan.2006.
[10]
Singh, A. Ksentini, and B. Marienval, ―Quality of experience
measurement tool for SVC video coding,‖ in Proc. IEEE ICC, Jun.2011,
pp.1-5.
[11]
Monteiro, C, Calafate, and M. Nunes, ―Evaluation of the H.264
scalable video coding in errer prone IP networks,‖ IEEE Trans.
Broadcast., vol.54, no.3, pp.652-659, Sep. 2008.
[4] Kassler, M.O‘Droma, M.Rupp, and Y. Koucheryavy, ―Advances in
[12]
quality and performance assessment for future wireless communication
services,‖ Eurasip Journal on Wireless communication and networking,
vol.2010, Article ID 389728, 2010.
[13]
[5] R.Haddad, M. McGarry, and P.Seeling, ―Video bandwidth forecasting,‖
Iain E. Richardson, ―The H.264 Advanced Video Compression
Standard,‖ 2nd ed, 2010.
IEEE Communications Survey & Tutorials, vol. 15, no.4, pp.1803-1818,
2013.
[6] J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira,
T. Stockhammer, and T. Wedi, ―Video coding with H.264/AVC:Tools,
performance, and complexity,‖ IEEE Circuits Sust. Mag., vol.4, no.1, pp.
7-28, Jan-Mar.2004.
[7] G-M. Muntean, P. Perry, and L. Murphy, ―Objective and subjective
evaluation of QOAS video streaming over broadband networks,‖ IEEE
Trasn. Network service Manage., vol.2, no.1, pp.19-28, Nov.2005.
[8] L. Zhang, C. Yuan, and Y. Zhong, ―Reliable and efficient adaptive
streaming mechanism for multi-user SVC VoD system over
GPRS/EDGE network,‖ in Proc. IEEE Int. Conf. Comput. Sci. Softw.
Eng., vol.3.2008, pp.232-235.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 410
Download