Proposal

advertisement
EE5359 – Multimedia Processing
“A performance comparison of fractional-pel interpolation filters in HEVC and
H.264/AVC”
Under the guidance of
Dr.K.R.RAO
EE 5359 - Multimedia Processing Project Updated Report
Submitted By: LOHITH SUBRAMANYA
UTA ID: 1000928742
E-mail: lohith.subramanya@mavs.uta.edu
Date of Submission: 19th February 2014
1
Spring 2014
EE5359 – Multimedia Processing
List of Acronyms:
AIF: Adaptive Interpolation Filter
ALF: Adaptive Loop Filter
APEC: Adaptive Prediction Error Coding
AVC: Advanced Video Coding
AQMS: Adaptive Quantization Matrix Selection
CSVT: Circuits and Systems for Video Technology
DCT: Discrete Cosine Transform
DCTIF: Discrete Cosine Transform Interpolation Filter
DMVD: Decoder-side Motion Vector Deviation
DSP: Digital Signal Processing
EMS: Extended Macro-block Size
FIR: Finite Impulse Response
HEVC: High Efficiency Video Coding
IBDI: Internal Bit Depth Increasing
ITU-T: International Telecommunication Union – Telecommunication Standardization Sector
JCT-VC: Joint Collaborative Team on Video Coding
JPEG: Joint Photographic Experts Group
KLT: Karhunen - Loeve Transform
LTS: Larger Transform Size
MCP: Motion Compensated Prediction
MPEG: Moving Picture Experts Group
MV: Motion Vectors
RDO: Rate Distortion Optimization
SOC: System On Chip
SVN: Sub-Version
VCEG: Video Coding Experts Group
VCIP: Visual Communications and Image Processing
2
Spring 2014
EE5359 – Multimedia Processing
Objective:
The objective of this project is to compare and analyze the fractional-pel interpolation
filters in HEVC [18] and H.264/AVC [17] based on their frequency responses, complexity,
coding performance and performance gain. BD-PSNR [33] and BD-Bit Rate [33] are the metrics
used to evaluate the comparison of the two standards.
Introduction:
The fractional-pel interpolation filters (6-tap FIR [24] and average) adopted in
H.264/AVC [17] improves motion compensation greatly. Similarly, DCT-based fractional-pel
interpolation filters (7-tap and 8-tap) is adopted in the HEVC [1] [10] standard. This project
involves the differences between these fractional-pel interpolation filters. During this project, the
derivations of fractional-pel interpolation filters in HEVC [1] and H.264/AVC [17] will be
described in detail and will be compared based on properties of their frequency responses.
What is H.264 [17]?
It is an industry standard for video compression, the process of converting digital video
into a format that takes up less capacity when it is stored or transmitted. Video compression (or
video coding) is an essential technology for applications such as digital television, DVD-Video,
mobile TV, videoconferencing and internet video streaming. Standardizing video compression
makes it possible for products from diļ¬€erent manufacturers (e.g. encoders, decoders and storage
media) to inter-operate. The encoder converts video into a compressed format and the decoder
converts compressed video back into an uncompressed format.
H.264 [17] defines a format (syntax) for compressed video and a method for decoding
this syntax to produce a displayable video sequence. Figure 1 [17] shows the encoding and
decoding processes and highlights the parts that are covered by the H.264 standard.
3
Spring 2014
EE5359 – Multimedia Processing
Figure 1: Block Diagram of H.264 [17]
How does H.264 [17] codec work?
An H.264 [17] video encoder carries out prediction, transform and encoding processes
(Figure 1) [17] to produce a compressed H.264 [17] bitstream. An H.264 [17] video decoder
carries out the complementary processes of decoding, inverse transform and reconstruction to
produce a decoded video sequence.
What is HEVC [1]?
High Efficiency Video Coding (HEVC) [1] is the current joint video coding
standardization project of the ITU-T Video Coding Experts Group (VCEG) (ITU-T Q.6/SG 16)
and ISO/IEC Moving Picture Experts Group (MPEG) (ISO/IEC JTC 1/SC 29/WG 11).
The Joint Collaborative Team on Video Coding (JCT-VC) [25] has been established to
work on this project.
The Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V)
[25] has been established to work on 3D video coding extensions of HEVC and other video
coding standards.
The block diagram of HEVC is shown in figure 2 [6].
4
Spring 2014
EE5359 – Multimedia Processing
Figure2: Block diagram of HEVC [6]
Why use HEVC?
All video technologies need encoding and decoding to ensure efficient transmission and
storage. HEVC’s better use of bandwidth is designed to enable higher resolutions without
crippling networks and overstuffing storage systems. Since video and cinema industries are
edging towards 8Kx4K [22] video resolutions with images of more than 8 megapixels, thus the
new HEVC standard could soon be a widely used technology. However, due to the filing of
numerous patents the use will not be for free. The patent filing by country is shown in Figure 3
[2] and the HEVC encoding efficiency for a sample (Kimono1) video sequence is shown in
Figure 4 [5].
5
Spring 2014
EE5359 – Multimedia Processing
Figure 3: Patent filings of HEVC related patents by the five biggest patent holders [2]
Figure 4: HEVC encoding efficiency in comparison with the previous standards [5]
6
Spring 2014
EE5359 – Multimedia Processing
Features of HEVC [3]:
The JCT-VC is currently evaluating modifications [3] to current coding tools, such as:
ALF: Adaptive Loop Filter
EMS: Extended Macro-block Size
LTS: Larger Transform Size
IBDI: Internal Bit Depth Increasing
AQMS: Adaptive Quantization Matrix Selection,
As well as new coding tools such as:
Modified intra prediction,
Modified de-block filter and
DMVD: Decoder-side motion vector deviation.
The new features [3] proposed to meet the requirements are:
2-D non-separable AIF
Separable AIF
Directional AIF
"Super-macro-block" structure up to 64x64 with additional transforms.
Adaptive prediction error coding (APEC) in spatial and frequency domains
Competition-based scheme for motion vector selection and coding
Mode-dependent KLT [21] for intra coding
It is speculated that these techniques are most beneficial with multi-pass encoding.
7
Spring 2014
EE5359 – Multimedia Processing
FIR Filters [27]:
FIR filters [27] are one of two primary types of digital filters used in Digital Signal Processing
(DSP) applications.
"FIR" means "Finite Impulse Response". If an impulse response, that is, a single "1" sample is
followed by many "0" samples, zeroes will come out after the "1" sample has made its way
through the delay line of the filter.
An N-Tap FIR filter is shown in figure 5 [28].
Figure 5: N-Tap FIR filter [28]
Here h (k) is the filter coefficient array and x (n-k) is the input data array to the filter.
The number N represents the number of taps of the filter and relates to the filter performance.
An N-tap FIR filter requires N multiply-accumulate cycles.
8
Spring 2014
EE5359 – Multimedia Processing
Why use interpolation in video coding?
Motion-compensated prediction (MCP) [8] is the key to the success of the modern video
coding standards, as it removes the temporal redundancy in video signals and reduces the size of
bitstreams significantly. With MCP, the pixels to be coded are predicted from the temporally
neighboring ones, and only the prediction errors and the motion vectors (MV) [8] are
transmitted. However, due to the finite sampling rate, the actual position of the prediction in the
neighboring frames may be out of the sampling grid, where the intensity is unknown. So, the
intensities of the positions in between the integer pixels, called sub-positions, must be
interpolated and the resolution of MV [8] is increased accordingly.
One aspect of HEVC [1] video compression involves interpolation among various pixels
to determine brightness. Figure 6 [9] from the draft standard shows how this process takes place.
Figure 6: Integer pixel (Shaded with upper case letters) and fractional pixel positions
(Non-shaded blocks with lower case letters) for quarter-pel LUMA interpolation [9]
(Credit: ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC))
The interpolation filters used in H.264 [17] are 6 tap FIR filter for half-pel interpolation and the
average filter for quarter-pel interpolation. Similarly, in HEVC [3], an 8-tap DCTIF is used for
half-pel interpolation and a 7-tap DCTIF is used for quarter-pel interpolation.
9
Spring 2014
EE5359 – Multimedia Processing
DCT [10] is one of the popular transforms used in video signal processing applications, since
DCT exhibits similar properties to the optimal KLT [21]. The 2nd order DCT [10] used in image
compression standard JPEG [31] is defined by:
Here, X(k) is the 2nd order inverse DCT co-efficient and x(n) is the 2nd order forward DCT coefficient. Also,
By substituting the forward 2nd order DCT [10] equation in the inverse equation, the
interpolation formula is obtained which is as follows:
For even tap filters, the equation changes to:
Similarly, for an odd tap filter, the equation is represented by:
The filter co-efficients for half-pel and quarter-pel filters are:
The filter weights of the corresponding positions in HEVC [1] [10] are:
10
Spring 2014
EE5359 – Multimedia Processing
Here, the constant B = 8 and >> denotes arithmetic right shift. The magnitude response graphs of
half-pel interpolation filters are shown in figure 7.
Figure 7: Magnitude response graphs of half-pel interpolation filters [10]
Here, the solid graph represents the DCTIF 8-Tap filter response.
Dashed graph represents H.264/AVC filter response.
Dotted graph represents DCTIF 6-Tap filter response.
Plan of action:
Step 1: Based on reference software HM13 [16] - To change the halfpel interpolation filter
coefficients to the coefficients in H.264/AVC [17], and thus the fractional-pel pixels might get
affected.
Step 2: Based on reference software HM13 [16] - To change the half-pel interpolation filter
coefficients to the coefficients of DCTIF [10].
Step 3: In addition to the change in step 1, the interpolation methods of quarter-pel pixels in
horizontal direction and in vertical direction are to be changed to those in H.264/AVC [17]. Thus
the fractional-pel pixels might get affected again.
Step 4: Besides the changes in step 3, the interpolation methods of remaining four quarter-pel
pixels in the diagonal direction are to be changed to those in H.264/AVC [17].
11
Spring 2014
EE5359 – Multimedia Processing
Figure 8: Representation of integer, half and quarter pel pixels [20]
Figure 8 shows the representation of the integer and fractional pel arrangement.
The comparison of the modified filter coefficients based on coding performance, frequency
response, performance gain and complexity that are obtained from Steps 1-4 can be further
assessed for the required parametric results mentioned in “A comparison of Fractional-Pel
Interpolation Filters in HEVC and H.264/AVC” [10]
12
Spring 2014
EE5359 – Multimedia Processing
References:
1. Fraunhofer Heinrich Hertz Institute - http://hevc.hhi.fraunhofer.de/
2. Open Patents and Standards Platform - http://www.iplytics.com/en/tag/hevc/
3. HEVC Review Site http://telcogroup.ru/files/materialspdf/High_Efficiency_Video_Coding_H265.pdf
4. Overview of HEVC - http://iphome.hhi.de/wiegand/assets/pdfs/2012_12_IEEEHEVC-Overview.pdf
5. Extremetech Blog: http://www.extremetech.com/computing/162027-h-265benchmarked-does-the-next-generation-video-codec-live-up-to-expectations
6. Altera Technologies: http://www.altera.com/technology/systemdesign/articles/2013/tv-studio-system.html
7. I. Richardson, “Real time implementation of H.264 Video Coding”, IEEE International
SOC Conference, PP: 390, Sept. 2008
8. H.265 Blog http://www.h265.net/2010/07/adaptive-interpolation-filter-for-videocoding.html
9. CNET Blog http://news.cnet.com/8301-11386_3-57566116-76/hevc-video-standardfinished-high-end-improvements-coming/
10. H.Lv, et al, “ A comparison of fractional-pel interpolation in HEVC and H.264/AVC”,
IEEE Conference on Visual Communications and Image Processing (VCIP), PP: 1-6,
Nov. 2012
11. G.J.Sullivan, et al, “ Overview of the HEVC Standard”, IEEE Transactions on Circuits
and Systems for Video Technology (CSVT), Vol: 22, No: 12, PP: 1649-1668, Dec. 2012
12. B.Lee, et al, “Performance Comparison of various interpolation methods for color filter
arrays”, IEEE Symposium on Industrial Electronics, Vol: 1, PP: 232-236, Jun. 2001
13. V.Yu and J.Ostermann, “Locally Adaptive Non-Separable Interpolation Filter for
H.264/AVC”, IEEE International Conference on Image Processing, PP: 33-36, Oct. 2006
14. Video Test Sequences: http://trace.eas.asu.edu/yuv/
15. Tortoise SVN Downloadable Software Link: http://tortoisesvn.net/downloads.html
16. HM 13 Software Link:
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-13.0+RExt-6.0rc1/
17. H.264 Advanced Video Coding Blog https://www.vcodex.com/h264.html
18. G.J.Sullivan, et al, “ Standardized Extensions of HEVC”, IEEE Journal of Selected
Topics in Signal Processing, Vol: 7, No: 6, PP: 1001-1016, Dec. 2013
19. K.R.Rao, D.N.Kim and J.J.Hwang, “Video coding standards”, Springer Publications, Jan.
2014: http://www.springer.com/physics/book/978-94-007-6741-6
20. SPIE Digital Library:
http://electronicimaging.spiedigitallibrary.org/article.aspx?articleid=1730243
21. Karhunen-Loeve Transform:
http://en.wikipedia.org/wiki/Karhunen%E2%80%93Lo%C3%A8ve_theorem
22. Sharp 8Kx4K TV: http://www.sound-news.net/index.php/the-novosti/hifi-avnovosti/item/552-sharp-8kx4k-tv
13
Spring 2014
EE5359 – Multimedia Processing
23. Institute of Computer and Communication Engineering:
http://research.ncku.edu.tw/re/articles/e/20071102/2.html
24. FIR Filter: http://en.wikipedia.org/wiki/Finite_impulse_response
25. JCT-VC Document Management System: http://phenix.int-evry.fr/jct/
26. T.Wiegand, et al, “Overview of the H.264/AVC Video Coding Standard”, IEEE
Transactions on Circuits and Systems for Video Technology, Vol: 13, No: 7, PP: 560576, July 2003
27. Iowegian International DSP Site: http://www.dspguru.com/dsp/faqs/fir/basics
28. N-Tap FIR Filter: http://www.analog.com/static/importedfiles/seminars_webcasts/MixedSignal_Sect6.pdf
29. I.Richardson, “ The H.264 Advanced Video Compression Standard”, Wiley Publications,
Aug. 2010: http://www.wiley.com/WileyCDA/WileyTitle/productCd0470516925.html
30. HM 13 Software Reference Manual: http://mpeg.chiariglione.org/standards/mpegh/high-efficiency-video-coding/high-efficiency-video-coding-hevc-encoderdescription
31. JPEG: http://www.jpeg.org/
32. JM 18.6 Software Repository: http://iphome.hhi.de/suehring/tml/download/
33. BD-PSNR and BD-BR:
http://www.mathworks.com/matlabcentral/fileexchange/27798-bjontegaardmetric/content/bjontegaard.m
14
Spring 2014
Download