Current Video Coding Standards

advertisement
Current Video Coding Standards:
H.264/AVC, Dirac, AVS China and VC-1
K. R. Rao, IEEE Fellow, and Do Nyeon Kim
Dept. of Electrical Engineering
University of Texas at Arlington
Arlington, Texas, USA
rao@uta.edu
Barun Technologies, Corp.
Seoul, South Korea
cooldnk@yahoo.com
Abstract—Video coding standards: H.264/AVC,
DIRAC, AVS China and VC-1are presented. These
are the latest standards and are adopted by ITUT/ISO-IEC, BBC, China standards organization and
SMPTE respectively.
Besides presenting these
standards, research potential and as well projects
(both at UG and grad levels) are emphasized. These
are available by accessing the database for research
and projects in [18]. Web/ftp sites for accessing
standards documents, software, test sequences,
conformance bit streams, industry activities etc are
provided.
HEVC
Keywords- H.264/AVC; Dirac; AVS China; VC-1
I.
INTRODUCTION
Residual image data is that which is obtained
through taking the pixel by pixel differences
between the original data and the image
reconstructed after lossy compression. For lossless
compression, the residual from compression are
separately compressed using an appropriate lossless
compression approach [72].Work has been done on
optimizing the codec, either by reducing the
complexity,
encoding
time,
improving the quality, or improving
the robustness of the standard using
algorithms for error concealment and
error correction (Fig. 1).
MPEG-4
AVC/H.264
is
developed
for
multimedia
applications [1, 3, 5-13, 19]. It
adopted advanced coding techniques
such as multiple-reference frame
prediction,
and
context-based
adaptive binary arithmetic coding
(CABAC).
It
provides
high
compression efficiency. Thus it
enables to compress video to
1.5~2Mbps for standard definition
(SD), and 6~8Mbps for HD. It can
save
storage
space,
channel
bandwidth, and frequency spectrum.
Figure 1. Optimizing the codec in terms of complexity and robustness.
We suggest that you use a text box to insert a graphic (which is ideally
a 300 dpi TIFF or EPS file, with all fonts embedded) because, in an MSW
document, this method is somewhat more stable than directly inserting a
picture.
To have non-visible rules on your frame, use the MSWord “Format”
pull-down menu, select Text Box > Colors and Lines to choose No Fill
and No Line.
Figure 2. Profiles in H.264/AVC [1].
1
II.
H.264/AVC
A. H.264 intra-frame encoding
H.264 (Figs. 2, 3 and 4) uses the methods of adaptive
prediction of intra-coded macroblocks to reduce the high
amount of bits coded by original input signal itself. For
encoding a block or macroblock in intra-coded mode, a
prediction block is formed based on previously reconstructed
blocks. For the luma samples, the prediction block may be
formed for each 4 × 4 subblock, each 8 × 8 block, or for a 16
× 16 macroblock. One mode is selected from a total of 9
prediction modes for each 4 × 4 (similar to Fig. 7) and 8 × 8
luma blocks; 4 modes for a 16 × 16 luma block; and 4 modes
for each chroma block. The residuals generated from the
difference between the current block and the best mode are
further processed by the transform and quantization unit, and
reconstructed by their inverse operations to be the reference
for the next macroblock. The coefficients after quantization
are encoded by entropy coding for final bit stream output.
Figure 3. Coding structure for H.264/AVC encoder for a macroblock [7].
B. Profiles
Jizhun Profile (base profile or main profile) is defined in
AVS Part 2 and is targeted mainly at digital video applications
like commercial broadcasting and storage media. It has
moderate computational complexity. Jiben Profile (basic
profile or baseline profile) is defined in AVS Part 7 for mobile
applications. Shenzan and Jiaqiang profiles are defined in AVS
Part 2 for video surveillance and multimedia entertainment
respectively.
The best prediction mode(s) are chosen utilizing the R-D
optimization which is described as:
J(s, c, MODE|QP)
= D(s, c, MODE|QP)+MODE R(s, c, MODE|QP)
(1)
The distortion D(s,c,MODE|QP) is measured as sum of
squared differences(SSD)between the original block sand the
reconstructed block c, and QP is the
quantization parameter, MODE is the
prediction mode. R(s,c,MODE|QP) is a
number of bits for coding the block. The
modes(s)
with
the
minimum
J(s,c,MODE|QP) are chosen as the
prediction mode(s) of the macro block.
Adpative seven block sizeME/MC
prediction for
inter-frame predictionis
shown in Fig. 5.
III.
AVS CHINA [47-53]
A. Standards
AVS Part 1 System comprises a set of
standards that converts single/multi
channel audio and video bit streams into a
single multiplexed stream for transmission
and storage and also defines an encoding
syntax which is necessary for synchronous
de-multiplexing of audio and video bit streams.
Figure 4. H.264/MPEG-4 AVC decoder block diagram [1].
8
16
AVS System basically comprises of two data streams
namely the program stream and transport stream where each
one has its own applications. AVS Part1 complies with AVS
Part 2 or AVS Part 7 video, AVS Part 3 audio as its elementary
bit stream [46].
16
8
MB
16
0
16
0
0
8
8
8
While H.264 specifies only video, it is meaningful to
encode and multiplex audio with the video bitstream. Hence
this is a viable research area where the best audio codec can be
multiplexed with the latest video codecs such as AVS China,
H.264/AVC, VC-1 and Dirac(Fig. 6).Ten parts of AVS china
are listed in Table I.
8
0
8
4
0 1
1
2
3
1
1
Sub MB
0
4
8 0
1
4
4 0
1
2
3
Figure 5. MB and sub MB partitions for adpative ME/MC prediction
(seven block sizes). The coded blocks with motion vectors are ordered in a
raster-scan order.
2
Nine adaptive directional intra prediction modes
including the DC mode for luminance in AVS-China
Part 7 is show in Fig. 7 [53].
C. Inter-frame prediction (Part 7)
Similar to Fig. 5, seven sizes of the blocks in
inter-frame adaptive ME/MC prediction are 16×16,
16×8, 8×16, 8×8, 8×4, 4×8 and 4×4 depending on the
amount of information present within the macro-block.
Motion is predicted up to ¼ pixel accuracy. If the
half_pixel_mv_flag is 1 then it is up to ½ pixel
accuracy.
Eight-tap filter F1 = (−1,4,−12,41,41,−12,4,−1)
and four-tap filter F2 = (−1,5,5,−1) are used for
horizontal and vertical interpolations respectively for
½ pixel MV search and averaging (liner interpolation)
is used for ¼ pixel accuracy.
Figure 8. Comparison of H.264 and VC-1.
IV.
SMPTE VC-1 (WINDOWS MEDIA VIDEO 9)
VC-1 [24-27] is an informal name of the SMPTE 421M
video codec. This standard initially has been developed by
Microsoft – Window Media Video 9. WMV-9 supports
progressive video and is mainly used for online video services.
VC-1 extends WMV-9 and adds features necessary for
broadcast services such as interlace support. It is a supported
standard for Blu-ray Discs and Windows Media Video. The
high definition DVD format Blue ray has mandated MPEG-2,
H.264 and VC-1 as the video compression formats.VC-1 is
compared with H.264 in Fig. 8.
Figure 6. Multiplexing of audio/video and lip sync.
V.
Dirac [28-45] is a family of video codecs spanning mobile
to UHDTV and film video post production. For low bit rate
applications such as the Internet, we can think of Dirac as
functionally similar to H.264 (Fig. 8) and offering similar
compression performance. For high quality compression in
production, Dirac is functionally similar to JPEG2000 [54-69].
Dirac is royalty free open technology. Dirac is simple, low
cost. Dirac is a hybrid motion-compensated video coding,
whereas Dirac Pro (standardized as SMPTE VC-2) is only intra
frame coding for professional or production applications.
Figure 7. Nine adaptive directional intra prediction modes
including the DC mode for luminance in AVS-China Part 7
[53].
TABLE I.
In the Dirac codec, image motion is tracked and the motion
information is used to make a prediction of a later frame. A
transform is applied to the prediction error between the current
frame and the previous frame aided by motion compensation
and the transform coefficients are quantized and entropy coded
(Figs. 9 and 10).Temporal and spatial redundancies are
removed by motion estimation, motion compensation and
discrete wavelet transform respectively. Dirac uses a more
flexible and efficient form of entropy coding called arithmetic
coding which packs the bits efficiently into the bit stream [28,
44].The two-dimensional discrete wavelet transform provides
Dirac with the flexibility to operate at a range of resolutions.
This is because wavelets operate on the entire picture at once,
rather than focusing on small areas at a time. In Dirac, the
discrete wavelet transform plays the same role as the DCT in
MPEG-2 in de-correlating data in a roughly frequencysensitive way, whilst having the advantage of preserving fine
details better than block based transforms [42]. An experiment
TEN PARTS OF AVS CHINA STANDARD FAMILY [46]
AVS
Contents
Part 1
Part 2
Part 3
Part 4
Part 5
Part 6
Part 7
Part 8
Part 9
Part 10
System for broadcasting
SD/HD video
Audio
Conformance test
Reference software
Digital rights management
Mobility video
System over IP
File format
Mobile speech and audio coding
DIRAC
3
and “Susie" standard-definition (SD) (720×480) are used for
evaluation (Figs.11, 12 and13). The two methods are very close
and comparable in compression, PSNR and SSIM. Also, a
significant improvement in encoding time is achieved by Dirac,
compared to H.264 for all the test sequences [42].
showed the difference in the encoding time taken by Dirac and
H.264 / MPEG-4 for QCIF, CIF and SD sequences. The
simplicity of the Dirac encoder is evident, as its encoding speed
was much higher compared to the H.264 AVC [42].
Compression ratio vs Bitrate at CBR (QCIF)
100
H.264
Dirac
90
80
Compression ratio
70
60
50
40
30
20
10
0
Figure 9. Dirac encoder architecture [44, 45].
VI.
0
20
40
60
80
100
120
140
Bitrate (k bits per second)
160
180
200
Figure 11. Compression ratio comparison of Dirac and H.264 for “MissAmerica” QCIF sequence [42].
SIMULATION RESULTS
The comparison between H.264and AVS-China’s
performance was produced by encoding several test sequences
SSIM vs Bitrate at CBR (QCIF)
1
0.995
0.985
M
SSIM
0.99
0.98
0.975
0.97
Figure 10. Dirac decoder architecture [42].
0.965
0
at different bit rates and shown in Figs. 14 thru 17. Test
sequences with HD (1280×720) and standard-definition (SD)
(720×480) are used for evaluation. The two methods are very
close and comparable in peak-to-peak signal-to-noise ratio
(PSNR).
H.264
Dirac
20
40
60
80
100
120
140
Bitrate (k bits per second)
160
180
200
Figure 12. SSIM comparison of Dirac and H.264 for “Miss-America” QCIF
sequence [42].
VII. CONCLUSIONS
Objective test methods attempt to quantify the error
between a reference and an encoded bitstream. To ensure the
accuracy of the tests, each codec must be encoded using the
same bit rate. Since the latest version of Dirac does include a
constant bit rate (CBR) mode, the comparison between Dirac
and H.264’s performance was produced by encoding several
test sequences at different bit rates. By utilizing the CBR mode
within H.264, we can ensure that H.264 is being encoded at the
same bit rate as that of Dirac.
Video coding standards: H.264/AVC, DIRAC, AVS China
and VC-1 are presented. Performance comparison of these
standards using different test sequences is presented. Their
functionalities are summarized in Tables II and III. In general
H.264 performs better compared to Dirac, AVS China and VC1, but at the cost of additional complexity.
Objective tests are divided into three sections, namely (i)
compression, (ii) structural similarity index (SSIM), and (iii)
peak-to-peak signal-to-noise ratio (PSNR). The test sequences
“Miss-America” QCIF (176×144), “Stefan” CIF (352×288)
[1]
REFERENCES
H.264/AVC
4
A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG4 AVC compression standard”, Signal Processing: Image
Communication, vol. 19, pp. 793-849, Oct. 2004.
PSNR vs Bitrate at CBR (QCIF)
PSNR vs Bitrate for a SDTV sequence
54
42
52
40
50
46
dB)
PSNR (in dB)
dB)
PSNR (in dB)
38
48
36
44
34
42
32
40
38
H.264
Dirac
0
20
40
60
80
100
120
140
Bitrate (k bits per second)
160
180
H.264-High
AVS-Jizhun
30
200
0
1
2
3
4
5
6
7
Bitrate (M bits per second)
8
9
10
Figure 16. Bitrate vs. PSNR for Bus – SDTV sequence (720  480i).
Figure 13. PSNR (peak-to-peak signal-to-noise ratio) comparison of Dirac
and H.264 for “Miss-America” QCIF sequence [42].
MSE vs Bitrate for a SDTV sequence
50
PSNR vs Bitrate for a HDTV sequence
44
H.264-High
AVS-Jizhun
45
40
42
35
MSE
dB)
PSNR (in dB)
38
E
30
40
25
20
15
36
10
34
32
5
H.264-High
AVS-Jizhun
Dirac
0
5
10
15
20
25
Bitrate (M bits per second)
30
35
0
40
[2]
[3]
MSE vs Bitrate for a HDTV sequence
H.264-High
AVS-Jizhun
Dirac
30
20
E
MSE
25
15
10
5
5
10
15
20
25
Bitrate (M bits per second)
20
30
40
50
60
70
Bitrate (M bits per second)
80
90
100
H.264 AVC JM software: http://iphome.hhi.de/suehring/tml/
D. Kumar, P. Shastry and A. Basu, “Overview of the H.264 / AVC”, 8th
Texas Instruments Developer Conference India, 30 Nov – 1 Dec 2005,
Bangalore.
[4] H.264 encoder and decoder:
http://www.adalta.it/Pages/407/266881_266881.jpg
[5] “H.264 video compression standard”, White paper, Axis
communications.
[6] R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC
standard”, EBU Technical Review, Jan. 2003.
[7] T.Wiegand, et al “Overview of the H.264/AVC video coding standard”,
IEEE Trans. CSVT, vol.13, pp 560–576, July 2003.
[8] M.Fieldler, “Implementation of basic H.264/AVC decoder”, seminar
paper at Chemnitz University of Technology, June 2004
[9] MPEG-4: ISO/IEC JTC1/SC29 14496-10: Information technology –
Coding of audio-visual objects - Part 10: Advanced Video Coding,
ISO/IEC, 2005.
[10] Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec.
H.264/ISO/IEC 14496-10, Mar.2005.
[11] S.K.Kwon, A.Tamhankar and K.R.Rao, “Overview of H.264 / MPEG-4
Part 10” J. Visual Communication and Image Representation, vol. 17,
pp.186–216, April 2006.
[12] D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC
standard and its applications”, IEEE Communications Magazine, vol. 44,
pp. 134–143, Aug. 2006.
35
0
10
Figure 17. Bitrate vs. MSE for Bus – SDTV sequence (720  480i).
Figure 14. Bitrate vs. PSNR for Harbour – HDTV sequence (1280 
720p).AVS Jizhun Profile is a main profile.
0
0
30
35
40
Figure 15. Bitrate vs. MSE for Harbour – HDTV sequence (1280  720p).
5
[13] T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”,
IEEE Signal Processing Magazine, vol. 24, pp. 148–153, March 2007.
[14] Z. Wang, et al “Image quality assessment: From error visibility to
structural similarity,” IEEE Trans. on Image Processing, vol. 13, pp.
600-612, Apr. 2004. http://www.ece.uwaterloo.ca/~z70wang/
[15] H. Jia and L. Zhang, “Directional diamond search pattern for fast block
motion estimation”, IEE Electronics Letters, vol. 39, No. 22, pp. 15811583, 30th Oct. 2003.
[16] Video test sequences (YUV 4:2:0):
http://trace.eas.asu.edu/yuv/index.html
[17] Video test sequences ITU601:
http://www.cipr.rpi.edu/resource/sequences/itu601.html
[18] K.R. Rao, Mutimedia Processing, Course Website, UT Arlington:
http://ee.uta.edu/Dip/Courses/EE5359/index.html
[19] I. Richardson, H.264 Advanced Video Compression Standard, II Edition,
Hoboken, NJ: Wiley, 2010.
[20] Y.Q. Shi and H. Sun, “Image and video compression for multimedia
engineering”, Boca Raton: CRC Press, II Edition, (Chapter on H. 264),
2008.
[21] B. Furht and S.A. Ahson, “Handbook of mobile broadcasting, DVB-H,
DMB, ISDB-T and MEDIAFLO,” Boca Raton, FL: CRC Press, 2008
(H.264 related chapters).
[23e] http://www.h265.net/ has info on developments in HEVC NGVC –
Next generation video coding.
[23f] JVT KTA Reference Software
http://iphome.hhi.de/suehring/tml/download/KTA
[23g] IEEE Trans. on CSVT, vol. 20, Special section on high efficiency video
coding (several papers), Dec. 2010.
[23h] Z. Ma and A. Segall, „ Low resolution decoding for high-efficiency
video coding“, IASTED SIP-2011, Dalls, TX, Dec. 2011.
[23i] T. Wiegand, B. Bross, W.-J. Han, J.-R. Ohm, and G. J. Sullivan, WD3:
Working Draft 3 of High-Efficiency Video Coding, Joint Collaborative
Team emerging HEVC standard on Video Coding (JCT-VC) of ITU-T
VQEG and ISO/IEC MPEG, Doc. JCTVC-E603, Geneva, CH, March
2011.
[23j] Y. Ye and M. Karczewicz, “Improved H.264 intra coding based on bidirectional intra prediction, directional transform, and adaptive
coefficient scanning,” IEEE Int’l Conf. Image Process.’08 (ICIP08), San
Diego, U.S.A., Oct. 2008.
[23k] IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 7, Nov.
2011 (several papers on HEVC) Introduction to the Issue on Emerging
Technologies for Video Compression.
[23l] R. Joshi, Y.A. Reznik and M. Karczewicz, “Efficient large size
transforms for high performance video coding”, Proc. SPIE, vol. 7798,
pp.
, San Diego, CA, Aug. 2010.
[23m] Special issue on emerging research and standards in next generation
video coding” IEEE Trans. CSVT, Tentative publication date (Dec.
2012).
[24] VC-1 Compressed Video Bitstream Format and Decoding
Process,SMPTE421M-2006, SMPTEStandard, 2006.
[25] S. Srinivasan and S. L. Regunathan, “An overview of VC-1,” Proc.
SPIE, vol. 5950, pp. 720–728, 2005.
[26] Microsoft Windows Media:
http://www.microsoft.com/windows/windowsmedia
[27] H. Kalva and J.-B. Lee, The VC-1 and H.264 video compression
standards for broadband video services, Springer, 2008.
MPEG AND H.26X SERIES
[22] <http://en.wikipedia.org/wiki/MPEG>
[22a] V. Vijaykumar and K.R. Rao, “Low complexity H.264 to VC-1
transcoder” J. of Real Time Image processing (under review).
[22b] V.S. Kolkeri, J. H. Lee and K. R. Rao, ”Error concealment techniques in
H.264/AVC for wireless video transmission in mobile networks”,
International Journal in Image Processing, (Under review)
[22c] K.R. Rao, A. Urs and S. Patil, “Comparison of 8 × 8 integer DCTs used
in H.264, AVS-CHINA and VC-1 video codecs”,CMIC 2011, 4-7 Jan.
2011, Chiang Mai, Thailand.
[22d] D. Han et al, “ Low complexity H.264 encoder using machine learning”,
IEEE SPA 2010, PP. 40-43, Poznan, Poland, Sept. 2010.
DIRAC
[22e] K.V.S Swaroop and K.R. Rao, “ Performance analysis and comparison
of JM 15.1 and Intel .
IPP H.264 encoder and decoder”, 42nd
South Eastern Symp. on System Theory, pp. 371-375, Tyler, TX,
March 2010.
[22f] S.-W. Lee and C.-C.J. Kuo, “ H.264/AVC entropy decoder complexity
analysis and its applications”, J. VCIR, vol.22, pp. 61-72, Jan. 2011.
[22g] T. Wiegand and G.J. Sullivan, “The picturephone is here. Really,” IEEE
Spectrum, vol. 48, pp. 50-54, Sept. 2011.
[28] K. Onthriar, K. K. Loo and Z. Xue, “Performance comparison of
emerging Dirac video codec with H.264/AVC,” IEEE Int’l Conf. on
Digital Telecommunications, ICDT 2006, vol. 6, Page: 22, Issue: 29-31,
Aug. 2006.
[29] T. Davies, “The Dirac Algorithm”:
http://dirac.sourceforge.net/documentation/algorithm/, 2008.
[30] M. Tun and W. A. C. Fernando, “An error-resilient algorithm based on
partitioning of the wavelet transform coefficients for a Dirac video
codec,” IEEE Tenth International Conf. on Information Visualization,
IV’06, pp.615–620, July 2006.
[31] Daubechies wavelet: http://en.wikipedia.org/wiki/Daubechies_wavelet
[32] Daubechies wavelet filter design: http://cnx.org/content/m11159/latest/
[33] Vorbis: http://www.vorbis.com/
[34] T. Borer, “Dirac coding: Tutorial & Implementation”, EBU Networked
Media Exchange seminar, June 2009.
[35] Dirac software and source code: http://diracvideo.org/download/diracresearch/
[36] Dirac video codec – A programmer's guide:
http://dirac.sourceforge.net/documentation/code/programmers_guide/toc
.htm
[37] Dirac Pro: http://www.bbc.co.uk/rd/projects/dirac/diracpro.shtml
[38] T. Davies, “A modified rate-distortion optimization strategy for hybrid
wavelet video coding,” ICASSP 2006,vol.2, pp.909–912, May 2006.
[39] M. Tun, K.K. Loo and J. Cosmas, “Semi-hierarchical motion estimation
for the Dirac video codec,” 2008 IEEE International Symposium on
Broadband Multimedia Systems and Broadcasting, pp.1–6, March 31April 2, 2008.
HEVC
[23] G.J. Sullivan and J.-R. Ohm, “Recent developments in standardization of
high efficiency video coding (HEVC),” SPIE Optics + Photonics,
Applications of Digital Image Processing XXXIII, vol. 7798, paper
7798-3, San Diego, CA, Aug. 2010.
[23a]EEE Trans. on CSVT, vol. 20, Special section on high efficiency
video coding (several papers), Dec. 2010.
[23b] M. Karczewicz et al, „A hybrid video coder based on extended
macroblock sizes, improved interpolation and flexible motion
representation“, IEEE Trans. CSVT, Vol.20, pp. 1698-1708, Dec. 2010.)
[23c] S. Jeong et al, “ High efficiency video coding for entertainment
quality’, ETRI Journal, vol. 33, pp. 145-154, April 2011.
VC-1
[23d] IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 7, pp.
1290-1297, Nov. 2011, New Video Coding Scheme Optimized for HighResolution Video Sources - Asai, et. al
6
[40] H. Eeckhautet al., “Speeding up Dirac’s entropy coder”, Proc. 5th
WSEAS Intl. Conf. on Multimedia, Internet and Video Technologies,
pp. 120–125, Greece, Aug. 2005.
[41] The Dirac web page and developer support: http://dirac.sourceforge.net
[42] A. Ravi andK.R. Rao, “Performance analysis and comparison of the
Dirac video codec with H.264 / MPEG-4 Part 10 AVC,”IJWMIP, vol.4,
pp.635-654, No.4, 2011.
[43] BBC Research on Dirac:
http://www.bbc.co.uk/rd/projects/dirac/index.shtml
[44] T. Borer and T. Davies, “Dirac video compression using open
technology,” BBC EBU Technical Review, July 2005.
[45] C. Gargour et al., “A short introduction to wavelets and their
applications,” IEEE Circuits and Systems Magazine, vol. 9, pp. 57–68, II
Quarter, 2009.
[45a] A. Ravi and K.R. Rao, “Performance analysis and comparison of the
Dirac video codec with H.264/ MPEG- 4, Part 10,” for the book
Advances in reasoning-based image processing, analysis and intelligent
systems: Conventional and intelligent paradigms,” 2011.
[45b] A. Urs and K.R. Rao “Multiplexing/de-multiplexing Dirac video with
AAC audio bit stream”, TELSIKS 2011, Nis, Serbia, 5-8 Oct. 2011.
[51] X. Wang et al., “Performance comparison of AVS and H.264/AVC
video coding standards” J. Comput. Sci. & Technol., vol.21, No.3,
pp.310-314, May 2006.
[52] B. Tang et al., “AVS encoder performance and complexity analysis
based on mobile video communication”,WRI International conference
on Communications and Mobile Computing, CMC‘09, vol. 3, pp. 102–
107, 6-8 Jan. 2009.
[53] L. Yu et al., “Overview of AVS video coding standards,” Signal
Processing: Image Communication, vol. 24, pp. 263–276, April 2009.
[53a] D. Sahana and K.R. Rao, “A study on AVS-M standard”, Advanced
Computational Technologies published by the Romanian Academy
Publishing House, 2011.
[53b] S. Swaminathan and K.R. Rao, “Multiplexing and demultiplexing of
AVS CHINA video with AAC audio,” TELSIKS 2011, Nis, Serbia, 58 Oct. 2011.
JPEG2000
[54] D. T. Lee, “JPEG 2000: Retrospective and new developments,” Proc.
IEEE,vol. 93, pp.32–41, Jan. 2005.
[55] P. Schelkens, A. Skodras and T. Ebrahimi, “The JPEG 2000 suite”,
Hoboken, NJ: Wiley, 2009.
[56] M. S. Zhong and Z. M. Ma, “JPEG 2000 based scalable reconstruction
of image local regions”, IEEE ISIMP 2001, Hong Kong, May 2001.
[57] C. Christopoulous, A. Skodras and T. Ebrahimi, “The JPEG 2000 still
image coding system: An overview,” IEEE Trans. on Consumer
Electronics, vol. 46, pp. 1103–1127, Nov. 2000.
[58] M. D. Adams, “The JPEG-2000 still image compression standard,”
JPEG Tutorial download from http://www.ece.uvic.ca/~mdadams/jasper
(also software)
[59] A. Skodras, C. Christopoulous, and T. Ebrahimi, “JPEG-2000: The
upcoming still image compression standard,” Pattern Recognition
Letters, vol. 25, pp. 1337–1345, 2001.
[60] T. Fukuhara et al, “Motion-JPEG2000 standardization and target
market,” IEEE ICIP, vol. 2, pp. 57–60, 2000.
AVS CHINA
[46] GB/T 20090.1 Information technology - Advanced coding of audio and
video – Part 1: System, Chinese AVS standard.
[47] L. Yu et al., “An Overview of AVS-Video: tools, performance and
complexity”, Visual Communications and Image Processing 2005, Proc.
of SPIE, vol. 5960, pp.596021, July 31, 2006.
[48] L. Yu et al., “An area-efficient VLSI architecture for AVS intra frame
encoder” Visual Communications and Image Processing 2007, Proc. of
SPIE-IS & T Electronic Imaging, SPIE vol. 6508, pp. 650822, Jan. 29,
2007.
[49] W. Gao et al., “AVS – The Chinese next-generation video coding
standard” NAB, Las Vegas, 2004.
[50] J. Wang et al., “An AVS-to-MPEG2 transcoding system” Proceedings of
2004 International Symposium on Intelligent Multimedia, Video and
Speech Processing, Hong Kong, pp. 302-305, Oct. 20-22, 2004.
7
TABLE II.
Algorithmic
Element
Intra
Prediction
Picture coding
type
Motion
compensation
block size
Motion vector
Precision
P frame type
B frame type
In loop filters
MPEG-4
AVC
(H.264)
4×4 spatial
16×16 spatial
I-PCM
SMPTE VC-1
(Windows Media
Video 9)
Frequency
domain
coefficient
Frame
Field
Picture AFF
MB AFF
16×16, 16×8,
8×16, 8×8,
8×4, 4×8,
4×4(seven
variable sizes)
Full pel
Half pel
Quarter pel
Single
reference
Multiple
reference
One reference
each way,
Multiple
reference,
Direct &
spatial direct
weighted
prediction
De-blocking
Entropy coding
CAVLC,
CABAC
Transform
Main: 4×4
integer DCT,
High:
4×4&8×8
integer DCTs
Quantization
scaling
matrices
Other
COMPARISON OF VARIOUS VIDEO COMPRESSION STANDARDS
Dirac
DiracPRO
(SMPTE VC-2)
AVS China
Part 2
4×4 spatial
4×4 spatial
(forward,
backward)
8×8 block based
Intra Prediction
Frame
Field
Picture AFF
MB AFF
16×16, 8×8
Frame
Intra – Frame,
Field (Interlace,
Progressive)
Frame
4×4
N/A
16×16, 16×8,
8×16, 8×8
16×16, 16×8,
8×16, 8×8, 8×4,
4×8
Full pel
Half pel
Quarter pel
Single reference,
Intensity
compensation
1/8 pel
N/A
1/4 pel
1/4 pel
Single
reference,
Multiple
reference
No P frames
One reference
each way
One reference
each way,
Multiple
reference
No B frames
Single and
multiple
reference
(maximum of 2
reference
frames)
One reference
each way,
Multiple
reference.
Direct and
symmetrical
mode
Single and
multiple
reference
(maximum of 2
reference
frames)
No B frames
De-blocking
Overlap
transform
Adaptive VLC
None
None
De-blocking
filter
De-blocking
filter
Arithmetic
coding
Context based
adaptive binary
arithmetic coding,
Exponential
Golomb coding
4×4
wavelet transform
2D variable
length coding.
Context based
adaptive 2D
variable length
coding
8×8 integer
DCT
4×4 integer
DCT
Quantization
scaling matrices
Quantization
scaling matrices
Quantization
scaling matrices
4×4, 8×8
8×4& 4×8
integer DCTs
4×4 wavelet
transform
Range reduction.
Instream-post
processing
control
Quantization
scaling
matrices
[61] M. Rabbani and R. Joshi, “ An overview of the JPEG 2000 still image
compression standard,” Signal Processing: Image Communication, vol.
17, pp. 3–48, Jan. 2002.
[62] J. Hunter and M. Wylie, “JPEG2000 Image Compression: A real time
processing challenge,” Advanced Imaging, vol. 18, pp.14–17, April
2003.
[63] D.S. Taubman and M.W. Marcellin, “JPEG 2000: Image compression
fundamentals, standards and practice,” Kluwer, 2001.
AVS China
Part 7
(AVS-Mobile)
Intra_4×4
(4×4 spatial).
Direct Intra
Prediction
Frame
[64] D. Marpe, V. George, and T. Wiegand, “Performance comparison of
intra-only H.264/AVC HP and JPEG 2000 for a set of monochrome
ISO/IEC test images,” JVT-M014, pp.18–22, Oct. 2004.
[65] D. Marpe et al, “Performance evaluation of motion JPEG2000 in
comparison with H.264 / operated in intra-coding mode,” Proc. SPIE,
vol. 5266, pp. 129–137, Feb. 2004.
8
TABLE III.
Standard
H.264/MPE
G-4 Part 10
STANDARD
Main Compression Technologies
Standardization body
JVT (ISO/IEC & ITU-T)
Main Target Bitrate
8 kb/s up to about 150
Mb/s
–Integer DCT
–Adaptive quantization
–Zigzag reordering
–Alternate Scan ordering
–Predictive motion compensation
–Bi-directional motion compensation
–Variable block size motion compensation
with small block sizes
– Quarter pixel motion compensation
– Motion vector over picture boundaries
– Multiple reference picture motion
compensation
–Adaptive intra directional prediction
–In-loop deblocking filter
HEVC/
NGVC
Standardization body
JVT (ISO/IEC & ITUT)
–Arithmetic coding
–Variable length coding
–Error resilient coding
Besides those listed under H.264 / MPEG4
part10
Main Target Applications
–Broadcast over cable, terrestrial and
satellite
–Interactive or serial storage on optical
and magnetic devices, DVD, etc
–Conversational services
–Video on demand
–MMS over ISDN, DSL, Ethernet, LAN,
wireless and mobile networks
–HDTV
–Digital camera
Same as in H.264 / MPEG-4 part 10 but
at lower bit rate and higher compression
efficiency
(1) RD Picture Decision
(2) RDO_Q
(3) New Offset
(4) Adaptive Interpolation Filter
(5) Block Adaptive Loop Filter (BALF)
(6) Bigger Blocks and Bigger transform
(32x32 and 64x64)
(7) Multiple Angular Direction Intra
Adaptive Prediction
(8) Inter prediction ( Multiple ref. pictures,
bi-prediction, weighted prediction)
(9) New MV competition Transform unit
block size 4X4 to 64X64 ( Mode dependent
directional transform MDDT and rotational
transforms)
– Interlace handling: Picture-level adaptive
– HD broadcasting
frame/field coding (PAFF)
– High density storage media
–Macroblock-level adaptive frame/field
– Video surveillances
coding (MBAFF)
– Video on demand
Main Target Bitrate
– Intra prediction: 5 modes for luma and 4
1 Mb/s up to about 20
modes for chroma
Mb/s
– Motion compensation: 16×16, 16×8,
8×16, 8×8 block size
– Resolution of MV: 1/4-pel, 4-tap
interpolation filter
– Transform: 16bit-implemented 8×8
integer cosine transform
– Quantization and scaling: scaling only in
encoder
[66] P. Topiwala, “Comparative study of JPEG2000 and H.264/AVC FRExt
sequences,” Proc.SPIE Int’l Symposium, Digital Image Processing,
– Entropy
2D-VLC andApplications
Arithmeticof Digital Image ProcessingXXIX, vol. 6312, San Diego,
I-frame coding on high definition video sequences,”
Proc. coding:
SPIE Int’l
Symposium, Digital Image Processing, San Diego,Coding
Aug. 2005.
Aug. 2006.JPEG XR (HD photo of Microsoft)
– In-loop
deblocking
[67] P. Topiwala, T. Tran and W. Dai, “Performance
comparison
of filter
– Motion
vector
prediction
JPEG2000 and H.264/AVC high profile intra-frame
coding on
HD video
–Adaptive scan
– Record and local playback on mobile
AVS Part 7 Standardization body – Intra prediction: 9 modes for luma and 3
modes for chroma
devices
AVS workgroup
9 16×16, 16×8,
– Motion compensation:
– Multimedia Message Service (MMS)
8×16,
8×8,
8×4,
4×8
block
size
– Streaming and broadcasting
Main Target Bitrate
– Resolution of MV: 1/4-pel
– Real-time video conversation
1 Mb/s up to about 20
–
Transform:
16bit-implemented
4×4
Mb/s
integer cosine transform
AVS Part 2
Standardization body
AVS workgroup
TABLE III.
Standard
Dirac
Main Compression Technologies
Standardization body
BBC R&D
Mozilla Public License
(MPL)
Main Target Bitrate
Few hundred kbps up to
about 15Mbps
DiracPRO
(SMPTE VC-2)
Standardization body
BBC R&D
SMPTE
Main Target Bitrate
Lossless HD to < 50Mb/s
Compression ratio 20:1
SMPTE VC-1
(WMV-9)
STANDARD(CONTINUED)
Standardization body
SMPTE 421M
Main Target Bitrate
10 kbps– 8 Mbps
Main Target Applications
–4×4 wavelet transform
–Dead-zone quantization and scaling
–Entropy coding: Arithmetic coding
–Hierarchical motion estimation
–Intra, Inter prediction
–Single and multiple reference P, B frames
–1/8 pel motion vector precision
–4×4 overlapped block based motion
compensation (OBMC)
–Daubechies wavelet filters
–Broadcasting
–Live streaming video
–Pod casting
–Peer to peer transfers
–HDTV with SD (standard definition)
simulcast capability
–Desktop production
–News links
–Archive storage
–PVRs (personal video recorders)
–Multilevel Mezzanine coding
–4×4 wavelet transform
–Dead-zone quantization and scaling
–Entropy coding: Context based adaptive
binary arithmetic coding (CABAC),
exponential Golomb coding
–Intra-frame only (forward, backward
prediction modes also available)
–Frame, Field coding (Interlaced and
progressive)
–Daubechies wavelet filters
–Integer DCT
–Adaptive block size transform: (8×8),
(8×4), (4×8) and (4×4)
–Motion estimation for (16×16) and (8×8)
blocks
–½ pixel and ¼ pixel motion vector
resolution
–Dead zone and uniform quantization
–Multiple VLCs
–In-loop deblock filtering, fading
compensation
–Professional (high quality, low latency)
applications (not for end user
distribution)
–Lossless or visually lossless
compression for archives
–Mezzanine compression for re-use of
existing equipment
–Low delay compression for live video
links
[68] T. Tran, L. Liu and P. Topiwala, “Performance comparison of leading
image codecs: H.264/AVC intra, JPEG 2000, and Microsoft HD photo,”
Proc.SPIE Int’l Symposium, Applications of Digital Image Processing
XXX, vol. 6696, San Diego, Sept. 2007.
[69] JPEG-2000 open source
softwarehttp://www.ece.uvic.ca/~mdadams/jasper/
[69a] Z. Liu, L.J. Karam and A.B. Watson, “JPEG2000 Encoding with
Perceptual Distortion Control,” IEEE Transactions on Image Processing,
vol.15, no.7, pp.1763-1778, July 2006.
[70] JPEG <http://en.wikipedia.org/wiki/JPEG>
[71] MJPEG<http://en.wikipedia.org/wiki/MJPEG>
–Media delivery over the Internet
–Broadcast TV
–HD DVD
–Digital projection in theaters, mobile
phones
–DVB-T, DVB-S
[73] D. Santa-Cruz and T. Ebrahimi, “ A study of JPEG 2000 still image
coding versus other standards”, Proc X EUSIPCO, vol.2, pp. 673-676,
Sept. 2000.
[74] E.L. Tan and W.S. Gan, “Perceptually tuned subband coder for JPEG,”
J. Real Time Image Process., vol. 6, pp. 101-115, 2011.
JPEG-XR
[75] S. Srinivasan, C. Tu, S. L. Regunathan, G. J. Sullivan, and R. A. Rossi,
“HD Photo: A New Image Coding Technology for Digital
Photography,” Proc. SPIE, vol. 6696 (2007).
[76] MICROSOFT HD PHOTO SPECIFICATION
http://www.microsoft.com/whdc/xps/wmphotoeula.mspx
GENERAL
[72] D.A.Novik, J.C.Tilton and M. Manohar, "Compression through
decomposition into browse and residual images" Space and Earth
Science Data Compression Workshop, NASACP-3191, edited by James
C. Tilton, WashingtonD.C., 1993.
DIGITAL VIDEO
[77] DV <http://en.wikipedia.org/wiki/DV>
[78] Y. Gao, D. Chan and J. Liang,” JPEG-XR optimization with graphbased SOFT quantization”, IEEE ICIP 2011, Brussels, Aug. 2011.
JPEG
10
Download