Multimedia Processing Lab NH 140

advertisement
Multimedia Processing Lab
NH 140
Advisor : Dr. K.R. Rao
Phone : (817) 272-3478
Email : rao@uta.edu
Website: http://www-ee.uta.edu/dip
Multimedia Network
Home Media Ecosystem
A case for importance of research in multimedia
Video Redundancy – An Example
The need for video compression




Video signal : Sequence of
frames (images) related
among temporal dimension
TV video quality: 704x576
pixels per frame, 12 bpp, 25
frames per second - > 121
Mbps
Too much data for video
transmission or storage
Increasing importance of
multimedia communication
t
NEED FOR VIDEO
COMPRESION
Research Focus Areas
H.265(?)
2010
2009
HVC(?)
MVC
2005
SVC
HDTV
Blue ray DVD
mobile
2003 H.264
Mobile TV
1999
MPEG4
Hand PC
PC
Video Conferencing
H.263
MPEG2
Mobile Phone
1994
1992
MPEG1
Year
Research : Image, Video, Audio
Image
Video
Audio
JPEG, JPEG-LS,
LOCO, CALIC
MPEG 1,2,4,7, 21
Dolby True HD
JPEG 2000
H.264, H.265(?),HVC
HD-AAC
JPEG XR–AIC
VP6, VP7, VP8
MP3, MP3 Pro
JBIG1,2
VC–1 (WMV–9)
AAC–SBR
PNG
Wyner Ziv
HE–AC3
GIF
AVS China part 2
AVS China part 3
Dirac,Dirac Pro(BBC)
ATSC (E-AC3)
Real Networks-RV10
WMA
DTS-HD Audio
Video Compression Standards
Standard
Main Applications
Year
JPEG, JPEG2000
Image
1992-1999, 2000
JBIG, JBIG2
Fax
1995-2000
H.261
Video Conferencing
1990
H.262, H.262+
DTV, SDTV, HDTV
1995, 2000
H.263, H.263++
Videophone
1998, 2000
MPEG-1
Video CD
1992
MPEG-2
DTV, SDTV, HDTV, DVD
1995
MPEG-4 Part 2
Interactive video
2000
MPEG-7
Multimedia Content description
2001
MPEG-21
Multimedia Framework
2002
H.264/MPEG-4 part 10
Advanced Video Coding
2003
Latest Video Codecs
Standard
Main Applications
Year
Dirac (B.B.C.)
Internet streaming to Ultra-high definition TV
2008
Dirac pro/VC-2
Studio and professional use
2009
VC-1 (SMPTE/Microsoft)
Internet streaming to High definition TV
2006
VC-3
Compositing, mastering, and multi-generational use
2006
VP6 (On2 technologies)
Broadcasting
2003
VP7
Broadcasting
2005
VP8
Broadcasting
2008
RV10 (Real Networks)
Internet streaming
2008
AVS China
IP TV , Terrestrial digital TV, Satellite broadcast,
Video surveillance
2005
H.264 Fidelity Range
Extensions
Studio editing, Post processing, Digital cinema
2004
H.264 SVC, MVC
Scalable video coding, panaromic video
HVC
High Efficiency Video Coding
2006-2009
2010 ?
Advanced Television Systems Committee (ATSC)
Advanced Television Systems Committee (ATSC) www.atsc.org
A/53B ATSC Standard: Digital television standard Revision B with
amendment (Video: MPEG-2, Audio: AAC), 2007
A/153 Digital TV Mobile and handheld specifications 2009 (Video: H.264)
(Audio HE AACv2, ISO/ IEC 14496-3)
Digital TV in North America
Advanced Television Systems Committee
(ATSC)…….continued
ATSC Mobile DTV includes a highly robust transmission system based on vestigial
sideband (VSB) modulation coupled with a flexible and extensible IP based
transport, efficient MPEG AVC (H.264) video and HE AAC v2 audio (ISO/IEC
14496-3) coding.
The Candidate Standard consists of eight parts:
• Part 1 – Mobile/Handheld Digital Television System
• Part 2 – RF/Transmission System Characteristics
• Part 3 – Service Multiplex and Transport Subsystem Characteristics
• Part 4 – Announcement
• Part 5 – Presentation Framework
• Part 6 – Service Protection
• Part 7 – Video System Characteristics
• Part 8 – Audio System Characteristics
ATSC Broadcast System
Fig 1. ATSC Broadcast system with TS main and M/H Services [1]
Comparison of various video compression standards
Algorithmic
Element
Intra
Prediction
Picture
coding type
MPEG-2
Video
(H.262)
SMPTE VC-1
Dirac
(Windows
(BBC)
Media Video
9)
None: MB
4x4 spatial
Frequency
4x4 spatial
encoded DC 16x16 spatial
domain
predictors
I-PCM
coefficient
Frame
Field
Picture AFF
MPEG-4 AVC
(H.264)
Frame
Field
Picture AFF
MB AFF
Motion
16×16, 16×8, 16×16, 16×8,
compensatio
8×16
8×16, 8×8,
n block size
8×4, 4×8, 4×4
Motion
Full pel
Full pel
vector
Half pel
Half pel
Precision
Quarter pel
Frame
Field
Picture AFF
MB AFF
Frame
16×16, 8×8
4×4
Full pel
Half pel
Quarter pel
1/8 pel
DiracPRO
(BBC)
AVS
Part 2
China
AVS
Part 7
China
4x4
8×8 block
Intra_4x4
Spatial
based Intra (4x4 spatial).
(forward,
Prediction
Direct Intra
backward)
Prediction
Intra –
Frame
Frame
Frame,
Field
(Interlace,
Progressive)
N/A
16×16, 16×8, 16×16, 16×8,
8×16, 8×8
8×16, 8×8,
8×4, 4×8
N/A
1/4 pel
1/4 pel
Comparison of various video compression standards
Algorithmic
Element
P frame type
B frame type
In loop filters
MPEG-2 Video
(H.262)
Single
reference
One reference
each way
None
MPEG-4 AVC
(H.264)
SMPTE VC-1
(Windows
Media Video 9)
Single reference Single reference,
Multiple
Intensity
reference
compensation
Dirac
Single
reference,
Multiple
reference
One reference
each way,
Multiple
reference,
Direct & spatial
direct weighted
prediction.
One reference
each way
One reference
each way,
Multiple
reference
De-blocking
De-blocking
Overlap
transform
None
DiracPRO
AVS
Part 2
AVS
Part 7
No P frames
Single and
multiple
reference
(maximum of 2
reference
frames)
Single and
multiple
reference
(maximum of 2
reference
frames)
No B frames
One reference
each way,
Multiple
reference.
Direct and
symmetrical
mode.
No B frames.
None
De-blocking
filter.
De-blocking
filter.
Comparison of various video compression standards
Algorithmic
Element
MPEG-2
Video
(H.262)
MPEG-4 AVC
(H.264)
SMPTE VC-1
(Windows
Media Video 9)
Dirac
DiracPRO
AVS
Part 2
AVS
Part 7
Entropy coding
VLC
CAVLC,CABAC
Adaptive VLC
Arithmetic
coding
Context based
adaptive
binary
arithmetic
coding,
Exp-Golomb
coding.
2D variable
length coding.
Context based
adaptive 2D
variable length
coding.
Transform
8×8 DCT
4×4 wavelet
transform
4×4
wavelet
transform
8×8 DCT
4×4 DCT
Other
Quantization
scaling
matrices.
Quantization
scaling
matrices.
Quantization
scaling
matrices.
4×4 integer DCT 4×4 integer DCT
8×8 integer DCT 8×8 integer DCT
8×4 & 4×8
integer DCT
Quantization
scaling
matrices.
Range
reduction.
Instream-post
processing
control
Quantization
Quantization
scaling matrices. scaling matrices.
Standards Comparison
Standard
H.264/MPE Standardization
G-4 Part 10 body
JVT (ISO/IEC & ITUT)
Main Target Bitrate
8 kb/s up to about
150 Mb/s
Main Compression Technologies
Main Target Applications
– Integer DCT
– Adaptive quantization
– Zigzag reordering
– Alternate Scan ordering
– Predictive motion compensation
– Bi-directional motion compensation
– Variable block size motion
compensation with small block sizes
– Quarter pixel motion compensation
– Motion vector over picture
boundaries
– Multiple reference picture motion
compensation
– Adaptive intra directional
prediction
– In-loop deblocking filter
– Arithmetic coding
– Variable length coding
– Error resilient coding
– Broadcast over cable, terrestrial
and satellite
– Interactive or serial storage on
optical and magnetic devices,
DVD, etc
– Conversational services
– Video on demand
– MMS over ISDN, DSL, Ethernet,
LAN, wireless and mobile
networks
– HDTV
– Digital camera
Standards Comparison
AVS Part 2 Standardization
body
AVS workgroup
Main Target Bitrate
1 Mb/s up to about 20
Mb/s
– Interlace handling: Picture-level
adaptive frame/field coding (PAFF)
– Macroblock-level adaptive
frame/field coding (MBAFF)
– Intra prediction: 5 modes for luma
and 4 modes for chroma
– Motion compensation: 16×16,
16×8, 8×16, 8×8 block size
– Resolution of MV: 1/4-pel, 4-tap
interpolation filter
– Transform: 16 bit-implemented
8×8 integer cosine transform
– Quantization and scaling: scaling
only in encoder
– Entropy coding: 2D-VLC and
Arithmetic Coding
– In-loop deblocking filter
– Motion vector prediction
–Adaptive scan
– HD broadcasting
– High density storage media
– Video surveillances
– Video on demand
Standards Comparison
AVS Part 7
Standardization body
AVS workgroup
Main Target Bitrate
1 Mb/s up to about 20
Mb/s
– Intra prediction: 9 modes for luma and
3 modes for chroma
– Motion compensation: 16×16, 16×8,
8×16, 8×8, 8×4, 4×8 block size
– Resolution of MV: 1/4-pel
– Transform: 16 bit-implemented 4×4
integer cosine transform
– Quantization and scaling: scaling only
in encoder
– Entropy coding: Context based
adaptive 2D variable length coding
– In-loop deblocking filter
– Record and local playback on
mobile devices
– Multimedia Message Service
(MMS)
– Streaming and broadcasting
– Real-time video conversation
Dirac
Standardization body
BBC R&D
Mozilla Public License
(MPL)
Main Target Bitrate
Few hundred kbps up
to about 15 Mbps
– 4×4 wavelet transform
– Dead-zone quantization and scaling
– Entropy coding: Arithmetic coding
– Hierarchical motion estimation
– Intra, Inter prediction
– Single and multiple reference P, B
frames
– 1/8 pel motion vector precision
– 4×4 overlapped block based motion
compensation (OBMC)
– Daubechies wavelet filters
– Broadcasting
– Live streaming video
– Pod casting
– Peer to peer transfers
– HDTV with SD (standard
definition) simulcast capability
– Desktop production
– News links
– Archive storage
– PVRs (personal video recorders)
– Multilevel Mezzanine coding
Standards Comparison
DiracPRO
Standardization body
(SMPTE VC- BBC R&D
2)
SMPTE
Main Target Bitrate
Lossless HD to < 50
Mb/s
Compression ratio 20:1
SMPTE VC-1 Standardization body
(WMV-9)
SMPTE 421M
Main Target Bitrate
10 kbps – 8 Mbps
– 4×4 wavelet transform
– Dead-zone quantization and scaling
– Entropy coding: Context based
adaptive binary arithmetic coding
(CABAC), exponential Golomb
coding
– Intra-frame only (forward,
backward prediction modes also
available)
– Frame, Field coding (Interlaced
and progressive)
– Daubechies wavelet filters
– Integer DCT
– Adaptive block size transform:
(8×8), (8×4), (4×8) and (4×4)
– Motion estimation for (16×16) and
(8×8) blocks
– ½ pixel and ¼ pixel motion vector
resolution
– Dead zone and uniform
quantization
– Multiple VLCs
– In-loop deblock filtering, fading
compensation
– Professional (high quality, low
latency) applications (not for end
user distribution)
– Lossless or visually lossless
compression for archives
– Mezzanine compression for re-use
of existing equipment
– Low delay compression for live
video links
– Media delivery over the Internet
– Broadcast TV
– HD DVD
– Digital projection in theaters,
mobile phones
– DVB-T, DVB-S
Performance comparison of various
video coding standards
Audio Compression Standards
Standard
Main Applications
Year
Dolby True HD
Lossless audio, Blu-ray Disc players, A/V
receivers, and home-theater
2006
HD-AAC
Soundtrack applications
1997
MP3
Handheld devices
1991
MP3 Pro
Handheld devices
2001
AAC–SBR
DAB – High quality audio
2003
HE–AC3
Satellite or terrestrial audio broadcasting
2005
AVS China part 3
Handheld and broadcasting
2004
AC3 Pro
Satellite or terrestrial audio broadcasting
2006
E-AC3
Enhanced AC-3 or Dolby Digital Plus (Multiple
program streams, multi channel signals
beyond 5.1)
2007
DTS – Digital Theater
Systems
DTS – High Definition Audio
2008
Current Research Activities of MPL
Mobile Applications

Development of virtual lab platform for
Improve Robustness

Lossy Wireless Environment
mobile software application

Developing a low complexity video


Complexity reduction in existing video
Transcoders

Quality Improvement

AVS China, H.264 to DIRAC transcoders
Complexity reduction in existing audio
codecs
Video/Audio Integration

AVS China – Audio/Video codec –
Multiplex/demultiplex and lip sync
Optimizing existing video codecs using
perceptual coding techniques
Video transcoders : VP6 to H.264, H.264
to VC-1, Wyner Ziv to H.264, H.264-to-
codecs

Error concealment techniques for
wireless video transmission
codec for mobile application
Complexity reduction
Error Resilience of video streams in a

DIRAC video codec and AAC Multiplex/demultiplex and lip sync
Virtual lab. Platforms for Mobile SW
Applications
Low complexity Codec Applications

SensorCamPillCamWearableCamDisposable
cam.ScanCam
Wearable Cam
Pill Cam
Disposable Cam
Transcoding Applications
Low
complexity
Encoder
The transcoding platforms handle the high
complexity decoding on one side and high
complexity encoding on the other (right) side
Low
complexity
Decoder
An application scenario for transcoding
Error Concealment in Lossy Wireless
Environment
Reconstruct
lost information
Source
Destination
Information lost due
to lossy wireless
network
Typical situation of 3G/4G cellular telephony
Original
Information
Multiplexing of Audio/Video And Lip Sync
Video
Source
Compressed Video
AVS
Encoder
Encoded Stream
Multiplexer
Audio
Source
AVS
Encoder
Compressed Audio
Video
AVS
Decoder
C h an n
Compressed Video
Demultiplexer
Lip Synch
Audio
AVS
Decoder
Compressed Audio
AVS – Audio Video Standard of China
el
A quick view on H.264 - Encoder
Profiles in H.264
4:2:2, 4:4:4, upto 12 bit
depth
Intra Adaptive Directional Prediction 4x4 in H.264
Intra Adaptive Directional Prediction 8x8 in H.264
Intra Adaptive Directional Prediction 16x16 in H.264
Motion Estimation/Compensation Sizes (H.264)
Sub pixel accuracy for ME/MC (H.264)
E
cc
K
F
dd
L
A
aa
B
C
bb
D
G
a
b
c
d
e
f
g
h
i
j
k
n
p
q
r
H
I
J
m
ee
ff
P
Q
M
s
N
R
gg
S
T
hh
U
Scanning of transform coefficients (H.264)
0
1
5
6
0
2
8
12
2
4
7
12
1
5
9
13
3
8
11
13
3
6
10
14
9
10
14
15
4
7
11
15
Zig-zagascan
b scan
Alternate
SVC Extensions (H.264)
Future Standards Activities – Bit depth Scalability
 LCD dynamic range – 500:1
 HDR displays: Sharp “Mega-contrast”, LG.Philips - 1,000,000:1, Dolby – 250,000:1
HDR video input
10, 12, 14 bits/pixel
HDR video output
(HDR storage/display)
Bit Depth
Scalable Coder
LDR video output
(conventional display)
+
HDR
range
8-bit
range
+
Tone
Mapping
=
Future Standards Activities – 3D Video
Consumer Electronics
auto-stereoscopic display,
10+ views required
Digital Cinema
polarized glasses, 2 views sufficient
3D Video (3DV)/Free View-Point Video (FVV) effort initiated in MPEG. Similar
concept to MPEG-C. Any number of views can be recreated using depth map
in the decoder.
2D video data + depth
Future Standards Activities – 3D Video
 Paramount Pictures' Beowulf is benefiting from theaters utilizing nextgeneration 3D technology (grossed approximately $23.4 million of a total
domestic gross over 79.4 million.”
 “U2 3D, the first live-action movie to be shot, produced, and screened
exclusively with digital 3-D technology
 DreamWorks Animation is joining the digital 3-D wave
 Studio plans to release all its pics in 3-D starting in 2009.”
Original and compressed Lena
image with different methods
(a) Original Lena
(51251224)
(b) AIC: 0.22bpp,
PSNR=28.84dB
(c) JPEG2000: 0.22bpp,
PSNR=29.57dB
Compressed Lena image with
different methods(contd.)
(d) M-AIC: 0.22bpp, PSNR=29.02dB
(e) JPEG: 0.22bpp, PSNR=24.29dB
AVS





AVS is a set of integrity standard system-system, video,
audio and media copyright management.
AVS-M is the seventh part of the video coding standard
developed by AVS work group of China which aims for
mobile systems and devices.
In AVS-M,a Jiben Profile has been defined which has 9
different levels.
AVS follows a layered structure for the data and this
representation is seen in the coded bitstream.
Sequence layer provides an entry point into the coded
video. It consists of a set of mandatory and optional
downloadable parameters.
AVS-M ENCODER
Block Diagram of AVS-M encoder [34]
AVS-M DECODER
Block Diagram of AVS-M Decoder [34]
AVS-M Analysis
AVS-M Analysis
AVS-M Analysis
Original
Decoded sequence
AVS-M Analysis
Dirac features
Direct support of multiple picture formats
4K e-cinema through to quarter common intermediate
format (QCIF)
 Supports I-frame only up to long group of picture (GOP)
structures
 Direct support of multiple chroma formats e.g.
4:4:4/4:2:2/4:2:0
 Direct support of multiple bit depths e.g. 8 bit to 16 bit
 Direct support of interlace via metadata
 Direct support of multiple frame rates from 23.97 fps to
60fps
 Definable pixel aspect ratios
 Multiple color spaces with metadata
 Definable wavelet depth


Dirac Encoder
Dirac encoder architecture [1]
Dirac Decoder
Dirac decoder architecture [8]
Dirac Results for Miss America
Dirac Results for Miss America
Dirac Results for Miss America
Dirac Results for Miss America
Dirac Results for Miss America
Dirac Results for Miss America
Current Interns & Alumni Network
Current & Recent Grads:
Jay R Padia (M.S) (May 2010)
- Job @ Intel
Att Kruafak (Ph.D)
– job @ Engineer CAT, Thailand
Sangseok Park (Dec 2008) (Ph.D)
– job @ DiaLogic
Aruna Ravi Subramanya
Sahana Devaraju
Tejaswini Purushottam
job@microchip
Krishnan -intern@ FastVDO
Swaroop Suchethan - job@Ericsson
Jennie Abraham - job@Ericsson
Nikshep Patil – intern @ Datamatics
Radhika Veerla (Aug 08) – job@RIM
Theju Jacob (Aug 08) – Ph.D. student
Pooja Agawane (Aug 08) – job@Intel
Leena Agarwal (Dec 07) – job@Intel
Rahul Panchal (May 07) – job@Qualcomm
Harishankar Murugan (May 07)- job@NVidia
Sreejana Sharma (May 07)- job@Intel
Hitesh Yadav (August 06)- job@Intel
Basavaraj S. M. (May 06)- Job@Fast VDO
Rochelle Pereira (Dec 05)- job@NVidia
Sandya Sheshadri (Dec 05) – job@Microsoft
Tarun Bhatia (Dec 05)- job@wirelessventures
Vidhya Vijaykumar job@TI
Current Interns & Alumni Network







Pragnesh Ramolia- job @
Tactel US
Nikshep Patil –job @ Marvell
Semiconductors
Sreya – Intern @ RIM
Shreyanka – Intern @ Intel
Amruta –Intern @ RIM
Tejas –Intern @ RIM
Sadaf –Inern @ Ericsson





Anuradha (Dec 04) –job @
Qualcomm
Shubha Kumbadkone (Dec
04) –job @ Intel
Nandakishore (Aug 04) –job
@ Qualcomm
Phani (May 04) – job @
Qualcomm
Ravi Kumar (May 04) –job @
Qualcomm
References
DIRAC
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
T. Borer, and T. Davies, “Dirac video compression using open technology”, BBC EBU
Technical Review, July 2005
BBC Research on Dirac: http://www.bbc.co.uk/rd/projects/dirac/index.shtml
The Dirac web page: http://dirac.sourceforge.net
T. Davies, “The Dirac Algorithm”: http://dirac.sourceforge.net/documentation/algorithm/,
2005.
Dirac developer support: Overlapped block-based motion compensation:
http://dirac.sourceforge.net/documentation/algorithm/algorithm/toc.htm
“Dirac Pro to bolster BBC HD links”: http://www.broadcastnow.co.uk/news/multiplatform/news/dirac-pro-to-bolster-bbc-hd-links/1732462.article
Dirac software and source code: http://diracvideo.org/download/dirac-research/
Dirac video codec - A programmer's guide:
http://dirac.sourceforge.net/documentation/code/programmers_guide/toc.htm
Daubechies wavelet: http://en.wikipedia.org/wiki/Daubechies_wavelet
Daubechies wavelet filter design: http://cnx.org/content/m11159/latest/
Dirac developer support: Wavelet transform:
http://dirac.sourceforge.net/documentation/algorithm/algorithm/wlt_transform.xht
Dirac developer support: RDO motion estimation metric:
http://dirac.sourceforge.net/documentation/algorithm/algorithm/rdo_mot_est.xht
A. Ravi and K.R. Rao, “Performance analysis and comparison of the DIRAC video codec
with H.264/MPEG-4 part 10 AVC”, IJWMIP , vol.4, pp. 635-654, 2011.
References
H.264
1.
T.Wiegand, et al “Overview of the H.264/AVC video coding standard”, IEEE Trans. on Circuit
and Systems for Video Technology, vol.13, pp. 560-576, July 2003.
2.
T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing
Magazine, vol. 24, pp. 148-153, March 2007.
3.
D. Marpe, T. Wiegand and G. J. Sullivan, “The H.264/MPEG-4 AVC standard and its
applications”, IEEE Communications Magazine, vol. 44, pp. 134-143, Aug. 2006.
4.
S.K.Kwon, A.Tamhankar and K.R.Rao, “Overview of H.264 / MPEG-4 Part 10” J. Visual
Communication and Image Representation, vol. 17, pp.186-216, April 2006.
5.
A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4 AVC compression
standard”, Signal Processing: Image Communication, vol. 19, pp. 793-849, Oct. 2004.
6.
H.264 AVC JM software: http://iphome.hhi.de/suehring/tml/
7.
[19] H.264/MPEG-4 AVC: http://en.wikipedia.org/wiki/H.264
8.
M.Fieldler, “Implementation of basic H.264/AVC decoder”, seminar paper at Chemnitz
University of Technology, June 2004.
9.
H.264 encoder and decoder: http://www.adalta.it/Pages/407/266881_266881.jpg
10. R. Schäfer, T. Wiegand and H. Schwarz, “The emerging H.264/AVC standard”, EBU Technical
Review, Jan. 2003.
11. H.264 reference software download : http://iphome.hhi.de/suehring/tml/
12. D. Marpe, T. Wiegand, and S. Gordon, "H.264/MPEG4-avc fidelity range extensions: tools,
profiles, performance, and application areas," in, IEEE International Conference on Image
Processing, vol. 1, pp. I-593-6, 2005.
13. S. Saponara et al, "The JVT advanced video coding standard: complexity and performance
analysis on a tool-by-tool basis," in Packet Video Workshop, Nantes, France, April 2003.
References
VC-1
1.
2.
3.
4.
5.
VC-1 technical overview http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx
Microsoft Windows Media: http://www.microsoft.com/windows/windowsmedia
http://en.wikipedia.org/wiki/VC-1
S. Srinivasan, et al, “Windows Media Video 9: overview and applications”, Signal
Processing: Image Communication, vol .19, Issue 9, pp. 851-875, Oct. 2004
S. Srinivasan and S. L. Regunathan, “An overview of VC-1”, SPIE / VCIP, vol. 5960, pp.
720-728, July 2005.
AVS
1.
2.
3.
4.
5.
6.
7.
AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part
2: Video (AVS1-P2 JQP FCD 1.0),” Audio Video Coding Standard Group of China (AVS), Doc.
AVS-N1538, Sept. 2008.
AVS Video Expert Group, “Information technology – Advanced coding of audio and video – Part
3: Audio,” Audio Video Coding Standard Group of China (AVS), Doc. AVS-N1551, Sept. 2008.
L Yu et al., “Overview of AVS-Video: Tools, performance and complexity,” SPIE VCIP, vol.
5960, pp. 596021-1~ 596021-12, Beijing, China, July 2005.
L. Fan, S. Ma and F. Wu, “Overview of AVS video standard,” IEEE Int’l Conf. on Multimedia and
Expo, ICME '04, vol. 1, pp. 423–426, Taipei, Taiwan, June 2004.
W. Gao et al., “AVS – The Chinese next-generation video coding standard,” National Association
of Broadcasters, Las Vegas, 2004.
Special issue on 'AVS and its Applications' Signal Processing: Image Communication, vol. 24,pp.
245-344, April 2009..
AVS China software : ftp://159.226.42.57/public/avs_doc/avs_software (need password)
References
8.
9.
10.
11.
12.
13.
14.
15.
16.
AVS working group official website, http://www.avs.org.cn
http://www-ee.uta.edu/dip/Courses/EE5351/ISPACSAVS.pdf
W.Gao et al., “AVS–the Chinese next-generation video coding standard,” National Association of
Broadcasters, Las Vegas, 2004.
L.Fan, “Mobile Multimedia Broadcasting Standards”, ISBN: 978-0-387-78263-8, Springer US,
2009
F.Yi et al., “Low-Complexity Tools in AVS Part 7”, J. Comput. Sci. Technol, vol.21, pp. 345-353,
May. 2006
L.YU, S.Chen and J.Wang, “Overview of AVS-video coding standards”, Signal Process: Image
Commun, vol. 24, Issue 4, pp 247-262, April 2009
W.Gao, “AVS–A project towards to an open and cost efficient Chinese national standard”, ITU-T
VICA workshop, ITU Headquarters, Geneva, 22-23 July 2005.
Z. Zhang et al., “Improved intra prediction mode-decision method”, Proc. of SPIE ,Vol. 5960, pp.
59601W-1~ 59601W-9, Beijing, China, July 2005.
Z.. Ma et al., “Intra coding of AVS Part 7 video coding standard”, J. Comput. Sci. Technol,vol.21,
Feb.2006
References
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
W.Gao and T.Huang “AVS Standard -Status and Future Plan”, Workshop on Multimedia New
Technologies and Application, Shenzhen, China, Oct. 2007.
Y.Cheng et al., “Analysis and application of error concealment tools in AVS-M decoder”, Journal of
Zhejiang University –Science A, vol. 7, pp. 54-58, Jan 2006.
M.Liu and Z.Wei, “A fast mode decision algorithm for intra prediction in AVS-M video coding”
Vol 1, ICWAPR 07, Issue, 2-4, pp.326 –331, Nov. 2007.
Q.Wang et al., “Context-Based 2D-VLC for Video Coding”, IEEE Int’l Conf. on Multimedia and
Expo (ICME), vol.1, pp. 89-92, June. 2004.
http://vspc.ee.cuhk.edu.hk/~ele5431/AVS.pdf
W.Gao, K.N. Ngan and L.Yu, “Special issue on AVS and its applications: Guest editorial”, Signal
Process: Image Commun, vol. 24, Issue 4, pp. 245-344, April 2009.
S.W.Ma and W.Gao, “Low Complexity Integer Transform and Adaptive Quantization
Optimization”, J. Comput. Sci. Technol, vol.21, pp.354-359, May 2006.
S.Hu, X.Zhang and Z.Yang, “Efficient Implementation of Interpolation for AVS”, Image and Signal
Processing, 2008. Congress, vol. 3, Issue, 27-30, pp.133 –138, May 2008.
R. Schafer and T. Sikora, “Digital video coding standards and their role in video communications”,
Proc. of the IEEE, vol. 83, pp. 907-924, June 1995.
A. K. Jain, “Image data compression: A review”, Proc. IEEE, vol. 69, pp. 349-384, March 1981.
References
JPEG, JPEG-2000, JPEG-XR (XR Extended range)
1.
AIC website: http://www.bilsen.com/aic/
2.
T. Wiegand et.al, “Overview of the H.264/AVC Video Coding Standard,” IEEE Trans. on Circuits
and Systems for Video Technology, vol. 13, pp.560-576, July 2003.
3.
G. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC Advanced Video Coding Standard:
Overview and Introduction to the Fidelity Range Extensions,” SPIE Conference on Applications of
Digital Image Processing XXVII, vol. 5558, pp. 53-74, Aug. 2004.
4.
I. Richardson, H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation
Multimedia, Hoboken, NJ: Wiley, 2003.
5.
P. Topiwala, “Comparative study of JPEG2000 and H.264/AVC FRExt I-frame coding on high
definition video sequences,” Proc. SPIE Int’l Symposium, Digital Image Processing, vol. , pp.
San Diego, Aug. 2005.
6.
P. Topiwala, T. Tran and W.Dai, “Performance comparison of JPEG2000 and H.264/AVC high
profile intra-frame coding on HD video sequences,” Proc. SPIE Int’l Symposium, Digital Image
Processing, applications of digital image processing XXIX, vol. 6321, pp.
, San Diego, Aug.
2006.
References
7.
8.
9.
10.
11.
12.
13.
14.
15.
T. Tran, L.Liu and P. Topiwala, “Performance comparison of leading image codecs: H.264/AVC
intra, JPEG 2000, and Microsoft HD photo,” Proc. SPIE Int’l Symposium, Digital Image
Processing, vol. , pp.
,San Diego, Sept. 2007.
G. J. Sullivan, “ ISO/IEC 29199-2 (JpegDI part 2 JPEG XR image coding – Specification),”
ISO/IEC JTC 1/SC 29/WG1 N 4492, Dec. 2007
D. Marpe, T.Weigand and G. Sullivan, “The H.264/MPEG4 advanced video coding standards and
its applications”, IEEE Communications Magazine, vol. 44, pp.134-143, Aug. 2006.
A. Skodras, C. Christopoulus and T. Ebrahimi, “The JPEG2000 still image compression standard,”
IEEE Signal Processing Magazine, vol. 18, pp. 36-58, Sept. 2001.
D.S. Taubman and M.W. Marcellin, JPEG 2000: Image compression fundamentals, standards and
practice, Kluwer academic publishers, 2001.
W.B. Pennebaker and J.L. Mitchell, JPEG: Still image data compression standard, Kluwer
academic publishers, 2003.
D. Marpe, V. George, and T.Weigand, “Performance comparison of intra-only H.264/AVC HP and
JPEG 2000 for a set of monochrome ISO/IEC test images”, JVT-M014, pp.18-22, Oct. 2004
D. Marpe et al, “Performance evaluation of motion JPEG2000 in comparison with H.264 / operated
in intra-coding mode”, Proc. SPIE, vol. 5266, pp. 129-137, Feb. 2004.
Z. Xiong et al, “A comparative study of DCT- and wavelet-based image coding,” IEEE Trans. on
Circuits and Systems for Video Tech., vol.9, pp. 692-695, Aug. 1999.
References
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
H.264/AVC reference software (JM 13.2) Website: http://iphome.hhi.de/suehring/tml/download/
JPEG reference software website: ftp://ftp.simtel.net/pub/simtelnet/msdos/graphics/jpegsr6.zip
Microsoft HD photo specification: http://www.microsoft.com/whdc/xps/wmphotoeula.mspx
JPEG2000 latest reference software (Jasper Version 1.900.0) Website:
http://www.ece.ubc.ca/mdadams/jasper
JPEG-LS reference software website http://www.hpl.hp.com/loco/
M.D. Adams, “JasPer software reference manual (Version 1.900.0),” ISO/IEC JTC 1/SC 29/WG 1
N 2415, Dec. 2007.
M.D. Adams and F. Kossentini, “Jasper: A software-based JPEG-2000 codec implementation,” in
Proc. of IEEE Int. Conf. Image Processing, vol.2, pp 53-56, Vancouver, BC, Canada, Oct. 2000.
M. J. Weinberger, G. Seroussi, and G. Sapiro, “LOCO-I: A low complexity, context-based, lossless
image compression algorithm”, Hewlett-Packard Laboratories, Palo Alto, CA.
M.J. Weinberger, G. Seroussi and G. Sapiro, “The LOCO-I lossless image compression algorithm:
principles and standardization into JPEG-LS”, IEEE Trans. Image Processing, vol. 9, pp. 13091324, Aug.2000.
Ibid, “LOCO-I A low complexity context-based, lossless image compression algorithm”, Proc. 1996
DCC, pp.140-149, Snowbird, Utah, Mar. 1996.
K. Sayood, “Introduction to Data Compression”, Third Edition, Morgan Kaufmann Publishers,
2006.
M.Ghanbari, “Standard Codecs: Image Compression to Advanced Video Coding”. IEE, London,
UK, 2003.
Z. Wang and A. C. Bovik, “Modern image quality assessment”, Morgan and Claypool Publishers,
2006.
References
29.
30.
31.
Special Issue on JPEG-2000, Signal Processing: Image Communication, vol. 17, pp. 1-144, Jan 2002.
A. Stoica, C. Vertan, and C. Fernandez-Maloigne, “Objective and subjective color image quality evaluation for
JPEG 2000- compressed images,” IEEE Int’l Symposium on Signals, Circuits and Systems, vol. 1, pp. 137 – 140,
July 2003.
J. J. Hwang and S. G. Cho, “Proposal for objective distortion metrics for AIC standardization”, ISO/IEC JTC 1/SC 29/WG 1 N4548, Mar
2008.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
H. R. Wu and K. R. Rao, “Digital video image quality and perceptual coding,” Boca Raton, FL: Taylor and Francis,
2006.
I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Communications of the ACM,
vol. 30, pp. 520-540, June 1987.
Z. Zhang, R. Veerla and K. R. Rao, “A modified advanced image coding”, Proceedings of CANS’ 2008, Romania,
Nov. 8-10, 2008.
X. Shang, “Structural similarity based image quality assessment: pooling strategies and applications to image
compression and digit recognition,” M.S. Thesis, EE Department, The University of Texas at Arlington, Aug. 2006.
A. M. Eskicioglu and P. S. Fisher, “Image quality measures and their performance,” IEEE Signal Processing Letters,
vol. 43, pp. 2959-2965, Dec. 1995.
Test images found in: http://www.hlevkin.com/default.html#testimages
Information collected for various topics included in the material: www-ee.uta.edu/dip
Y-L. Lee and K-H. Han, “Complexity of the proposed lossless intra for 4:4:4”, (ISO/IEC JTC1/SC29/WG11 and
ITU-T SG 16 Q.6) document JVT-Q035, 17-21 Oct. 2005.
M. Ouaret F. Dufaux and T. Ebrahimi, “ On comparing JPEG 2000 and intraframe AVC”’,SPIE, Applications of
digital image processing XXIX, vol.6312, pp.
,Aug. 2006.
S-T. Hsiang, “ A new subband/wavelet framework for AVC/H.264 intraframe coding and performance comparison
with motion-JPEG 2000”, VCIP, Proc of SPIE-IS& T Electronic Imaging, SPIE vol. 6822, pp. 68220P-1 thru
68220P-12, Jan. 2008.
S. Srinivasan et al, “An introduction to the HD photo technical design” , JPEG document wg1n4183, April 2007.
References
Books
1. I. Richardson “The H.264 advanced video
compression standard” Hoboken, NJ:
Wiley, 2010.
4x4 INTDCT in H.264

Vcodex white paper on 4x4 transform and quantization in H.264
http://www.vcodex.com/files/H264_4x4_transform_whitepaper_Apr09.pdf
The description of the normative inverse quantization and transform process is
found in the latest standard specification:
http://www.itu.int/rec/T-REC-H.264
 Last, the following papers and standardization contributions contain valuable
information and insight on the transform and quantization design of H.264/MPEG4 Part 10 AVC:
1) H. S. Malvar, A. Hallapuro, M. Karczewicz, and L. Kerofsky, “Low-Complexity
Transform and Quantization in H.264/AVC”, IEEE Trans. on Circ. Sys. on Video
Tech., vol. 13, pp. 598-603, July 2003,
2) A. Hallapuro, M. Karczewicz, and H. Malvar, “Low Complexity Transform and
Quantization – Part I: Basic Implementation”, JVT of ISO/IEC MPEG and ITU-T
VCEG, JVT-B038, Feb. 2002.
3) A. Hallapuro, M. Karczewicz, and H. Malvar, “Low Complexity Transform and
Quantization – Part II: Extensions”, Joint Video Team of ISO/IEC MPEG and ITUT VCEG, JVT-B039, Feb. 2002.
LARGE SIZE TRANSFORMS

W.K. Cham, “Simple order-16 integer transform for video coding” IEEE ICIP 2010,
Hong Kong, Sept.2010.

R. Joshi, Y.A. Reznik and M. Karczewicz, “ Efficient large size transforms for highperformance video coding”, SPIE 0ptics + Photonics, vol. 7798, paper 7798-31,
San Diego, CA, Aug. 2010.

A.T. Hinds, “ Design of high- performance fixed-point transforms using the
common factor method”, SPIE 0ptics + Photonics, vol. 7798, paper 7798-29, San
Diego, CA, Aug. 2010.

G.J. Sullivan, “ Standardization of IDCT approximation behavior for video
compression: the history and the new MPEG-C parts 1 and 2 standards”, SPIE
vol. 6696, paper 35, Aug.2007.

I. E. Richardson , “The H.264 Advanced Video Compression Standard”, 2nd Edition, Wiley
publications, 2010.

High efficiency video coding (HEVC)

http://www.h265.net/ has info on developments in HEVC NGVC – Next generation video
coding.
Some of the tools contributing to the gain are:
(1) RD Picture Decision
(2) RDO_Q (from Qualcomm)
(3) MDDT (from Qualcomm)
(4) New Offset (from Qualcomm)
(5) Adaptive Interpolation Filter (from Qualcomm & Nokia)
(6) Block Adaptive Loop Filter (BALF) (from Toshiba)
(7) Bigger Blocks and Bigger transform (32x32 and 64x64) (Qualcomm)
(8) Motion Vector Competition (France Telecomm)
(9) Template matching
JVT KTA reference software (KTA: key technical areas)
http://iphome.hhi.de/suehring/tml/download/KTA/
G.J. Sullivan and J.-R. Ohm,“Recent developements in standardization of high
efficiency video coding“, Proc. SPIE, vol. 7798, pp. 77980V-1 thru V-7, San diego, CA
Aug. 2010.



NEW GENERATION VIDEO CODING
(NGVC)
VCEG
(ITU-T)
MPEG
(ISO/IEC)
Joint collaborative team on video coding (JCT-VC)
(15-23 April 2010- first meeting)
Table. 1 [1]
Technical assessment first JCT-VC,
Dresden, Germany 15-23 April 2010

All proposed algorithms are based on the traditional MC
hybrid (transform-DPCM)coding approach.
 Random Access
 Low Delay
 TMUC ( test Model Under Consideration)
 Coding Units (CU)
 Prediction Units (PU)
 Transform Units (TU)
Coding Units Intra prediction – upto 28 angular
directions ME/MC
 Inter prediction ( Multiple ref. pictures, bi-prediction,
weighted prediction)
 New MV competition Transform unit block size 4X4 to
64X64 ( Mode dependent directional transform MDDT
and rotational transforms)

ADAPTIVE LOOP FILTER
JCT- VC : Developing a well validated
design called TM leading to HEVC
standardization by 2011.
 First version of HEVC is probably
expected by end of 2012 or early 2013.

Explore the field of multimedia processing in
MPL @
- Dr. K.R. Rao
(817) 272-3478
rao@uta.edu
NH 140
http://www-ee.uta.edu/dip
Download