Digital Media Dr. Jim Rowan ITEC 2110 Video Part 2

advertisement
Digital Media
Dr. Jim Rowan
ITEC 2110
Video Part 2
Digital Video Standards
• Even though digital video COULD be
much less complicated... it isn’t
because...
• Backward compatibility requirement
– new equipment must create signals that
can be handled by older equipment
• Originally... TV signals (analog) needed
to be converted to digital format
Digital Video Standards...
• Digital from NTSC and PAL are ANALOG
standards
– Each has a screen size and frame (or refresh) rate
– Each define a number of lines on the screen (that
can be easily used for one the Y dimension)
– But what about the other dimension, the X?...
– Each line is a continuous (analog) signal which
has to be converted to digital...
• How do you do that?
– SAMPLE the analog data!
– But directly sampling for each pixel results in a
data stream of 20 Mbytes/ second... HUGE!
Coping with Video Size
• Aside from changing screen size or
frame rate...
• Consider human vision limitations
– Use algebra to compute part of the signal
– Chrominance sub-sampling
• Compression - two versions
– spatial
– temporal
Coping with Video Size
• Aside from changing screen size or
frame rate...
• Consider human vision limitations
– Use algebra to compute part of the signal
– Chrominance sub-sampling
• Compression - two versions
– spatial
– temporal
Sampling analog Video
• To reduce the data stream you can consider
human vision again
• Human eyes are less sensitive to color
changes than luminance
• Decision: Take fewer samples for color than
luminance
• Without sub-sampling...
– for each pixel on the screen 4 things will have to
be encoded
• luminance, red, blue, green
Sub-sampling &
understanding human vision
Designers realized that Green contributes the most to intensity,
Red is next and Blue hardly contributes anything to luminance
Based on this, it was decided to use a formula for luminance
• Y = 0.2125R+0.7154G+0.0721B
With this we only have 3 (not 4) data elements to transmit
The 4th element (green luminance) can be calculated
- results in a 25% data reduction
-Y (luminance)
-Cb (blue chrominance)
-Cr (red chrominance)
Calculating the 4th color
component
• Known as the Y’CbCr model for CRTs
Y = 0.2125R+0.7154G+0.0721B
Solve for Cg (green):
Y - 0.0721B - 0.2125R = 0.7154G
0.7154G = Y - 0.0721B - 0.2125R
G = (Y - 0.0721B - 0.2125R) / 0.7145
There is a use for algebra!
Coping with Video Size
• Aside from screen size and frame rate...
• Consider human vision limitations
– Use algebra to compute part of the signal
– Chrominance sub-sampling
• Compression - two versions
– spatial
– temporal
Chrominance sub-sampling
• Humans can’t distinguish changes in color as
well as they can distinguish luminance
changes
– http://en.wikipedia.org/wiki/Chroma_subsampling
• Of every 4 frames
– store the luminance
– only store a proportion of the color info
Chrominance sub-sampling
LRGB LRGB
LRGB
LRGB
LRB
L
LRB
L:R:B
4:4:4
L
4:2:2 CCIR 601
video sampling
L
L
LRB
LR
LB
L
L
L
L
L
4:1:1 NTSC DV
L
L
4:2:0 PAL DV
notice the
inconsistency?
NTSC & PAL weirdness
(sidebar)
• NTSC & PAL
–
–
–
–
Define different screen sizes
Define different frame rates
Both have the same aspect ratio of 4:3
BUT... they each are digitized (through sampling)
to the same screen size
• The result?
– The pixels are not square
– PAL is taller than it is wide
– NTSC is wider than it is tall
DV and MPEG
• DV and its different forms:
– MiniDV, DVCAM & DVPRO
– http://en.wikipedia.org/wiki/DV#DVCAM
• DVCAM & DVPRO
– use the same compression algorithm (5:1)
– use the same data stream (25Mbits)
– use 4:2:2 sampling where DV uses 4:1:1
DV and MPEG
• MPEG-1 originally meant for Video CD
– http://en.wikipedia.org/wiki/Mpeg
– never got very popular
– developed into a family of standards
• MPEG-4 rose from the ashes
– http://en.wikipedia.org/wiki/Mpeg-4
– used for iTunes video
A Computational Irony
• Digital has been touted as a way to create
exact copies while analog (VCR) cannot...
– Analog VCR suffers from generational loss
– Digital doesn’t suffer from generational loss
• BUT only if you use video compression that
is... LOSSLESS
• AND... you guessed it, a lossless video
compression technique is not used because
the lossless ones don’t compress enough
Lossless compression
Original
compression
routine
Exact
duplicate
Original
compressed
original
decompress
routine
Lossy compression
Original
compression
routine
Changed
Original
compressed
original
decompress
routine
Changed
Original 2
The Moral?
• In production, if several people are
working on the same bit of video, make
sure that they all get uncompressed
video to work with.
• Only produce the compressed version
after all the work is complete.
Coping with Video Size
• Aside from screen size and frame rate...
• Consider human vision limitations
– Use algebra to compute part of the signal
– Chrominance sub-sampling
• Compression - two versions
– spatial
– temporal
Coping with Video Size
• Spatial compression
• Individual images can be compressed using the
techniques discussed in the bitmapped section
• Doesn’t result in very much compression for
video
• Doesn’t take into consideration the other
frames that come before or after it
Coping with Video Size
• Aside from screen size and frame rate...
• Consider human vision limitations
– Use algebra to compute part of the signal
– Chrominance sub-sampling
• Compression - two versions
– spatial
– temporal
Temporal Compression 1
• Use the Difference in two frames
– naive approach can result in good
compression
– works well for a small amount of movement
– A Tarantino film? not so much...
Temporal Compression 2
• When an object moves
– compute its trajectory
– fill in the resulting exposed background
vector
– BUT there’s a problem...
– why isn’t this an easy thing to do?
Temporal Compression 2
• Bitmapped images do not have defined
objects... that’s Vector graphics...
• What to do?
Temporal Compression 2
• Define blocks of 16 x 16 pixels
– called a macroblock
• Compute all possible movements of the block
within a short range
• Compute a vector to define that movement
• Store the movement vectors
• Compress the vectors
More on
Temporal Compression
• Need some place to start from
• Can be forward or backward prediction
• Called KeyFrames
–
–
–
–
pick a keyframe
compute next image from that
compute next image from that
What happens when the scene completely
changes?
• Pick a new key frame...
• But HOW?
• Requires powerful AI
Video Compression
What does this?
• Coder/Decoder - Codec
– http://en.wikipedia.org/wiki/Video_codec
• encodes and decodes video
– can be symmetric
• it takes as long to compress as decompress
– can be asymmetric
• it takes longer to compress or decompress than it does to
decompress to compress
A final worry...
• We have been talking about making
video smaller
• There are a variety of techniques to do
this
• Which to choose?
– It is a tradeoff between compression
technique and its computational complexity
MPEG-4
• Designed for streams that contain
video, still images, animation, textures
3-D models
• Contains methods to divide scenes into
arbitrarily shaped video objects
• The idea is that each object has an
optimal compression technique
• BUT...
MPEG-4
• Dividing a scene into arbitrarily shaped video objects
is non-trivial
– so they drop back to the rectangular object position
• Quicktime and DivX use the rectangular video object
idea
• Forward inter-frame compression
• Backward inter-frame compression
• Using the simpler technique reduces the
computational complexity allowing it to be
implemented on small devices like portable video
players
Other codecs
• Cinepak, Intel Indeo & Sorenson
• All use “vector” quantization
– divides frame into rectangular blocks
– these frames are called “vectors” but they don’t represent
movement or direction
• Codec uses a collection of these “vectors”
– contains typical patterns seen in the frames
• textures, patterns, sharp and soft edges
– compares the “vectors” to the ones in the code book
– if it is close, it uses the code book entry
– (does this explain the patchwork painting of the screen when
the digital signal goes bad?)
“Vector” quantization
• a frame contains indices into the code book
• reconstructs the image from “vectors” in the code
book
– makes decompression very straight forward and efficient
– this makes the implementation of a player very easy
• What about the compression?
“Vector” quantization
• this is an asymmetric codec
• compression takes ~150 times longer
than decompression
• Cinepak and Intel Indeo use temporal
compression and simple differencing
• Sorenson uses motion compensation
similar to the MPEG-4 standard
So... How do codecs vary?
• compression and decompression complexity
– affects the artifacts that are created
– affects the time required to carry them out
– affects the volume of the data stream created
– affects the type and expense of the equipment used
– affects whether or not it can be implemented in
hardware of software
Comparison
Bear in mind that this comparison is not absolute and
will vary from frame to frame but in general...
• MPEG-4
– detail is good (at the sacrifice of speed)
• DV
– detail is good but the biggest
• Sorenson
– loss of detail (see pg 218-219)
• Cinepak
– loss of detail
– smallest file
A word about QuickTime
• All standards so far have defined the data stream... not the file
format
• QT is the defacto standard design of a component-base
architectural framework
– allows plugins (components) to be developed by others
• Every QT movie has a “time base”
– records playback speed and current position relative to a time coordinate
system to allow them to be replayed at the right speed on any system
– It keeps the visual synched with the audio
– If the playback speed of the device is not fast enough, QT drops frames
keeping audio synchronization
More about QuickTime
Plugins make it flexible so that it can
accommodate new file formats
– comes with a standard set of plugins (components)
• compressor components include
– MPEG-4, Sorenson and Cinepak
• movie controller interface provides uniformity
• transcoder components exist to convert one format to
another format
– supports true streaming and progressive download
Questions?
Download