Video

advertisement
Chapter 10
Video
Multimedia Systems
Key Points
The display of moving pictures depends on
persistence of vision.
 Uncompressed video requires 26MBytes per sec
(NTSC) or 31MBytes per sec (PAL).
 Digitization may be performed in the camera
(e.g. DV) or using a capture card attached to
a computer.
 NTSC, PAL and SECAM are analogue video
standards. All three use interlaced fields.

Key Points
CCIR 601 is a standard for digital video. It uses Y'CBCR
colour with 4:2:2 chrominance sub-sampling. The
data rate is 166Mbits per sec.
 Video compression can make use of spatial (intra-frame)
and temporal (inter-frame) compression. Spatial
compression is still-image compression applied to
individual frames. Temporal compression is based on
frame differences and key frames.
 Motion JPEG applies JPEG compression to each frame.
It is usually performed in hardware.
 Cinepak, Intel Indeo and Sorenson are popular
software codecs used in multimedia. They are based on
vector quantization.

Key Points
MPEG video is an elaborate codec that
combines DCT-based compression of key frames
(I-pictures) with forward and backward
prediction of intermediate frames (P-pictures
and B-pictures) using motion compensation.
 QuickTime is a component-based multimedia
architecture providing cross-platform support for
video, and incorporating many codecs. It has its
own file format that is widely used for
distributing video in multimedia.
 Digital video editing is non-linear (like film
editing).

Key Points
Most digital post-production tasks are
applications of image manipulation operations to
the individual frames of a clip.
 For delivery using current technology, it may be
necessary to sacrifice frame size, frame rate,
colour depth, and image quality.
 Streamed video is played as soon as it arrives
without being stored on disk, so it allows for live
transmission and `video on demand'.

Moving Pictures

All current moving pictures depend on
the following phenomena
– Persistence of vision
 A lag in the eye's response results 'after-images'
– Fusion frequency
 If a sequence of still images is presented above
this frequency, we will experience a continuous
visual sensation
 Depend on brightness of image relative to
viewing environment
 Below this frequency will perceived flickering
effect
Generate Moving Pictures

Video
– Use video camera to capture a sequence of
frames

Animation
– Generate each frame individually either by
computer or by other means
Digital Video
A video sequence consists of a number
of frames
 Each frame is a single image produced
by digitizing time-varying signal
generated by video camera

Digital Video

Think about the size of the uncompressed
digital video
– NTSC video format
 Bitmapped images for video frame
– 640  480 pixels with 24-bit color = 0.9 MB/frame
 30 frames per second
– 900 kb/frame  30 frames/sec = 26 MB/sec
 60 seconds per minute
– 26 MB/sec  60 secs/minute = 1,600 MB/minute
Strains on current processing, storage and data transmission !
Create Digital Video

Get analog/digital video signal from
– video camera
– video tape recorder (VTR)
– broadcast signal

Digitize analog video & compress it
Digitizing Analog Video

In computer
– Video capture card
 Convert analog to digital & compress
 Can also decompress & convert digital to analog
– Compress through
 Video capture card (hardware codec)
 Software (software codec)
Digitizing Analog Video

In camera
– Digitize and compress using circuitry inside
camera
– Transfer digitized signal from camera to
computer through
 IEEE 1394 interface (FireWire): 400 Mb/sec
 USB: 12Mb/sec(version 1.1) ~ 480 Mb/sec(version
2.0)
Digitize in Computer v.s. Camera

Digitize in camera
– Advantage
 Digital signals are resistant to corruption when
transmitted down cables and stored on tape
– Disadvantage
 User has no control between picture quality
and data rate (file size)
Analog Video Standard
Frame/Field
Rate
(per sec)
NTSC
PAL
SECA
M
Scan Aspect Horizontal/Vertical
Lines Ratio
frequency
525
29.97 / 59.94 (480)
25 / 50
625
(576)
25 / 50
625
(576)
4:3
4:3
15.734 kHZ / 60
HZ
15.652 kHZ / 50
HZ
Worldwide
Standard
US, Japan,
Taiwan, …
Western
Europe,
…
France, …
4:3
15.652 kHZ / 50
HZ
Display Video on TV
Cross-section of CRT
Delta-delta shadow-mask CRT
(Scan From “Computer Graphics: Principles and Practice”)
Field and Interlace

Transmitting many entire pictures in a second
requires a lot of bandwidth

Field
– Divide each frame into two fields
 One consisting of the odd-numbered lines of each frame,
the other of the even lines

Interlace
– Each frame is built up by interlacing the fields
 PAL
– 50 fields/sec => 25 frames/sec
 NTSC
– 59.94 fields/sec => 29.97 frames/sec
Display Video on Computer

Progressive scanning
– Write all lines of each frame to frame buffer
– Refresh whole screen from frame buffer at
high rate
Display Video
TV
…
frame i
frame i+1
frame i+2
frame i+3
…
…
Computer Monitor
Field & Interlace Artifacts
A video clip of flash light on the water surface
Odd lines of frame i
Even lines of frame i+1
Combine previous two
analog video for
progressive display
Field & Interlace Artifacts
Prevent Interlace Artifacts

Average two field to construct a single
frame

Discard half fields and interpolate
remained fields to construct a full frame

Convert each field into a single frame
(reduce frame rate but much better !)
Types of Analog Video

Component video
– Three components: Y (luminance), U and V (color)
– Often use in production and post-production

Composite video
– Combine three components into a signal
– Color component (U and V) is allocated half
bandwidth as the luminance (Y)
– Often use in transmission

S-video
– Separates the luminance from the two color (total
two signals)
Digital Video Standards

CCIR 601 (Rec. ITU-R BT.601)
– specifies the image format, and coding for
digital television signals
Parameter
Value
YUV encoding
4:2:2
Sampling frequency for Y (MHz)
13.5
Sampling frequency for U and V (MHz)
6.75
No of samples per line
720
No of levels for Y component
220
No of levels for U,V components
225
Perplexing
NTSC System
1
2
3
1
2
3
…
…
480
480
1 2 3 …
Analog to digital
Pixels are square
640
1 2 3 …
CCIR 601 standard
Pixels are not square
720
CCIR 601 Sampling
4 : 2 : 2 sampling
(co-site)
4 : 2 : 0 sampling
(not co-site)
Y samples
CB and CR samples
4 : 1 : 1 sampling
(co-site)
Compression & Data Stream Standards

Sampling produces a digital representation
of a video signal

This must be compressed and then
formed into a data stream for transmission

Further standards are needed to specify
the compression algorithm and the format
of the data stream
Compression & Data Stream Standards

DV standard
– For semi-professional & news-gathering

MPEG-2 standard
– For family use
– Organized into different profiles and levels
 The most combination is Main Profile at Main Level
(MP@ML)
 Used for digital television broadcasts & DVD video
Introduction to Video Compression

Adapted to consumers’ hardware, video
data needs to be compressed twice
– First during capture
– Then again when it is prepared for distribution
Video Compression

Digital video compression algorithms
operate on a sequence of bit-mapped
images
– Spatial compression (intra-frame)
 Compress each individual image in isolation
– Temporal compression (inter-frame)
 Store the differences between sub-sequences of
frames
Spatial Compression

Compress method is similar to image
compression
– Lossless
 No information loss
 Compression ratios is lower
– Lossy
 Some information loss
 Compression ratios is higher

Why recompressing video is unavoidable
– The compressor used for capture are not suitable
for multimedia delivery
– For post-production
Temporal Compression

Key frames
– Certain frames in a sequence are designated
as key frames

Difference frame
– Each of the frames between the key frames is
replaced by a difference frame
– Records only the differences between the
frames
Time Required for Compression & Decompression

Symmetrical
– Compression & decompression of a piece of
video take the same time

Asymmetrical
– Compression & decompression of a piece of
video not take the same time
– Generally Compression takes longer time
Motion JPEG (MJPEG)

A popular approach to compressing video
during capture

Applying JPEG compression to each frame
(No temporal compression)

Therefore it is called “Motion JPEG”
DV

Compression based on DCT transform

Perform temporal compression (motion
compensation) between two fields of each
frame

Quality is varied dynamically to maintain
constant data rate
MJPEG v.s DV
Date Rates
Compression
method
Compression
ratio
MJPEG
24 Mb/sec
JPEG compression
7:1
(mid-range capture card)
DV
25 Mb/sec
DCT based
compression
Motion compensation
5:1
(4:1:1 sample)
Software Codecs for Multimedia

Popular software codecs
– MPEG-1
– Cinepak
– Intel Indeo
– Sorenson
Vector Quantization
Source
Output
Group
into
vectors
encoder
Find closest
code-vector
code book index
Reconstruction
decoder
Table
lookup
index code book
Unblock
MPEG

Stand for Motion Picture Experts Group
(Joint of the ISO and the IEC)

Works on standards for the coding of
moving pictures and associated audio
MPEG Family

MPEG – 1
– Coding of moving pictures and associated audio
for digital storage media at up to about 1.5 Mb/s

MPEG – 2
– Generic coding of moving pictures and associated
audio
– For broadcasting & studio work

MPEG – 3
– no longer exists (has been merged into MPEG-2)

MPEG – 4
– Very low bit rate audio-visual (integrated
multimedia) coding
MPEG Family

MPEG – 7
– Multimedia content description interface

MPEG – 21
– Vision statement
 To enable transparent & augmented use of multimedia
resources across a wide range of networks and devices
– Objectives
 To understand how the elements fit together
 To identify new standards which are required if gaps in
the infrastructure exist
 To accomplish the integration of different standards
MPEG–1 Standard

Defines a data stream syntax and a
decompressor, allowing manufacturers to
develop different compressors

MPEG-1 compression
– Temporal compression based on motion
compensation
– Spatial compression based on quantization &
coding of frequency coefficients produced by
a DCT of the data
MPEG –1 Objective
Medium quality video (VHS-like)
 Bit rate < 1.5 Mb/s

– 1.15 Mb/s for video
– 350 kb/s for audio & additional data

Asymmetrical application
– Store video & audio on CD-ROM

Picture format : SIF (Source Input Format)
– 4:2:0 sub-sampled
– Frame size @ frequency rate
 352  288 @ 25 HZ
 352  240 @ 30 HZ
An object moving between frames
Area of potential change
Motion Compensation
Divide each frame into macroblocks of 16 
16 pixels
 Predict where the corresponding macroblock
in next frame

– Try all possible displacements within a limited
range
– Choose the best match

Construct difference frame by subtracting
each macroblock from its predicted
counterpart
– Keep the motion vectors describing the predicted
displacement of macroblocks between frames
Picture Type

I (intra) pictures
– Code without reference to other pictures
– Low compression rate

P (predicted) pictures
– Code using motion compensated prediction from a
past I or P picture
– Higher compression rate than I picture

B (bidirectional-predicted) pictures
– Code bidirectional interpolation between the I or P
picture which preceded & followed them
– Highest compression rate
All are compressed using the MPEG version of JPEG compression
P
I
B
01
B
02
03
I
B
04
B
05
06
B
B
11
I
P
12
13
B
14
B
15
16
21
Group of Pictures (GOP)
An MPEG sequence in display order
P
I
I
B
01
04
02
B
03
B
11
I
P
05
B
B
06
14
12
B
13
B
21
An MPEG sequence in bitstream order (decode order)
15
B
16
MPEG-1 視頻壓縮技術
運動補償 (Motion Compensation)
 頻率變換 (Frequency Transform)
 可變長度編碼 (Variable Length Coding)
 彩色信號 subsampling
 量化 (Quantization)
 預測編碼
 圖像插值

QuickTime
Apple, 1991
 Time base, non-linear editing
 Component-based architecture

– Compressor components
 Cinepak, Intel Indeo codec
– Sequence grabber components
– Movie control component
– Transcoder
 Translate data between different formats
– Video digitizer component

Support MPEG-1, DV, OMF, AVI, OpenDML
Digital Video Editing & Post-production
Editing
 Compositing
 Reverse shot

– Conversation between two people
Film & Video Editing
Traditional
 In point and out point
 Timecode

– SMPTE timecode
– Hours, minutes, seconds, frames

VHS
– Two copying operations is to produce serious
loss of quality
– Constructed linearly
Digital Video Editing



Random access
Non-destructive
Premiere
– Three main windows
 Project, timeline, monitor
 Figs. 10.12-14
– Timelines
 Have several video tracks
– Transitions, Fig. 10.15
– Cuts and Transitions
 In a cut, two clips are butted
 In transitions, two clips overlap
– Image processing is required to construct transitional frames
Digital Video Post-production

Over- or under-exposed, out of focus, color cast,
digital artifacts
– Provide image manipulation programs
 Adjust level, sharpen, blur
– The same correction may be needed for every frame,
so the levels can be set for the first frame and the
adjustment will be applied to as many frames as user
specifies.
– If light fades during a sequence, it will necessary to
increase the brightness gradually to compensate.
 Apply a suitable correction to each frame and allow their
values at intermediate frames to be interpolated
 Varying parameter values over time
Keying


Selecting transparent areas
Blue screening
– Chroma keying: any color
– Alpha channel
– Luma keying: a brightness threshold is used to
determine which areas are transparent

Select explicitly
–
–
–
–

Create mask
In film and video, mask is called matte
Matte out: removing unwanted elements
Split-screen effects
Alpha channel created in other application
Track matte

Chroma keying and luma keying
– Color and brightness changes between frames
– Use a sequence of masks as matte
 Separate video track: track matte
– Track matte
 Painstaking by hand
 Generated from a single still image
applying simple geometrical transformations over
time to create a varying sequence of mattes
Adobe After Effects
Apply a filter to a clip and vary it over time
 A wide range of controls for the filter’s
parameters
 Premiere: parameter values are
interpolated linearly between key frames
 After effect: interpolation can use Bezier
curves

Preparing Video for Multimedia Delivery
Frame size, frame rate, color depth, image
quality
 People sit close to monitors, so a large picture is
not necessary
 Higher frame rates are needed to eliminate
flicker only if display is refreshed at the same
rate.

– Computer monitors are refreshed at a much higher
rate from VRAM.

Limiting colors
– Not all codecs support
Streamed Video & Video Conference

Streamed video
– Delivering video data stream from a remote
server, to be displayed as it arrives
– As against downloading an entire files to disk
& playing it from there
– Opens up the possibility of delivering live
video on computers
Streamed Video & Video Conference

Video conference
– Streamed video doesn't restricted to a single
transmitter broadcasting to many consumers:
Any suitably equipped computer can act both
as receiver & transmitter
– Users on several machines can communicate
visually, taking part in what is usually called a
video conference
Single transmitter
Multiple receiver
All computer are receiver & transmitter
Obstacle to Streamed Video

Bandwidth
– SIF MPEG-1 video require a bandwidth of 1.86
Mb/sec
– Decent quality streamed video is restricted to LAN,
T1 lines, ADSL & cable modems for now

Delivering time over network
– Deliver data with the minimum of delay
 Delay may cause independently delivered video & audio
stream to lose synchronization
Conventional Delivery of WWW Video

Embedded video
– Transfer movie files from a server to the user’s
machine
– Playback from disk once the entire file has arrived

Progressive download (HTTP streaming)
– Transfer movie files to user’s disk
– Start playing as soon as enough of it has arrived
– The file usually remains on the user’s disk after
100
playback is completed
download
Cannot be used for live video !
0
play
start playing
time
True Streaming
Each stream frame is played as soon as it
arrives over the network
 Video files is never stored on the user’s disk

– Length of streamed movie is limited only by the
storage size at the server, not by the user’s
machine

Suit for live video & video on demand (VOD)
– The network must be able to deliver the data
stream fast enough for playback
– Movie’s data rate & quality is restricted to what
the network can deliver
State of the Art

Leading technique over the Internet
– RealVideo (Real Networks)
– Streaming QuickTime (Apple)
– Media Player (Microsoft)

Architecture
– RTSP (Real Time Streaming Protocol)
 Control the playback of video streams
– Providing several versions of a movie compressed
 Server chooses the appropriate one to fit the speed of the
user’s connection
Codecs for Video Conferencing

H.261
– Designed for two-way telecommunication
applications over ISDN
– A precursor of MPEG-1
 DCT-based compression with motion compensation
 It does not use B-pictures

H.263
– Very low bit rate video

H.263+
– An extension of H.263
H.261

Real time constrain
– Video conference cannot tolerate longer delays
without becoming disjointed
– Maximum delay: 150 ms (about 7 frames/sec)
Bit rate: p  64 kbps (p = 1 ~ 30)
 Picture format

– CIF (Common Intermediate Format)
 Component(size): Y(352  288), Cb & Cr(176  144)
 Picture rate: 29.97 frames/sec
– QCIF (Quarter CIF)
 Component(size): Y(176  144), Cb & Cr(88  72)
H.263
Very low bit rate video (< 64 kbps)
 Primary target rate is about 27 kbps (V.34
modem)
 Compression techniques

– Chroma sub-sampling 4:2:0
– DCT compression with quantization
– Run-length and variable-length encoding of
coefficients
– Motion compensation with forward & backward
prediction
– Compress a QCIF picture as low as 3.5 frames/sec
Download