Lecture 24

advertisement
Lecture 24


Reminders: Programming Project 5 (reassigned
Homework Problem 5.48) due today.
Homework 7 due next Tuesday.
Questions?
Thursday, November 17
CS 475 Networks - Lecture 24
1
Outline
Chapter 7 - End-to-End Data
7.1 Presentation Formatting
7.2 Multimedia Data
7.3 Summary
Thursday, November 17
CS 475 Networks - Lecture 24
2
Data Compression
Data compression can allow us to send data at a
faster rate than seemingly supported by the
network. Compression can, for example, allow us
to send a 10 Mbps video stream over a 1 Mbs
link.
Compression algorithms can be categorized as
lossy or lossless. Greater compression ratios can
be achieved with lossy algorithms. The lost data
may not be noticeable in images or audio or video
streams.
Thursday, November 17
CS 475 Networks - Lecture 24
3
Data Compression
Compression and decompression are often time
consuming. Greater throughput can only be
achieved if
x / Bc + x / (r Bn) < x / Bn
where x represents the amount of data
(uncompressed), r is the compression ratio, Bc is
the bandwidth at which data can be run through
the compressor/decompressor and Bn is the
network bandwidth.
Thursday, November 17
CS 475 Networks - Lecture 24
4
Lossless Compression
Run Length Encoding (RLE)
In RLE, consecutive occurrences of a symbol are
replaced with a copy of the symbol and a count.
The string AAABBCDDDD would be encoded as
3A2B1C4D (an 80% compression ratio).
RLE works well on scanned text images where 8
to 1 rates can be achieved. It is used to transmit
faxes. If there is not a lot of adjacent identical
data, RLE actually can increase the size of a file.
Thursday, November 17
CS 475 Networks - Lecture 24
5
Lossless Compression
Differential Pulse Code Modulation
In DPCM, a reference symbol is output and then
the difference between each symbol and the
reference is output.
The string AAABBCDDDD would be output as
A0001123333. Differences between 0 and 3 can
be encoded using only 2 bits so the compression
ratio is 35%. When the difference becomes too
large a new reference symbol is output.
Thursday, November 17
CS 475 Networks - Lecture 24
6
Lossless Compression
Differential Pulse Code Modulation
A variation gives delta encoding where each
symbol is encoded as the difference from the
previous one.
AAABBCDDDD would become A001011000.
Encoding the difference using only one bit gives a
compression ratio of 21%. Delta encoding may
be following by RLE since there may be long
strings of 0s. Delta encoding works well when
adjacent pixels in an image are similar.
Thursday, November 17
CS 475 Networks - Lecture 24
7
Lossless Compression
Dictionary Methods
The UNIX compress command uses a variation
of the Limpel-Ziv (LZ) algorithm which is a
dictionary based method.
With this approach a group of symbols (a word) is
replaced by its index into a dictionary. LZ
generates the dictionary adaptively and must
transmit the dictionary along with the compressed
data.
Thursday, November 17
CS 475 Networks - Lecture 24
8
Image Compression (GIF)
The Graphical Interchange Format (GIF) image
uses 8-bit pixels to index into a 256 color 24-bit
palette. The image is compressed using LZ with
common sequences of pixels making up the
dictionary.
Patent restrictions on the LZ algorithm spurred
development in 1995 of the Portable Network
Graphics (PNG) format as a replacement format
for images on the web. The patent expired in
2003.
Thursday, November 17
CS 475 Networks - Lecture 24
9
Image Compression (JPEG)
The JPEG (Joint Photographic Expert Group)
image format uses a lossy compression algorithm
that offers better ratios than GIF or PNG. JPEG is
related to MPEG which is a video data format.
JPEG compression occurs in three phases. The
image is fed through the phases one 8 x 8 block
at a time.
Thursday, November 17
CS 475 Networks - Lecture 24
10
Image Compression (JPEG)
DCT Phase
To simplify the discussion assume that we are
working only with 8-bit grayscale images where 0
is white and 255 is black.
In the discrete cosine transform (DCT) phase an 8
x 8 matrix of pixel values is converted to an 8 x 8
matrix of frequency coefficients. Low frequency
coefficients define gross image feature while high
frequency coefficients correspond to the fine
details.
Thursday, November 17
CS 475 Networks - Lecture 24
11
Image Compression (JPEG)
DCT Phase
The DCT and its inverse are defined by
N −1 N −1
1
DCT i , j=
C i C  j ∑ ∑ pixel  x , y 
 2N
x=0 x=0
2 x1i 
2 y1 j 
cos
cos
2N
2N
[
] [
1
pixel  x , y =
2N
]
N −1 N −1
∑ ∑ C iC  j DCT i , j 
i=0 j =0
[
] [
2 x1i 
2 y1 j 
cos
cos
2N
2N
{
1
where C i=  2
1
Thursday, November 17
]
if i=0
if i0
CS 475 Networks - Lecture 24
12
Image Compression (JPEG)
DCT Phase
The frequency coefficient at (0, 0) is the DC
coefficient and is the average of the 64 pixel
values. The other coefficients correspond to
higher frequencies and to finer and finer detail.
The DCT is lossless. The image can be exactly
reconstructed from the coefficients using the
inverse transform. It is in the Quantization Phase
where loss is introduced.
Thursday, November 17
CS 475 Networks - Lecture 24
13
Image Compression (JPEG)
Quantization Phase
To see how quantization works imagine truncating
numbers less than 100 to multiples of 10. For
example, the numbers 45, 98, 23, 66 and 7 would
become 4, 9, 2, 6 and 0. These numbers can be
encoded using only 4 bits instead of 7.
The truncation operation is equivalent to dividing
each of the original numbers by a quantum (10)
using integer division.
Thursday, November 17
CS 475 Networks - Lecture 24
14
Image Compression (JPEG)
Quantization Phase
Instead of using the same quantum value for all
64 freq. coefficients, JPEG uses quantization
tables. An example table is:
[
3
5
7
9
Quantum=
11
13
15
17
Thursday, November 17
5
7
9
11
13
15
17
19
7
9
11
13
15
17
19
21
9
11
13
15
17
19
21
23
11
13
15
17
19
21
23
25
CS 475 Networks - Lecture 24
13
15
17
19
21
23
25
27
15
17
19
21
23
25
27
29
17
19
21
23
25
27
29
31
]
15
Image Compression (JPEG)
Quantization Phase
The quantized frequency coefficient values are
computed using:
QuantizedValue i , j=
IntegerRound DCTi , j/Quantum i , j
Thursday, November 17
CS 475 Networks - Lecture 24
16
Image Compression (JPEG)
Encoding Phase
A variant of RLE is used to compress an 8 x 8
block along a zigzag path. The individual
coefficients are encoded using a Huffman code
(short codes are assigned to the most frequently
occurring coefficients).
Zigzag traversal of the
quantized frequency
coefficients for RLE.
Thursday, November 17
CS 475 Networks - Lecture 24
17
Image Compression (JPEG)
Color Images
A color image can be thought of as being made
up of separate red, green and blue images. This
is known as the RGB representation. JPEG
typically uses the YUV (luminance and
chrominance) representation.
Each of the Y, U, and V images is processed as
described for a grayscale image.
JPEG is capable of compressing 24-bit images by
an approximately 30-to-1 ratio.
Thursday, November 17
CS 475 Networks - Lecture 24
18
Video Compression (MPEG)
Video can be thought of as a succession of still
images or frames. Individual frames can be
compressed using a DCT-based technique as
with JPEG.
By additionally removing the redundant
information that is present in successive frames
compression ratios on the order of 150-to-1 can
be achieved.
Thursday, November 17
CS 475 Networks - Lecture 24
19
Video Compression (MPEG)
Frame Types
MPEG takes a video stream and produces
intrapicture (I), predicted picture (P) and
bidirectional predicted picture (B) frames.
Thursday, November 17
CS 475 Networks - Lecture 24
20
Video Compression (MPEG)
Frame Types
I frames can be considered to be JPEG
compressed frames. They are self-contained. P
and B frames specify differences relative to an I
frame.
16 x 16 blocks
of pixels are
used in MPEG.
The U and V
components are
downsampled to
an 8 x 8 block.
Thursday, November 17
CS 475 Networks - Lecture 24
21
Video Compression (MPEG)
Frame Types
A macroblock in a B frame is represented with a
4-tuple: (1) coordinates (x and y) for the
macroblock, (2) a motion vector (x p and yp)
relative to the previous reference, (3) a motion
vector (xf and yf) relative to the next reference,
and (4) a delta (δ) for each pixel in the
macroblock. The pixel value in the current frame
is computed using:
Fc(x, y) =
(Fp(x+xp, y+yp)+Fn(x+xf, y+yf))/2 + δ(x, y)
Thursday, November 17
CS 475 Networks - Lecture 24
22
Transmitting MPEG over a Network
The MPEG specification specifies the format for a
video stream but does not specify how the stream
is broken into network packets.
An MPEG main profile is a nested structure. At
the top level the video stream is broken into
groups of pictures (GOP) separated by a SeqHdr.
The SeqHdr contains a quantization matrix for the
I frame and one for the B and P frames.
Thursday, November 17
CS 475 Networks - Lecture 24
23
Transmitting MPEG over a Network
Thursday, November 17
CS 475 Networks - Lecture 24
24
Transmitting MPEG over a Network
A GOP consists of pictures (I, B, and P frames)
and pictures are broken into slices (a region of the
picture). A slice consists of a macroblock. A
macroblock is made up of six 8 x 8 blocks (one
each for the U and V components and 4 for the Y
component which is 16 x 16).
MPEG allows the frame rate, resolution, mix of
frame types, and quantization to change between
GOPs, allowing picture quality to be traded for
network bandwidth adaptively.
Thursday, November 17
CS 475 Networks - Lecture 24
25
Transmitting MPEG over a Network
By using UDP and breaking the stream at
selected points (macroblock boundaries) we can
limit the loss in picture quality due to a lost
network packet. This is an example of
application level framing.
The video encoding method may depend on
latency as well as bandwidth. For interactive
videoconferencing (where low latency is
desirable) an encoding using only I and P frames
(I P P P P I) may be preferable to one also using
B frames (I B B B B P B B B B I).
Thursday, November 17
CS 475 Networks - Lecture 24
26
Audio Compression (MP3)
CD quality sound requires sampling at 44.1 kHz
using 16-bit samples. A stereo stream implies a
bit rate of
2 x 44100 x 16 = 1.41 Mbps
Synchronization and error correction require
sending 49 bits instead of 16 resulting in
49/16 x 1.41 Mbps = 4.32 Mbps
Thursday, November 17
CS 475 Networks - Lecture 24
27
Audio Compression (MP3)
MPEG also defines audio compression formats,
the most popular is Layer III (MP3) which reduces
BW requirements to 128 Kbps
MP3 splits audio into frequency subbands. The
subbands are broken into blocks, transformed
(DCT), quantized and Huffman encoded (all
similar to video compression).
The audio frames are interleaved with video
frames to form a complete MPEG stream.
Thursday, November 17
CS 475 Networks - Lecture 24
28
In-class Exercises
To be submitted as part of Homework 8.

Problems 7.25 and 7.26 on page 629.
Some notes:



You can create image files in plain (P2 type) PGM format
using a text editor. See the PGM man page for the
description.
cjpeg produces JPEG files from PGM files.
djpeg produces RAW PBM files from JPEG files. They
can be converted to plain format using pnmtoplainpnm or (depending on which version of the PBM utilities you
have installed) pnmtopnm ­plain
Thursday, November 17
CS 475 Networks - Lecture 24
29
Download