Chapter 7 End-to

advertisement
Chapter 7 End-to-End Data
Ohanes Dadian, Danny Luong, Yu Lu
Contents
7.1 Presentation Formatting
7.1.1 Taxonomy
7.1.2 Examples
7.1.3 Markup Languages
7.2 Multimedia Data
7.2.1 Lossless Compression Techniques
7.2.2 Image Representation and Compression
7.2.3 Video Compression
7.2.4 Audio Compression
Definition

Data



Presentation Format




Agreement
An important aspect
Data manipulation function
Encoding


What is it?
Importance
Argument marshalling
Decoding

Argument unmarshalling
7.1 Presentation Formatting
Challenges of Presentation Formatting

Different representation in Floating-Point Numbers



Different registers sizes




IEEE standard 754 format
Nonstandard format
16-bit register
32-bit register
64-bit register
Different representation in Integers


Big-Endian
Little-Endian
7.1.1 Taxonomy

Data Types

Conversion Strategy

Tags

Stubs
Data Types

Lowest Level - Base Types




Medium Level – Flat Types



Integers
Floating-point numbers
Characters
Structures
Arrays
Highest Level – Complex Types

Those types that are built using pointers
Data Types
Conversion Strategy



Canonical intermediate Form
 Sender:
internal representation  external representation
 Receiver:
external representation  internal representation
Receiver-Makes-Right
 Sender :
Do not convert, send its internal representation directly
 receiver :
Any representation  internal representation
Combined
Conversion Strategy Debate

Canonical intermediate Form vs. Receiver-Makes-Right
7.2.1 Lossless Compression Techniques

Data Compression




Data Compression Categories


What is it?
Bandwidth vs Throughput
When to compress?
Lossy vs Lossless
Lossless Compression Techniques



Run Length Encoding
Differential Pulse Code Modulation
Dictionary-Based Methods
Tags

Tagged Data



A tag = additional information
Help the receiver decoding data
Untagged Data


How does the receiver know how to decode?
Think any object-oriented language as an example
Stubs

On the Client Side


Marshal the arguments into a message
On the Server Side

Converts the message back into variables
Stubs
7.1.2 Examples

External Data Representation (XDR)




Supports the entire C-type System without function pointers
defines canonical intermediate form
Does not use tags
Uses compiled stubs
7.1.2 Examples

Abstract Syntax notation One (ASN.1)





Supports the C-type system without function pointers
Defines canonical intermediate form
Uses type tags
Uses either interpreted or compiled stubs
Representation

<tag , length , value>
7.1.2 Examples

Network Data Representation (NDR)





Supports the C-type system
Defines receiver-makes-right
Uses tags
Generates the necessary stubs
Representation
7.1.3 Markup Languages(XML)

Extensible Markup Language – XML




XML Schema Document – XSD


What is xml?
Can be sent as data over internet
Can be configuration file used in frameworks
Defines XML
Namespace




Solve name clashes
URI – Uniform Resource Identifier (detail in Chapter 9)
xmlns: emp = “http://www.example.com/employee”
<emp: title> head Bottle Washer </emp:title>
XML--Example
XSD--Example
Data Compression

Data Compression: Less data for the same message,
increase throughput

Bandwidth vs Throughput



Bandwidth – Physical
Throughput – Logical
When to compress?

Time Cost
Data Compression Categories

Two different categories



Lossy
Lossless
Pros and Cons of each technique


Compression ratio
Exact reconstruction
Lossless
Lossy
7.2.1 Lossless Compression Techniques

Data Compression




Data Compression Categories


What is it?
Bandwidth vs Throughput
When to compress?
Lossy vs Lossless
Lossless Compression Techniques



Run Length Encoding
Differential Pulse Code Modulation
Dictionary-Based Methods
Data Compression

Data Compression: Less data for the same message,
increase throughput

Bandwidth vs Throughput



Bandwidth – Physical
Throughput – Logical
When to compress?

Time Cost
Data Compression Categories

Two different categories



Lossy
Lossless
Pros and Cons of each technique


Compression ratio
Exact reconstruction
Lossless
Lossy
Lossless Compression Techniques

Run Length Encoding

Differential Pulse Code Modulation

Dictionary-Based Method
Run Length Encoding (RLE)






One of the simplest method of data compression
A lossless compression method
Consecutive data elements saved as one element and a
count
Simplified Example
AAAAAAAAAABBBBBOOOOOOOOO
10A5B9O
Real world deals with binary instead
Most useful on data that contains repeated elements
Differential Pulse Code Modulation (DPCM)

Another simple lossless compression algorithm



Example



Output a reference symbol
Output the difference between new symbol and the reference
AAABBCDDDD
A0001123333
Work better on digital images
Dictionary-Based Method

The last lossless compression algorithm we discuss

The idea is to build a dictionary (common phrases)

Replace common phrases with index

Dictionary constructed during compression sent along
with the code

A lot of research on how to build an efficient dictionary
7.2.2 Image Representation and
Compression

Focus on GIF & JPEG


Differences between GIF and JPEG
JPEG Compression Phases



DCT Phase
Quantization Phase
Encoding Phase
Difference between GIF and JPEG

Graphic Interchange Format (GIF)






RGB color space
8 bits each dimension = 24 bits total
Instead of sending 24 bits, reduce to 8 bits first
28 = 256 color
Picture has more than 256 colors, change it and cut it down
Good compression ratio, but low quality
Difference between GIF and JPEG

Joint Photographic Experts Group (JPEG)








Most widely used
More suited to photographic images
Transforming RGB color to YUV color space (lossless)
Y component represent brightness (luminance)
UV components represent color (chrominance)
Human eyes have separate receptors for brightness and color
By separating brightness and color, we can perform
compression separately
Human eyes are less sensitive to color, so we can compress UV
component more aggressively (lossy)
GIF Compression Phases


No phases
One single step






Lossless
Each color is represented by 24 bits
Reduce it down to 8 bits
Only 256 color possible
More than 256 color, merge to closest color
Usage

Sharp-edged images (logos)
JPEG Compression Phases

3 different phases



DCT (lossless)
Quantization (lossy)
Encoding
JPEG Compression Phases

Discrete Cosine Transform (DCT) Phase





Closely related to the Fast Fourier Transform (FFT)
8x8 matrix of pixel value => matrix of frequency coefficients
Help Filter out least important pattern
Lossless
Formula
JPEG Compression Phases

Quantization Phase




DCT phases only transform the
data into something easier to
recognize
Quantization Phase drops the
insignificant bits of the frequency
Coefficients
Truncate information => Lossy
Dropping the insignificant bits of
the frequency coefficients
JPEG Compression Phases

Encoding Phase




Final phase
Encodes the quantized frequency coefficients in a compact
form
Coefficients are processed in zigzag sequence
Further compression can be applied e.g. Run Length Encoding
7.2.3 Video Compression

What is a video?


Frames can be compressed using DCT


A succession of frames, or images.
Like JPEG
Compression should take into consideration the
redundancy between frames.

Two successive frames likely have a lot of the same information,
depending on the moving object.
MPEG

The MPEG format provides standards for multimedia
compression.



Includes compression, decompression, processing and codedrepresentation.
MPEG 1, MPEG 2, MPEG 4
Named after the group that invented it.


Moving Picture Experts Group
Established in 1988 and included experts from ISO and IEC.
How MPEG Works
Frame Types

MPEG compresses frames into three types.


I frames


Self-dependent.
P frames


I frames, P frames and B frames.
Specifies difference from the previous I frame.
B frames

Interpolation between the previous and subsequent I or P
frames.
Frame Types, contd.
Frame Types, contd.
Frame Types, contd.
8 x 8 Macroblock
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
U
U
V
V
U
U
V
V
Effectiveness and Performance


MPEG video can be compressed with up to a 90:1 ratio.
Encoding/Decoding can be performed on software and
hardware.


Hardware being the most prominent for encoding.
Software being the most prominent for decoding.
7.2.4 Transmitting MPEG over a Network

MPEG defines video stream.



Complex but modular.
Three variations.


MPEG 1, MPEG 2 and MPEG 4.
Focus on Main Profile MPEG 2 stream.


Does not dictate how stream is stored.
Used by DVD.
Nested Structure
MPEG Format
Transmitting MPEG over a Network, contd.


Allows for trading picture quality for bandwidth.
Need to packetize stream.


Not an issue with TCP.
For UDP, streams are broken at selected points.



Need to deal with packet loss.


Macroblock boundaries.
Application Level Framing.
Mark frames with drop probability.
Need to deal with application latency constraints.
MPEG Video Compression Algorithm
7.2.4 Audio Compression

MPEG provides audio compression standard.


MPEG I/2 Layer III (MP3).
Example:

CD quality audio at 44.1 KHz sampling rate requires 1.41 Mbps
bitrate.



4.32 Mbps with synchronization and error correction overhead.
Some compression is in needed for network transfer.
MP3 compression solves this issue.
MP3 Compression Algorithm
Audio Compression, contd.

Optimization lies in the number of subbands used and
how many bits are allocated for each subband.


Governed by psychoacoustic models.
Why is this the case?

Take the following for example.



A male voice requires more bits to low-frequency subbands.
A female voice requires more bits to low-frequency subbands.
Change in frequency is done by dynamically changing the
quantization tables used for each subband.
Summary



Discussed how application data is encoded in network
packets.
Presentation Formatting.
Compression.



Lossy
Lossless.
Provide high quality media.
Q&A
Thank You
References
Peterson, Larry L., and Bruce S. Davie. Computer Networks: A
Systems Approach. 5th ed. Amsterdam: Morgan
Kaufmann, 2012. Print.
Motion Picture Experts Group. Wikipedia.
http://en.wikipedia.org/wiki/Mpeg (accessed February
19, 2013).
Extra Slides
Frame Types, contd.
B Frame Macroblock
Coordinate
c
Motion Vector relative to
previous reference frame
a
Motion Vector relative to
subsequent reference frame
b
Delta for each pixel in
macroblock
d
Frame Types, contd.


Goal: Find the corresponding reference pixel using the
past and future reference frames.
coordinate(d ) : (x p , y p )
Past Reference Frame


Fp
coordinate(b ) : (x f , y f )
Future Reference Frame

coordinate(c) : (x, y)
Ff
Fc (x, y) =
Fp (x + x p, y + y p ) + Ff (x + x f , y + y f )
2
+ d (x, y)
Other Video Encoding Standards


ITU-T has defined the H series of codecs for video, audio,
control and multiplexing.
H.261, H.263




First and second generation for video encoding.
Targeted for lower speeds.
Available in 64-kbps increments.
H.264



New generation codec.
Part of the MPEG-4 standard.
One of the most popular codecs for HD video on web and
mobile platforms.

Used by Blu-ray, iTunes,YouTube, Flash, Silverlight and various HDTV
broadcasts.
Download