Data Compression - Ryerson University

advertisement
Data Compression II
(Codecs and Container Formats)
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Codecs


A codec is a device or program capable of
performing encoding and decoding on a digital
data stream or signal.
The word codec may be a combination of any of
the following: 'compressor-decompressor',
'coder-decoder', or 'compression/decompression
algorithm'.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Codecs (Usage)


Codecs encode a stream or signal for
transmission, storage or encryption and decode
it for viewing or editing. Codecs are often used
in videoconferencing and streaming media
applications.
An audio compressor converts analogue audio
signals into digital signals for transmission or
storage. A receiving device then converts the
digital signals back to analogue using an audio
decompressor, for playback.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Codecs



Most codecs are lossy, allowing the compressed
data to be made smaller in size.
There are also lossless codecs, but for most
purposes the slight increase in quality might not
be worth the increase in data size, which is
often considerable.
Codecs are often designed to emphasise certain
aspects of the media to be encoded (motion vs.
color for example).
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Codec Compatibility


There are hundreds or even thousands of
codecs ranging from those downloadable for
free to ones costing hundreds of dollars or more.
This can create compatibility and obsolescence
issues.
By contrast, lossless PCM audio (44.1 kHz, 16
bit stereo, as represented on an audio CD or in
a .wav or .aiff file) offers more of a persistent
standard across multiple platforms and over
time.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container Formats

Many multimedia data streams need to contain
both audio and video data, and often some form
of metadata that permits synchronisation of
audio and video. Each of these three streams
may be handled by different programs,
processes, or hardware; but for the multimedia
data stream to be useful in stored or transmitted
form, they must be encapsulated together in a
container format.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container Formats


The widely spread notion of AVI being a codec is
incorrect as AVI (nowadays) is a container
format, which many codecs might use.
There are other well known alternative
containers such as Ogg, ASF, QuickTime,
RealMedia, Matroska, AIFF, DivX, FLV and
MP4.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Popular Audio Codecs

The most popular audio codecs are all lossy.

Dolby Digital (AC3)

Digital Theatre System Coherent Acoustics (DTS)

MP3 (MPEG-1 Audio Layer 3)

Advanced Audio Coding (AAC)

Vorbis

Free Lossless Audio Codec (FLAC)
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Popular Video Codecs

The most popular video codecs are all lossy.

Cinepak

MPEG-1 Video (VCD)

MPEG-2 Video (DVD)

MPEG-4 ASP (includes Xvid, FFmpeg and DivX)

MPEG-4 AVC (x264, HD-DVD, Blu-Ray)

RealVideo

Windows Media Video (includes ASF)
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Dolby Digital (AC3)


Dolby Digital, or AC-3, is the common version containing up
to six discrete channels of sound, with five channels for
normal-range speakers (20 Hz – 20,000 Hz) (right front,
center, left front, right rear and left rear) and one channel
(20 Hz – 120 Hz) for the subwoofer driven low-frequency
effects. Mono and stereo modes are also supported. AC-3
supports audio sample-rates up to 48KHz.
Batman Returns was the first film to use Dolby Digital
technology when it premiered in theaters in Summer 1992.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Digital Theater System (DTS)


On the consumer level, DTS is the shorthand for the DTS
Coherent Acoustics codec, transportable through S/PDIF (SonyPhilips Digital Interconnect Format) and used on DVDs, CDDAs,
LDs and in wave files. This system is the consumer version of
the DTS standard, using a similar codec without needing
separate DTS CD-ROM media.
There are significant technical differences between
commercial/theatrical and home variants: the former being a
traditional ADPCM compression system and the latter a
sophisticated hybrid perceptual and signal-redundancy
compressor based on ADPCM called APTX-100.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MP3


MP3 uses a lossy compression algorithm that is designed
to greatly reduce the amount of data required to represent
audio recordings, yet still sound like faithful reproductions
of the original uncompressed audio to most listeners. An
MP3 digital file created using the mid-range bitrate setting
of 128 kbit/s results in a file that is typically about 1/10th the
size of the CD file created from the same audio source.
LAME is a popular open source MP3 encoder.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MP3


It provides a representation of sound within a short term
time/frequency analysis window, by using psychoacoustic
models to discard or reduce precision of components less
audible to human hearing, and recording the remaining
information in an efficient manner. These techniques are
called "Perceptual Coding".
Other techniques such as Huffman Coding (lossless),
variable speed encoding and joint stereo encoding are part
of the mp3 codec.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Variable BitRate (VBR) Encoding

Variable BitRate encoding is designed for size & quality
optimalization. Where there is silence in the music, it is less
"demanding" in terms of its encodability, it makes sense to
drop the bit rate, simply because there's not much there to
encode, and the wasted space is overkill. Where the full
orchestra and high noise percussion is joining in, the encoder
will choose a higher bitrate appropriate to the demands. Some
parts of the music can be encoded in 128 kbps (kilo bits per
second) without any quality loss, other parts get the full 320
kbps to make the best of it.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Joint Stereo Encoding



Joint stereo is a method to save some bandwidth by encoding
certain parts of the spectrum in mono (i.e. only once) for which
the human ear has no directional hearing. These are very low
and very high tones.
The bandwidth is saved by recording a wider sum channel and
a narrower difference channel, where the difference channel
does not contain these spectral components.
This works very well and produces excellent quality at 128
Kbit/s for most pieces of music (but not all!).
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Advanced Audio Coding (AAC)


Advanced Audio Coding (AAC) is a standardized, lossy
compression and encoding scheme for digital audio. Designed
to be the successor of the MP3 format, AAC generally
achieves better sound quality than MP3 at the same bitrate,
particularly below 192 kbit/s.
AAC's best known use is as the default audio format of Apple's
iPhone, iPod, iTunes, and the format used for all iTunes Store
audio (with extensions for proprietary digital rights
management).
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Advanced Audio Coding (AAC)




AAC is a wideband audio coding algorithm that exploits two
primary coding strategies to dramatically reduce the amount of
data needed to represent high-quality digital audio.
1. Signal components that are perceptually irrelevant are
discarded;
2. Redundancies in the coded audio signal are eliminated.
3. The signal is processed by a modified discrete cosine
transform (MDCT) according to its complexity. Avoids
overlapping artifacts.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Vorbis Audio Codec


Vorbis is a free and open source, lossy audio codec project
headed by the Xiph.Org Foundation and intended to serve as
a replacement for MP3. It is most commonly used in
conjunction with the Ogg container and is therefore called Ogg
Vorbis.
For many applications, Vorbis has clear advantages over other
lossy audio codecs in that it is patent-free and has free and
open-source implementations and therefore is free to use,
implement, or modify as one sees fit, yet produces smaller
files than most other codecs at equivalent or higher quality
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Free Lossless Audio Codec



FLAC is an audio format using lossless compression. It is the
most popular lossless audio codec.
FLAC achieves compression rates of 30–50% for most music,
with significantly greater compression for voice recordings.
FLAC uses linear prediction to convert the audio samples to a
series of small, uncorrelated numbers (known as the residual),
which are stored efficiently using Golomb-Rice coding
(quotient/remainder pairs using a divisor that is a power of
two: 2,4,8...). Entropy coding is used for silent parts.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Cinepak Video Codec

Cinepak is a video codec developed by SuperMatch, a division
of SuperMac Technologies, and released in 1992 as part of
Apple Computer's Quicktime video suite. It was designed to
encode 320x240 resolution video at 1x (150 kbyte/s) CD-ROM
transfer rates. The codec was ported to the Windows platform
in 1993. It was also used on first-generation and some
second-generation CD-ROM game consoles, such as the Atari
Jaguar CD, Sega CD, Sega Saturn, and 3DO.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Cinepak Video Codec

Cinepak is based on vector quantization, which is a
significantly different algorithm from the discrete cosine
transform (DCT) algorithm used by most current codecs (in
particular the MPEG family, as well as JPEG). This permitted
implementation on relatively slow CPUs, but tended to result in
blocky artifacting at low bitrates.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Cinepak Video Codec

Cinepak divides a movie into key images and intra-coded
images. Each image is divided into a number of horizontal
bands which have individual 256-color palettes transferred in
the key images. Each band is subdivided into 4x4 pixel blocks.
The compressor uses vector quantization to determine the one
or two band palette colors which best match each block and
encodes runs of blocks as either one color byte or two color
bytes plus a 16-bit vector which determines which pixel gets
which color.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-1 Video Codec

MPEG-1 video was originally designed with a goal of
achieving acceptable video quality at around 1.5 Mb/sec data
rates and 352x240 pixels (29.97 frame per second) / 352x288
pixels (25 frame per second) resolution. While MPEG-1
applications are often low resolution and low bitrate, the
standard allows any resolution less than 4095x4095. One big
disadvantage of MPEG-1 video is that it supports only
progressive pictures. This deficiency helped prompt
development of the more advanced MPEG-2.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-1 Video Codec


MPEG-1 starts with a relatively low resolution video sequence
of about 352 by 240 frames by 30 frames/s, but original high
(CD) quality audio. The images are in color, but converted to
YUV space, and the two chrominance channels (U and V) are
decimated further to 176 by 120 pixels.
The basic scheme is to predict motion from frame to frame in
the temporal direction, and then to use DCT's to organize the
redundancy in the spatial directions. The DCT's are done on
8x8 blocks, and the motion prediction is done in the luminance
(Y) channel on 16x16 blocks.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-1 Video Codec

In other words, given the 16x16 block in the current frame that
you are trying to code, you look for a close match to that block
in a previous or future frame. The DCT coefficients (of either
the actual data, or the difference between this block and the
close match) are quantized. Hopefully, many of the coefficients
will then end up being zero. The results of all of this, which
include the DCT coefficients, the motion vectors, and the
quantization parameters (and other stuff) is Huffman coded
using fixed tables.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-1 Video Codec


There are three types of coded frames. There are I or intra
frames. They are simply a frame coded as a still image, not
using any past history. You have to start somewhere.
Then there are P or predicted frames. They are predicted from
the most recently reconstructed I or P frame. Each macroblock
in a P frame can either come with a vector and difference DCT
coefficients for a close match in the last I or P, or it can just be
"intra" coded (like in the I frames) if there was no good match.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-1 Video Codec

Lastly, there are B or bidirectional frames. They are predicted
from the closest two I or P frames, one in the past and one in
the future. You search for matching blocks in those frames,
and try three different things to see which works best. You try
using the forward vector, the backward vector, and you try
averaging the two blocks from the future and past frames, and
subtracting that from the block being coded. If none of those
work well, you can intra- code the block. B frames are not so
popular because the image sequence must be
transmitted/stored out of order so that the future frame is
available to generate the B frames.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-2 Video Codec

MPEG-2 is widely used as the format of digital television
signals that are broadcast by terrestrial (over-the-air),
cable, and direct broadcast satellite TV systems. It also
specifies the format of movies and other programs that
are distributed on DVD and similar disks. As such, TV
stations, TV receivers, DVD players, and other equipment
are often designed to this standard.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-2 Video Codec

The Video section, part 2 of MPEG-2, is similar to the
previous MPEG-1 standard, but also provides support for
interlaced video, the format used by analog broadcast TV
systems. MPEG-2 video is not optimized for low bit-rates,
especially less than 1 Mbit/s at standard definition
resolutions. However, it outperforms MPEG-1 at 3 Mbit/s
and above. All standards-compliant MPEG-2 Video
decoders are fully capable of playing back MPEG-1 Video
streams.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
MPEG-4 Video Codec


MPEG-4 is a collection of methods defining compression of
audio and visual (AV) digital data. It was introduced in late
1998 and designated a standard for a group of audio and
video coding formats and related technology.
MPEG-4 absorbs many of the features of MPEG-1 and
MPEG-2 and other related standards, adding new features
such as (extended) VRML support for 3D rendering, objectoriented composite files (including audio, video and VRML
objects), support for externally-specified Digital Rights
Management and various types of interactivity.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Parts of MPEG-4 Video Codec


MPEG-4 Part 2 is a video compression technology developed
by MPEG. It is a discrete cosine transform compression
standard, similar to previous standards such as MPEG-1 and
MPEG-2. Several popular codecs including DivX, Xvid and
Nero Digital are implementations of this standard.
H.264 is a standard for video compression, and is equivalent
to MPEG-4 Part 10, or MPEG-4 AVC (for Advanced Video
Coding). As of 2008, it is the latest block-oriented motioncompensation-based codec standard. x264 is a free software
library for encoding H.264/MPEG-4 AVC video streams.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Lumi Masking

Lumi masking is a technique used by video compression
software, which reduces quality in very bright or very dark
areas of the picture, as quality loss in these areas is less likely
to be visible. It is also known as "psychovisual enhancements"
or "adaptive quantization". The reduction in quality (and
therefore bitrate) in certain areas of the picture caused by
using lumi masking allows more bits to be allocated to the rest
of the video, thus improving overall quality. Lumi masking is
not perfect, however, and in some cases the degradation in
quality it causes is visible.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Trellis Quantization


Trellis quantization is an algorithm that can improve data
compression in DCT-based encoding methods. It is used to
optimize residual DCT coefficients after motion estimation in
lossy video compression encoders such as Xvid.
Trellis quantization reduces the size of some DCT coefficients
while recovering others to take their place. Trellis quantization
effectively finds the optimal quantization for each block to
maximize the peak s/n ratio relative to bitrate. It has varying
effectiveness depending on the input data and compression
method.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Xvid Video Codec


Xvid is a video codec library following the MPEG-4 standard.
Xvid features MPEG-4 Advanced Simple Profile features such
as b-frames, global and quarter pixel motion compensation,
lumi masking, trellis quantization, and H.263 (another codec),
MPEG and custom quantization matrices.
Xvid is a primary competitor of the DivX Pro Codec but unlike
the latter, it is free software. Xvid encoded files can be written
to a CD or DVD and played in a DivX compatible DVD player.
However, Xvid can optionally encode video with advanced
features that most DivX Certified set-top players do not
support.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
FFmpeg




ffmpeg is a command line tool to convert one video file format to
another. It also supports grabbing and encoding in real time from a
TV card.
libavcodec is a library containing all the FFmpeg audio/video
encoders and decoders. Most codecs were developed from scratch
to ensure best performance and high code reusability.
libavformat is a library containing demuxers and muxers
(multiplexing) for audio/video container formats.
Its best implementation is in the ffdshow encoder.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
ffdshow


ffdshow is a media decoder and encoder mainly used for the
fast and high-quality decoding of video in the MPEG-4 ASP
(e.g. encoded with DivX, Xvid or FFmpeg MPEG-4) and AVC
(H.264) formats, but supporting numerous other video and
audio formats as well. It is free software released under the
GPL license, runs on Windows and is implemented as a
DirectShow decoding filter.
The project is now known as ffdshow-tryouts, where bug fixes,
stability fixes, new features, and codec updates continue.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
ffdshow


ffdshow is a media decoder and encoder mainly used for the
fast and high-quality decoding of video in the MPEG-4 ASP
(e.g. encoded with DivX, Xvid or FFmpeg MPEG-4) and AVC
(H.264) formats, but supporting numerous other video and
audio formats as well. It is free software released under the
GPL license, runs on Windows and is implemented as a
DirectShow decoding filter.
The project is now known as ffdshow-tryouts, where bug fixes,
stability fixes, new features, and codec updates continue.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container formats : AVI

Audio Video Interleave, known by its acronym AVI, is a
multimedia container format introduced by Microsoft in
November 1992 as part of its Video for Windows technology.
AVI files can contain both audio and video data in a file
container that allows synchronous audio-with-video playback.
Like the DVD video format, AVI files support multiple streaming
audio and video, although these features are seldom used.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container formats : AVI


The AVI container has no native support for modern MPEG-4
features like B-Frames. Hacks are sometimes used to enable
modern MPEG-4 features and subtitles, however, this is the
source of playback incompatibilities.
AVI files do not contain pixel aspect ratio information. It
renders all AVI files with square pixels. Therefore, the frame
appears stretched or squeezed horizontally when the file is
played back. There are other video container formats that
allow irregular shaped pixels.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container formats : DIVX

In June 2005, DivX, Inc. released its own container format
called DivX Media Format (.divx extension) to succeed the AVI
+ DivX combo. However, this format is basically an enhanced
AVI format (based on the same RIFF structure, for backward
compatibility with existing players and devices) and so far, has
gained no perceivable consumer traction, even where the
DivX codec was once popular (the Xvid codec has instead
become the codec of choice among most of the file-sharing
groups).
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container formats : Matroska


The Matroska Multimedia Container is an open standard free
container format, a file format that can hold an unlimited
number of video, audio, picture or subtitle tracks inside a
single file. It is intended to serve as a universal format for
storing common multimedia content, like movies or TV shows.
Matroska file types are .MKV for video (with subtitles and
audio), .MKA for audio-only files and .MKS for subtitles only.
The most common use of .MKV files are used to store HD
video files.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container formats : MP4

MPEG-4 Part 14, is a multimedia container format standard
specified as a part of MPEG-4. It is most commonly used to
store digital audio and digital video streams, especially those
defined by MPEG, but can also be used to store other data
such as subtitles and still images. Like most modern container
formats, MPEG-4 Part 14 allows streaming over the Internet.
The official filename extension for MPEG-4 Part 14 files is
.mp4, thus the container format is often referred to simply as
MP4.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Container formats : FLV

FLV files contain video bit streams which are a variant of the
H.263 video standard, under the name of Sorenson Spark.
Flash Player 8 and newer revisions support the playback of
On2 TrueMotion VP6 video bit streams. On2 VP6 can provide
a higher visual quality than Sorenson Spark, especially when
using lower bit rates. On the other hand it is computationally
more complex and therefore will not run as well on certain
older system configurations. Flash Player 9 Update 3 includes
support for H.264 video standard (MPEG-4 part 10, or AVC)
which is even more computationally demanding, but offers
significantly better quality/bitrate ratio.
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
End of lesson
7. Data Compression II - Copyright © Denis Hamelin - Ryerson University
Download