Entropy & Huffman coding

advertisement
Chapter 2
Entropy
2.4 Coding (Binary Codes)
Code words
Alphabet (Collection of symbols)
(FLC) FWL (Fixed word length)
(VLC) VWL (Variable word length)
Uniquely decodable code
FLC (Codes)
Decoding (Simpler)
a1
00
a2
01
00 11 01 10 10 11 01
a3
10
a1
a4
11
a4
a3
a2
a4
a3
a2
(Error is localized. Does not affect the other bit
stream)
Random bits stream: a1 a4 a2 a3 a3 a4 a2
00110110101101
Bit position 6 i.e. bit 1 detected as 0.
P.S: These notes are adapted from K.Sayood “Introduction to data compression”, Morgan Kaufmann, 3 rd
Edition, San Francisco, CA, 2006.
2.4.1 Uniquely Decodable Codes
Ex: Source alphabet
Table 3.1
ai
P(ai)
Binary Code
1
2
3
4
a1
½
0
0
0
0
a2
¼
0
1
10
01
a3
1/8
1
00
110
011
a4
1/8
10
11
111
0111
1.125
1.25
1.75
1.875
Avg. Length
4
Avg. Length =  P (ai) n(ai)
(bits/symbol) VLC
i=1
4
 P (ai) = 1,
i=1
Serial bit stream
0111
a4
1
01 0 01 011 01
a2 a 1 a2 a3
Decoder (Complex)
a2
VLC is highly error sensitive
Random bits stream: a4 a2 a1 a2 a3 a2
1
01110100101101
Bit position 10 i.e. bit 0 detected as 1. This can affect detection at the receiver.
Unique Decodability
A given sequence of codewords can be decoded in one and only one way.
No Ambiguity.
Code 3
0
10
110
111
Instantaneous code
Code 4
0
01
011
0111
Not instantaneous code
Shannon
N
Entropy = -  P (ai) log2 P(ai)
i=1
N = # of symbols
P (ai) = probability of symbol ai
N
 P (ai) = 1, Entropy: Theoretical minimum average bit rate
i=1
Table 2.2
a1
0
a2
01
a3
11
Consider Bitstream
011111111111111111
17 ‘ones’
Uniquely decodable but not instantaneous.
Table 2.3
a1
0
a2
01
a3
10
Consider Bitstream
“01010101010101010”
Not Uniquely decodable
Unique decidability is a must.
Prefix code: No codeword is prefix of any other codeword. This
guarantees unique decodability.
Unique Decodability
Consider two binary codewords ‘a’ and ‘b’
a : k bits long
b : n bits long
k<n
If the first ‘k’ bits of ‘b’ are identical to ‘a’ then ‘a’ is called prefix of ‘b’. Last
n-k bits are called dangling suffix.
Ex.
a = 010,
(‘a’ is prefix of ‘b’) 010
b = 010 11,
‘11’ = Dangling suffix
prefix
k = 3, n = 5
2.4.2 Prefix codes
Unique Decodability : Examine dangling suffixes of codeword pairs in which
one codeword is prefix of the other. If the dangling suffix is itself a
codeword, then the code is not uniquely decodable.
Ex: 2.4.1
Codewords { 0, 01, 11} Uniquely Decodable
Codeword ‘0’ is prefix for ‘01’. Dangling suffix is ‘1’
Codewords {0, 01, 11, 1} Not Uniquely decodable
2.4.2 Prefix Codes (contd.,)
Prefix code : No codeword is prefix of any other code.
Code 2 (Not uniquely decodable)
a1
a2
0
1
a3
00
a4
11
Root Node
(Fig. 2.4)
Internal node
External node (Leaf)
Protocol A
Protocol B
Codes 3 & 4 are uniquely decodable
Code 3
a1
a2
Root Node
0
10
a1
a3
110
a4
111
a2
a3
a4
Code 4
a1
a2
0
01
a3
011
a4
0111
a4 is external node
a1, a2, a3 are internal nodes
For any non prefix uniquely decodable code, there is a prefix code with the
same codeword lengths.
In a prefix code, codewords are associated only with the external nodes.
For any non-prefix uniquely decodable code, there is always a prefix code
with the same codeword lengths.
( Prefix code: No codeword is prefix of any other codeword. This
guarantees unique decodability.)
Chapter 3
Huffman Coding
VLC - prefix codes optimum for a given model.
Practical code closest to the entropy. If all probabilities are negative integer
powers of two then Huffman code = Entropy.
Ex: 2-1, 2-2, 2-3, 2-3
N
Entropy = -  Pi log2 Pi(ai)
i=1
N
Minimum Theoretical bit rate to code N symbols,  Pi = 1
i=1
Huffman Code : Practical VLC comes very close to Entropy.
3.2 Huffman Coding
(Optimum prefix code )
1. Symbols that occur more frequently (Higher Probabilities) have shorter
codewords than symbols that occur less frequently.
2. Two symbols that occur least frequently will have the same code length.
N
Average bit rate =  ni Pi
(bits / symbol)
i=1
N = # of symbols
n i = bit size for symbol i
P i = probability of symbol i
N
 Pi = 1, Entropy
i=1
3. Codewords corresponding to the two lowest probability symbols differ
only in the last bit.
Two least Probability Symbols
r

Code word
(m * 0) 11010010
(m * 1) 11010011
m
 = Concatenation
m = 1101001
Ex. 3.2.1 Design of Huffman Code
Given
1 Rearrange (VLC)
ai
P (ai)
ai
P (ai)
1
.2
a2
.4
2
.4
a1
.2
3
.2
a3
.2
4
.1
a4
.1
5
.1
a5
.1
5
 P(ai) = 1,
i=1
5
H = Entropy = -  P(ai) log2 P(ai) = 2.122 bits/symbol
i=1
= Minimum Average Theoretical bit rate
( P(ai) is either given or developed experimentally )
Huffman Tree
a2 (.4)
(1.0)
0
0
1
0
0 (.6)
0
1
a1 (.2)
a2
a1
a3
a4
a5
0
0
a3 (.2)
0
0 (.4)
a4 (.1)
0
1
0
1
01
000
0010
0011
Huffman code is a Prefix code
Uniquely Decodable
0 (.2)
a5 (.1)
0
1
(See Fig. 3.2/ p.46)
Protocol
Average bit size = [2 * 0.2 + 1 * 0.4 + 3 * 0.2 + 2 * 4 * 0.1] (bit/symbol)
= 2.2
5
H = Entropy = -  Pi log2 Pi = 2.122 bits/symbol
i=1
Ex. 3.2.1
Average bit length
5
 P(ai) n(ai) = 2.2 bits/symbol
i=1
Redundancy = 2.2 – 2.122
= 0.078 bit/symbol
3.2.1 Minimum Variance Huffman Codes (See Fig 3.3)
Always put the combined letter as high in the list as possible in the Huffman
tree.
Fig. 3.4
Average bit length = 2.2 bits/symbol
Pi
.2
.4
a1
a2
10
00
.2
a3
11
.1
a4
010
.1
a5
011
Min Variance Huffman tree
Buffer design become much simpler. (See pages 44 - 45)
ai
P (ai)
a1
.2
a2
.4
a3
.2
a4
.1
a5
.1
Min. Variance Huffman Code
Place the combined letter as high as possible
3.2.1 Minimum Variance Huffman Code
(Fig. 3.2 / p.3 - 5) & (Fig. 3.4 / p.3 - 7)
Both give the same average bit length (2.2 bits/symbol).
Their variances are different.
VLC
Buffer
Fixed bit rate
Channel
Assume 10,000 symbols/sec (i.e. average bit rate of 22,000 bits/sec)
Minimum Variance Huffman Code
0
a2
a1
00
10
a3
11
a4
010
a5
011
Buffer : To smooth out the variations in the bit generation rate.
a1
a2
01
1
a1
a2
10
00
a3
000
a3
11
a4
0010
a4
010
a5
0011
a5
011
Huffman Code
Min. Variance Code
Assume strings of a4’s & a5’s to be transmitted for several seconds.
(10,000 symbols/sec)
Code from
Fig 3.2
Generates 40,000 bps
(store 18000 bps)
Code from (Min. Variance Code)
Fig. 3.4
Generates 30,000 bps
(store 8000 bps)
Assume string of a2’s to be transmitted for several secs.
Generates 10,000 bps
generates 20,000 bps
(Make up a deficit of 12000 bps)
(Make up a deficit of 2000 bps)
Buffer design is simpler based on minimum variance Huffman Code.
Variable bit rate
Buffer
Channel
Given
Rearrange (VLC)
ai
P (ai)
ai
P (ai)
a1
.2
a2
.4
a2
.4
a1
.2
a3
.2
a3
.2
a4
.1
a4
.1
a5
.1
a5
.1
Huffman Tree (Page 3.5)
Rearrange
Ex. 3.2.1 (Fig. 3.2 / p.46)
Average bit size = [2 * 0.2 + 1 * 0.4 + 3 * 0.2 + 2 * 4 * 0.1] = 2.2 bits/symbol
5
5
 P(ai) = 1; H = Entropy = -  P(ai) log2 P(ai) = 2.122 bits/symbol
i=1
i=1
Redundancy = 2.2 – 2.122 = 0.078 bit/symbol
Minimum Variance Huffman Tree
p.3-7a
code MVHC
Rearrange
a2
00
a1
10
a1
10
a2
00
a3
11
a3
11
a4
010
a4
010
a5
011
a5
011
Average bit rate 2.2 bits/symbol. Assume 10,000 symbols/sec channel
22,0000 bps. MVHC makes buffer design easier Minimum variance
Huffman Code.
3.2.2 Optimality of Huffman Codes.
(VLC) H(s) = -  Pi log2 Pi
i
3.2.3 Length of Huffman Codes:
H(s) ≤ l < H(s) + 1,
(3.1)
l = Avg. code length for Huffman code
H(s)= -  P(ai) log2 P(ai)
i
= Entropy (Min., theoretical Average bit rate)
(Huffman tree)
(Symbol with High probabilities: bit size is small) and vice versa.
Huffman code is a prefix code. Guarantees unique decodability.
( Page 48)
H(s) ≤ lH < H(s) + Pmax,
Pmax ≥ 0.5
< H(s) + Pmax + 0.086, Pmax < 0.5
Pmax = Largest probability of any symbol.
See [80]
When alphabet size is small and P(ai) of different ai is skewed, then Pmax
can be large Huffman coding becomes inefficient.
Fascimile
200 dpi
400 dpi
600 dpi
1000 dpi
(dpi: dots per inch)
white dot -> 0.8
& black dot -> 0.2
(Binary Images)
3.2.4 Extended Huffman codes.
p.49
Ex. 3.2.3
H(s) ≤ R ≤ H(s) + 1/n
alphabet size m symbols
(a1, a2, a3, …….am)
Group and code ‘n’ symbols at a time.
Extended alphabet size = mn
One code word for every n symbols.
R = Rate = # of bits/symbol.
(3.7)
Ex. 3.2.3
p.49
0 a1  .8
11 a2  .02
10 a3  .18
m=3
Extended alphabet
Size = mn = 32 = 9
(Contd.)
Let n = 2
Symbol
a1 a1
.64
0
a1 a2
.016
10101
a1 a3
.144
11
a2 a1
.016
101000
a2 a2
.0004
10100101
a2 a3
.0036
1010011
a3 a1
.1440
100
a3 a2
.0036
10100100
a3 a3
.0324
1011
See table 3.11 (p.31) for the code.
Avg. codeword length for the extended code is 1.7228 bits/symbol of two
alphabets.
(1.7228/2) = 0.8614 bits/alphabet.
Redundancy = 0.384
Avg. code word length = 1.2 bits/symbol
(Entropy = 0.816)
m=3, n=3
(a1, a2, a3)
a1 a1 a1, a1 a1 a2,…………..a3 a3 a3
Extended alphabet size = 33 = 27
By coding blocks of symbols together, redundancy of Huffman codes can
be reduced. However alphabet (extended) size grows exponentially &
Huffman coding becomes impractical.
a1 a1 a1
a1 a1 a2
m=3, n=4
a1 a1 a3
alphabet size = 34 = 81
a1 a2 a1
a1 a2 a2
“
“ “
a3 a3 a3
Huffman coding (Variation)
Truncated Huffman coding.
Modified Huffman coding
3.4
Adaptive Huffman coding.
Non binary Huffman code
(Ternary code: 0, 1, 2)
3.8.2 Text Compression (Page 74) II-Edition
Using Huffman Coding file size dropped from 70,000 bytes to 43,000 bytes.
Higher Compression can be obtained by using the structure. Discussed in
Chapters 5 & 6 LZ 77, LZ 78, LZW etc.
3.8.3 Audio Compression (Page 75)
(2 * 16 * 44.1) Kbps
CD- quality audio.
fs = 44.1 KHz,
Stereo Channel.
16 bit PCM
(Two audio channels) (216 = 65,536 levels)
Estimated Compressed file size = (entropy) * (# of samples in the file)
Huffman Coding Programs: (p.74)
huff_enc
huff_dec
adap_huff
FLAC
Apples’ ALAC or ALE
Monkey’s Audio, MPEG-4 ALS
Entropy
Lossless Schemes:
JPEG Lossless
JPEG LS
GIF
PNG
FELICS
JPEG-2000
H.264 Intra
JPEG-XR, HD Photo
LOCO
Group of symbols 1,2,……..,N
ai i = 1,2,…..,N
P(ai) = probability of occurrence of symbol ai
N
 P(ai) = 1
i=1
(Probability Distribution) Given or Developed
Shannon’s fundamental theorem
Entropy:
N
H = -  P(ai) log2 [P(ai)]
(p.22)
i=1
Minimum (theoretical) bit rate at which the group of symbols can be
transmitted, # of bits/symbol.
Huffman code is a VWL code. Very close to entropy. Practical code.
Entropy Coder
Contributes to compression.
P (a1)
1/2
P (a2)
1/8
P (a3)
1/8
P (a4)
1/4
N=4
ai, i = 1,2,3,4
4
H = Entropy = -  P(ai) log2 P(ai)
i=1
= (1/2) * 1 + 2 * (1/8) * 3 + (1/4) * 2
= (1/2) + (3/4) + (1/2) = 1.75 bits/symbol
Huffman Code
(Huffman Tree)
Uniquely decodable
P (a1)
1/2
a1
0
P (a4)
1/4
a2
110
P (a2)
1/8
a3
111
P (a3)
1/8
a4
10
Download