Uploaded by Emad Adel

Section 01

advertisement
Section 1
Data compression:
Data compression is a reduction in the number of bits needed to represent data.
There are two methods of compressing data:
Lossy compression reduces file size by permanently removing some of the
original data.
It’s typically used when a file can afford to lose some data, and/or if storage space
needs to be drastically ‘freed up’.
It’s applicable for Images, video, audio. (Images: JPEG
Video: MPEG, AVC, HEVC
Audio: MP3, AAC)
Much smaller in size but the quality degrades a lot from the original one.
Lossless compression file size by removing unnecessary metadata.
In lossless compression, the file data is restored and rebuilt in its original form
after decompression, enabling the image to take up less space without any
discernible loss in picture quality.
No data is lost and as the process can be reversed, it’s also known as reversible
compression.
It’s applicable for Text, images, audio
(Images: RAW, BMP, PNG
General: ZIP
Audio: WAV, FLAC)
Not so much smaller in size but the quality doesn’t change a lot.
1-
Entropy :
The average amount of information of a source (average number of bits of
a code)
N
H = − P X i log 2 P X i 
i =1
2- Average length ( Lavg ):
It’s the sigma summation of probability of the code multiplied by its
length for each code.
Σ𝑝𝑖 × 𝑙𝑖
3- Efficiency:
It’s the efficiency of the binary code in which can be compressed
Efficiency =
𝐻
𝐿𝑎𝑣𝑔
Example:
Symbol
Prob.
FLC
Code
1
Code
2
Code
3
Code 4
A
P[A]=1/2
000
1
1
0
00
B
P[B]=1/4
001
01
10
10
01
C
P[C]=1/8
010
001
100
110
10
D
P[D]=1/16
011
0001
1000
1110
11
E
P[E]=1/16
100
00001
10000
1111
110
Average
Length
H=30/16
3
31/16
31/16
30/16
33/16
Exercises:
Test the codewords in this binary code and Conclude if it’s uniquely decodable
or not ?
{0,01, 11}
{0,01,10}
{0,01,10,1}
{0,1,00,11}
{0,10,110,111}
Test the codewords in this binary code and conclude it it’s prefix or not?
{1,01,001,0000}
{0,10,110,1011}
{0,10,010,101}
Encode the following sets of binary codes by using Run-length coding
technique:
0011110001111000
0000011110
111100011100
Decode the following sets of codes by using Run-length coding technique:
10,20 , -5 ,35,32,54,32,19,3,87
9,12,-4,35,76,112,67,2,19,2
255,8,2,54,32,65,76,255,5,30,1
128,8,2,54,32,65,76,128,5,30,1
Find the codewords and code length of the following probabilities by using
Shannon-fano coding technique:
1- 0.25, 0.2, 0.15, 0.15, 0.10, 0.10, 0.05
2- AADCABCBAB
Download