Faculty of Information and Engineering Technology Dr.-Ing. Khaled Shawky Hassan Bar Code Source Codind and Compression (COMM 901) Winter 2013, Final Examination Please read carefully before proceeding. Jan. 3rd 2013 1. The exam is closed book, i.e., no materials are permitted 2. The duration of this exam is 3 hours (180 mins) 3. Maximum mark 100!+3 Marks Bonus 4. Solve as much as you can! 5. Only calculators are permitted for this exam 6. This exam booklet contains 17 pages including this one, and two extra sheets attached (page 17)! Good Luck! Problem Number [A] [B] Total Possible Marks 45 58 100 Final Marks 1 Question 1: Worming up questions with lossless coding (50 Marks) A. Consider the following alphabet set S={a, b, c, d, e} with the probabilities P = {0.4, 0.2, 0.2, 0.1, 0.1}, obtain the following: (12 Marks) a. A minimum variance Huffman code. b. A maximum variance Huffman code. 2 c. The average code words length in both cases and the entropy H(S). d. Compare between average code length for each of the cases in a) and b) with the Shannon (only) average code words length. B. Optimal codes for uniform distributions: (10 Marks) Consider a random variable with M outcomes. Hence, find out: 1. What is the entropy of this source (in terms of M) when the entropy is maximized? 2. Write down the optimality condition for this code assuming Lossless Huffman encoding. 3 3. Based on the result in B.1); for what values of M does the average codeword length “L” will be equal to the entropy? (Hint: think of a Huffman tree for the case in B.1.) 4. Assuming that M is always in the interval of [2k, 2k+1], i.e., 2k ≤ M ≤ 2k+1, where in this interval will the redundancy (L – H) be minimum? (Hint: assume an addition or subtraction of a very small value to (δ --> 0) to any of the interval borders 2k or 2k+1) C. Construct the Shannon-Fano code for: Fano and Huffman codes for X = {x1, x2, x3, x4, x5, x6}; with the probabilities P= {0.2, 0.2, 0.18, 0.16, 0.14, 0.12}; then find the average code length! (5 Marks) 4 D. Decoding of LZ 78 an LZ 77: (8 Marks) Decode (decompress) the following message with LZ78: (0, A) (0, B) (1, A) (2, B) (0, R) AND Decode (decompress) the following message with LZ77: (0,0,a)(1,1,c)(3,3,a)(0,0,b) 5 E. Arithmetic Decoding for a floating tag: (10 Marks) For the following Arithmetic order in the table and the figure, decode the following binary Tag: 110001011 (Hint: change to float first, 2^-1+2^-2+0+ … ) and use these two methods: 1. Concept decoder Symbol Prob. 2. Simplified decoder (Hint: ) And assuming a message of composed of 4 symbols only A B C A 0.8 B 0.02 C 0.18 6 Question 2: Advanced Lossless Codes and Lossy Concepts (57 Marks) A. Scalar quantization: (18 Mark) This problem concerns the scalar quantization of a continuous, real-valued, stationary, memoryless source. - At each time instant n, the source emits a realization of a real-valued random variable X[n] which has a Gaussian distribution with zero mean, a variance σ2, and symmetric probability density, i.e., p(x) = p(−x). - The quantizer has M representative levels y1, . . . , yM and M − 1 decision threshold or boundaries: b1, . . . , bM−1. - The quantizer design rule is Q(X[n]) = yi; if bi−1 < x <= bi We set : b1 = −∞ and bM-1 = ∞. i. Draw to scale: (4 Marks) - Two uniform quantizers with any arbitrary chosen M (or M-1) levels, where one consider the zerolevel and the other just quantize before and after it. - Show also on your figure the quantization level Delta and the thresholds in each case - Name these two quantizers in the blank area! * Then differentiate between them for: 1- Number of levels (should be less than M). 2- The number of bits/level. * Which one has better Granular Error? Why? Quantizer 1: _ _ _ _ _ _ _ _ _ _ _ _ Quantizer 2:_ _ _ _ _ _ _ _ _ _ _ 7 ii. Referring to In the problem body; state the following: 1. If X[n] is the signal after sampler! Hence, say what is the optimum sampling frequency? Is it also optimum for compression in wide-band signals? What should we do and what is the drawback? (2 Marks) 2. If we can count all the number of samples and find a statistic out of them. If we have in total N (bounded) samples which are all repeated with equal probability = 1/N (Here, we can assume a periodic signal which is sampled for sufficiently long time). Say clearly what is entropy (H(X[n])) in this case? (2 Marks) 3. If M, as said before, is the output levels of the quantizer, which is always less than N. NOW, what do you expect for the repeated samples before the quantizer? Accordingly, find the entropy of the Q(X[n]) in terms of M. (2 Marks) 8 4. Which is bigger, H(X[n]) or H(Q(X[n]))? And why? Prove mathematically using H(X), H(Q(X)), and the relation between M & N (2 Marks) 5. Based on the periodic assumption and the sufficient statistics made in 3) and 4) in this part; What is the exact Entropy if we used 2-Bit quantizer adapted to the probability density function (PDF) given in the body of the question. (2 Marks) 6. In the next figure, the PDF of the Gaussian signal declared in the body of the problem is shown. The optimum solution for a single bit quantizer is indicated on the figure as -0.79 and 0.79. - Suggest roughly where will be the quantization points for a 2-bit quantizer (i.e., with 4 levels) if we will use a Lloyd-Max algorithm. - Find the thresholds in this case and show how do you select them based on the 4 quantizer (2 Marks) 9 7. In this case (the 4-level quantizer defined in the previous example) state exactly the minimum channel capacity assuming a systematic channel coding with a code rate k/n=2/3 (k information bits and n code word length (with redundancy)) (2 Marks) 10 B. 2-D Voronoi Diagrams and Vector Quantizer (7 Marks) i. On The next figure (where the rounded nodes are the quantization values); construct Voronoi region diagram and show the steps you do to create such regions: (2 Marks) ii. Show on your figure above: (1.5 Marks) a. The Bounded Cell, and b. The Unbounded Cell iii. Are the next regions representing 2-D Voronoi regions? and why ? (1.5 Marks) iv. How may quantization points are needed to have at least ONE bounded cell? Draw an example and show the bonded polygon (2 Marks) 11 C. For an input sequence x[n] is given as: (10 Marks: 1, 2, 2, 2, 3) X[n] = 4, 4.2, 4, 3.5, 2.5, 2.2, 2.5, 2.8, -2.6, -2.3, -2.6, -3, -3.2, -4, -3.5, -4, -4.2 i. Descried the input X[n] as 2-D coordinates as in the next figure ii. Construct a uniform 2-D vector quantizer with 4 code vectors that lies on the X-Y plan (Note: for uniform quantizer here, even in 2D, the distances between quantization points is equal). iii. Find the MSE for your proposed quantizer (as a measure of the mean Euclidian distance^2 ) iv. Find the total number of bits/dimension if a 45o transformer will be used before quantization v. Suggest another (uniform) quantizer with only 1 bit (2 levels), hence, compute the new MSE. 12 D. Rate-distortion theory (9 Marks: 5 and 4) 1. For a Bernoulli source that generates only 0 and 1 with probability p, i.e., (p → 0 and 1 –p → 1}) Assume a Hamming distortion with probability D, - Find the information rate distortion function as a function of H(p) and H(D). - Assuming p=1/2; find the R(D) curve for D changes from 0 --> 0.5 2. Assuming an original signal X that is quantized to X’; show analytically and via information circles of H(x) and H(X’) that the rate –distortion R(D)= I(X;X’) is a non-increasing function with distortion D. 13 E. Draw the general transform coding block diagram where you have to show the compression and the decompression blocks. On your figure, show the following: (4 Marks) - Original domain, Transformed domain, Quantized samples, ADDITIVE NOISE, and Segmentation F. For a JPEG encoder, what is the segmentation value (nxn)? (5 Marks) - By drawing the segment, show how do we read the values in the DCT transformed blocks? - On the segmented block, show where to allocate the important information of 14 G. Transformation, DCT, and FFT: (9 Marks) a. Given the following Transformation matrices: DCT and Walsh-Hadamard respectively π»π·πΆπ−πππ‘πππ₯ 1 1.31 = [ 2 1 0.54 1 1 0.54 −1 −1.31 1 −0.54 −1 1.31 1 −1.31 ] 1 −0.54 and π»ππππ β−πππ‘πππ₯ 1 1 = [ 2 1 1 1 1 −1 1 −1 1 1 −1 −1 1 −1 ] −1 1 And a vector which needs to be transformed which is: X = [10; 13; 18; 25]T. where Y = H X; i. Find the transformed Y in both cases, then show how the rotation matrix can compact the energy of a vector X into Y. ii. Compare between the two results by computing the coding gain CGT! Which one has a good coding gain and why? iii. Show how we can adapt our quantizer after this transformation such that the distortion is still bounded. 15 Extra Sheet 1 16 Extra Sheet 2 17