mt 2 sol

ELE 488 F 06 Midterm Test II - Solution 1. The standard quantization table for JPEG is given below as Q, along with the DCT of an 8x8 block, given below as D. Q = 16 12 14 14 18 24 49 72 11 12 13 17 22 35 64 92 10 14 16 22 37 55 78 95 16 24 40 19 26 58 24 40 57 29 51 87 56 68 109 64 81 104 87 103 121 98 112 100 51 61 60 55 69 56 80 62 103 77 113 92 120 101 103 99 D = 1329 -19 -1 13 -1 4 -7 5 23 56 85 -8 -10 0 6 4 3 32 -3 -29 -18 -9 5 -6 -20 -12 -33 -12 4 27 0 -3 14 8 -2 2 21 7 2 -7 3 -3 -5 -6 0 -6 -5 2 -8 10 -6 2 -4 2 -1 -1 6 -7 8 1 5 3 1 2 (a) (10 pts) Use the table Q with a multiplier of 2 to quantize D. What are the quantized DCT coefficients? The actual quantization step is 2 times the entries of Q. For the first row, 1329/(2x16) = 42.75, round to 42. 23/(2x11) = 1.05, round to 1. 3/20=0.15, round to 0, -20/32= –0.625, round to – 1, . . . Proceed in this way, the first 4 rows of quantized DCT coefficients are: 43 1 0 -1 0 0 0 0 -1 2 1 0 0 0 0 0 0 3 0 -1 0 0 0 0 0 0 -1 0 0 0 0 0 The rest 4 rows have all 0 entries. (b) (7 pts) What is the mean squared quantization error of the (1,1)-th DCT coefficients? The question can be interpreted in two different ways: 1. The unquantized coef is 56, and the quantized value is 2 x(2x12) = 48. So the squared error is (56 – 48)^2 = 64. 2. step size is 24, MSE quantization error is 24^2/12 = 48 These two answers came out to be the same is a coincidence. (c) (8 pts) Describe the run length coding in JPEG using part (a) for illustration. The zigzag pattern is 43 1 -1 0 2 0 -1 1 3 0 0 0 0 0 0 0 -1 -1 0 to end. The DC coefficient 43 is differentially coded with other DC coefficients. The symbols for the rest are: ‘number of zeros before a nonzero quantized coefficient. So here we have: no zero before 1, no zero before -1, one zero before 2, one zero before -1, no zero before 1, no zero before 3, 7 zeros before -1 2. (a) (10 pts) What is a B-frame in MPEG encoding? Why use B-frames? MPEG frames form a sequence of the form: I B B P B B P B B P B B I . . . I’s are ‘intra’ frames, coded by itself. P’s are predicted frames, coded differentially using the previous I or P frames with motion compensation. B’s are bidirectional frames, coded differentially using both previous and following P or I frames with motion compensation. The most obvious advantage of using B frames is for occlusion. (b) (10 pts) What are motion vectors in MPEG encoding? How are they determined? For each macroblock (16x16) in a P frame, we look for a 16x16 array of pixels in the previous P or I frame that best matches the given macroblock. The goodness of match is computed most commonly using the sum of absolute pixel differences. The relative position of the macroblock and the best matched macroblock is the motion vector of that macroblock. Usually the search for best match is made only over a selected area. 3. Given a triangle with the three vertices at (0,0), (1,1) and (0,1). Suppose f(x,y) = 1 inside this triangle and f(x,y) = 0 outside. For a given θ, the Radon transform pθ(r) is zero outside an interval a(θ) < r < b(θ), and the peak of pθ(r) is at r = c(θ). (a) (10 pts) Determine the analytic expression for a(θ), b(θ), and c(θ). Denote by A the point (0,0), B the point (0,1), and C the point (1,1). Then A maps to r = 0 for all θ. B maps to r = sinθ, and C maps to r = cosθ + sinθ. So a = min (0, sinθ, cosθ + sinθ), b = max (0, sinθ, cosθ + sinθ), c = not a and not b See (b) below. (b) (10 pts) Sketch the three trajectories r = a(θ), r = b(θ), and r = c(θ) in the (r, θ) plane. First sketch for 0 < θ < π. r = 0, r = sinθ, and r = cosθ + sinθ. It will be seen that for 0 < θ < π/2, 0 < sinθ < cosθ + sinθ. For π/2 < θ < 3π/4, 0 < cosθ + sinθ < sinθ . For 3π/4 < θ < π, cosθ + sinθ < 0 < sinθ . So a(θ) = 0 for 0 < θ < 3π/4, a(θ) = cosθ + sinθ for 3π/4 < θ < π. b(θ) = cosθ + sinθ for 0 < θ < π/2, b(θ) = sinθ for π/2 < θ < π. c(θ) = sinθ for 0 < θ < π/2, c(θ) = cosθ + sinθ for π/2 < θ < 3π/4, c(θ) = 0 for 3π/4 < θ < π. (c) (10 pts) Sketch pθ(r) for θ=0, 3π/8, and 5π/8. pθ(r) is triangular shaped in the (r, pθ) plane. For θ=0, the three vertices are at (0,0), (0, 1) and (1,0). For θ=3π/8, the three vertices are at (0,0), (cos(π/8), 1/(√2 cos(π/8) ) ) and (√2 cos(π/8), 0). For θ=5π/8, the three vertices are at (0,0), (√2 sin(π/8), 1/( cos(π/8) ) ) and (cos(π/8), 0). 4. (a) (13ps) The binary watermark “I” is to be inserted in an 8x8 image to get a watermarked image. The pixel value of the original image is shown below. How do you determine the pixel value of the (1,3)-th element and of the (2,1)-th element? What additional information do you need to accomplish this? (b) (6 pts) How can the watermark “I” be retrieved from the watermarked image? (c) (6 pts) What happens when the bottom half of the watermarked image is replaced by an image of size 4x8? See lecture noted L19.

mt 2 sol

Related documents

Products

Support

mt 2 sol

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib