Tutorial_12_E

Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Tutorial 12 –Signal Representation, Visual communication Question 1: N N is a Hilbert space with the following inner product: N 1 N 1 a, b   a  m, n   b  m, n  m 0 n 0 Additionally, there is a collection of N 2 matrixes  k ,l  :    2m  1  k       2n  1  l      C  l   cos   2N 2N         k ,l  m, n   C  k   cos    1 N C k     2 N k 0 k 0 a. Show that in this space the collection of matrixes  k ,l  is an orthonormal group. Definition: A picture a  N N is given by: a N 1  aˆk ,l  k ,l k ,l  0 The DCT (Discrete Cosine Transform) transform of a is a . b. Given the following matrix (N=8):  44   48  41   53 a  41  44   44  40  38 43 49 4 44 - 2 45 - 20 51 12 42 16 41 30 49 41 23 - 44 - 69 - 73 - 70 - 56 - 32 7 18 - 57 - 73 - 74 - 84 - 71 - 60 -3 9 - 60 - 84 - 89 - 92 - 71 - 48 - 24 3 - 59 - 77 - 96 - 98 - 70 - 56 - 15 10   - 57  - 68   - 78   - 85  - 79   - 56  - 15  What is the DCT transform of the matrix? c. Assuming the aforementioned matrix describes a picture, we wish to describe it using only N 2/2 pixels. To do so we shall choose the most meaningful members (in their absolute value) in matrix a . What is the pixel's average reconstruction error in grey levels number terms? What is the reconstruction error when representing the picture using the N 2/2 DCT coefficients of the picture? d. A pixel in the picture has a diameter d . The distance from the viewer to the picture is D. What is the connection between the matrix kl and the spatial frequency (radial angular) which it represents? Find the values of the following MTF function in the matching frequencies (weight matrix). 1 Visual and Auditory Systems Tutorial 12            e  0  MTF    C    0   The Technion - Israel Institute of Technology Electrical engineering faculty  C  2.2,   0.192,   1.1, 0  8 cpd  , D  3072  d e. Based on the MTF function, how can the DCT coefficients be quantified (quantization) while causing a minimal damage to the quality of the received picture? 2 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Solution: Section a: The group  k ,l  defines a collection of N 2 lattices. Each lattice is a multiplication of a vertical lattice and a horizontal lattice, unlike a directional lattice defined by cos  x x   y y  . That is how for N=8 we receive the following 64 lattices: l k The inner product of the two lattices  k1 , l1  ,  k2 , l2  is: k ,l , k ,l  1 1 2 2  C  k1  C  l1  C  k2  C  l2   N 1 N 1    m  1 2   k1     n  1 2   l1     m  1 2   k2     n  1 2   l2    cos    cos    cos    cos   N N N N m0 n 0          C  k1  C  l1  C  k2  C  l2   N 1    m  1 2   k1     m  1 2   k2  N 1    n  1 2   l1     n  1 2   l2    cos   cos      cos    cos   N N N N m0     n 0       k1 , k2    l1 ,l2   C  k1   C  l1   C  k2   C  l2     k1 , k2     l1 , l2     k1  k2     l1  l2   N k1  k 2  0  0  else For: k1, k2    N 2 k1  k2  0 Meaning we are looking at an orthonormal group. 3 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty The function   k2 , l2  can be found: k1, k 2    1 2 1  2 N 1   m  1 2k1    m  1 2k 2    cos  N N     cos m 0 N 1   m  1 2k1  k 2     m  1 2k1  k 2     cos  N N     cos m 0 N 1   jmk1  k 2  j k1  k 2   1    N 2N  2  Re exp  m 0 N 1   jmk1  k 2  j k1  k 2     N 2N   Re exp  m 0 N 1 N 1  1  1  j k1  k 2    jmk1  k 2    j k1  k 2    jmk1  k 2    Re  exp    exp    Re  exp    exp   2N N 2N N   m 0     m 0    2  2     1   j k1  k 2   exp  j k1  k 2   1  1   j k1  k 2   exp  j k1  k 2   1   Re exp      Re exp   2   2N 2N  exp  j k1  k 2    1  2    exp  j k1  k 2    1         N N         A B For expression A we receive: k1  0, k2  0 (l'Hôpital's rule) A N 2 k1, k 2  0 k1  k 2 even A0 k1  k2 odd    1   j k1  k 2   exp  j k1  k 2   1  A  Re exp    2   2N  exp  j k1  k 2    1     N        1   j k1  k 2   2   Re exp      j  k  k 2   2N    exp 1 2  1    N      j k1  k 2     j k1  k 2        exp    1   exp  2N N           Re     exp  j k1  k 2    1   exp  j k1  k 2    1           N N                j  k  k j  k  k    k1  k 2    1 2   exp  1 2  2i sin     exp    2N 2N N        Re    0   Re          j  k  k j  k  k  k  k      1 2  exp 1 2  1 1 2    1  exp   2  2 cos    N N N           Likewise, for expression B we receive: k1  k 2 A N 2 k1  k 2 k1  k 2 even or odd A0 Conclusions: 1. The group  k ,l  includes linearly independent N 2 matrixes; therefore it is an orthonormal basis of the N  N space. 2. Representing a picture according to the  k ,l  group is equal to finding the lattices (harmonies) composing the picture, and also determining their relative weight. 4 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Section b: The matrix a is based on the following picture: In order to find the representation coefficients, a number of inner products between the picture a and the members of the group  k ,l  should be found. That is how, for instance, the DC component in the picture is described by: aˆ0,0  1 N N 1 N 1   am, n  158 m 0 n 0 And the weight of the  2,4 lattice is: aˆ 2,4  a,  2,4  N 1 N 1   am, n  2,4 m, n  12 m 0 n 0 All the representation coefficients can be calculated in a similar manner, as in the following matrix (the picture describes absolute numbers)   158   7  166   25 â    57  14   27  2  318  17  90 7  28  15  21 1 121  23  44  33 6 16   7 12 14  7  4  4  53  8 12 20 2 1   15  3  9 9 13 0    21  1 17 7 3 1  4 6 7 3 7  3   10  2 2 2 3 9   3  15  8 0 8 3  As can be seen in picture a , there is a sharp horizontal transition from a bright area to a dark area, which is expressed in the a 0,1 coefficient. Additionally, the dominant lattices composing picture a are the horizontal lattices. This fact is consistent with the horizontal tendency of the grey levels which exists in the picture itself. 5 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Section c: We shall choose the N 2/2 most meaningful members in the matrix a , according to absolute values. (The rest of the members are 0). The pixel's average reconstruction error in MSE terms is: MSEa  1 N2 b  a  2.8 b is the flattened matrix. Meaning there is an average error of 2.8 grey levels in each pixel. On the other hand, the reconstruction error for the DCT coefficients is: MSEaˆ  1 N2 c  a  0.4 c is the DCT inverse transform of the flattened matrix of a . We received MSEa  MSEa , meaning most of the picture's energy is described by a small number of DCT coefficients, unlike the grey levels representation. This is an important characteristic of the transform, since it allows describing pictures by a relatively small number of members and still maintaining a small representation error. For example, the following picture was received by division to blocks of size NxN, DCT calculation for each block, nullification of half of the coefficients and performing an inverse transform: The differences picture between the reconstructed picture and the original picture is given below. As expected, the reconstruction error is expressed in the higher frequencies, where there are edges (high frequencies) in the picture. 6 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Section d: We shall find the angular frequency for 1D: the k parameter determines the lattice's number of cycles over N pixels. The cycle distance M in pixels is received by:  m  1 2  M k N   m  1 2k N  2   Mk N  2   M  2N k The viewing angle where N pixels are seen is:  N d rad   N  d  180 deg D D  The number of cycles seen in this case is k . 2 Now, the angular frequency describes the number of cycles seen in one degree. Therefore: k  In 2D: k 2    k D 360 N  d cpd   D k , l  k2  l2  k 2  l 2 cpd  360 N  d Therefore the coefficients aˆ k ,l describe the contribution of the lattice  k ,l in the picture's assembly. The given MTF function is shown graphically: 7 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty The suitable angular frequencies:   D  360 N  d    0   3. 4  6. 7   10 .1  2 2 k l    k ,l  13 .4  16 .8   20 .1  23 .5  3.4 4. 7 7.5 10 .6 13 .8 17 .1 20 .4 23 .7 6. 7 7.5 9.5 12 .1 15 .0 18 .0 21 .2 24 .4 10 .1 10 .6 12 .1 14 .2 16 .8 19 .5 22 .5 25 .5 0.38 0.37 0.34 0.29 0.24 0.19 0.15 0.11 0.26   0.26  0.24   0.21   0.17  0.14   0.11  0.08  13 .4 13 .8 15 .0 16 .8 19 .0 2135 24 .2 27 .0 16 .8 17 .1 18 .0 19 .5 21 .5 23 .7 26 .2 28 .8 20 .1 20 .4 21 .2 22 .5 24 .2 26 .2 28 .4 30 .9 23 .5   23 .7  24 .4   25 .5   27 .0  28 .8   30 .9  33 .2  And the matching weight matrix:  0.42   0.92  0.99   0.88 W   0.70  0.53   0.38  0.26  0.92 0.98 0.98 0.85 0.68 0.51 0.37 0.26 0.99 0.98 0.91 0.78 0.62 0.47 0.34 0.24 0.88 0.85 0.78 0.66 0.53 0. 4 0.29 0.21 0.70 0.68 0.62 0.53 0.43 0.33 0.24 0.17 0.53 0.51 0.47 0.40 0.33 0.26 0.19 0.14 As expected, low angular frequency lattices are more significant in the picture assembly than high frequency lattices. Therefore, if we change the DCT coefficient value which matches the high frequency (by quantifying for example), we still won't damage the picture's quality significantly. Section e: Since the picture's quality is determined by the human viewer, the MTF function can be used to quantify the DCT coefficients themselves. q is defined as the index for the quantifying process (a large index indicates a large quantifying step). The weighted quantifying steps are: Qk ,l  q Wk ,l Meaning there is an inverted ratio: the smaller the lattice's weight is, the larger the quantifying step can be. The quantization is described by:  aˆ k , l aˆ k , l   round  Qk , l      In this manner, by determining quantization coefficients for each lattice, every DCT coefficient can be quantified separately thus achieving minimal damage to the picture's quality as seen in the eyes of the human viewer. 8 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty In conclusion: The DCT transform of pictures has a few characteristics:  The representation coefficients are real.  The representation is done according to an orthonormal basis.  The DCT transform is not the real part of a discrete Fourier transform (DFT). Instead, it describes a Fourier transform of a symmetrical signal received by replication of the original signal.  The transform can be used effectively (similarly to the FFT algorithm)  Energy concentration: most of the picture's energy is described by a small number of coefficients.  Correlation reduction: grey levels values of adjacent pixels are of high correlation. On the other hand, the DCT coefficients are not. This characteristic is very important in the encoding of the coefficients' values (entropy encoding- will not be covered in this tutorial).  The DCT coefficients can be linked to the MTF function of the human visual system. The approach described in this question is a fundamental description of the JPEG algorithm (Joint Photographic Experts Group). In this algorithm the picture is divided to 8x8 blocks and the DCT transform is preformed separately on each block. The DCT coefficients are quantified in a similar manner to the weight matrix W, and only a small part is sent to the receiver. The process causes lost of information (LOSSY) since not all the DCT coefficients are sent to the receiver, but the human viewer receives a picture with minor distortion. 9 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Question 2: A TV transmitter produces the following signal: r  t   D cos 0t   E sin 0t   C , used to describe colored information. a. Express C, E, D using R, G, B (the correction coefficients connecting between R  Y , B  Y and V ,U can be omitted), assuming this is a normal colored broadcast. b. Due to a mishap the transmitted signal is rˆ  t   E cos 0t   D sin 0t   C (meaning E and D are exchanged). The signals which appear in the picture below are the R, G, B components of the camera that feeds the impaired broadcast. Calculate the R, G, B output received in a normal receiver for each time. What color in the 3D color space (a cube with R, G, B axis) will the received color resemble to (proximity to colors in the cub's corners)? 10 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Solution: a. r  t   D cos 0t   E sin 0t   C A normal broadcast is according to: r  t   y   cos 0t      u 2  v2 ; v     tan 1   u Using trigonometric identities: r  t   y   cos 0t  cos    sin 0t  sin     Cy D   cos    u  R  y E   sin    v  B  y In conclusion: C  y  0.11B  0.59G  0.3R D  R  y  0.11B  0.59G  0.7 R E  B  y  0.89 B  0.59G  0.3R b. Due to the mishap the transmitted signal is rˆ  t   E cos 0t   D sin 0t   C The values the receiver "comprehends" are marked with : yˆ  C uˆ  E  Rˆ  yˆ  Rˆ  uˆ  yˆ  E  C  B vˆ  D  Bˆ  yˆ  Bˆ  yˆ  vˆ  D  C  R 1 Gˆ  yˆ  0.11Bˆ  0.3Rˆ  G  0.322  R  B  0.59   In conclusion: Rˆ  B Gˆ  G  0.322  R  B  Bˆ  R 11 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Graphically: A summarizing table: Time 0t T T  t  2T 2T  t  3T 3T  t  4T 4T  t  5T Transmitted Red Blue Green Cyan White 12 Received Blue Red Green Yellow White Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Question 3: A Hilbert space H  L2 0,1 is given with the standard inner product: 1 f , g   f  t  g   t  dt , f ,g H 0 Given the following vectors: 1  t   t , 2  t   t 2 a. Is 1 , 2  an orthogonal group? b. What is the biorthonormal group of 1 , 2  ? Solution: a. We demand 1 , 2  0 1 1 , 2   t  t 2 dt  0 1 0 4  Not orthogonal b. A general approach for finding the biorthonormal group using a Gram matrix: Given a sequence n   H , a biorthonormal sequence  n   H that fulfils the requirement n , m   mn should be found in the following manner: 1. Finding the Gram matrix, defined by Gij   j , i . 2. Finding the inverse Gram matrix Q  G 1 . 3. The biorthonormal group is found by n   Qmn m . m Back to our question- finding the Gram matrix:   , 2 , 1  G      1 1 , 2 , 2   1 , 2 Calculating the inner products:   1 , 2  1 3 1 , 1  , 2 , 2  1 5 The biorthonormal group is found according to   G 1    G 1 , therefore: 1 1 3 4  48 60  G  G 1      1 1  60 80    4 5  1  481  602     G 1    2  601  802 13 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Question 4: Alternative solution - tutorial 11, using the Gram method: 1 2 We need to find the biorthonormal group for  f mn , f mn  , given: 1   f mn  g  x  mD  cos  nWx  , 0  n  ,  2   f mn  g  x  mD  sin  nWx  , 1  n  ,   m     m   Solution: In order to find the Gram matrix, first we calculate the inner products: 1 mn 1 mn 1 mn 2 mn f ,f f ,f  f ,f 2 mn    2 mn   m 12  D  m 12  D   m 12 D  m 12  D   m 12 D  m 12 cos 2  nWx  dx  D 2 cos  nWx  cos  nWx  0  dx 1 1 D cos  2nWx  0   cos 0   dx  D cos 0   2 2 4 D The rest of the inner products are equal to zero due to the orthogonality of the cosine and the windows' lack of overlap. Since there is infinite number of functions in the 1 2 group  f mn , f mn  , the Gram matrix is of infinite size, but has a separable nature. A group of sub-matrixes Gmn can be defined for m  Z, n  0 : Gmn 1 1  f mn , f mn  1  f mn , f mn2  The inversed matrixes: Qmn   G 1  mn 1  D 2 1 f mn2 , f mn    2 2  f mn , f mn  4  1 2   4  2 1   3D  1 2  The biorthonormal group is given by:  1 mn 2 , mn    fmn1 , fmn2   G 1  mn 1   f mn , f mn2   3  2 1   4 D  1 2  Therefore: 3  2 cos  nWx   cos  nWx  0   g  x  mD  4D   4  3 1   cos  nWx   sin  nWx   g  x  mD  2 3D  2  4  4     sin  nWx   g  x  mD   cos  nWx  0   g  x  mD  3 2 3D 3D   4   2  cos  nWx   g  x  mD  Likewise:  mn 2 3D  1  mn  The only case left to be handled is when m  Z, n  0 : 1 Gm 0  f m10 , f m10  D   m1 0  g  x  mD  D 14 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Question 5 – Gabor functions: Responses from three different cells in the visual system, r1 , r2 , r3 , were measured in a physiological experiment. For simplicity we shall assume 1D light excitation, as a function of x alone. The following results were received ( B0  B1 ): Excitation E1  x   B0  B1 sin  210 x  caused a reaction only from cell r1 . Excitation E2  x   B0  B1 sin  2 20 x  caused a reaction only from cell r2 . Excitation E3  x   B0  B1 cos  2 10 x  caused a reaction only from cell r3 . It is assumed that the cells respond to the presented signal according to the real or imaginary part of an inner product with Gabor functions f mn  x  :       ri  Re   E  x  f mn  x  dx  or ri  Im   E  x  f mn  x  dx  , where B0 is omitted prior     to performing the inner product. The Gabor functions in this question are of form f mn  x   g  x  mD  e jnWx , where W  2 , D  1, and g  x  is a normalized square envelope with width D, meaning g  x   1 when x  0.5 , otherwise g  x   0 . a. According to the results, characterize the specific Gabor function which is related to each of the cells ( m and/or n ) as much as you can, and specify which part (real or imaginary) does the cell react to. b. Give an example to an excitation which will cause a reaction from all three cells. Is there a signal that can excite only two of the cells without affecting the third one? If there is, give an example. If not, explain why. c. A light point approximated by E4  x   B1  x  vt  is moved across the X axis. Given v  1 is the movement velocity, the results are: r1 reacts only in the time frame 2.5  t  1.5 . r2 reacts only in the time frame 4.5  t  5.5 . r3 reacts only in the time frame 4.5  t  5.5 . Complete the information known regarding the identification of the Gabor functions involved. d. Sketch the reaction of cell r2 as a function of the time t , as a reaction to the signal in section c. What is the name of the reaction in physiology you received? e. Qualitatively describe the range of sensitivity of the three cells in the spacefrequency domain. 15 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty Solution: a. The excitations presented using the given Gabor functions: E1  x     m   B1 g  x  m  B1 sin  2 10 x   B1   g  x  m  sin  2 10 x   m   e j 2 10 x  e j 2 10 x  g x  m      2j m     Therefore, in case of an excitation E1  x  , the cells which will react are those with n  10 , imaginary part: r1  Im am,10  Likewise for E2  x  , E3  x  : r2  Im am,20  r3  Re am,10  Since the excitations are periodical in the space domain, m cannot be determined! b. Since the Gabor representation is linear, the reaction for the sum of excitations will be the sum of reactions, meaning the excitation E  E1  E2  E3 will cause a response of all three cells. c. There is an excitation which will cause a reaction in r1 , r2 without affecting r3 . According to the location of the light point in the time frames where each cell reacted, m can now be determined:  Re   Re  jnwvt     B1  x  vt  f mn  x  dx   B1 g  vt  mD  e Im  Im  The reaction is different from zero only for t  m  1 , meaning: 2 m 1 t  m 1 . 2 2 Therefore: r1 : m  2 ri  r2 : m5 r3 : m0 d. r2  t     Im   B1  x  t  g  x  5  dx     4.5  t  5.5  B sin  2  20t  ,  else 0, 16 Visual and Auditory Systems Tutorial 12 The Technion - Israel Institute of Technology Electrical engineering faculty 20 sine cycles in uniform amplitude. This is actually the Receptive field of cell r2 ! e. In the space-frequency domain: in the space axis, the function is actually limited to bands in width D. The envelop of every band is uniform in the x direction, and sine-like in the frequency axis  direction. Every decision of effective limitation in the frequency domain (for example, 3dB ) will result in rectangular ranges: Last update – January 2011 17

Tutorial_12_E

Related documents

Products

Support

Tutorial_12_E

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib