Video processing for multimedia systems G. de Haan technische universiteit eindhoven W 2 Schedule lectures 5P530 Week 1 Week 2 Week 3 Week 4 Basics (Ch 2, 3) Filtering (Ch 4) Video Enhancement (Ch 5) Picture-Rate Conversion (Ch 7/9) Week 5 Week 6 Week 7 Week 8 De-interlacing (Ch 8) Questions Motion Estimation (Ch 10) Object Detection (Ch 11) X technische universiteit eindhoven W 2 3 Motion Estimation technische universiteit eindhoven W 3 4 Motion Estimation • Is there any motion? • How fast? • Into which direction? Dy Dx technische universiteit eindhoven W 4 5 Application dependency of ME • Scan rate conversion (true-motion vectors) • • • Picture rate conversion Video compression (low prediction error) • • • De-interlacing MPEG H.2.63 True-motion vectors are usually more consistent than coding vectors. Consistency has some, but no dominant relevance for coding efficiency ME technische universiteit eindhoven W 5 6 Motion estimation and coding Motion Picture delay compensation Prediction Input + - error Image compression: accuracy demands decrease with increasing frequency (DCT-transform (DCT transform + quantization) technische universiteit eindhoven W 6 Output 7 Pixel-recursive PixelME methods technische universiteit eindhoven W 7 8 Pixel (Pel) recursive ME; Earliest methods, many variants DFD 2 dDFD dD 2 Algorithm: Determine gradient of displaced frame difference (DFD), and update vector in direction of decreasing DFD. i I+1 I+2 I+3 Displacement D technische universiteit eindhoven W 8 9 Pel-recursive ME 1) 2) 3) 4) Di Di 1 u d u DFD( x , Di1 , n) dD DFD ( x , Di 1 , n) F ( x , n) F ( x Di 1 , n 1) d u DFD( x , Di 1 , n) F ( x Di 1 , n) dx technische universiteit eindhoven W 9 10 Pel-recursive ME; The use of predictions Spatial causal prediction x Temporal predictions Current pixel Time technische universiteit eindhoven W 10 11 Why not so popular anymore? • Pel-recursive estimators require fairly complex calculations for every pixel in the image • As soon as applications became practical that required real-time motion estimation, complexity reduction of the estimator was crucial • • Primarily coding, later also scan conversion For coding one vector per pixel is not attractive technische universiteit eindhoven W 11 12 Block-matching BlockME methods: Full--search Full technische universiteit eindhoven W 12 13 Block-matching; find corresponding block in image n-1 Corresponding block Search area Current block n -1 n Image number technische universiteit eindhoven W 13 14 Finding block similarity Current block Dy Dx Search area technische universiteit eindhoven W 14 15 Formal definitions Luminance value in previous picture, shifted over candidate vector C: F ( x C , n 1) A block matcher optimizes a function, Cost, varying C: (C , X , n) Cost ( F ( x, n), F ( x C , n 1)) xB ( X ) And the resulting candidate vector for which the error is minimal is assumed to be the displacement vector: D ( x , n) technische universiteit eindhoven W 15 16 Normalised cross-correlation (C , X , n) ( F ( x , n).F ( x C , n 1) xB ( X ) F xB ( X ) • • 2 ( x , n). favourable performance rather high operations count technische universiteit eindhoven W 16 F xB ( X ) 2 ( x C , n 1) 17 Summed Square Error (C , X , n) • • 2 (F ( x, n) F ( x C , n 1) ) xB ( X ) good performance acceptable operations count technische universiteit eindhoven W 17 18 Summed Absolute Difference (C , X , n) • • xB ( X ) F ( x, n) F ( x C , n 1) still good performance favourable operations count technische universiteit eindhoven W 18 19 Significantly differently pixels (C , X , n) xB ( X ) T ( F ( x , n) F ( x C , n 1) ) with : 1 T ( a ) 0 • • , ( a threshold ) , ( a threshold ) Rather poor performance Favourable operations count, reduced register size compared to SAD technische universiteit eindhoven W 19 Alternative match criteria Complexity 20 • Correlation (NCCF) of pixels in the two blocks • Mean Square Error (MSE) between pixels in the blocks • Mean Absolute Difference (MAD) between pixels in the blocks • Number of significantly different pixels (NSD) in the two blocks technische universiteit eindhoven W 20 21 Comparison of match criteria MSE SAD technische universiteit eindhoven W 21 NSDP 22 Operations count of full search block matching • • • CCIR signal • 720x288x50 (pixels/s) Search window for realistic velocities • 64x48 (HxV in pixels) = 3000 possible vectors, assuming integer vector accuracy Matching error (SAD) calculation only: • approximately: 1x1011 (ops/s) technische universiteit eindhoven W 22 23 Block-matching Blockefficient search techniques technische universiteit eindhoven W 23 24 Current block Finding block similarity Dy Dx Search area technische universiteit eindhoven W 24 25 Current block Sub-sampled search Dy Dx Search area technische universiteit eindhoven W 25 26 Sub-sampled full search Dy 2 1 Dx Search area technische universiteit eindhoven W 26 27 3-step search (Koga et al., 1981) Dy Dx Search area technische universiteit eindhoven W 27 28 One-at-a-time search (Srinivasan & Rao, 1985) Dy Dx technische universiteit eindhoven W 28 Video processing for multimedia systems G. de Haan technische universiteit eindhoven W 30 Successive approximation may become necessary Dy y xmin 3 2 1 Dx x0 i1j Contour plot of error plane technische universiteit eindhoven W 30 31 Prevention of trap in local minimum Dy X0(b) Xmin(b) Xmin(a) X0(a) Dx x0 Xmin(d ) Xmin(c ) X0(c) X0(d) Contour plot of error plane technische universiteit eindhoven W 31 32 Reality is even more complicated… technische universiteit eindhoven W 32 33 And sometimes there is no unique solution… technische universiteit eindhoven W 33 34 Comparison of search techniques FS LogS technische universiteit eindhoven W 34 OTS 35 Pixel subsubsampling in match function technische universiteit eindhoven W 35 36 Intermediate conclusion • Efficient search techniques can highly reduce the operations count of a block matching motion estimator, but increase the risk of getting trapped in a local minimum of the error function • Methods to prevent the disadvantages of efficient search, increase complexity again. technische universiteit eindhoven W 36 37 Pixel sub-sampling of match error criterion Dy Current block Dx Search area technische universiteit eindhoven W 37 38 Pixel sub-sampling in match error criterion 1 4 2 4 technische universiteit eindhoven W 38 39 Block subsubsampling technische universiteit eindhoven W 39 40 Block sub-sampling V-position Search area Candidate vector Current block n-1 n Picture number technische universiteit eindhoven W 40 H-position 41 Interpolate missing motion vectors Up Le current Ri Lo 1: 2: Current Dx = median{Lex, (Upx+Lox)/2, Rix} Current Dy = median{Ley, (Upy+Loy)/2, Riy} Use the vector-median to prevent new vectors technische universiteit eindhoven W 41 42 Summary cost reduction block matchers • Simple match criterion • Efficient search strategy • Pixel sub-sampling in match criterion • • a factor of four is usually feasible with little influence on the performance Block sub-sampling • only valid if motion field is smooth technische universiteit eindhoven W 42 43 Vectors and object velocity technische universiteit eindhoven W 43 44 Full search block matching motion vectors technische universiteit eindhoven W 44 45 True motion versus best match Poor relation vectors & velocities 1 1 3 2 Number 7 Arm Scarf 2 SAD : (C , X , n) Seven: Arm: 1 clear no Scarf: min clear multiple min min 3 xB ( X ) F ( x, n) F ( x C , n 1) C is motion vector, F image grey value B 8x8 block, x pixel position, n picture nr technische universiteit eindhoven W 45 46 Block-matching Blocktrue--motion true estimation technische universiteit eindhoven W 46 47 What is wrong with block matching? • Blocks are not unique • Optimization is ill-posed problem • Testing for best match gives too many solutions • Solutions: • • • Introduce bias, e.g. towards consistent vectors (test better) Post-processing, e.g. eliminating outliers (test again) Pre-selection of likely candidates (test less) technische universiteit eindhoven W 47 48 Introduce bias Test better… technische universiteit eindhoven W 48 49 Introduce bias – Test better Minimal match error gives no unique solution (C, X , n) xB ( X ) | F ( x, n) F ( x C, n 1) | An improved criterion takes into account that vectors are consistent within objects and over time: (C, X , n) xB ( X ) | F ( x, n) F ( x C, n 1) | Ps (C ) Pt (C ) Ps and Ps are penalties depending on spatial and temporal consistency of the candidate vector PROBLEM: Consistently only known after completion… Usually an iterative approach is required technische universiteit eindhoven W 49 50 Post-processing PostTest again.. technische universiteit eindhoven W 50 51 Post processing to improve vector consistency (Reuter, 1988) V-Pos y-2Y y-Y y y+Y y+2Y y+3Y x-4X x-2X x x+2X x+4X Do ( X ) Fp ( D ( X k ), k Neighbourhood technische universiteit eindhoven W 51 H-Pos 52 The effect of post-filtering (5x3 blocks) Original Average technische universiteit eindhoven W 52 Median 53 Pre-selection PreTest less… technische universiteit eindhoven W 53 54 Hierarchical block matching (Thoma & Bierling, 1989) Coarse estimation Down-sampled picture at intermediate level Initialise Initialise Medium size update vectors Small size update vectors Down-sampled picture at highest level Original picture technische universiteit eindhoven W 54 55 Hierarchical block matching Hierarchical Full search technische universiteit eindhoven W 55 56 Pre-selection in Fourier domain- Phase Plane Correlation • PPC is a two-step hierarchical motion estimator • 1) Select largest correlation peaks in the Fourier domain using blocks larger than 64x64 • 2) Test SAD only for these vectors on small block, here 8x8, in the spatial domain • Algorithm originally proposed by Graham Thomas, and applied in professional studio scan converters technische universiteit eindhoven W 56 57 Time recursive block matching (Ninomya, 1982) Cy +6 +4 +2 0 -2 -4 -6 -6 -4 -2 0 +2 +4 +6 Cx Test SAD only for these vectors centred around result vector previous picture technische universiteit eindhoven W 57 58 ST-recursive STcandidate selection after break technische universiteit eindhoven W 58 Video processing for multimedia systems G. de Haan technische universiteit eindhoven W 60 3-D Recursive Search blockblockmatching technische universiteit eindhoven W 60 61 3-Dimensional Recursive Search (3DRS) Assumptions: 1. Objects are LARGER than blocks 2. Objects have INERTIA Candidate set • Spatial candidates • Temporal candidates • Updated candidates technische universiteit eindhoven W 61 ?? 62 3-D RS: How to start? Single random update sufficient! Noise vector update Dy Spatial prediction candidates Temporal prediction candidate technische universiteit eindhoven W 62 Dx 63 Chosen candidates Spatial Temporal Update technische universiteit eindhoven W 63 64 Performance technische universiteit eindhoven W 64 65 Operations Count 140 FS: 2000 125 H3: 1500 120 100 100 Pel-Rec:1000 75 80 68 60 40 22 20 10 0 PPC 4-St 3-St OTS technische universiteit eindhoven W 65 H2 3-D RS 66 Performance of a true-motion estimator: Smoothness technische universiteit eindhoven W 66 67 Vector field smoothness 4.5 4.3 4 3.5 3 2.5 2 1.5 1 0.5 0.8 0.2 0.3 0.3 0.9 0.5 0 4-St 3-St FS OTS technische universiteit eindhoven W 67 H2 PPC 3-D RS 68 Performance testing of true-motion estimator: M2SE MC ME MMSE (n) ( F ( x , n) Fmc ( x , n)) 2 x 1 Fmc ( x , n) F ( x D( x ), n 1) F ( x D( x ), n 1) 2 n-1 n n+1 picture nr. technische universiteit eindhoven W 68 69 M2SE score of ME-methods 250 244 196 200 189 150 137 120 112 100 101 106 50 0 4-St OTS 3-St H2 FS technische universiteit eindhoven W 69 H3 PPC 3-DRS 70 Comparison of best vector fields Phase Plane Correlation motion vectors 3-D Recursive Search BM motion vectors technische universiteit eindhoven W 70 71 MC up-conversion; Relevance of true-motion vectors Interpolated images using full search motion vectors Interpolated image using 3D-RS motion vectors In contrast with coding, for scan rate conversion true-motion is an absolute must. RATHER SMOOTH THAN ACCURATE!! technische universiteit eindhoven W 71 72 Simplifications 1) Reduced candidate set technische universiteit eindhoven W 72 73 With 8 prediction and 1 update: 9 candidates Current block Block in current field Block in previous field V-pos y-Y Sa Sb y Sd y+Y Tb Tc x-X x Sc Ta Td y+2Y x-2X x+X x+2X technische universiteit eindhoven W 73 H-pos 74 3DRS, 4 candidates are enough (including 1 update) Current block Block in current field Block in previous field V-pos Sb y-Y Sa y y+Y T y+2Y x-2X x-X x x+X x+2X technische universiteit eindhoven W 74 H-pos 75 Y-estimator, advantage for pipe-lining Current block Block in current field Block in previous field V-pos Sa y-Y Sb y y+Y T y+2Y x-2X x-X x x+X x+2X technische universiteit eindhoven W 75 H-pos 76 Effect of candidate reduction M2SE: 21.5 M2SE: 26.0 M2SE: 23.3 S: 2.8 S: 1.7 S: 2.6 technische universiteit eindhoven W 76 77 Block diagram of Y-estimator; Simple hardware Prediction memory 0 Nbl U(X,n) D(X,n) D(x,n) Update Mod p count Look Up Table Best vector selection Update Generator Current Previous picture picture technische universiteit eindhoven W 77 78 Simplifications 1) Reduced resolution for ME technische universiteit eindhoven W 78 79 ME with reduced resolution compared to application input Application, like De-interlacing, PRC, etc. Down-scale video signal Motion estimation D(x,n) on reduced video technische universiteit eindhoven W 79 Up-scale motion vectors output 80 SophisSophistications technische universiteit eindhoven W 80 81 Iterating more than once on an image pair Effect of iterations Once, 1st image 300 10 times 250 200 M2SE 150 100 x smoothness 100 50 0 1 2 3 4 5 6 7 8 9 10 Remark 1: If estimating in the output domain (100Hz): 2 iterations on video and 4 iterations on film material! Remark 2: Effect mainly shows in 1st image after scene change: •1 iteration, 10th frame: M2SE: 29, Smoothness: 2.8 •10 iterations, 10th frame: M2SE: 28, Smoothness: 3.5 technische universiteit eindhoven W 81 82 Block--erosion Block technische universiteit eindhoven W 82 83 Block diagram of Y-estimator; Simple hardware Prediction memory 0 Nbl U(X,n) D(X,n) Update Mod p count Look Up Table Best vector selection Block erosion Update Generator Current Previous picture picture technische universiteit eindhoven W 83 D(x,n) 1 step BE 2 step BE 3 step BE Block erosion 84 U U Median L C R L V1 V2 V3 V4 D U Median L C R L V1 V2 V3 V4 D D U U Median L C D R D U R L V1 V2 V3 V4 R U U Median L R C D D technische universiteit eindhoven W No BE 84 R L V1 V2 V3 V4 D R 85 The effect of block erosion technische universiteit eindhoven W 85 86 Advanced scanning technische universiteit eindhoven W 86 87 3-Dimensional Recursive Search (3DRS) Normal scan Meandering scan technische universiteit eindhoven W 87 Reverse scan Video processing for multimedia systems G. de Haan technische universiteit eindhoven W 89 Parametric motion models technische universiteit eindhoven W 89 90 Global motion estimation • Simple parametric motion model: • • • p1 and p2 describe pan and tilt • Dx ( x , n) p1 (n) p3 (n) x p5 (n) y ...... • D y ( x , n) p2 (n) p4 (n) y p6 (n) x ...... p3 and p4 describe zoom p5 and p6 describe rotation technische universiteit eindhoven W 90 91 Sample vector field to calculate model parameters Motion model with 4 parameters can be calculated from any 2 independent sample vectors So, in totaluniversiteit from these 9 vectors 18 models can be estimated technische eindhoven W 91 92 Derive robust background model from sample vectors Take median of all estimated parameters to eliminate outliers: p1 p2 p3 p4 = median{p11 , p21 , p31 ,………………… p181 } = median{p12 , p22 , p32 ,………………… p182 } = median{p13 , p23 , p33 ,………………… p183 } = median{p14 , p24 , p34 ,………………… p184 } technische universiteit eindhoven W 92 93 Extra candidate from parametric motion model (SAA4992) Prediction memory U(X, n) Mod p counter Look up table Update vector generator micro processor calculates parameters calculate local candidates P1, P2,.. technische universiteit eindhoven W D( X , n) update > Sample vectors Nbl 0 93 Best vector selection Block erosion Current Previous picture picture D( x, n) 94 Effect of extra candidates from parametric model Without parametric candidate With parametric candidate Clearly, the effect depends on the settings of the candidate’s penalty! technische universiteit eindhoven W 94 95 Block--hopping Block technische universiteit eindhoven W 95 96 Chosen candidates Spatial Temporal Update technische universiteit eindhoven W 96 •In many cases the spatial prediction (SP) is good. •Save calculations on the average by checking the other candidates only if SP error is above Th 97 Block-hopping Calculate all SADs (grey blocks are skipped) technische universiteit eindhoven W 97 98 Block hopping; optimal resource usage Vector memory Calc. SAD of SP compare MUX s Calc. all Assign SADs best D Th SP Adapt threshold Calculate Resource Usage/field technische universiteit eindhoven W MUX s Assign 98 99 Motion estimation and occlusion technische universiteit eindhoven W 99 100 The basic block matching concept V-position Search area Candidate vector Reference block 8 x 8 pixels n-1 H-position n Picture number technische universiteit eindhoven W 100 101 How to estimate motion estimation in occlusion areas? Information not available in previous picture n-1 n technische universiteit eindhoven W 101 Ambiguities due to uncovering Position 102 Preference for FG-vector in uncovered areas ? n-1 Time n technische universiteit eindhoven W 102 103 How to estimate motion estimation in occlusion areas? Information not available in next picture Information not available in previous picture n-1 n technische universiteit eindhoven W 103 104 Motion estimation problem in occlusion areas • Observations: • • • Foreground: • Matches always, i.e. in previous and in next picture Background: • In case of covering all background will match in previous picture • In case of uncovering all background will match in next picture Conclusion: • Switch between “forward” and “backward” motion estimation to prevent ambiguities technische universiteit eindhoven W 104 105 Solution: In covering areas “forward” estimation V-position Search area Candidate vector Reference block 8 x 8 pixels n-1 H-position n Picture number technische universiteit eindhoven W 105 106 Solution: In uncovering areas “backward” estimation V-position Reference block 8 x 8 pixels Search area Candidate vector n-1 H-position n Picture number technische universiteit eindhoven W 106 107 Unambiguous motion vectors for original images Look for correspondences in BOTH neighbouring images, select Position prediction with the highest correlation forward backward n-1 n technische universiteit eindhoven W 107 n+1 Time 108 Comparison 2 frame and 3 frame motion estimation 2 frame ME 3 frame ME technische universiteit eindhoven W 108 109 Global motion estimation technische universiteit eindhoven W 109 110 Projection based global motion estimation • Algorithm: • • • Accumulate luminance over all lines Accumulate luminance over all collumns Determine global H- and V- motion based on these projections Demo Samsung ME technische universiteit eindhoven W 110 111 Projection based global motion estimation • Global motion: Minimum SAD of projection current and previous image F(i,k) i F(i,k+1) Global ME 2v i technische universiteit eindhoven W DEMO 111 112 Success and failure of the projection based global ME technische universiteit eindhoven W 112 113 Conclusions • Motion estimators for scan rate conversion differ from estimators for coding, due to additional true-motion constraint • True motion results from constraints like spatial and temporal consistency • • 3 options: better criterion, post-processing, pre-selection Pre-selection options • • Hierarchical approach (e.g. Phase Plane Correlation.) Recursive approach (3-D RS) technische universiteit eindhoven W 113 114 Conclusions • Picture rate conversion requires very consistent but not necessarily very accurate motion vectors (integer resolution sufficient), the range should be at least +/-16 pixels • De-interlacing requires very accurate motion vectors (at least 1/4 pixel) . For larger vectors the accuracy is less important technische universiteit eindhoven W 114 115 Prepare yourself for the exam… • Last week: • Today: • I recommend you read the text • And try the exercises in the book: • • • • • • Chapter 8 Chapter 10 (not: object based ME) Book available at Pt9:24 Chapter 8 Chapter 10, skip 10.6 You have to download VidProc (w3.ics.ele.tue.nl/~dehaan/ ) • Send me e-mail for password (G.d.Haan@tue.nl) technische universiteit eindhoven W 115 116 Questions. Part 6. Motion estimation 1. A full-search block-matcher uses blocks of 8x8 pixels and a search range of 7x5 blocks. How many candidate vectors have to be evaluated per block? 2. A 3-D recursive-search block-matcher uses blocks of 8x8 pixels and a search range of 7x5 blocks. How many candidate vectors have to be evaluated per block in case the true-motion vector is (Dx,Dy) = (4,3)? 3. Increase the number of iterations using 3DRS block-matching and evaluate the effect on the M2SE and smoothness 4. Analyze the effect of a parametric motion model in 3DRS (choose a suitable test sequence!) 5. Try some of the available motion estimation algorithms of the software (4 frames will do..) 1. 2. 3. 4. How do they compare in M2SE and Smoothness? What is the effect of vector-field post-processing on these two quality metrics? What is the effect of the match-criterion and the number of images used in it? How would you rate the algorithms by (subjectively) evaluating the vector field? technische universiteit eindhoven W 116