Mode Decision and Fast Motion Estimation in H.264 K.-C. Yang Qionghai Dai, Dongdong Zhu and Rong Ding,” FAST MODE DECISION FOR INTER PREDICTION IN H.264,” ICIP 2004 Zhi Zhou and Ming-Ting Sun,” FAST MACROBLOCK INTER MODE DECISION AND MOTION ESTIMATION FOR H.264/MPEG-4 AVC,” ICIP 2004 Introduction • Mode decision 16*16 16*8 8*16 8*8 8*4 4*8 4*4 • Motion estimation strategies – – – – Full Search (FS) Three Step Search (TSS) Diamond Search (DS) ? FAST MODE DECISION FOR INTER PREDICTION IN H.264 Qionghai Dai, Dongdong Zhu and Rong Ding Tsinghua University, China ICIP 2004 Outline • • • • • • Framework Down Sampling Complex Motion Edge Detector Motion Estimation Strategy Simulation Results Framework Original image Down sampling Current image Reference image Mode Preencode Reference image Motion vector Encode with Output original image Down Sampling(1) • 2:1 down sampling – 16x168x8, 16x88x4, 8x164x8, 8x84x4 – 8x4?, 4x8?, 4x4? (Complex motion inside) Down Sampling(2) • ME and mode decision on down sampled image – DIRECT, 8x8, 8x4, 4x8, 4x4 in the down sampled image • Remember the top 2 best mode – Down sampled image Candidate modes in the Real image DIRECT DIRECT, 16x16 8x8 8x8, 16x16 8x4 8x4, 16x8 4x8 4x8, 8x16 4x4 Complex motion Complex Motion in Smaller Blocks • Complex motion – Edge detection on the original image • Use Sobel operator • Horizontal edges • Vertical edges • Others , Edge Detector(1) • Edge vector – -1 -2 -1 1 2 1 dxi,j -1 -2 -1 1 2 1 dyi,j dyi,j dxi,j – Ampij = |dxij|+|dyij| – Angij = (180°/π) arctan(dyij/dxij) – Histogram(k) = ∑(i,j)SET(k) Ampij 90° • SET(k) = all pels s.t. Angijak – a1 = [-22.25°, 22.25°] (horizontal) – a2 = (-90°,-67.5°)(67.5°,90°) (vertical) – a3 = o.w. a2 a3 0° a1 a3 a2 -90° Edge Detector(2) • Histogram(1) > 2Histogram(2) and Histogram(1) > 2Histogram(3) – 8x4 mode • Histogram(2) > 2Histogram(1) and Histogram(2) > 2Histogram(3) – 4x8 mode • O.W. – 4x4 and 8x8 mode 1: Horizontal edges 2: Vertical edges 3: others Motion Estimation Strategy • MVFAST motion estimation 1. 2. 3. Initial motion vectors • • S1 : (0, 0), MVs of its neighbors S2 : scaled MVs obtained from pre-encoding process • T1 can be the number of pels of the examined block type • L TH1 (small motion) • TH1 < L TH2 (medium motion) If min SAD for a MV in S2 is smaller than T1, STOP ELSE, let L be maximum amplitude of MVs in S1 • – Small diamond pattern – Large diamond pattern – – Use MV with minimum SAD from initial MVs Small diamond is then used L > TH2 (large motion) Simulation Results • Compare with – JM61 with MVFAST algorithm • Config 1. All 7 inter block modes • Config 2. 16x16 inter block only Sequence Max PSNR Change(db) Avg Time Saving(%) Config 1 Config 2 Config 1 Config 2 Stefan -0.01 0.91 42 2 Foreman -0.18 0.73 47 -1 Paris -0.18 1.01 53 0 Simulation Results Stefan FAST MACROBLOCK INTER MODE DECISION AND MOTION ESTIMATION FOR H.264/MPEG-4 AVC Zhi Zhou and Ming-Ting Sun Department of Electrical Engineering, University of Washington, Seattle ICIP 2004 Outline • Diversity-Based Fast Block Motion Estimation • Fast Variable Block-Size Motion Estimation Algorithms Based on Merge And Split Procedures for H.264/MPEG-4 AVC • Framework of Fast Mode Decision And ME • Simulation Results • Conclusion Relative Work • DIVERSITY-BASED FAST BLOCK MOTION ESTIMATION ICME 2003 • FAST VARIABLE BLOCK-SIZE MOTION ESTIMATION ALGORITHMS BASED ON MERGE AND SPLIT PROCEDURES FOR H.264/MPEG-4 AVC ISCAS 2004 Diversity-Based Fast Block Motion Estimation a b a b c d c d a b a b 4. ADSS ... 2. 3. Perform DS with {a} and {d}, and denote the two obtained MV v1 and v2. If (v1 == v2) Return v1. If (|v1-v2|<TH) Use DS for {b} and {c}. Else use TSS for {b} and {c}. Check v1, v2, v3, and v4. Return the MV with minimum SAD. ... 1. ... Adaptive diversity search strategy (ADSS) ... • c d c d ... ... ... ... CIF_Foreman 68% v1 and v2 are the same 96% of them are final MVs TH = 2 DS FS MSE/ pel Search pels/MB MSE/ pel Search pels/MB MSE/ pel Search pels/MB Akiyo 3.93 7.35 3.94 12.28 3.92 869.3 Foreman 36.46 13.89 33.68 16.34 31.01 869.3 Tennis 121.5 14.40 132.9 17.18 101.5 869.3 Fast Variable Block-Size Motion Estimation Algorithms Based on Merge And Split Procedures for H.264/MPEG4 AVC • Motion vector merging and splitting A B C D E F G H I MVE = (MVA+MVB)/2 MVF = (MVC+MVD)/2 MVG = (MVA+MVC)/2 MVH = (MVB+MVD)/2 A D E MVI = (MVE+MVF+ MVG+MVH)/2 B C MVB = MVC = MVD = MVE = MVA MVF = (MVB+MVD)/2 F G MVG = (MVB+MVE)/2 H I ... ... • ADSS for initial blocks • Small Diamond Pattern for refinement Compare actual MV 16*16, 16*8, 8*16 8*4, 4*8 4*4 Error=0 Error=3 Error=0 Error=3 Error=0 Error=3 QCIF_Foreman 49.44% 96.62% 72.25% 97.30% 66.06% 98.20% CIF_Coast Guard 49.04% 97.83% 67.49% 97.00% 62.82% 96.90% CIF_Mobile 48.58% 93.52% 68.60% 94.51% 60.07% 93.71% Framework of Fast Mode Decision And ME Merge+ME 16x16 ME good Done 8*8 good Splitting+ME Splitting+ME ME Merge+ME Splitting+ME Done 16*16 ME Merge+ME good ME Merge+ME Predicted MV Splitting+ME good Splitting+ME Merge Done 4*4, 4*8, 8*4 Done 16*16, 16*8, 8*16 Done 8*8 Motion cost of 16*16 at (0, 0) or PMV < TH16x16? Yes Prediction MV No 8*8 ME by ADSS Yes Best mode=16*16, MV=(0,0) or PMV Best mode=8*8, MV=MV by ADSS A B No 0 modes 1 modes Four motion cost of 8*8 < TH8*8? No 8*16 and 16*8 ME by MV merging Compare motion cost Both are larger of 8*16, 16*8 with that of 8*8 than 8*8 Otherwise 8*8 sub-MB mode decision 16*16 ME by MV merging Four 8*8 sub-MBs done? Yes Choose best on from 16*16, 16*8 and 8*16 End C 4 modes 8*4 and 4*8 ME by MV splitting Both are larger than 8*8 Compare motion cost of 8*4, 4*8 with that of 8*8 4*4 ME by MV splitting Best mode=8*8 D 5 modes Otherwise Choose best on from 4*4, 8*4 and 4*8 E 6 modes End Threshold • TH16*16= (min cost of previous 20 16*16 blocks) + 600 • TH8*8= (min cost of previous 20 8*8 blocks) + 150 Simulation Results • • LCM-FFS: Low-Complexity Mode JM with Fast Full-Search LCM-MSS: Low-Complexity Mode JM with Merge-Split Search C D E Conclusion • A new method for mode decision and ME – – – – Block splitting and merging Diversity-based fast ME Thresholds Compare with JM with FFS • Save about ½ computation • Quality degradation for 0.1~0.4 dB