Relation Between Video Bitrate And Frame Size In Arbitrary Downsizing Transcoding HAIWEI SUN, YAPPENG TAN and YONGQING LIANG School of Electrical & Electronic Engineering Nanyang Technological University Singapore pg01533119@ntu.edu.sg Abstract: - Video downsizing transcoding is a common technique for adapting the bitrate or spatial/temporal resolution of a compressed video to suit different transmission bandwidths or receiving devices. In arbitrary downsizing video transcoding [1], although the transcoder can adjust the size of a pre-coded video to meet the requirements of different transmission channels or receiving devices, there is however no model to illustrate the relation between bitrate and frame size. In this paper, we propose a new method to model the relation between bitrate and frame size under low bitrate condition. With this new method, we can re-estimate the bitrate according to required frame size or select a suitable frame size for a given target bitrate, while maintaining good video quality. Experimental results are presented to demonstrate the effectiveness of the proposed video transcoding methods. Key-Words: - Arbitrary downsizing transcoding, bitrate, frame size 1 Introduction Video transcoding is a common technique for adapting the bitrate or spatial/temporal resolution of a compressed video to suit different transmission bandwidths or receiving devices. By using arbitrary downsizing video transcoding technique [1] which can downsize a pre-coded video by an arbitrary factor, we have more choices in adjusting the spatial resolution of a pre-compressed video according to the available channel bandwidth. Given a target bitrate, however, it is not clear that which educed size can provide optimal video quality. Moreover, if users want to select the display size, the bitrate what is required for the size is also not clear. Thus, we focus on modelling the relation between the frame size of the transcoded video and its required bitrate. With the relation between bitrate and frame size, we can apply it into two applications as follows: 1) If the available channel bandwidth is insufficient, we can select a reduced frame size or the transcoded video to meet the available bandwidth. 2) If users want to change frame size during video transmission, we can estimate a suitable bitrate what is required to transcode the video. In this paper, we focus on developing an efficient model to build the relation between video bitrate and frame sizes. In order to estimate the relation between bitrate and frame size, we want to build a model to estimate the required bitrate given a precoded video with its frame size and bitrate, while maintaining video quality or distortion from precoded video. To simplify the estimation of the relation, the same mean quantization parameter Q for all macroblocks is used in both precoded video and downsizing transcoded video. Since we care more for downsizing transcoding under low bitrate, we set Q as 32 and frame rate as 10 to stimulate low bitrate case in our experiments. Six standard test videos, ``Tempete", ``Foreman", ``Stefan", ``Mobile & Calendar (MC)", ``Mother & Daughter (MD)", and ``News" are used in our experiments. 2 Video Bitrate And Frame Size The total number of bits allocated for a frame can be estimated by the total number of bits used to code motion vectors (RMV), frame residue (Rr), overhead (RH) as follows [3]. (1) R R MV Rr RH where parameter RH in our experiments is treated as bits allocated to express layer information, including bits allocated for picture layer, bits allocated for macroblock layer, bits allocated for block layer, etc [6]. To estimate the relation between bitrate and frame size, we consider RMV, Rr, RH separately. 2.1 Bitrate For Motion Vector Considering the relation between bitrate for motion vector and frame size, we notice that papers [3,4,5] present methods to estimate motion vector rate, or bits/pixel for motion vector as follows: RM 4 e2 B 1 1 2 log 2 ln 2 log2 2 V ~ 2 a B B 1 where parameters 2 V (2) ~ are the average and a variance and average correlation coefficient of motion vectors for one frame. Parameter is the accuracy of motion vectors. Parameter B is the size of motion compensated block for frame residue. So the total number of bits for motion vectors RMV within a frame can be expressed as the total number of pixels times the motion vector rate. Thus, we obtain an expression for RMV as: (3) R MV S A R M where parameter S is the total number of macroblocks in a frame. Parameter A is the total number of pixels in a macroblock (i.e., A=162). Thus, S×A stands for the total number of pixels in one frame. Moreover, after a further analysis, we ignore the second part on the right of the equation (2) and simplify the expression for RMV. The reason is illustrated as below. As for the first part on the right of the equation (2), if we fix B as 8 for the size of block, as 0.5 for half pixel resolution, we obtain a approximation of the first part: 4 e2 B 1 log2 947 2 log2 2 64 B 1 2 1 1 2 0.002 V log 2 log2 V ln ~ 2 a 64 B R MV R ~ is set as 0.998 [3]. We calculate the value of where a 2 V by measuring the variance for the x and y components of the block (8 × 8 pixels) motion vectors and average the two variance [3]. If the value of 0.002 V2 in (5) is comparable with 947 in (4), we should not ignore the second part. However, parameter V2 is not large enough. For example, we calculate the the value of V2 from ``Foreman". The results from different frame size are: frame size variance 35.1 336×272 31.7 304×240 25.7 256×208 21.3 224×176 15.2 176×144 7.9 128×96 Table 1. Variance for motion vector in "Foreman" Table 1 show that the variance is no more than 40 in ``Foreman". So the value of 0.002 V2 is less than 0.08 which is far less than 947. Moreover, video ``Foreman" includes a lot of motions compared to other videos. In other word, the variance of motion vectors from other videos should not be much larger than ``Foreman". Thus, we ignored the second part compared with the first part in (2). We simplify Equation (2) to (6) by using the same block size B and (6) pre MV S pre (7) where, parameter Spre is the total number of macroblocks in a frame in the precoded video. Combining (6) and (7), we obtain an expression for the total number of bits allocated for motion vectors in a frame: S pre (8) R MV R MV S pre Assuming the the size of the precoded video is CIF, we compare the results estimated from Equation (8) with the results obtained from Fullsearch method. Results are shown as follows. frame size (5) 1 4 e2 B S A 2 log2 2 S B From (6), we obtain the total number of bits for motion vectors in a frame in the precoded video: (4) while the second part in (2): 1 resolution : bitrate from fullsearch for motion vector foreman news md 336×272 5363.06 1812.50 2194.28 304×240 4419.66 1399.90 1666.84 256×208 3326.04 1048.16 1191.12 224×176 2510.74 737.04 860.26 176×144 1629.34 482.12 508.34 761.74 224.78 222.48 128×96 Table2. Bitrate from fullsearch for motion vector The difference of the results for motion vectors is as follows. frame size bitrate from (8) for motion vector foreman news md 336×272 35.48 49.97 -24.02 304×240 166.55 -7.17 -104.08 256×208 222.02 21.25 -101.34 224×176 212.57 -23.27 -96.66 176×144 151.95 -6.65 -106.82 45.43 -12.20 -75.78 128×96 Table3. Difference between bitrate from fullsearch and bitrate from (8) for motion vector From Table 3, we find all values of absolute difference is less than 300 bits for motion vector in a frame. So the overall difference can be acceptable. Thus, we apply (8) to estimate bits for coding the motion vectors. 2.2 Bitrate For Frame Residue With the assumption that DCT coefficients of frame residue are approximately uncorrelated and are Laplacian distributed [7], papers [3,4] present an expression of frame residue rate, or bits/pixel for frame residue: Q H 1 H Q H 2 Q 2 1 2 2 log2 e 2 2 Q 2 e ln 2 Q2 if if 2 Q 2 2 Q 2 1 2e (9) 1 2e So the total number of bits for frame residue (Rr ) within a frame can be expressed as the total number of pixels times the frame residue rate. We select H2(Q) for low bitrate case [5]. We obtain an expression for Rr: e S A H 2 Q S A ln 2 Q2 bitrate from fullsearch for motion vector frame size foreman news md 336×272 2968.08 2564.42 1272.46 304×240 2635.40 2283.50 1130.24 256×208 2329.08 1953.94 966.74 224×176 1992.24 1580.26 810.08 176×144 1480.10 1048.04 561.20 965.48 542.64 307.76 128×96 Table4. Bitrate from fullsearch for frame residue The difference of the results for frame residue is as follows. 2 R r where parameter 2 (10) is the average variance for a frame in the transcoded video. Parameter S is the total number of macroblocks within a frame in the transcoded video. Parameter A is the total number of pixels in a macroblock. However, Equation (9) is obtained before Zig-Zag scan and run length coding. This will make the estimation for frame residue rate higher when (10) is used to estimate the total number of bits for frame residue in a frame. So we revise 2 Equation (10), by keeping in (10) as a relative Q 2 part for frame residue rate, to: R SK r (11) 2 From (11), we obtain the total number of bits for frame residue in a frame in the precoded video: R r S pre K 2 pre 2 Q (12) where, parameter Spre is the total number of macroblocks within a frame in the precoded video. Parameter 2pre is the average variance within a frame in the precoded video. Combining (11) and (12), we obtain an expression for the total number of bits allocated for frame residue in a frame as follows: S S 2 R R r pre r 2 pre foreman news md 336×272 -17.60 27.45 97.98 304×240 58.08 148.05 200.27 256×208 41.71 212.56 287.92 224×176 -7.98 275.79 291.31 176×144 -43.32 221.11 195.27 -86.75 135.02 113.49 128×96 Table5. Difference between bitrate from fullsearch and bitrate from (13) for frame residue From Table 5, the absolute difference for most value (90%) is less than 300 bits for frame residue in a frame. The overall difference can be acceptable. Thus, we apply Equation (13) to estimate bits coded for frame residue. 2 Q pre bitrate from (8) for motion vector frame size (13) pre Assuming the the size of the precoded video is CIF, we compare the results estimated from Equation (13) with the results obtained from Fullsearch method. Results are shown as follows. 2.3 Bitrate For Overhead Since most bits for overhead is allocated for macroblock and block layer [6], the total number of macroblocks in a frame will affect the overhead mostly. So we use the function of frame size (S) to build the model for the relation for overhead and frame size. To further simplify this model, we assume the average number of bits for overhead per macroblock (Rh) will not change much after downsizing. Then we set an expression of overhead in a frame: S h (14) H From (14), we obtain the total number of bits for overhead within a frame in the precoded video: R R pre H R S pre Rh (15) where, parameter Spre is the total number of macroblocks within a frame in the precoded video. R pre Parameter is the average number of bits for H overhead within a frame in the precoded video. Combining (14) and (15), we obtain an expression for the total number of bits allocated for overhead in a frame: R 2.4 H R pre H S S (16) pre Bitrate And Frame Size Combining (8), (13) and (16), we obtain the overall bitrate (R) model for the relation between video bitrate and frame size: 2 S pre pre pre (17) R S pre R MV Rr 2pre R H From Table 7, we find the overall absolute difference from our proposed method (17) is smaller than the overall absolute difference from simple linear model (18) for a frame. In other word, using our proposed method will get a more accurate model to build the relation between bitrate and frame sizes. 30000 pre Assuming the size of the precoded video is CIF, we compare the results estimated from Equation (17) with the results obtained from the simple linear model. Results are shown in Table 6,7 and Figure 1. number of MBs Overall bitrate from fullsearch stefan news 357 24173.73 6133.89 285 20414.17 5122.62 208 16384.07 4168.61 154 12993.10 3209.82 99 8438.99 2180.23 48 3939.30 1130.09 Table6. Overall bitrate from fullsearch number of MBs proposed (17) te m p e te - p r o s te fa n - p r o n e w s -p ro te m p e te - l s te fa n - l n e w s -l 15000 Linear (18) stefan news stefan news 357 202.05 -72.57 951.88 95.50 285 314.18 78.66 -355.90 -302.06 208 90.32 19.31 -1745.05 -650.44 154 -204.51 142.86 -2154.60 -605.02 99 -212.44 51.04 -1471.38 -505.72 48 -66.20 14.05 -561.06 -318.21 Table7. Difference between overall bitrate from fullsearch and bitrate from proposed method (17) and linear method (18) 10000 5000 bitrate Simple linear model by connecting precoded bitrate and zero bitrate is used to make a comparison with our proposed method (17). Here, the simple linear model is assumed to be the model of relation between bitrate and frame sizes before our investigation in this paper. The bitrate (Rl) in the linear model can be expressed as follow. S (18) Rl R S s te fa n news 25000 20000 3 Experimental Results te m p e te n u m b e r o f MB 0 0 50 100 150 200 250 300 350 400 450 Figure1. Overall bitrate from fullsearch, proposed method (17) and linear method (18) 4 Conclusion We develop a model that can estimate the relationship between bitrate and video frame sizes in arbitrary downsizing transcoding. Using this model, we can choose a bitrate according to the frame size required by users or we can select a suitable frame size for a given bitrate, while maintaining a good video quality. References: [1] L. P. Chau, Y. Q. Liang, and Y. P. Tan, ``Motion Vector Re-Estimation for Fractional-Scale Video Transcoding'', International Conference on Information Technology: Coding and Computing, pp. 212-215, April 2001. [2] G. Shen, B. Zeng, Y.-Q. Zhang and M.L. Liou, ``Transcoder with arbitrarily resizing capability'', IEEE International Symposium on Circuits and Systems, pp. 25-28, May 2001. [3] Jordi Ribas-Corbera, and David L. Neuhoff, ``Optimizing Block Size in Motion-Compensated Video Coding'', Journal of Electronic Imaging, pp. 155-165, 2001. [4] Jordi Ribas Corbera, and David L. Neuhoff, ``Optimal Bit Allocations for Lossless Video Codecs: Motion Vectors VS. Difference Frames'', International Conference on Image Processing, IEEE Proceedings, VOL. 3, 1995. [5] Chia-Wen Lin and Te-Jen Liou, and Yung-Chang Chen, ``Dynamic Rate Control in Multipoint Video Transcoding'', International Sysmposium on Circuit and System, IEEE Proceedings, pp. II-17-II-20, 2000. [6] ITU-Telecommunications Standardization Sector, ``Video Coding for Low Bit Rate Communication'', pp. 9-18, 1996. [7] Jordi Ribas-Corbera, and Shawmin Lei, ``Rate Control in DCT Video Coding for Low-Delay Communcations'', IEEE Trans. on Circuits and Systems for Video Technology, VOL. 9, pp. 172-185, May 1999.