In Arbitrary Downsizing Transcoding

advertisement
Relation Between Video Bitrate And Frame Size
In Arbitrary Downsizing Transcoding
HAIWEI SUN, YAPPENG TAN and YONGQING LIANG
School of Electrical & Electronic Engineering
Nanyang Technological University
Singapore
pg01533119@ntu.edu.sg
Abstract: - Video downsizing transcoding is a common technique for adapting the bitrate or spatial/temporal
resolution of a compressed video to suit different transmission bandwidths or receiving devices. In arbitrary
downsizing video transcoding [1], although the transcoder can adjust the size of a pre-coded video to meet the
requirements of different transmission channels or receiving devices, there is however no model to illustrate the
relation between bitrate and frame size. In this paper, we propose a new method to model the relation between
bitrate and frame size under low bitrate condition. With this new method, we can re-estimate the bitrate
according to required frame size or select a suitable frame size for a given target bitrate, while maintaining good
video quality. Experimental results are presented to demonstrate the effectiveness of the proposed video
transcoding methods.
Key-Words: - Arbitrary downsizing transcoding, bitrate, frame size
1 Introduction
Video transcoding is a common technique for
adapting the bitrate or spatial/temporal resolution of a
compressed video to suit different transmission
bandwidths or receiving devices. By using arbitrary
downsizing video transcoding technique [1] which
can downsize a pre-coded video by an arbitrary
factor, we have more choices in adjusting the spatial
resolution of a pre-compressed video according to the
available channel bandwidth. Given a target bitrate,
however, it is not clear that which educed size can
provide optimal video quality. Moreover, if users
want to select the display size, the bitrate what is
required for the size is also not clear. Thus, we focus
on modelling the relation between the frame size of
the transcoded video and its required bitrate. With the
relation between bitrate and frame size, we can apply
it into two applications as follows:
1) If the available channel bandwidth is
insufficient, we can select a reduced frame size or the
transcoded video to meet the available bandwidth.
2) If users want to change frame size during
video transmission, we can estimate a suitable bitrate
what is required to transcode the video.
In this paper, we focus on developing an efficient
model to build the relation between video bitrate and
frame sizes. In order to estimate the relation between
bitrate and frame size, we want to build a model to
estimate the required bitrate given a precoded video
with its frame size and bitrate, while maintaining
video quality or distortion from precoded video. To
simplify the estimation of the relation, the same mean
quantization parameter Q for all macroblocks is used
in both precoded video and downsizing transcoded
video. Since we care more for downsizing
transcoding under low bitrate, we set Q as 32 and
frame rate as 10 to stimulate low bitrate case in our
experiments. Six standard test videos, ``Tempete",
``Foreman", ``Stefan", ``Mobile & Calendar (MC)",
``Mother & Daughter (MD)", and ``News" are used
in our experiments.
2 Video Bitrate And Frame Size
The total number of bits allocated for a frame can be
estimated by the total number of bits used to code
motion vectors (RMV), frame residue (Rr), overhead
(RH) as follows [3].
(1)
R  R MV  Rr  RH
where parameter RH in our experiments is treated as
bits allocated to express layer information, including
bits allocated for picture layer, bits allocated for
macroblock layer, bits allocated for block layer, etc
[6]. To estimate the relation between bitrate and
frame size, we consider RMV, Rr, RH separately.
2.1 Bitrate For Motion Vector
Considering the relation between bitrate for motion
vector and frame size, we notice that papers [3,4,5]
present methods to estimate motion vector rate, or
bits/pixel for motion vector as follows:
RM
 4 e2 B  1
1 

  2 log  2 ln  
 2 log2 
2 
V
~
2

a
   B
B
1
where parameters

2
V
(2)
~ are the average
and a
variance and average correlation coefficient of
motion vectors for one frame. Parameter  is the
accuracy of motion vectors. Parameter B is the size of
motion compensated block for frame residue. So the
total number of bits for motion vectors RMV within a
frame can be expressed as the total number of pixels
times the motion vector rate. Thus, we obtain an
expression for RMV as:
(3)
R MV  S  A  R M
where parameter S is the total number of
macroblocks in a frame. Parameter A is the total
number of pixels in a macroblock (i.e., A=162). Thus,
S×A stands for the total number of pixels in one
frame. Moreover, after a further analysis, we ignore
the second part on the right of the equation (2) and
simplify the expression for RMV. The reason is
illustrated as below. As for the first part on the right
of the equation (2), if we fix B as 8 for the size of
block,  as 0.5 for half pixel resolution, we obtain a
approximation of the first part:
 4 e2 B  1


log2 947
2 log2 
2  
   64
B
1

 2  1  1
2
0.002  V
log
2 log2  V ln ~   
2



a  64
B
R MV
R

~ is set as 0.998 [3]. We calculate the value of
where a
2
 V by measuring the variance for the x and y
components of the block (8 × 8 pixels) motion
vectors and average the two variance [3]. If the value
of 0.002  V2 in (5) is comparable with 947 in (4), we
should not ignore the second part. However,
parameter  V2 is not large enough. For example, we
calculate the the value of  V2 from ``Foreman". The
results from different frame size are:
frame size
variance
35.1
336×272
31.7
304×240
25.7
256×208
21.3
224×176
15.2
176×144
7.9
128×96
Table 1. Variance for motion vector in "Foreman"
Table 1 show that the variance is no more than 40
in ``Foreman". So the value of 0.002  V2 is less than
0.08 which is far less than 947. Moreover, video
``Foreman" includes a lot of motions compared to
other videos. In other word, the variance of motion
vectors from other videos should not be much larger
than ``Foreman". Thus, we ignored the second part
compared with the first part in (2). We simplify
Equation (2) to (6) by using the same block size B and
(6)
pre
MV
 S pre 
(7)
where, parameter Spre is the total number of
macroblocks in a frame in the precoded video.
Combining (6) and (7), we obtain an expression
for the total number of bits allocated for motion
vectors in a frame:
S
pre
(8)
R MV  R MV 
S
pre
Assuming the the size of the precoded video is CIF,
we compare the results estimated from Equation (8)
with the results obtained from Fullsearch method.
Results are shown as follows.
frame size
(5)
 1
 4 e2 B  

 S  A 2 log2 
2    S
   
 B
From (6), we obtain the total number of bits for
motion vectors in a frame in the precoded video:
(4)
while the second part in (2):
1
resolution  :
bitrate from fullsearch for motion vector
foreman
news
md
336×272
5363.06
1812.50
2194.28
304×240
4419.66
1399.90
1666.84
256×208
3326.04
1048.16
1191.12
224×176
2510.74
737.04
860.26
176×144
1629.34
482.12
508.34
761.74
224.78
222.48
128×96
Table2. Bitrate from fullsearch for motion vector
The difference of the results for motion vectors is as
follows.
frame size
bitrate from (8) for motion vector
foreman
news
md
336×272
35.48
49.97
-24.02
304×240
166.55
-7.17
-104.08
256×208
222.02
21.25
-101.34
224×176
212.57
-23.27
-96.66
176×144
151.95
-6.65
-106.82
45.43
-12.20
-75.78
128×96
Table3. Difference between bitrate from fullsearch and bitrate
from (8) for motion vector
From Table 3, we find all values of absolute
difference is less than 300 bits for motion vector in a
frame. So the overall difference can be acceptable.
Thus, we apply (8) to estimate bits for coding the
motion vectors.
2.2 Bitrate For Frame Residue
With the assumption that DCT coefficients of frame
residue are approximately uncorrelated and are
Laplacian distributed [7], papers [3,4] present an
expression of frame residue rate, or bits/pixel for
frame residue:


Q 
 H 1
H Q  

 H 2 Q 

2

1
2 2  
log2  e 2 
2
Q

2
e 
ln 2 Q2
if

if

2
Q
2

2
Q
2

1
2e
(9)
1
2e
So the total number of bits for frame residue (Rr )
within a frame can be expressed as the total number
of pixels times the frame residue rate. We select
H2(Q) for low bitrate case [5]. We obtain an
expression for Rr:
e 
 S  A  H 2 Q   S  A 
ln 2 Q2
bitrate from fullsearch for motion vector
frame size
foreman
news
md
336×272
2968.08
2564.42
1272.46
304×240
2635.40
2283.50
1130.24
256×208
2329.08
1953.94
966.74
224×176
1992.24
1580.26
810.08
176×144
1480.10
1048.04
561.20
965.48
542.64
307.76
128×96
Table4. Bitrate from fullsearch for frame residue
The difference of the results for frame residue is as
follows.
2
R
r

where parameter
2
(10)
is the average variance for a
frame in the transcoded video. Parameter S is the total
number of macroblocks within a frame in the
transcoded video. Parameter A is the total number of
pixels in a macroblock. However, Equation (9) is
obtained before Zig-Zag scan and run length coding.
This will make the estimation for frame residue rate
higher when (10) is used to estimate the total number
of bits for frame residue in a frame. So we revise
2
Equation (10), by keeping 
in (10) as a relative
Q
2
part for frame residue rate, to:
R  SK
r
(11)
2
From (11), we obtain the total number of bits for
frame residue in a frame in the precoded video:
R
r
 S pre

K
2
pre
2
Q
(12)
where, parameter Spre is the total number of
macroblocks within a frame in the precoded video.
Parameter  2pre is the average variance within a
frame in the precoded video. Combining (11) and
(12), we obtain an expression for the total number of
bits allocated for frame residue in a frame as follows:
 S
 S
2

R R
r
pre
r

2
pre
foreman
news
md
336×272
-17.60
27.45
97.98
304×240
58.08
148.05
200.27
256×208
41.71
212.56
287.92
224×176
-7.98
275.79
291.31
176×144
-43.32
221.11
195.27
-86.75
135.02
113.49
128×96
Table5. Difference between bitrate from fullsearch and bitrate
from (13) for frame residue
From Table 5, the absolute difference for most
value (90%) is less than 300 bits for frame residue
in a frame. The overall difference can be acceptable.
Thus, we apply Equation (13) to estimate bits coded
for frame residue.
2
Q
pre
bitrate from (8) for motion vector
frame size
(13)
pre
Assuming the the size of the precoded video is
CIF, we compare the results estimated from Equation
(13) with the results obtained from Fullsearch
method. Results are shown as follows.
2.3
Bitrate For Overhead
Since most bits for overhead is allocated for
macroblock and block layer [6], the total number of
macroblocks in a frame will affect the overhead
mostly. So we use the function of frame size (S) to
build the model for the relation for overhead and
frame size. To further simplify this model, we
assume the average number of bits for overhead per
macroblock (Rh) will not change much after
downsizing. Then we set an expression of overhead
in a frame:
S h
(14)
H
From (14), we obtain the total number of bits for
overhead within a frame in the precoded video:
R
R
pre
H
R
 S pre  Rh
(15)
where, parameter Spre is the total number of
macroblocks within a frame in the precoded video.
R
pre
Parameter
is the average number of bits for
H
overhead within a frame in the precoded video.
Combining (14) and (15), we obtain an expression
for the total number of bits allocated for overhead in
a frame:
R
2.4
H

R
pre
H
S

S
(16)
pre
Bitrate And Frame Size
Combining (8), (13) and (16), we obtain the overall
bitrate (R) model for the relation between video
bitrate and frame size:
2

S  pre
pre
pre
(17)
R


S pre  R MV Rr 2pre R H 


From Table 7, we find the overall absolute
difference from our proposed method (17) is smaller
than the overall absolute difference from simple
linear model (18) for a frame. In other word, using
our proposed method will get a more accurate model
to build the relation between bitrate and frame sizes.
30000
pre
Assuming the size of the precoded video is CIF,
we compare the results estimated from Equation (17)
with the results obtained from the simple linear
model. Results are shown in Table 6,7 and Figure 1.
number of
MBs
Overall bitrate from
fullsearch
stefan
news
357
24173.73
6133.89
285
20414.17
5122.62
208
16384.07
4168.61
154
12993.10
3209.82
99
8438.99
2180.23
48
3939.30
1130.09
Table6. Overall bitrate from fullsearch
number of
MBs
proposed (17)
te m p e te - p r o
s te fa n - p r o
n e w s -p ro
te m p e te - l
s te fa n - l
n e w s -l
15000
Linear (18)
stefan
news
stefan
news
357
202.05
-72.57
951.88
95.50
285
314.18
78.66
-355.90
-302.06
208
90.32
19.31
-1745.05
-650.44
154
-204.51
142.86
-2154.60
-605.02
99
-212.44
51.04
-1471.38
-505.72
48
-66.20
14.05
-561.06
-318.21
Table7. Difference between overall bitrate from fullsearch and
bitrate from proposed method (17) and linear method (18)
10000
5000
bitrate
Simple linear model by connecting precoded bitrate
and zero bitrate is used to make a comparison with
our proposed method (17). Here, the simple linear
model is assumed to be the model of relation between
bitrate and frame sizes before our investigation in this
paper.
The bitrate (Rl) in the linear model can be expressed
as follow.
S
(18)
Rl  R 
S
s te fa n
news
25000
20000
3 Experimental Results
te m p e te
n u m b e r o f MB
0
0
50
100
150
200
250
300
350
400
450
Figure1. Overall bitrate from fullsearch, proposed method (17)
and linear method (18)
4 Conclusion
We develop a model that can estimate the
relationship between bitrate and video frame sizes in
arbitrary downsizing transcoding. Using this model,
we can choose a bitrate according to the frame size
required by users or we can select a suitable frame
size for a given bitrate, while maintaining a good
video quality.
References:
[1] L. P. Chau, Y. Q. Liang, and Y. P. Tan, ``Motion
Vector Re-Estimation for Fractional-Scale Video
Transcoding'', International Conference on
Information Technology: Coding and Computing,
pp. 212-215, April 2001.
[2] G. Shen, B. Zeng, Y.-Q. Zhang and M.L. Liou,
``Transcoder with arbitrarily resizing capability'',
IEEE International Symposium on Circuits and
Systems, pp. 25-28, May 2001.
[3] Jordi Ribas-Corbera, and David L. Neuhoff,
``Optimizing Block Size in Motion-Compensated
Video Coding'', Journal of Electronic Imaging,
pp. 155-165, 2001.
[4] Jordi Ribas Corbera, and David L. Neuhoff,
``Optimal Bit Allocations for Lossless Video
Codecs: Motion Vectors VS. Difference Frames'',
International Conference on Image Processing,
IEEE Proceedings, VOL. 3, 1995.
[5] Chia-Wen Lin and Te-Jen Liou, and Yung-Chang
Chen, ``Dynamic Rate Control in Multipoint
Video Transcoding'', International Sysmposium
on Circuit and System, IEEE Proceedings, pp.
II-17-II-20, 2000.
[6]
ITU-Telecommunications
Standardization
Sector, ``Video Coding for Low Bit Rate
Communication'', pp. 9-18, 1996.
[7] Jordi Ribas-Corbera, and Shawmin Lei, ``Rate
Control in DCT Video Coding for Low-Delay
Communcations'', IEEE Trans. on Circuits and
Systems for Video Technology, VOL. 9, pp.
172-185, May 1999.
Download