Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6) 20th Meeting: Klagenfurt, Austria, 15–21 July, 2006 Document: JVT-T207 Filename: JVT-T207.doc Title: Common Test Conditions for Multiview Video Coding Status: Output Document from JVT Purpose: Test Condition Author(s) or Contact(s): Yeping Su Two Independence Way, Suite 300 Princeton, NJ 08536, USA Anthony Vetro Mitsubishi Electric Research Labs 201 Broadway, 8th Floor Cambridge, MA 02474 USA Aljoscha Smolic Fraunhofer-Institut für Nachrichtentechnik, Heinrich-HertzInstitut (HHI) Einsteinufer 37, 10587 Berlin Source: Tel: Email: +1 609 987 7319 yeping.su@thomson.net +1-617-621-7591 avetro@merl.com +49 30 31002-232 smolic@hhi.de AHG on MVC _____________________________ Reference Software: All proposals will be compared with the JSVM-based reference [JVT-Txxx], unless otherwise specified in the CE description. Objective Measurements: The Average PSNR performance over all frames in all views must all be reported. Test Sequences: The test sequences can be downloaded from the following ftp sites. If a username and password is not given, then anonymous login can be used to access the site. Validated camera parameters for all test data are available on the same ftp site. MERL ftp://ftp.merl.com/pub/avetro/mvc-testseq HHI https://www.3dtv-research.org/3dav_CfP_FhG_HHI/ KDDI ftp://ftp.ne.jp/KDDI/multiview Microsoft Research File: 533575059 Page: 1 Date Saved: 2016-03-06 http://www.research.microsoft.com/vision/ImageBasedRealities/3DVideoDownload/ A conversion program for bmp to yuv can be downloaded at http://www.ldv.ei.tum.de/page50 Tanimoto Lab http://www.tanimoto.nuee.nagoya-u.ac.jp/ usr: mpegguest pwd: ftvdata Table 1 describes the properties of the various test data sets. These data sets vary in the number of cameras/views, the arrangement of the cameras, distance between cameras, as well as properties of the images in terms of image size and frame rate. All sequences are provided in YUV 4:2:0 planar format, except for the Microsoft Research data that are available in BMP. At the above TU Munich web page a program can be found that shall be used for conversion to YUV format. Note that it has been found that the camera parameters provided for the Race1 data set may not satisfy highest requirements on accuracy (see WG11/m12546 for details). Nevertheless it was decided to keep it in the test data set. Table 1. Test Data Sets Data Set MERL Sequences Ballroom, Exit HHI Uli KDDI Race1 KDDI Flamenco2 Microsoft Breakdancers Nagoya University / Tanimoto Lab Rena Akko&Kayo Image Property1 640x480, 25fps (rectified) 1024x768, 25fps (non-rectified) 640x480, 30fps (non-rectified) 640x480, 30fps (non-rectified) 1024x768, 15fps (non-rectified) 640x480, 30fps (rectified) 640x480, 30fps (non-rectified) Camera Arrangement 8 cameras with 20cm spacing; 1D/parallel 8 cameras with 20cm spacing; 1D/parallel convergent 8 cameras with 20cm spacing; 1D/parallel 5 cameras with 20cm spacing; 2D/parallel (Cross) 8 cameras with 20cm spacing; 1D/arc 100 cameras with 5cm spacing; 1D/parallel 100 cameras with 5cm horizontal and 20 cm vertical spacing; 2D array Coding Scenarios and Conditions: The coding conditions are provided in Table 2. The average bit-rate per camera view is given, which implies a total rate over all camera views. Alternatively, fixed QP settings may be used as provided in Table 3. These settings, when used for the reference method, will result in the bitrates targeted in Table 2. In case a single measure is desirable for coding efficiency comparisons, the Bjontegaard measure in [1] for calculating average PSNR/bitrate differences between RD-Curves may be used. PSNR/bitrate are assumed to be obtained for BasisQP = 22, 27, 32, 37. This set of BasisQP covers the range of fidelity for all test sequences specified in Table 3. 1 Rectified means that the images are properly registered by applying a homography matrix. File: 533575059 Page: 2 Date Saved: 2016-03-06 Table 2. Coding Conditions Test Sequence Temporal Random Access2 0.5 sec 0.5 sec 0.5 sec 0.5 sec 0.5 sec 1.0 sec 0.5 sec 0.5 sec Ballroom Exit Uli Race1 Flamenco2 Breakdancers Rena [16 center views] Akko&Kayo [3v*5h views] Bit-rates [average kbps/camera] 256 192 768 384 256 256 128 192 384 256 1536 512 384 512 256 384 512 384 2048 768 512 1024 512 768 Table 3a. Fix QP Settings Test Sequence Ballroom Exit Uli Race1 Flamenco2 Breakdancers Rena Akko&Kayo 34 31 36 28 34 31 33 36 BasisQP 31 29 30 26 30 26 28 29 29 26 28 24 28 22 23 24 Table 3b. Delta QP Values DeltaLayer0Quant DeltaLayer1Quant DeltaLayer2Quant DeltaLayer3Quant DeltaLayer4Quant DeltaLayer5Quant Delta QP Values 0 3 4 5 6 7 Only a subset of cameras will be used from the Rena and the Akko&Kayo sequences since processing all 100 views would be too much effort. To benefit from the high spatial density, the 16 center views will be used from Rena, and a 2D array of size 3x5 in vertical and horizontal direction will be selected from Akko&Kayo. On the ftp site above a link will point to exactly these views to ensure that the right subset is used. Reference [1]. G. Bjontegaard, “Calculation of average PSNR differences between RD-Curves”, document VCEG-M33, 13th VCEG meeting, Austin TX, Mar’01 2 Temporal random access is imposed for fair comparison with the AVC anchors. No constraints are put on view random access to allow for flexibility in different proposals. We are aware that this could affect coding efficiency comparisons. File: 533575059 Page: 3 Date Saved: 2016-03-06