RECEIVER DRIVEN LAYERED MULTICAST WITH LAYER-AWARE FORWARD ERROR CORRECTION Cornelius Hellge1, Thomas Schierl1, and Thomas Wiegand1,2 {hellge|schierl|wiegand}@hhi.fraunhofer.de 1 Fraunhofer Institute for Telecommunications Heinrich-Hertz-Institut (HHI) Einsteinufer 37, D-10587 Berlin, Germany ABSTRACT A wide range of different capabilities and connection qualities typically characterizes receivers of mobile television services. Receiver driven layered multicast (RDLM) offers an efficient way for providing different capabilities over such a broadcast channel. Scalable video coding (SVC) allows for the transmission of multiple video qualities within one media stream. Using SVC generates a video bit stream with various inter layer dependencies due to references between the layers. This work proposes a layeraware forward error correction (L-FEC) approach in combination with SVC. L-FEC increases robustness of the more important layers by generating protection across layers following existing dependencies of the media stream. The L-FEC is integrated as an extension of a Raptor FEC implementation in a DVB-H broadcast system. It is shown by experimental results that L-FEC outperforms traditional UEP protection schemes. Index Terms— DVB-H, SVC, Raptor, RDLM, Layered Transmission 1. INTRODUCTION Digital video broadcasting for handhelds (DVB-H) [1] seems to become a popular solution for mobile broadcast. Due to the variety of different device capabilities, e.g. different display resolutions or computational power, transmitting only one video quality could be problematic due to extra computation like downscaling or transcoding at the battery-powered mobile devices. An approach similar to receiver driven layered multicast (RDLM) [2] in combination with the recently adopted SVC extension of H.264/AVC [3] offers an efficient way to provide multiple video signals over a broadcast channel as shown in previous work [4]. In such a mobile RDLM scenario, a client joining the broadcast service only requests the scalable layers, which provide either a signal that the device is capable or chooses to process. The transmission of multiple video signals using SVC is much more efficient in terms of bit-rate compared to simulcast transmission [5]. A mobile broadcast channel typically suffers from burst errors. With the additional delay constraints of the streaming service, the reliable transmission is still a big challenge in the mobile broadcast scenario. Since broadcast services only provide a unidirectional downlink channel, a possible solution to increase coverage is to use an additional forward error correction (FEC) at the link or application layer. FEC is applied in DVB-H at the link layer with the optional multi protocol encapsulation FEC (MPE- 2 Image Communications Group Technical University of Berlin Einsteinufer 17, D-10587 Berlin, Germany FEC) and using Raptor coding [7] as application layer FEC. Although MPE-FEC is intended to be used for streaming services and application layer FEC for file download, the application layer FEC offers more flexibility for media aware protection. Possible approaches are unequal error protection (UEP) [8], priority encoding transmission (PET) [9] or dependency aware UEP (DAUEP) as proposed in [10]. The aforementioned approaches do not take the existence of layers in the video stream and their multiple dimensions of dependencies into account. E.g., SVC allows up to three different scalability dimensions within one bit stream. Scalability in SVC can be applied to temporal, spatial and quality dimension. The proposed layer-aware FEC (L-FEC) approach generates redundancy symbols following existing dependencies in the scalable dimensions, i.e., redundancy symbols of layers of lower priority can be used to correct symbols of layers of higher priority. The proposed transmission scheme uses RDLM in combination with SVC to serve different device capabilities. Furthermore, we use the L-FEC as extension of the Raptor code defined in DVB-H as proposed in [14]. Simulation results show an increase in reliability of an SVC transmission in a mobile RDLM scenario. The rest of the paper is organized as follows. Section 2 gives a very brief overview of the SVC standard. Section 3 introduces the Raptor L-FEC defined in DVB-H and in section 4 we outline the proposed layer-aware FEC extension and show a simple example. In section 5 we apply the L-FEC to the Raptor defined in DVB-H and in 6 we show selected simulation results and conclude in section 7 with a summary. 2. SCALABLE VIDEO CODING The SVC design, which is an extension of the H.264/AVC video coding standard, can be classified as a layered video codec. An SVC bit-stream can be structured so that devices with different capabilities can decode parts of it that have a quality very similar to the case when the bit-stream for each device would be a singlelayer H.264/AVC bit-stream. In SVC, the hybrid video coding approach of motion-compensated transform coding of H.264/AVC is extended in a way that a wide range of spatio-temporal and quality scalability is achieved. The base layer (BL) is an H.264/AVC compliant bit-stream that ensures backwardcompatibility for existing receivers. The temporal scaling functionality of SVC for high delay configurations is typically based on a temporal decomposition using hierarchical bi-predictive pictures. The spatial scalability is achieved by different encoder loops with an over-sampled pyramid for each resolution. For details of SVC, see [3][6]. Using layered multicast, typically the redundancy symbols are generated separately for each layer. Due to the dependencies within the SVC bit-stream, lower priority layers and the associated redundancy symbols cannot be used without successfully decoded higher priority layers. I.e., if a higher priority layer is lost, even if it is fully received, all dependent layers including related redundancy become useless. The idea of L-FEC is, to follow media coding dependencies in the media stream for the generation of across layer protection. Using the proposed approach, protection of less important layers can be jointly used with protection of more important layers for recovering the SSs of all participating layers. A dependency path (DP) contains all referenced layers for decoding a particular frame in order of importance. Using the LFEC, all redundancy symbols in the same DP can be jointly used for error correction. Figure 1 sketches the L-FEC approach for one scalable dimension, e.g., temporal, spatial or SNR scalability. FEC L-1 FEC L-1 FEC L-1 ... ... ... ... ... ... ... Dependency Layer 2 FEC 2 Dependency Layer 1 FEC 1 FEC 1 FEC 1 Dependency Layer 0 FEC 0 FEC 0 FEC 0 Time = 0 FEC 2 Time = t FEC 2 ... ... D3 1 D2 1 ... D1 1 FEC Base layer 010 FEC 110 3 ... ... Base layer ... D2 1 D2 1 n sio en 3) m D i (D ... ... ... Base layer ... D1 2 D1 1 D1 1 FEC Base layer 100 FEC 200 Base layer ... Dimension 1 (D1) Fig.2: L-FEC generation over multiple dimensions In each dimension, several layers lDi are present where the arrows represent the dependencies. In this example, each layer depends on all layers of higher priority of the same dimension and partially on the layers of other dimensions. All redundancy symbols FEC lD1lD2lD3 are generated over all depending layers. The redundancy symbols within a particular dependency path can be jointly used for correcting all source symbols of that path. The BL, e.g., is included in all FEC symbols. Hence, there are multiple paths where redundancy symbols can be jointly used for correcting errors in the BL. The L-FEC approach allows for the joint use of symbols within multiple dimensions. I.e. for a single dimension, FEC 200, FEC 100, and FEC 000 can be jointly used for decoding source symbols in dimension D1 up to the maximum layer used for FEC generation. In the multidimensional case, FEC 111, FEC 110, FEC 010, FEC 100, and FEC 000 can be jointly used for correcting errors up to the maximum layer used for L-FEC generation. In Fig.3 we present an encoding example of the L-FEC using a very simple systematic FEC code, where redundancy symbols are generated by simple XOR combinations of the source symbols. Dependency Layer B L-FEC Extension Layer A iA0 iA1 iA2 iB0 iB1 iB2 0 0 1 1 1 0 IB=(110) XOR 1 à CB=(11011) pB1 1 pB0 Time = 2t iA0 iA1 iA2 Fig.1: Layer Aware FEC generation over one dimension The depicted SVC media bit-stream consists of L layers where the arrows show the dependencies between the layers. The redundancy symbols FEC 0 of the highest priority layer l=0 are typically generated given by the FEC coding technique T. Redundancy symbols of the enhancement layer l=x are calculated incorporating SSs of all layers l ≤ x. I.e., FEC 1 symbols are generated over SSs of layer l=0 and layer l=1. Furthermore, FEC 2 symbols are generated over SSs of layer l=0, layer l=1 and layer l=2 and so on up to FEC L-1, which is generated over SSs of itself and all layers of higher priority. Using the L-FEC approach the redundancy symbols of different layers, but same DP, can be FEC 111 D1 1 D2 2 FEC D2 1 020 Base layer FEC Base layer 000 4. LAYER-AWARE FEC Dependency Layer L-1 ... Raptor codes as first introduced in [11] belong to the category of Rateless or Fountain codes. Such a type of a FEC code can produce a theoretically infinite number of encoding symbols (ESs) from a limited number of source symbols (SSs) with a linear complexity. The receiver can recover the original data by an inverse encoding process after receiving an amount of ESs only slightly larger than the number of SSs. The Raptor code adopted by DVB-H [7] is a systematic code based on the concatenation of a Luby-Transform (LT-) Code [12], and an additional pre-code. The pre-code produces intermediate symbols, which are used as input symbols of the LT-Code. For systematic behavior, the pre-code is designed so that the output of the LT-Code contains the SSs. The ESs are generated by XORing randomly selected intermediate symbols following a given distribution. More details about the systematic Raptor design can be found in [7][12][14]. jointly used for error correction. Note that the number of redundancy symbols remains constant. This approach can also be extended to multiple dimensions of dependencies as present in SVC. Fig.2 depicts the L-FEC generation for three scalable dimensions D1…D3 corresponding to the temporal, spatial or SNR scalability in SVC. Dimension 2 (D2) 3. RAPTOR CODE IN DVB-H Dependency Layer A 0 0 IA=(001) 1 XOR 0 pA0 1 à pA1 CA=(00101) Fig.3: Exemplary encoding using L-FEC We assume two layers A and B with SSs IA and IB. Layer B depends on layer A. Each layer has k SSs and p=n-k redundancy symbols where k=3 and p=2 with symbol size t=1 Bit. Layer A is protected by typical FEC technique. I.e. the parity bits pA0 and pA1 are calculated by XOR combinations of the SSs. The code word CA xor RB=(11011) 0 Dependency Layer B Dependency Layer A 0 1 1 1 0 iB0 iB1 iB2 iA0 iA1 iA2 IA=(001) IB=(110) rA0 rA1 rA2 rA3 rA4 ? ? ? 0 1 RA=(???01) In this section, we present selected results for transmission of QVGA and VGA resolution using SVC over a DVB-H channel. A Gilbert-Elliot (GE) model is used as statistical model for simulation of burst losses on the DVB-H channel as used in [13]. The transmission blocks (TB) of the wireless channel are of size 186 bytes. The mean error burst length is about 100 TBs. We simulated two different sequences. The CIRCLE sequence has 25 fps and a length of 1297 frames and the SOCCER sequence has 30 fps and a length of 1794 frames. For encoding we used the SVC reference software JSVM8.8 with a H.264 BL at QVGA resolution and a spatial enhancement layer (EL) at VGA, a groupof-picture (GOP) size of 16 and random access point at each second GOP. In case, a VGA receiver does not receive the spatial EL, we calculated the PSNR value of an up-scaled VGA resolution and we use freeze frame error concealment in case the BL is lost. The BL is protected with a standard Raptor, whereas the spatial EL protection uses the L-FEC approach. Fig.5 sketches the different protection types for exemplary two layers, i.e. the L-FEC of the EL also incorporates the source symbols of the BL. ? ? ? IA=(???) iA0 iA1 iA2 Fig.4: Exemplary decoding using L-FEC Note, if layer A can be corrected by itself, the additional connections introduced by the L-FEC to layer A can be removed by the use of the SSs of layer A. In such a case, layer B can be corrected following standard FEC using only the redundancy symbols of layer B. We define the code rate (CR) as the number of source symbols k over the number of related output symbols n including all redundancy symbols p. Using the L-FEC, the CR of layer B remains constant with CRB=kB/nB. Contrary to that, the CR of layer A decreases due to the additional protection by the L-FEC in layer B. If there is no error in layer B, the minimum CRmin of layer A using the L-FEC decreases to CRmin = ka/(na+pb), because all redundancy symbols of layer B can be used for layer A. Therefore, using the L-FEC, the CR of layer A is between ka/na ≥ CRa ≥ ka/(na+pb) depending on the number of received symbols in layer A and B. I.e., for a layered transmission, the CR can never become higher or the protection can never become lower by using the L-FEC than with standard FEC. 2 Layer xor 1.Enh. Layer FEC 1 BaseLayer FEC 0 L-FEC approach 1.Enh. Layer BaseLayer L-FEC 1 FEC 0 Fig.5: Standard FEC generation and L-FEC approach The ESs are computed over the amount of data between the random access points. Tab.1 and Tab.2 show the CR of the standard FEC as well as the CRmin of the L-FEC and the operation points of the SVC stream. Since CRmin is estimated by adding the redundant bit-rate of the EL to the redundant bit-rate of the BL. Tab.1: Code-rate (CR) and operation points for SOCCER H.264 Base layer (QVGA) Spatial Enh. Layer (VGA) Source Bit-rate 352 kbps 685 kbps PSNR VGA 31.70 dB up-scaled 34.85 dB CR FEC/UEP 0.62/0.50 ≈CRmin FEC/UEP 0.36/0.35 0.62/0.70 0.62/0.70 Tab.2: Code-rate (CR) and operation points for CIRCLE 5. L-FEC WITH SYSTEMATIC RAPTOR CODES In order to apply the idea of L-FEC to the systematic Raptor FEC, the encoding process has to be modified for dependency layers l > 0 following the L-FEC procedure shown in section 4. The standard LT encoding only covers the SSs of the actual layer. To extend it following the L-FEC approach and to keep the code rate constant, the encoding process has to be extended to all referenced layers. Standard FEC 2 Layer rB0 rB1 rB2 rB3 rB4 1 1 0 1 1 6. SIMULATION RESULTS Spatial Scalability L-FEC Extension Layer A rA0 rA1 rA2 rA3 rA4 ? ? ? 0 1 I.e. the XORing process covers all layers of higher importance. To keep the systematic behavior of the extended code, the pre-code has to be modified as well. Further details about the required modification can be found in [14]. In the simulations in section 6, we use such an extended Raptor code, which provides systematic behavior and layer-aware protection, similar to the example given in section 4. The resulting ESs of lower priority layers are generated incorporating the SSs of higher priority layers, and the first symbols correspond to the original SSs of the encoded layer. Spatial Scalability is generated by concatenating the SSs with the redundancy symbols. Layer B is protected by the L-FEC, i.e. the redundancy symbols pB0 and pB1 of layer B are generated by additionally XORing the source symbols of layer A. Finally, the redundancy bits are concatenated with the SSs of layer B to the code word CB. With the L-FEC, the redundancy symbols of layer B now also protect layer A. It is obvious, that the bit-rate of the encoded stream remains equal to the standard FEC generation. Furthermore, each codeword is transmitted over an error prone channel. A client receives the code words RA and RB as depicted in Fig.4. We assume, that the code word of layer A is affected by three errors, which overburdens the error correction capabilities of the applied protection of layer A. I.e. using standard FEC, layer A could not be decoded. Layer B is received without any error. Therefore, its SSs can be recovered. But the video stream cannot be decoded due to the missing references of layer A. Using the L-FEC, the received redundancy bits of layer B can be used in combination with the received symbols of layer A for a combined error correction. Since there are enough received symbols in both layers, the SSs of both layers can be successfully recovered. H.264 Base layer (QVGA) Spatial Enh. Layer (VGA) Source Bit-rate 222 kbps 275 kbps PSNR VGA 33.26 dB up-scaled 36.84 dB CR FEC/UEP 0.60/0.51 ≈CRmin FEC/UEP 0.40/0.40 0.60/0.70 0.60/0.70 Fig.6 and Fig.7 show the results of the simulations, where the Y-axis shows the mean received video quality in terms of PSNR at different TB loss rates. We compare two settings using different CR distributions FEC, UEP as shown in Tab.1 and Tab.2. All settings have the same transmission bit-rate of 840kbps for CIRCLE and 1833kbps for SOCCER sequence incorporating header overhead. Each setting is simulated using standard FEC encoding (Normal) and the L-FEC (Extended) in the spatial EL. We focus on the VGA receiver, due to a QVGA client only receiving the BL would show similar performance compared to standard FEC. However, the results in [14] show, that with additional reception of the VGA stream, even a QVGA receiver would profit by the use of the L-FEC. SOCCER VGA Receiver 1833 kbps - Burst length: 100 TB 35 7. CONCLUSION AND SUMMARY In this work, we propose a layer-aware forward error correction (LFEC) approach. L-FEC generates redundancy symbols incorporating layered structures in modern media codes like SVC. SVC is used to transmit two different resolutions QVGA and VGA at the same time. We applied the L-FEC approach to a Raptor code. The L-FEC approach enhances the protection capability of the spatial enhancement layer without increasing the bit-rate. Simulation results in an RDLM-like scenario (DVB-H) show that the proposed approach outperforms a standard UEP scheme. In such a scenario, the L-FEC can never show a weaker performance than standard FEC. Therefore, we recommend the use of the LFEC for the transmission of layered media, e.g. SVC. 34 8. REFERENCES 33 Mean PSNR [dB] 32 31 30 29 FECNormal FECExtended 28 UEPNormal 27 26 UEPExtended 0 0.05 0.1 0.15 0.2 0.25 Mean TB loss rate 0.3 0.35 0.4 Fig.6: SOCCER sequence using a VGA receiver CIRCLE VGA Receiver 840 kbps - Burst length: 100 TB 37 36 35 Mean PSNR [dB] 34 33 32 31 30 FECNormal 29 FECExtended UEPNormal 28 27 UEPExtended 0 0.05 0.1 0.15 0.2 0.25 Mean TB loss rate 0.3 0.35 0.4 Fig.7: CIRCLE sequence using a VGA receiver Using a VGA receiver, the L-FEC approach shows a gain in PSNR for all settings and sequences. I.e. the additional protection of the L-FEC reduces losses in the base layer. In contrast to standard FEC, using the L-FEC, the FEC and UEP scheme show a similar performance at lower loss rates. First at higher loss rates, the settings with the highest protection for the BL show the best performance. The quality difference between the Normal and Extended approaches decreases with the increase in difference between CR of the layers, this is due to the gain is caused by the redundancy in the EL. If there is less redundancy in the EL, the observed gain decreases as can also be observed by the difference between the CRmin value and the CR value as shown in Tab.1 and Tab.2. [1] ETSI, “Digital video broadcasting (DVB); transmission system for handheld terminals (DVB-H),” European Standard EN 302 304, V1.1.1, November 2004. [2] S. McCanne, V. Jacobson, and M. Vetterli, “Receiver-driven layered multicast,” ACM SIGCOMM’96, Aug. 1996. [3] ITU-T and ISO/IEC JTC 1, “Advanced video coding for generic audiovisual services, ITU-T Recommendation H.264 and ISO/IEC 14496-10 (AVC),” Version 8, July 2007. [4] T. Schierl, C. Hellge, S. Mirta, K. Grüneberg, and T. Wiegand, “Using H.264/AVC-based Scalable Video Coding (SVC) for Real Time Streaming in Wireless IP Networks,” IEEE ISCAS, May 2007. [5] H. Schwarz, and T. Wiegand, “Further Results for SVC coding: QVGA to VGA spatial scalability in Mobile TV,” DVB, TM-AVC, TM-AVC0351, Dec. 2007. [6] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” IEEE Trans. on Circuits and Systems for Video Technology, Special Issue on Scalable Video Coding, Sept. 2007. [7] ETSI, “Digital Video Broadcasting (DVB); IP Datacast over DVB-H: Content Delivery Protocols,” TS 102 472 V1.2.1, Dec. 2006. [8] B. Girod, U. Horn, and B. Belzer, “Scalable Video Coding With Multiscale Motion Compensation And Unequal Error Protection,” in Proc. International Symposium on Multimedia Communications and Video Coding, New York, Oct. 1995. [9] A. Albanese, J. Blomer, J. Edmonds, M. Luby, and M. Sudan, ”Priority encoding transmission,” IEEE Trans. on Information Theory, 1996. [10] A. Bouabdallah, and J. Lacan, “Dependency-aware unequal erasure protection codes,” Journal of Zhejiang University Science A, Volume 7 (Suppl. 1): 27-33, April 2006. [11] A. Shokrollahi, ”Raptor codes,” IEEE Trans. on Information Theory, Vol. 52, No. 6, June 2006. [12] M. Luby, T. Gasiba, , T. Stockhammer, and M. Watson, ”Reliable Multimedia Download Delivery in Cellular Broadcast Networks,” IEEE Trans. on Broadcasting, March 2007. [13] M. Hannuksela, V. Vadakital and S. Jumislo-Pyykk, ”Comparison of Error Protection Methods for Audio-Video Broadcast over DVB-H”, EURASIP Journal on Advances in Signal Processing, 2007. [14] C. Hellge, T. Schierl, and T. Wiegand, “Mobile TV using scalable video coding and layer-aware forward error correction,” IEEE ICME, June 2008.