1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Signal Processing: Image Communication 16 (2001) 745}762 Forward error correction (FEC) codes based multiple description coding for internet video streaming and multicast Rohit Puri *, K. Ramchandran , K.W. Lee, V. Bharghavan EECS Dept, UC Berkeley, USA ECE Dept, Univ. IL, Urbana, USA Abstract In this work, we describe the application of generalized multiple description (MD) source coding architected through the usage of forward error correction (FEC) codes to achieve robust and e$cient Internet video streaming and multicast consistent with the existing Internet architecture. We highlight the synergistic interaction between the robust source coding and packetization strategy and an e$cient as well as responsive, TCP-friendly congestion control protocol linear increase multiplicative decrease with history (LIMD/H) to achieve robust streaming over the Internet. The proposed source packetization scheme transforms a loss sensitive layered video bitstream into a robust packet stream so as to provide graceful resilience to network packet drops. The proposed congestion control mechanism provides low variation in transmission rate in steady state and at the same time is reactive and provably TCP friendly. Furthermore, exploiting the fundamental property of a multicast tree, namely, the observation that the multicast tree is a degraded channel, we propose an elegant application level architecture for e$cient video multicast. We exploit the amenability of the MD coding strategy to fast transcoding. Our extensive simulation studies for the streaming problem and preliminary results for the multicast case, bear out the e$cacy of the proposed setup. 2001 Elsevier Science B.V. All rights reserved. Keywords: Robust multimedia streaming; Application-transport layer interaction; Multimedia multicast 1. Introduction In the current deployment of the Internet, network routers are typically oblivious to the structure or content of the packets and treat all packets equally. While this simple network design has enabled the Internet to scale to its present size, it adversely impacts multimedia #ows. Typical multi* Corresponding author. E-mail addresses: rpuri@eecs.berkeley.edu (R. Puri), kannanr@eecs.berkeley.edu (K. Ramchandran), kwlee@timely. crhc.uiuc.edu (K.W. Lee), bharghav@timely.crhc.uiuc.edu (V. Bharghavan). media streams are highly structured, i.e., characterized by a hierarchy of importance layers (e.g., I, P versus B frames in MPEG) and hence are fundamentally mismatched to the existing switching strategies. Furthermore, emerging multimedia compression standards for image/video coding are evolving towards a multiresolution (MR) or layered representation of the coded bitstreams. For example, there is a strong push in the next-generation image and video compression standards, JPEG-2000 and MPEG-4, respectively, to support scalability. Multiple description (MD) source coding has recently emerged as an attractive framework for 0923-5965/01/$ - see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 3 - 5 9 6 5 ( 0 1 ) 0 0 0 0 5 - 4 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 746 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 robust transmission over unreliable channels. The essential idea is to generate multiple independent descriptions (N in number) of the source such that each description independently describes the source with a certain desired "delity; when more than one description is available, they can be synergistically combined to enhance the quality. Note that while the MR bitstream is sensitive to the position of losses in the bitstream (e.g., a loss early on in the bitstream can render the rest of the bitstream useless to the decoder) the MD stream is insensitive to them and thus has the desired feature that the delivered quality is dependent only on the fraction of packets delivered reliably. While most of the initial work in this area has focussed on the special case of N"2 descriptions [6,30], there has been recent interest in the N'2 case also [8,16,30,20]. For example, [20] presents an e$cient algorithm that transforms an MR-based prioritized bitstream into a non-prioritized MD packet stream for which the delivered quality is a function of only the number of packets delivered, and is thus well suited to Internet multimedia transmission. The proposed strategy uses forward error correction (FEC) channel codes and incorporates the priority encoding transmission (PET) [1] philosophy in formulating a systematic, fast and end-to-end ratedistortion optimized algorithm. Details of this are presented in Section 2. The `smart end-hosts/simple networka philosophy which has enabled the Internet to scale to its present size, entrusts to the end user, the responsibility of maintaining transmission rates that are fair to other connections. Two extreme transport services that have emerged under this model are the reliable, network-fair transmission control protocol (TCP) [27] and the unreliable `best e!orta service called the user datagram protocol (UDP) [18]. TCP is not well suited for multimedia transmission for multiple reasons. First, the feature of reliability that it o!ers is unnecessary for multimedia applications and furthermore obtained at the expense of large round trip delays since reliability is attained through repeated retransmissions (ARQ: automatic repeat request). Typical multimedia applications, being loss-tolerant and delay sensitive, are therefore not well matched to TCP. Secondly, TCP employs a conventional endto-end congestion control paradigm based on the linear increase multiplicative decrease (LIMD) philosophy. The source continuously probes the network by incrementing its transmission rate (linear increase) when it encounters no loss. However, upon loss, it throttles down its transmission rate by a large scaling factor (multiplicative decrease). The LIMD approach reacts aggressively to packet loss which enables quick response to the onset of congestion; however it leads to undesirably large variations in the delivered transmission rate even in steady state. Such a congestion control philosophy can have an adverse impact on multimedia #ows since for such applications, `smoothnessa in delivered quality is an important consideration from the end user's point of view. The design of multimedia protocols thus dictates that transmission rate variations be small, while also requiring the congestion control policy to be network-friendly, namely, that it should allow other TCP #ows to co-exist and not grab their fair share of the bandwidth. In today's Internet, where most of the tra$c is TCP based, what is thus required is a congestion control paradigm that builds on top of UDP and possesses the property of `TCP-friendlinessa. Equation-based rate control has been recently proposed [5] as an attractive option for multimedia Internet transmission. It has the desirable feature of delivering maximally smooth, TCP-fair transmission rate. However, it is characterized by a relatively slow responsiveness to network dynamics. In this work we present a new congestion control algorithm, based on linear increase and graded multiplicative decrease, that can be conceptualized as generalizing both the LIMD control and equation based rate control approaches. We dub our approach `LIMD with historya or LIMD/H since it uses history of packet loss to adapt the control parameters. Recent work [24,25,28] on multimedia streaming has highlighted the bene"ts of joint design of the source coding schemes and the transport layer protocols. Our end-to-end system design subscribes Subjectively, it has been found that large deviation in delivered picture quality leads to a bad user experience. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 747 Fig. 1. Block diagram of the end-to-end video transmission system. prioritized MR bitstream into an N-packet unprioritized MD packet stream using e$cient erasure channel codes. Each description in the MD stream occupies an entire network packet, thus the terms `descriptiona and `packeta are used interchangeably. 2.1. Transcoding mechanism Fig. 2. Progressive bitstream from the source coder partitioned into N layers or quality levels. to this inter-layer interaction philosophy. While the individual components are novel in their own right, a key aspect of our approach is their e$cient and synergistic coordination in an integrated architecture. See Fig. 1 for the end-to-end block diagram of our streaming system. The remainder of this paper is organized as follows. Section 2 describes the FEC based robust source coding (MD-FEC) scheme. Section 3 presents the congestion control algorithm and the interaction between the application and the transport layer. In Section 4, we evaluate the performance of our system using the ns-2 network simulator [15]. Section 5 views the source coding scheme as a `transcoding agenta describing its applicability to a multicast scenario and presents preliminary results for the same. Section 6 concludes the paper and o!ers directions for future work. 2. Forward error correction codes based multiple description coding In this section, we brie#y describe the mechanics of the packetization strategy that converts the The quality pro"le re#ects the target quality (or equivalently distortion d: lower distortion implies higher quality and vice versa) when any k out of N descriptions are received. We will use the notation d(k) to describe the quality pro"le where the ith entry in d(k) represents the target quality when i descriptions are received. Given N and d(k) and a progressive bitstream, the stream is marked at N positions (see Fig. 2) which correspond to the attainment of the distortion levels d(k) and is thus partitioned into N sections or resolution layers. The goal is to enable the ith layer to be decodable when i or more Bitstreams which have the property of successive re"nability, i.e., new blocks of data re"ne the previous blocks of data are referred to as multiresolution or scalable bitstreams. Bitstreams with a large re"nement block size are typically referred to as layered bitstreams while bitstreams for which the re"nement block size is, theoretically, of the order of a bit are referred to as `bit-progressive or embeddeda bitstreams as in the case of the 2-D SPIHT image coder [23]. While the analysis presented in this section applies directly to the latter category of bitstreams, it can be easily generalized to the case of layered bitstreams such as those that can be generated by H.263#/MPEG video encoders. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 748 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 Fig. 3. N-description generalized MD codes using forward error correction codes. descriptions arrive at the receiver (i.e., when the number of erasures does not exceed N!i). This can be attained using the Reed Solomon family of erasure-correction block codes (An (n,k,d) block code is de"ned by a length n code with k user symbols and a minimum distance of d, i.e it can correct (d!1) erasures. Reed Solomon block codes have the property of maximum distance (d"n!k#1), i.e. the whole data can be recovered from any k out of n symbols.), which are characterized by the optimal code parameters (N, i, N!i#1) and can correct any (N!i) erasures out of N descriptions. We split the ith layer into i equal parts, and apply the (N, i, N!i#1) Reed Solomon code to get the contribution from the ith layer or section to each of the N descriptions. The contributions from each of the N levels or sections are then concatenated to form the N descriptions (see Fig. 3). This packetization strategy thus provides the property that the more the number of packets received, the better the received quality. We have so far shown the mechanics of our transcoding algorithm that transforms an MR bitstream into an MD packet stream but have not yet addressed the question of how to optimize the rate partitions in the framework. The answer to this of course depends on the properties of the channel at hand. We address this question in the following section. 2.2. Optimization problem statement In this section, we formulate the optimization problem. We assume that our model is characterized by q(i), i"!1, 2, N!1, denoting the probability that i#1 out of N packets are delivered to the destination. We coin the term transmission proxle to refer to the above channel state information. From operational rate-distortion theory, it follows that distortion is a one-to-one function (D(r)) of the rate r. Hence determining the quality pro"le d(k) of order N corresponds to "nding the rate partition R"R , 2, R of the bitstream (see ,\ Fig. 2). De"ne the expected distortion ED, ,\ ED"q E# q D(R ), (1) \ H H H where E equals the source variance, the distortion encountered when the source is represented by zero bits. When the codes as outlined in the above section are used, the total rate used R equals R R R " N R 1 # R !R R !R N#2# ,\ ,\ N. 2 N (2) 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 749 Problem statement. Given the number of packets N, each packet of size ¸ (i.e. a total rate budget" N¸), an embedded bitstream with rate}distortion curve D(r), and the transmission pro"le q(i), "nd R that minimizes ED subject to R )RH, (3) R R )R )2)R )R , (4) ,\ ,\ R !R "k (i#1), k *0, i"1, 2, N!1. G G\ G G (5) The constraints (4) and (5) are, respectively, referred to as the `embeddinga and the `channel coding constraints. The former arises due to the physical nature of the bitstream, i.e., a greater number of source layers would use a greater amount of rate (see Fig. 2). The latter arises from the fact that in order to apply an (N, i, N!i#1) Reed Solomon code to the ith layer, the ith layer should contain a number of source symbols that is an integral multiple of i. The solution to this problem based on Lagrangian optimization has been discussed in detail in [20] and is outside the scope of this paper. To summarize, we thus presented a non-prioritizing encoding strategy that is ideally suited for multimedia transmission over the Internet. In the following section, we deal with the second aspect of the problem, namely, designing of a TCP friendly congestion control protocol given that the target application is multimedia transmission. Fig. 4. (a) LIMD: "0.5, (b) LIMD/H: "small. Dotted values represent the average connection throughput. 3.1. Motivation for LIMD/H algorithm for adapting the transmission rate of a connection to match the available bandwidth whether the target application is data-oriented [27] or multimedia [2,22]. Brie#y, LIMD periodically adapts the sending rate of a connection by gently increasing the rate by a linear constant upon observing no packet losses (in order to probe for additional bandwidth), and aggressively decreasing the sending rate by a large multiplicative constant upon observing packet losses in order to alleviate congestion (see Fig. 4). The trade-o!s of LIMD are quite well known. While the LIMD approach has been shown to be robust and asymptotically convergent to fairness [7], it also reacts identically and aggressively to all packet losses, both congestion induced and noncongestion induced. When there is actual congestion, LIMD throttles the sending rate su$ciently so that congestion can be cleared out quickly. But when the network bandwidth is invariant, periodic packet losses due to channel probing still induce Congestion control algorithms that are deployed in the current Internet typically employ the linear increase multiplicative decrease (LIMD) paradigm For TCP, the increase factor is 1 packet/RTT and the decrease factor is set to a half. 3. Congestion control In the target environment, a congestion control algorithm should provide a smooth variation of transmission rate so as to ensure lesser variation in delivered multimedia quality. In addition it should possess features such as quick response to network dynamics, e$ciency, fairness and TCP-friendliness [5]. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 750 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 large rate #uctuations, which is undesirable especially given the target application. An ideal end-toend congestion control algorithm would be that which comprises of a loss di!erentiation mechanism that classi"es packet losses as being congestion/non-congestion induced and reacts instantly to the former by adapting to the new network capacity while throttling gracefully to the latter case. The congestion control algorithm we present in this paper called LIMD/H (LIMD with history) is an augmentation of the basic LIMD approach and tries to achieve the behavior of an ideal congestion control algorithm. It features a simple loss di!erentiation mechanism to predict the cause of packet loss and react accordingly. The LIMD/H congestion control algorithm has the following salient attributes: 1. LIMD/H uses the history of packet losses in order to distinguish between congestion-induced and non-congestion-induced packet losses. 2. LIMD/H reacts gently to non-congestion-induced losses in order to keep the sending rate variation to minimum, but reacts quickly to the onset of the congestion. 3. In the steady state it delivers the same average rate as the LIMD control (with "1 and "0.5) implemented in TCP would, and therefore is TCP-friendly. 3.2. Framework for congestion control In our framework, the evolution of the transmission rate occurs over discrete time periods called epochs. At the end of each epoch, the receiver computes the number of received packets during that epoch and then sends a congestion feedback message to the sender (via acknowledgements) containing this number. The loss fraction 0(f(1 can thus be calculated at the sender. Upon receiving the loss feedback, the sender executes the LIMD/H congestion control algorithm to adjust its sending rate. 3.3. LIMD/H congestion control We "rst present the rate evolution equations for LIMD congestion control. Let r denote the sending rate, f denote the loss fraction, denotes the linear increase constant, and denotes the multiplicative decrease constant. Using LIMD: If f"0, rQr#. If f'0, rQr(1!). LIMD employs aggressive rate throttling for any packet loss, congestion related or probe related. Ideally, we want to reduce the sending rate aggressively upon congestion-induced loss and adapt gracefully to probe-induced loss. The goal therefore is to set the value of the rate-throttling factor based on the cause of packet loss. An intuitive heuristic to predict congestion is the following: if the packet losses observed during an epoch are caused by congestion but the sender does not throttle aggressively enough, then the loss will persist; on the other hand, if the packet loss is a probe loss, then so long as the sender throttles enough to account for the loss, there will be no loss in the next epoch. Taking these factors into account, we propose the LIMD/H algorithm that keeps track of a very simple history parameter, h. The h variable is initially set to 1, is doubled if there is a packet loss in an epoch, and reset to 1 if there is not. Thus, the h variable is a very simple mechanism to capture the history of packet loss in the previous epochs. Based on the value of the h variable, LIMD/H exercises a graded multiplicative decrease upon packet loss: If f"0, rQr# and hQ1. If f'0, rQr(1!h) and hQ2h. In case of LIMD/H, we typically set the multiplicative decrease constant to be small values, e.g. 0.1, in order to achieve small variations of sending rate when the available bandwidth is invariant. LIMD/H thus throttles its transmission rate gently when there was no packet loss in the previous epoch, and progressively more aggressively when previous epochs have also experienced packet loss (see Fig. 4). Of course, the LIMD/H algorithm presented above is a very simple augmentation to LIMD. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 A detailed treatment of this algorithm is presented in [9,12]. Also in [12], it was shown that the increase/decrease parameters of LIMD/H can be tuned to achieve approximately the same steady state throughput as the LIMD congestion control (with "1 and "0.5) of TCP would. Essentially, to reduce the variation in the transmission rate, typically small values of are used in LIMD/H. Hence to ensure same average rate as say TCP, needs to be made correspondingly smaller. This approach to congestion control contrasts that proposed in [5,19] where the TCP throughput equation has been used to perform rate based congestion control. Equation-based congestion control, as it is known, measures the round-trip time and observed packet loss probability to compute the nearly constant TCP-friendly rate for a #ow. The advantage of such an approach lies in the fact that it can provide very smooth rate control, avoiding abrupt changes in sending rate, which can be well matched to multimedia applications such as video streaming over the network } to this extent, its properties are similar to LIMD/H. However, it is relatively slow to respond to network dynamics. Another drawback of equation-based congestion control is that its success depends on accurate measurements of the connection round-trip time and the packet loss probability; the latter is very di$cult to calibrate accurately in a responsive manner since it requires large observation intervals. A quantitative comparison of these approaches, is however, outside the scope of this paper [9,17]. 3.4. Coordination between application and transport layer To summarize, so far, we have advocated a state-of-the-art algorithm that o!ers a robust source encoding option and an e$cient algorithm at the transport layer (LIMD/H) that o!ers smooth transmission rate variations that are amenable to multimedia transmission. A key aspect of the end-to-end system is the synergistic interaction between the application and the transport layer which highlights the value sharing information between the application and the transport layer. This interaction is e!ected by the 751 e$cient transfer of the channel state information from the transport layer to the application layer (see Fig. 1). The congestion feedback from the receiver is available at the transport layer and can be used to infer the channel state and hence the transmission pro"le which can then be operated on by MD-FEC algorithm to deduce the optimal encoding strategy. While a wealth of literature is available on sophisticated tra$c models for the Internet, capturing the dynamics of the Internet channel has proved to be a di$cult task in general. Driven by our goal of designing a simple but e!ective interface for transfer of channel state information, we resort to the proven histogram based adaptive estimation principles that blend long-term averaging and shortterm updates, that have been deployed with great success in various research areas (ranging from control systems to adaptive "ltering to arithmetic coding). Our approach for channel estimation is measurement-based and simply involves collecting statistics available from reverse channel feedback and then conveying them to the application layer (MD-FEC channel pro"le). This is achieved by maintaining a pro"le of the relative frequency f (t) G of occurrence of each rate sample r(i) (the number of packets successfully received by the receiver). For the actual implementation, we maintain a histogram of the rate samples r(i). For updating the histogram upon availability of new feedback information, on receiving new feedback information, we add the `newa information to the `olda information in a weighted fashion where the weight for the old information has the connotation of a `forgetting factora which enables faster adaptation to network dynamics. The histogram is interpreted as the transmission pro"le by the MD-FEC transcoder. Based on our experiments, we "xed the forgetting factor at 0.875 which worked reasonably well for all situations. To summarize, our current mechanisms incur very low complexity and despite their simple nature perform well in various simulation scenarios that we have considered. Since our architecture is modular in design, it can easily accommodate sophisticated channel modeling machineries also which could form a part of possible future work. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 752 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 4. Simulation results In this section, we present simulation-based performance results of the proposed video transmission scheme. The ns-2 network simulator was used to implement the LIMD/H congestion control algorithm as a part of the transport layer protocol. For our simulations, we used a parameterized version of the `Footballa video sequence as encoded by the state-of-the-art embedded video encoder, namely the 3-D SPIHTencoder [10]. The frame size is 352;240 pixels for the luminance with a subsampling factor of 2 in each of the spatial dimensions for each of the two chrominance components. Each GOP comprises of 16 frames. The transmission speed chosen was 1 GOP per epoch, where each epoch has a duration equal to 650 ms, which corresponds to approximately 26.7 frames/s. The details of the performance of the individual components namely the MD-FEC and the LIMD/H algorithm are presented in [9,20], respectively, wherein the two respective components were shown to signi"cantly outperform state-of-the-art reference systems. (For example, the MD-FEC algorithm was shown to outperform reference systems [8,26] by rate savings of the order of 30% while delivered identical image qualities). In this paper, we present the performance of the end-toend system as a whole especially highlighting the e!ectiveness of the simple interaction between the application and the transport layer. Our reference transmission systems represent a gradual evolution from the dumb MR scheme with no robustness at all to the proposed MD-FEC scheme. They are: 1. Multiresolution (MR) source encoder, 2. Constant FEC or equal error protection (EEP), 3. Fixed unequal error protection (FUEP). MR source encoder corresponds to no robustness against packet loss. EEP is the case when di!erent layers of the prioritized bit stream are assigned the same level of error protection. FUEP corresponds to the case when di!erent layers of the prioritized bit stream are assigned unequal error protection that is xxed, and therefore does not adapt to the varying network conditions. An example of an FUEP system is the priority encod- ing transmission (PET) [1] scheme applied to encoding of MPEG video [13]. The performance of the video transmission systems was compared in terms of the measured delivered quality at the receiver (or peak signal-tonoise ratio (PSNR) in dB) versus time (s). We provide performance results of the candidate robustness mechanisms in a variety of network scenarios ranging from single hop to multihop topologies to varying degrees of random loss probabilities to sudden variations in network capacity. The random loss model captures the commonly accepted connection model for a #ow that shares a large network with a large number of other #ows. The scenario where the network capacity suddenly changes illustrates the behavior of the system for networks with heavy dynamics such as router/link failure, or upon the occurrence of major network events such as the NASA path"nder webcast. 4.1. Comparison of the source robustness mechanisms In this section, we focus on the performance of various robustness mechanisms with LIMD/H congestion control algorithm in a simple network topology shown in Fig. 5. In all the cases, we tested the performance of each system when there is random loss in the network channel, and when there is a sudden congestion due to the introduction of 800 Kbps CBR on cbrPsink during 400}500 s period. The bottleneck link has a bandwidth of nearly 2.5 Mbps. The performance of the MR encoder is shown in Fig. 6. As expected the MR encoder exhibits extreme fragility and delivers wildly #uctuating quality even in the situation of no random loss, the Fig. 5. A One-hop Network Topology. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 753 Fig. 6. Performance of MR: (a) in steady state and upon the introduction of a CBR #ow, (b) 0.01% random loss on the bottleneck link. Fig. 7. Performance of Equal Error Protection (EEP) in a one hop topology network with no random loss depicted in Fig. 5 for varying degrees of redundancy: (a) 10% redundancy, (b) 20% redundancy. amplitude of the #uctuations being more than 5 dB (Fig. 6). In the situation of random loss, the performance is even worse. Fig. 7 depicts the performance of the EEP scheme for the case of variation in network capacity. As expected, as the strength of the channel codes used is increased, smoother albeit lower quality is delivered to the destination. In this case, the quality pro"le exhibits a cliw e!ect as can be seen for the case corresponding to 10% redundancy via channel codes. In Fig. 8 we observe the performance of the EEP scheme for situations of random loss. Once again, the cliw ewect that is inherent in channel codes is depicted. The FUEP scheme assigns pre-speci"ed protection to pre-speci"ed source resolution layers. For instance, in [13], the priority encoding transmission scheme [1] was used for MPEG encoding. Here the I-frames were encoded with priority 60%, P-frames with priority 85% and B-frames with priority 95%. (Protecting a resolution layer with 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 754 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 Fig. 8. Performance of Equal Error Protection (EEP) in a one hop topology network depicted in Fig. 5 for varying degrees of redundancy and random loss: (a) 10% redundancy, 0.01% random loss, (b) 20% redundancy, 0.01% random loss, (c) 10% redundancy, 0.1% random loss, (d) 20% redundancy, 0.1% random loss. priority x% means that the receiver can recover it from x% of the encoded packets. In channel coding jargon, x is referred to as the code rate.) As in [13], in FUEP, we partitioned the source into three layers. Figs. 9 and 10 compare the performance of the FUEP scheme and the MD-FEC scheme. We observe that, in general, the MD-FEC scheme delivers a superior quality in comparison with the two chosen FUEP schemes. Moreover, the quality delivered by MD-FEC is smoother than that delivered by its counterparts. The LIMD/H trans- mission rate evolution for the four scenarios of no loss/CBR #ow, 0.01% random loss, 0.1% random loss and 1.0% random loss is plotted in Fig. 11. We conclude this section with the following remark. A realistic network scenario would comprise of all the simulation scenarios presented above. The MD-FEC/LIMD-H approach owing to its robustness and fast adaptivity properties and its reasonable performance in all the simulation scenarios described above is well suited for such varying network situations. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 755 Fig. 9. Performance comparison between MD-FEC scheme and "xed unequal error protection (FUEP) schemes with rates (60%, 80%, 95%) and (40%, 70%, 95%), respectively, for varying degrees of random loss: (a) MD-FEC, no loss, (b) MD-FEC, 0.01% loss, (c) FUEP (60%, 80%, 95%), no loss, (d) FUEP (60%, 80%, 95%), 0.01% loss, (e) FUEP (40%, 70%, 95%), no loss, (f) FUEP (40%, 70%, 95%), 0.01% loss. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 756 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 Fig. 10. Performance comparison between MD-FEC scheme and "xed unequal error protection (FUEP) schemes with rates (60%, 80%, 95%) and (40%, 70%, 95%), respectively, for varying degrees of random loss: (a) MD-FEC, 0.1% loss, (b)MD-FEC, 1% loss, (c) FUEP (60%, 80%, 95%), 0.1% loss, (d) FUEP (60%, 80%, 95%), 1% loss, (e) FUEP (40%, 70%, 95%), 0.1% loss, (f) FUEP (40%, 70%, 95%), 1% loss. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 757 Fig. 11. LIMD/H transmission rate evolution for the one hop network topology: (a) no loss, CBR #ow, (b) 0.01% random loss, (c) 0.1% random loss, (d) 1% random loss. 4.2. Performance in a large network In this section, we present results for the performance of the robustness mechanisms in a complex multi-hop network topology. We only present the results with FUEP and MD-FEC due to space constraints. Fig. 12 shows the topology of the network. There is a single video stream (vsPvr) and 7 TCP connections (SiPRi) in the network. The link capacity and the one-way delay is annotated on the graph. All the links without annotation were set to 40 Mbps and 20 ms. Fig. 13 shows the performance results of the video stream when encoded by the FUEP and MD-FEC methods. We observe that in general the MD-FEC scheme delivers a smoother and marginally higher end-to-end quality than the FUEP scheme. Although there are occasional spikes in the delivered quality, in general the delivered quality is less jittery than that delivered by the FUEP scheme. Moreover, MD-FEC emerges faster from the interval of congestion than the FUEP scheme once again highlighting its adaptivity features. To summarize, MD-FEC scheme responds reasonably 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 758 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 well to varying network dynamics and is more adaptive than either of the reference systems. 5. Application of MD-FEC to multicast We describe the use of the MD-FEC technique to achieve robust, e$cient and scalable multicasting of real-time multimedia such as video over the Internet. Multicasting multimedia data over the Internet is especially challenging due to the hetero- Fig. 12. A Multi-hop Network Topology with a single video #ow. Links without annotation have a bandwidth of 40 Mbps and a delay of 20 ms. geneous nature of the Internet environment, wherein di!erent receivers can potentially have diverse characteristics in terms of processing capability, and the connection quality in terms of bandwidths and loss rate. Any source based approach su!ers from the problem of inter-receiver fairness, namely given a heterogeneous set of receivers, `optimala source encoding for a particular subset of them can potentially starve others if the degree of heterogeneity is high. An alternative is receiver oriented approaches using layered multicast technology [4,14,29]. Such an approach assumes a layered source that transmits di!erent layers of multimedia stream to distinct multicast groups and each receiver subscribes to the multicast groups depending on its characteristics. [4,29] generalize this idea to include layered forward error correction codes as well. Such approaches however su!er from a number of drawbacks. In layered multicasting, receivers perform `join-groupa experiments subscribe to extra layers in order to probe for more bandwidth. When they su!er losses and hence quality degradation they leave the highest layer (the least important layer) that they are joined. Such actions, when done independently, can potentially lead to destructive interference between receivers that share common links and can result in #uctuations in received data quality. Moreover, in practice join}leave operations are associated with signi"cant Fig. 13. Multi-hop network performance: (a) MD-FEC, (b) FUEP with code rates 60%, 80%, 95% . 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 759 Fig. 14. (a) Network Topology for a network of 16 nodes (link bandwidths are in Mb), (b) Organization of the multicast tree into groups. The thick lines represent the bottleneck links. process overhead at the end hosts in addition to the video stream resynchronization problems at the receiver. Recently there has been work on inference of multicast network topologies based only on endhost measurements [21]. The idea here is to identify sets of receivers that share common bottlenecks in the multicast tree and hence have similar and correlated loss patterns and to group them together (group formation protocol (GFP), see Fig. 14). This approach ends up arranging the whole multicast tree as a hierarchy where groups downstream necessarily have worse reception than groups upstream of them and di!erent groups are separated by bottleneck links. Assuming such an architecture we propose a scheme that could employ the protocol described in [11], wherein a multicast session receiver upstream of a bottleneck link acts as a transcoding agent for the receiver group immediately below it. If the receiver group downstream conveys its channel state to the receiver acting as the transcoding agent, then that receiver can reencode its received bitstream using the MD-FEC encoder to optimally match the channel state below. A multicast channel thus would be transformed to a succession of point-to-point channels. More signi"cantly, the Internet is naturally evolving towards an ISP domain based architecture. In such an architecture the di!erent domain are connected by bottleneck links (e.g. the trans Atlantic cable that connects the ISP on the East Coast in North America with that of Europe). In such a situation, MD-FEC has a natural position at domain edges so that it can optimally re-encode multimedia bitstreams to match them to the domain downstream, i.e. MD-FEC can thus play the role of an application level proxy [3]. 5.1. Results In this section we present some of the preliminary simulation based results for the proposed scheme. The Network Simulator (ns-2) [15] was used as the simulation platform and the video sequence considered was the Football sequence. We present results for the network topology in Fig. 14(a). The reference system is a pure source based approach wherein the source encodes for the average transmission pro"le (the average transmission pro"le is obtained by taking the mean of the transmission pro"les of all the receivers) for all receivers. For the proposed system, the transcoders (see Fig. 14) for groups 2, 3 and 4 were placed, respectively, at nodes 1, 2 and 10, respectively, and the transmission pro"le for each group was passed to the corresponding transcoding agent. In Fig. 15 we plot the quality delivered to a typical receiver in each of the smaller groups for both cases. In this particular simulation scenario we observe a signi"cant quality di!erence (about 7 dB) delivered to receiver 1 and about 4 dB to receiver 12 thus illustrating the gains due to transcoding. In our simulations we also observed that for greater heterogeneity the di!erence was even more noticeable. We thus presented a novel paradigm for multimedia multicast. The proposed scheme ideally 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 760 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 Fig. 15. Quality delivered in dB at various receivers in Fig. 14: (a) Receiver 1, (b) Receiver 4, (c) Receiver 7, (d) Receiver 12. bounds the performance of any multicasting scheme that uses the same source encoder and can serve as a benchmark for comparing di!erent multicasting schemes. 6. Conclusions and future work In recent years, multiresolution coding has become a very popular paradigm for image/video source coding. However, the current `best e!orta network model of the Internet is not well suited for the transmission of MR bitstreams. Furthermore, the predominantly-used Linear Increase Multipli- cative Decrease congestion control paradigm is mismatched to multimedia transmission. In order to address these issues, we presented two components: (i) the MD-FEC transcoding algorithm that converts MR streams optimally to non-prioritized MD streams that are able to better tolerate variations in the frequency and relative position of packet loss; (ii) the LIMD/H congestion control algorithm that is better matched to multimedia transmission. Also, we presented a simple and e$cient mechanism to coordinate the congestion control and source coding algorithms closely in order to best use the varying network resources introduced "rst in [12,19]. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 For the multicast problem we presented a novel paradigm where proposed scheme ideally bounds the performance of any multicasting scheme that uses the same source encoder and can serve as a benchmark for comparing di!erent multicasting schemes. Future work includes re"ning the channel state estimation procedure possibly through usage of appropriate tra$c models and also the development of practical protocols that approximate the multicast scheme discussed above. We also propose to delve into the issues such as the e!ect of application level bu!ering and error concealment mechanisms on the end-to-end delivered quality. References [1] A. Albanese, J. Blomer, J. Edmonds, M. Luby, M. Sudan, Priority encoded transmission, IEEE Trans. Inform. Theory 42 (6) (November 1996) 1737}1744. [2] S. Cen, C. Pu, J. Walpole, Flow and congestion control for internet streaming applications, in: Proceedings of Multimedia Computing and Networking, January 1998. [3] Y. Chawathe, S. McCanne, E. A. Brewer, RMX: Reliable multicast for heterogeneous networks, in: IEEE INFOCOM, Tel-Aviv, March 2000. [4] P.A. Chou, A.E. Mohr, A. Wang, S. Mehrotra, FEC and pseudo-ARQ for receiver-driven layered multicast of audio and video, in: Proceedings of the DCC, Snowbird, UT, March 2000. [5] S. Floyd, M. Handley, J. Padhye, J. Widmer, Equation based congestion control for unicast applications, in: Proceedings of the ACM SIGCOMM, Stockholm, Sweden, August 2000. [6] V.K. Goyal, J. Kovacevic, R. Arean, M. Vetterli, Multiple description transform coding of images, in: Proceedings of the ICIP, October, Chicago, IL, 1998. [7] D.-M. Chiu, R. Jain, Analysis of the increase and decrease algorithms for congestion avoidance in computer networks, Comput. Networks ISDN Systems 17 (1) (June 1989) 1}14. [8] W. Jiang, A. Ortega, Multiple description coding via polyphase transform and selective quantization, in: Proc. SPIE Visual Communications and Image Processing Conference (VCIP), January, San Jose, CA 1999. [9] T. Kim, S. Lu, V. Bharghavan, Improving congestion control performance through loss di!erentiation, in: Proceedings of the International Conference on Computers and Comunication Networks, October, Boston, MA, 1999. [10] B.J. Kim, Z. Xiong, W.A. Pearlman, Very low bit-rate embedded video coding with 3D set partitioning in hierarchical trees (3D SPIHT), in: IEEE Trans. Circuits Systems Video Tech., October 1997, submitted for publication. 761 [11] I. Kouvelas, V. Hardman, J. Crowcroft, Network adaptive continuous media applications through self organized transcoding, NOSSDAV, Cambridge, UK, July 1998. [12] Kang-Won Lee, R. Puri, Tae-Eun Kim, K. Ramchandran, V. Bharghavan, An integrated source coding and congestion control framework for video streaming in the internet, in: IEEE INFOCOM, Tel-Aviv, March 2000, to appear. [13] C. Leicher, Hierarchical encoding of MPEG sequences using priority encoding transmission, Tech. Report, TR94-058, ICSI, November 1994. [14] S.R. McCanne, Scalable compression and transmission of internet multicast video, Ph.D Thesis, University of California, Berkeley, Berkeley, CA, 1996. [15] S. McCanne, S. Floyd, The LBL/UCB network simulator, Lawrence Berkeley Laboratory, University of California, Berkeley (http://www-nrg.ee.lbl.gov/ns). [16] A.E. Mohr, E.A. Riskin, R.E. Ladner, Unequal loss protection: Graceful degradation of image quality over packet erasure channels through forward error correction, in: Proceedings of the DCC, Snowbird, UT, March 1999. [17] T. Nandagopal, K.W. Lee, J.R. Li, V. Bharghavan, Scalable service di!erentiation using purely end-to-end mechanisms: Features and limitations, in: International Workshop on Quality of Service, June, Pittsburgh, PA, 2000. [18] J. Postel, User datagram protocol, RFC 768, 1980. [19] R. Puri, K.W. Lee, K. Ramchandran, V. Bharghavan, Application of FEC based multiple description coding for internet video streaming and multicast, in: Proceedings of the Packet Video Workshop, May, Cagliari, Italy, 2000. [20] R. Puri, K. Ramchandran, Multiple description source coding through forward error correction codes, in: 33rd Asilomar Conference on Signals, Systems and Computers, Paci"c Grove, CA, October 1999. [21] S. Ratnasamy, S. McCanne, Scaling end-to-end multicast transports with a topologically-sensitive group formation protocol, ICNP, Toronto, October 1999. [22] R. Rejaie, M. Handley, D. Estrin, RAP: An end-to-end rate based congestion control mechanism for realtime streams in the internet, in: Proceedings of the IEEE INFOCOM, March, New York, NY 1999. [23] A. Said, W.A. Pearlman, A new fast and e$cient image codec based on set partitioning in hierarchical trees, IEEE Trans. Circuits and Systems for Video Technology, 6 (June 1996) 243}250. [24] S. Servetto, Compression and reliable transmission of digital image and video signals, Ph.D Thesis, Dept. of Comp. Sc., University of Illinois, Urbana Champaign, May 1999. [25] S. Servetto, K. Nahrstedt, Video streaming over the public internet: multiple description codes and adaptive transport protocols, in: Proceedings of the ICIP Kobe, Japan, October 1999. [26] S. Servetto, K. Ramchandran, V. Vaishampayan, K. Nahrstedt, Multiple description wavelet based image coding, IEEE Trans. Image Process., to appear. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 762 1 2 3 4 5 6 7 8 R. Puri et al. / Signal Processing: Image Communication 16 (2001) 745}762 [27] W. Stevens, TCP slow start, congestion avoidance, fast retransmit, Fast Recovery Algorithms, RFC 2001, January 1997. [28] W.-T. Tan, A. Zakhor, Real-time internet video using error resilient scalable compression and TCP-friendly transport protocol, IEEE Trans. Multimedia 1 (2) (June 1999) 172}186. [29] W.-T. Tan, A. Zakhor, Multicast transmission of scalable video using receiver driven hierarchical FEC, in: Proceedings of the Packet Video Workshop, New York, April 1999. [30] V. Vaishampayan, Design of multiple description scalar quantizers, IEEE Trans. Inform. Theory 3 (May 1993) 821}834. 9 10 11 12 13 14 15 16 17