Summary on the ‘Multicast and Unicast Real-Time Video Streaming Over Wireless LANs’ Lee, Kang Eui Packet-Erasure Model of IEEE 802.11 LAN The IEEE 802.11 MAC/logical link control (LLC) and physical (PHY) layers represent two lower layers in the open system interconnect (OSI) reference model, i.e., the data link and physical layers. In real life, we do not have direct access to the physical (or even MAC) layer. Furthermore, most of the successful wireless networks adopt the IP as a network layer simplifying the integration of wireless networks into the Internet networks. In this scenario, user applications see the wireless channel as an IP packet channel with erasures-much like the wired Ethernet. Therefore, in designing our algorithms (which run at the application layer), we model the wireless network channel as a packet-erasure channel at the network layer level. The simplest model to approximate a packet-erasure channel is to assume that the erasures are independent and identically distributed (i.i.d.) and have probability Pe. Coding for Packet-Erasure Channels In order to reliably communicate over packet-erasure channels, it is necessary to exert some form of error control. Two classes of communication protocols are used in practice to communicate data over packet networks: synchronous and asynchronous. Asynchronous communication protocols, such as ARQ, are reliable but have unbounded delay. ARQ operates by dividing the data into packets and appending a special error check sequence to each packet for error detection purposes. The receiver decides whether a transmission error occurred by calculating the check sequence. For each intact data packet received in the forward channel, the receiver sends back an acknowledgment. Thus, ARQ requires a two-way communication channel to be present. While this model works very well for data communication, it is not suitable for multimedia streams with hard latency constraints. The maximum delay of the ARQ mechanism is unbounded, and in multimedia applications it is usually preferable and, in the case of live streaming, necessary to interpolate late-arriving or missing information rather than insert a delay in the stream playback. In synchronous protocols, the data are transmitted with a bounded delay but generally not in a channel adaptive manner. To provide for some measure of reliability, FEC coding is employed. FEC codes are applied to a group of source data packets. The FEC codes are designed to protect data against channel erasures by introducing parity packets. No feedback channel is required. If the number of erased packets is less than the decoding threshold for the FEC code, the original data can be recovered perfectly. However, FEC techniques cannot guarantee that the receiver receives all the packets without error. Note that existing Internet streaming media servers and clients are based on a partially-synchronous version of the ARQ protocol. These applications maintain a record of the approximate round-trip time for a packet and its acknowledgment, and use this information to determine at the server if a packet is likely to arrive at the destination before its deadline. In this way, the unbounded delay of ARQ protocols can be avoided. However, even with this change, the ARQ-based protocols still require a small overall packet loss rate and low round-trip latency to achieve an acceptably small probability of stream transmission failure. Reed–Solomon (RS) codes RS codes are described by two numbers (n,k), where n is the length of the codeword and k is the number of data symbols in the codeword. Each symbol is drawn from a finite field of 2s elements, where s is the number of bits to be represented in each symbol. The total number of words in the code equals 2s -1. RS codes can be used to correct errors, erasures, or both. Particularly efficient decoding algorithms based on Vandermonde matrices [16] exist if only erasures are to be corrected. In this case, each parity symbol can correct any one missing data symbol. This means that we can recover the original codeword, and hence the original data, if at least of the original symbols are received. STREAMING VIDEO OVER WIRELESS LAN: A SINGLE USER CASE A. MDFEC MDFEC is a transcoding mechanism to convert a prioritized multiresolution bitstream (see Fig. 1) into a nonprioritized multiple description bitstream (see Fig. 2) using efficient FEC codes. Fig. 1. A scalable bitstream partitioned into N layers. The ‘i’th layer can be further decomposed into i parts for channel coding. Fig. 2. Conversion of the prioritized scalable bitstream into an unprioritized one through unequal channel codes. Each packet offers an unprioritized equivalent description of the source. Let d be an N-dimensional distortion vector (also called the distortion profile) where dk reflects the distortion attained when k out of N packets are received. The progressive bitstream is marked at N different positions (that form N resolution layers), which correspond to achieving the distortion dk levels, as shown in Fig. 1. The i th resolution layer is split into i equal parts and an (N, i) RS code is applied to it to form the packets as shown in Fig. 2. Since every packet contains information from all the resolution layers, they are of equal priority. The RS code ensures that the i th resolution layer can be decoded on the reception of at least i packets. Since the distortion-rate function D(r) for a source is a one-to-one function of the rate r, finding the dimensional distortion vector corresponds to finding the rate partition R=(R1, R2, R3,…, RN) of the multiresolution bitstream. We note that, in this framework, the decoded quality at the receiver is strictly a function of how many packets are received and not which packets are received. B. Hybrid ARQ The MDFEC method is an attractive solution, but it requires progressive video input. Here we propose a way to combine the ARQ and FEC error control methods to improve the performance of unicast communications of single resolution video over packet-erasure channels. Hybrid ARQ schemes have been extensively studied in the literature for various communication channels. The acknowledgment has to share the same channel with the data and consequently too many ACKs can have a significant effect on the throughput. The proposed Hybrid ARQ scheme attempts to address this issue by efficiently reducing the amount of ACKs. Our scheme is inspired by a similar idea proposed in [17] for rate-compatible punctured convolutional (RCPC) codes. The idea is illustrated in Fig. 7 (bottom). Fig. 3. FEC, ARQ, and Hybrid ARQ coding schemes. We start by splitting our multimedia data into “packet groups,” consisting of k packets each, and then, for each packet group, appending (n-k) RS parity packets to the group as in the FEC coding scheme described above. However, unlike in the pure FEC scheme, we initially send only the first k data packets to the receiver. Then transmitter starts sending parity packets until one of the following two events occurs: either an acknowledgment from the receiver arrives, or the deadline for the transmission is reached. Once at least packets are received intact, the receiver sends an acknowledgment. Once the acknowledgment is received, the transmitter continues with the next k data packets. One significant advantage of this algorithm is that it does not break down even when acknowledgments are lost. Instead, the transmitter simply assumes that more parity is needed. The Hybrid ARQ scheme is a general algorithm and can be adjusted to fit specific cases as is appropriate. For instance, although interleaving is not described above, it is used in practice to improve bandwidth utilization during the time when an ACK is going from receiver to sender (i.e., after sending all the data packets for the current group of packets, the data/parity packets from other groups are used to interleave the parity packets of the current group). C. Hybrid ARQ Throughput Analysis To compare the throughput of the proposed Hybrid ARQ scheme with the conventional ARQ and FEC schemes described above, we assume a memory less packet-erasure channel with only a single user, and with Pe being the probability of packet erasure. The throughput is defined as the ratio of time (in the channel) of data packets to the average time it actually takes to send them. Additionally, we only count those k data packets that have transmission time less than some desired time n. Let I en denote a random variable that represents the number of packet erasures in a group of packets. Assuming an (n,k) RS channel code, the throughput of the FEC coding scheme is equal to TFEC k Pr( I en n k ) n k nk j n! Pe (1 Pe ) n j n j 0 j!(n j )! To estimate the throughput of an ARQ scheme used in the 802.11 link layer, we assume that ld is the transmission time (in the channel) of a data packet, and la is the transmission time of an acknowledgment, and we assume that no erasures occurs on the return channel. Let E denote a random variable representing the total number of packets sent in a successful transmission of k data packets. The throughput of the ARQ scheme is equal to n (e 1)! kld where, Pr( E e) (1 Pe ) k Peek T Pr( E e ) ARQ Assuming, as before, no erasures in the return channel, independent erasures in the (e forward k )!(k 1channel, )! e k eld ela the throughput of the described HARQ system can be found as n THARQ ek kld Pr( E e) eld la The throughput for different coding schemes is shown in Fig. 4. Fig. 4. Throughput for FEC, ARQ, and HARQ schemes STREAMING VIDEO OVER WIRELESS LAN: A MULTIPLE USER CASE It should be noted that ARQ-based schemes are less appropriate for the multicast case for two reasons: ACK explosions and the requirement to retransmit different packets to all users. For significant packet loss rates, each user will require frequent packet replacement, and different users are most likely to require different packets. To respond to requests by multiple users, we may have to resend a significant fraction of the original data even for small loss rates. However, for small multicast networks, the hybrid ARQ scheme can alleviate the problem of sending different correction packets to each user. Because each parity packet can replace any missing data packet, there is no need for each user to identify which packet is missing. Instead, each user can simply transmit an acknowledgment when it receives enough parity to decode the transmitted data. When acknowledgment packets have been received from all known multicast users, the transmitter can move on to the next packet group. In this work, however, our approach is to use progressive video coding as described below. A. Problem Formulation for Wireless Video Multicast When there is only a single client, it is clear that any coding scheme should seek to maximize the received user quality given a total rate constraint and a transmission profile. Since the scheme that maximizes the received quality for one client may not be the optimal one for other clients (since different clients will have different channel profiles and rate constraints), in the multicast scenario, it is desirable to maximize some composite delivered quality criterion, given the total rate constraint and the transmission profile. It is difficult to arrive at an overall quality criterion for the multi-user case and any number of schemes, such as a weighted averaging scheme or simply designing for the worst-case receiver, could be suitable to a particular application scenario. Of course, the weighted averaging scheme can be easily mapped to the single user case and can be directly solved by the MDFEC algorithm. In this paper, we instead focus on a maximal regret criterion. We propose that the optimal coding scheme is the one that minimizes the following criterion: ( R) max ( E[d i ( R)] E[d i ]min ) i where R is the rate partition, E[di]min is the minimum expected distortion for the i th client achieved by using the optimal coding scheme when it is the only client, and E[di(R)] is the expected distortion for the particular coding scheme being used. Such an overall quality criterion is fair in the sense that it minimizes the maximum penalty that any client suffers. B. Proposed Solution We assume that all the clients have the same total rate Rtot constraint. For any rate partition R, the total rate Rt used is ( R R2 ) ( R RN 1 ) R ( R R1 ) Rt 1 N 2 N 3 N ... N N 1 2 3 N N N j R j , where j for j 1,2,..., N 1 and N 1 j ( j 1) j 1 Now for each client, we find the minimum expected distortion or E[di]min. This optimization problem is then formulated as subject to Rt Rtot R1 R2 R3 ... RN where (1) (resource constraint) (2) (embedding constraint) N E[di ( R)] q0i E qjiD( R j ) j 1 and where q ji (N ) is the probability of the i th client receiving out of packets, D(r) is the source ratedistortion function, E is the source variance, and R is the rate partition. We state the problem as follows: min E[d i ( R )] subject to (1) and (2) R or generally, min ( R ) subject to (1) and (2) R Assuming that rate-distortion curve is convex (not always true in practice for operational rate-distortion curves), the expected distortion must also be convex in the rate partition, as the weighted sum of convex functions with positive weights is another convex function. Hence, Δi(R)=E[di(R)]-E[di]min is also convex in the rate partition. Consequently, Δ(R) has to be convex in the rate partition, as the maximum/supremum of any set of convex functions is convex. Further, the constraints on R, i.e., (1) and (2), are also convex constraints and hence the problem of finding the minimax regret is one of convex optimization. • Two client case: since Δi(R) is convex in R and the minimum value of Δi(R)=0, Δ1(R)= Δ2(R)for the optimal rate partition. So, for the particular case of two clients, the algorithm finds the rate partition at which both the users are equally satisfied or equally disappointed. In this sense, the algorithm is fair to both users. • For the rate partition that maximizes the overall quality criterion for more than 2 users, we will again have Δl(R)= Δj(R), where l and j are the two users for whom the versus rate partition curves (surfaces) intersect at the highest point relative to the intersection points of any pair of curves, and this will be the maximum disappointment or least satisfaction for any client. Thus, the N>2 users problem can be solved by analyzing the users pairwise.