Integrated Error Management in MoD Services Pål Halvorsen, Thomas Plagemann, and Vera Goebel University of Oslo, UniK- Center for Technology at Kjeller Norway INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Overview • Application Scenario • Integrated Error Management – – – – Basic idea Code requirements Possible solutions Evaluation • Conclusions INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Application Scenario Media-on-Demand server: Applicable in applications like News- or Video-on-Demand provided by city-wide cable or pay-per-view companies Multimedia Storage Server Network Network Retrieval is the bottleneck: Project goals: Some important factors: • Memory management • Communication protocol processing • Error management Optimize performance within a single server: • Reduce resource requirements • Maximize number of clients INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Traditional Error Management FEC Encoder INFOCOM 2001, Anchorage, April 2001 FEC Decoder ©2001 Pål Halvorsen Integrated Error Management FEC Encoder INFOCOM 2001, Anchorage, April 2001 FEC Decoder ©2001 Pål Halvorsen Correcting Code Requirements • Erasure (known errors) and error (unknown errors) correcting • Decoding throughput (recovery performance) • Amount of redundant data • Applicable within a single file • Cost of decoding function dependent on the corruption rate INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Possible Schemes • Error and erasure correcting codes Bad performance (< 1.7 Mbps) Not dependent of corruption rate (< 3.5 Mbps without any corruption) • Erasure correcting codes Adequate performance (> 6 Mbps) Corrects only erasures • Combined schemes – Erasure correcting / Error correcting Capable of recovering from most errors Even more redundant data Performance still a problem (< 1.7 Mbps) – Erasure correcting / Error detection Adequate performance (> 6 Mbps) Capable of recovering from most errors (Almost) no additional redundant data INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Prototype Design • Combination of the traditional UDP checksum and an erasure correcting code (Cauchy-based Reed-Solomon Erasure) • Disk array with 8 disks, i.e., 7 information disks and 1 parity disk: Write operations a la RAID level 4/5 Read operations a la RAID level 0 One codeword, contained in one or more stripes, is read as one read operation Each striping unit consists multiple symbols each transmitted as a UDP packet INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Codeword Size and Amount of Redundancy Requirements: • Codeword_size n stripe_size, n Z • Recover from one disk failure and (most) network errors Our current approach: – Using a codeword of 256 symbols with 32 redundant symbols corrects one out of 8 disks corrects most errors in the network according to our error model, i.e., any 32 out of 256 packets INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Packet (Symbol) Sizes - I • Correcting code throughput suggests a large symbol size INFOCOM 2001, Anchorage, April 2001 Low throughput ©2001 Pål Halvorsen Packet (Symbol) Sizes - II • Correcting code (startup) delay suggests a small symbol size INFOCOM 2001, Anchorage, April 2001 Low throughput High startup delay ©2001 Pål Halvorsen Results and Observations – I Experimental Setup • Assuming coding performance is only bottleneck • Transmitted a 225 MB file (corresponding to a 5 minutes 6 Mbps playout) • Used a worst-case loss scenario • Used several different machines INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Results and Observations – II Server Side • No extra disk space needed • No additional time needed to retrieve parity data • No overhead managing retransmissions • Increased buffer space and bandwidth requirement of 12.5% storing and transmitting redundant data. • Encoding throughput limitation of ~3 Mbps (Intel, 166 MHz) to ~25 Mbps (Intel, 933 MHz) is eliminated by retrieving parity data from disk The server can support more concurrent users INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Results and Observations – III Client Side • Increased buffer requirement for decoding the received data (2 MB, 3 MB, 5 MB, and 9 MB using 1 KB, 2 KB, 4 KB, and 8 KB packets, respectively) • Additional CPU requirements recovering lost/corrupted data • Additional (worst case) startup delay between ~0.1 s (Intel, 933 MHz) and ~4.5 s (Intel, 166 MHz) decoding the first block can be experienced • In-time decoding hiccup free presentation INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen Conclusion • The INSTANCE project aims at optimizing the data retrieval in an MoD server • Integrated Error Management • Future work • More information: www.unik.no/~paalh/instance INFOCOM 2001, Anchorage, April 2001 ©2001 Pål Halvorsen