A DSP-Based Platform for Wireless Video Compression Patrick Murphy, Vinay Bharadwaj, Erik Welsh & J. Patrick Frantz Rice University November 18, 2002 Motivations & Requirements • Wireless video communication Demand expected to explode • Much more challenging than voice • • Standards not yet established • • Need modular hardware & software Testbed should approximate real system Similar hardware • Portable code • 1 Testbed Description • Source coding on Texas Instruments DSP Similar to processors in most wireless devices • Mature software development environment • • Bluetooth wireless interface • • Datarate comparable to future mobile systems Decoding performed on separate PC 2 Testbed Hardware • Custom hardware Inspired by TI Imaging Developers Kit • Support for lower resolution imaging • Integrates wireless interface & host processor • • Daughtercard to TI DSP Starter Kits Adheres to DSK interface specification • Backward and forward compatible with DSKs • 3 Testbed Hardware Custom daughtercard on C6416 DSP board 4 Hardware Details: Processor • Axis Etrax 100LX 32-bit 100 MIPS RISC processor • Runs full Linux operating system • • • Uses 2.4.x kernels and glibc Open source Bluetooth stack Originally developed by Axis • Now available under GPL • • Multichip Module • Integrated flash, SDRAM and Ethernet interface 5 Hardware Details: Camera • Omnivision CMOS imager 16-bit digital interface • Supports YUV 4:2:2 and raw RGB • On-board image preprocessor • Color space conversion • White balance • • Configuration by I2C • Easily controlled by Etrax processor 6 Hardware Details: Bluetooth • Support for variety of Bluetooth hardware Up to 1 Mbps serial connection • Standard UART interface • SiliconWave and Ericsson modules tested • • Flexible application-level interface Simple (virtual) serial port • TCP/IP via PPP • Custom protocols possible • 7 Hardware Details: DSPs • Daughtercard adheres to TI spec Compatible with C62x, C67x & C64x boards • Possible future use with C5x systems • • Minimal resource requirements Uses just one external interrupt • All transfers via external memory interface • Allows additional daughtercards to be used • 8 Software Tools • DSP software Development in TI Code Composer • Compatible with CC 1.2x and 2.x • Only requires DSK tools • Uses dsplib and imglib • No third-party libraries or extensions required • • Etrax software Standard Linux development in C/C++ • GNU tools (gcc, gdb, etc.) • 9 Image Capture Dataflow Each pixel read is read from the camera • Daughtercard buffers one line • EDMA interrupted for each line • Buffer emptied by EDMA burst read • EDMA buffers full frame • DSP interrupted for full frame • Frame used only if DSP is ready • 10 Transmission Dataflow DSP writes coded data to memory • EDMA constantly writes to EMIF • Etrax on daughtercard reads data • Data packaged and transmitted • 11 Testbed Software • Generic framework for dataflow • • EDMA setup optimized for input & output Two coding schemes implemented • MPEG-4 Advanced video compression • Designed for low-datarate applications • • JPEG2000 Advanced still image compression • Extremely good compression vs. quality • 12 MPEG-4 Encoder • Structure based on open source x86 code Heavily optimized for C6x processors • Extensive use of TI assembly routines • • Implements MPEG-4 Simple Profile • • Operates on 8 x 8 macroblocks • • Only one previous frame required Minimizes internal data memory requirements Core uses 57 KB of program memory 13 MPEG-4 Encoder Performance • Depends heavily on cache configuration Encoder Performance on C6711 L2 Cache Size 0KB 16KB 32KB 48KB 64KB SRAM Size 64KB 48KB 32KB 16KB 0KB Cache Associativity Frames/second 11.9 1-way 2-way 3-way 4-way 14.5 17.6 18.5 19.2 14 JPEG2000 Image Compression New version of popular JPEG standard • Continuous tone still image compression • Wavelet based algorithm • More computational intensive than JPEG’s DCT • Much better compression ratios • • Useful features Lossless and lossy options • Random access - no inter-frame dependence • Good control over quality vs. compression ratio • 15 JPEG2000 Encoder • Based on JasPer codec • • • Very modular and abstracted design • • • • Open source reference implementation Project of Image Power & U of British Columbia Easy addition of features and formats Great for PC execution Bad for embedded DSP execution DSP optimizations • • Reduce function calls in core code Eliminate some extraneous format flexibilities 16 JPEG2000 Encoder Results • Operation on 32 x 32 pixel tiles • • Constrained by DSP’s internal data memory Total code size around 300KB • More internal program memory boosts performance Encoder Performance (frames/sec) CIF QCIF C6711 0.17 0.46 C6416 6.2 16.5 17 Future Work: Hardware • Interface to new C55x DSP systems • • Dependent on availability from TI Further integrate components Design single board system • Migrate wireless processing to DSP • • • TI’s OMAP platform is good candidate Support full-duplex operation Add audio support • Integrate video display • 18 Future Work: Software • Implement MPEG-4 decoder • • Relatively easy compared to encoder Further optimize JPEG2000 encoder Reduce number of data structures • Minimize function calls • Migrate core to optimized assembly routines • Increase tile size to reduce DMA transfers • • Investigate better error protection • Supplement or replace Bluetooth error control 19 Conclusions Modular & standards-based testbed • Realistic platform • • • Resembles capabilities of future mobile devices It works! Hardware manufactured and tested • Software framework validated • Two compression standards implemented • Ready for future research • 20 Questions 21