Design Methodology for the SVENm Multimedia Engine Florian H. Seitner3, Josef Meser1, Gerold Schedelberger2, Andreas Wasserbauer2, Michael Bleyer3, Margrit Gelautz3, Markus Schutti2, Ralf M. Schreier1, Premysl Vaclavik1, Gerald Krottendorfer1, Günther Truhlar2, Thomas Bauernfeind4, Philipp Beham2 Contact Florian Seitner, seitner@ims.tuwien.ac.at, Gerald Krottendorfer, gkrottendorfer@odmsemi.com, Markus Schutti, Markus.Schutti@infineon.com Thomas Bauernfeind, bfnd@riic.at Interactive Media Systems Group Institute of Software Technology Vienna University of Technology Display Content Controller (DCC) CHILI Vector Processor Display Content Controller Display Interface Pixel-Bit Conversion Video Input • RAW RGB data • ITU-BT.656 / 601 YCbCr stream • Compressed stream mode (e.g. MPEG TS) Display CHILI Design • CHILI Core with 32bit / 4 Slots / 8 SIMD • High performance for signal processing and control code • Compiler friendly instruction set • Fully programmable (C / Assembler) • C-Compiler (LLVM, GCC) and instruction set simulator available Graphic Accelerator Image Processing SVENm Bus System DMA Controller RX FIFO/ TX FIFO Register Set Display Functionality Decoder Drawing Operations Video Output • MIPI DPI / DBI 2.0, RGB data • ITU-BT.656, YCbCr data • TVout Composite • Double-buffering with V-sync support Camera Interface Output Formatter Scaler Window BT.656 (DCC) Generic HW Generation based on • Set of Camera / Display Interfaces • Set of Imaging / Graphics Features • Set of Performance Settings • Type / Number of Bus Interfaces • Bus behavior Main Memory (DRAM ) Slave System Bus 64 64 Master Master Core Memory Data Memory Subsystem Local Arbitration Camera CHILI Processor Features • Separate instruction and data path ICACHE • 16-bit SIMD operands • 64 32-bit general purpose registers • 128-bit core memory interface • 64 KB instruction cache CHILI System • 64 KB data SRAM (core memory) • 64-channel data load and store DMA controller • 1.92 GMAC 16-bit operations (@ 240 MHz) 32 32 32 32 C o n t r o l DMS IF Fetch Unit CHILI Core halt Peripheral Port IF 32 DMA Controller Design flow Video Pre- / Postprocessing • Windowing / Scaling / Mirroring / Rotation • Format / Color conversion • Overlay with alpha blending 2D / 3D Graphics LCD Mode DMA Control 32 (CHILI Structure) Partition Assessment Tool (PAT) PAT is a simulator which provides a fast way to • Describe video coding algorithms and HW architectures by an high-level architecture description • Estimate the run-time behavior of these architectures – Can I achieve video encoding and decoding requirements on my current hardware? – What HW is required to handle my specifications? – What is the optimal algorithm partitioning? • Explore new system designs in a flexible way Design Methodology 1. Defining an architecture by a High-Level Architecture Description 2. Collecting of profilings / measurements / expertise of the platform components / software 3. Estimating the system behavior with PAT based on available information from Step 2 4. Design goals reached? If yes, design finished. 5. If not, adaptation of hardware architecture and back to Step 2 SVENm IC High-Level Architecture Definition Algorithm Abstract Architecture Processor 1 A B + C DRAM Processor 2 HW/SW Mapping Core RAM Core RAM Core RAM A = B C DRAM Core RAM D D Processor 3 E E SVENm Multimedia Engine • Video / multimedia companion • Targets H.264 encoding / decoding at SD resolution (PAL / NTSC D1) mDDR Controller D-Cache I-Cache Chili-VSP Multimedia Processor Core Memory I-Cache Data Memory Subsystem DMA Controller Audio, Stream Mux Graphic Processor (DCC) Test, Debug, Control Interfaces Peripherals D-TCM 2 x CHILI Application Processor Display IF Camera IF DCC Display Content Controller Video Processor OS Processing Audio TS-Demux Evaluation Board SLOT 3 Video Encode / Decode SLOT 2 ARM926 OSD & Graphic Acceleration CHILI Core SLOT 1 (SVENm IC) Fetch Unit SLOT 0 I-TCM ARM 926 DMA RX / TX Controller FIFO OSD 2D/3D Graphics Acceleration LCD Controller TV Out Camera IN Video Decode Video Encode (SVENm Structure) SVENm Board Digital video interfaces ARM926 + 64kByte Analog video interfaces SVENm Floorplan Audio interfaces Chili 4-Slot + 64kByte SD card slot Serial interface (SVENm Floorplan) (SVENm Board) (Evaluation Board) Conclusions The Partition Assessment Tool (PAT) can estimate the run-time behaviour and performance of a multi-core decoder system. Architecture evaluations are possible before the decoder architecture is effectively built. Additionally, the PAT allows software design explorations for supporting the partitioning of the decoder software. Based on these results, the requirements of the decoding and displaying tasks in terms of computational complexity, data bandwidth and latency can be considered during the development of such a decoder platform. The resulting architecture, the SVENm multimedia engine, is capable of decoding H.264 baseline at D1@30 Hz (3.2 Mb/s). Acknowledgment Project partners This work has been supported by the Austrian Federal Ministry of Transport, Innovation, and Technology under the FIT-IT project VENDOR (Project nr. 812429). ON DEMAND Microelectronics1 DICE: Danube Integrated Circuit Engineering2 Vienna University of Technology, Institute of Software Technology and Interactive Systems3 Johannes Kepler Universität Linz, Research Institute for Integrated Circuits4