Profiling_based

Profiling-Based Hardware/Software CoExploration for the Design of Video Coding Architectures Heiko Hübert and Benno Stabernack Lu Hao Contents 1. Background 2. MEMTRACE profiler 3. Software/Hardware Optimization 4. Conclusion Background -- profiling Profiling is used to understand the runtime behavior of applications Efficient profiling approaches Software profiling  Sampling, Instrumentation  Flexible but have high overhead Hardware profiling  Performance counter  inexpensive but more rigid and may not be universally available Hybrid Combinations of the above  Hold great potential since they combine the advantages of both without the drawbacks An example of hardware profiling PC – Performance Counter Background – system analysis Why we need profiling?  It is very important to adapt the system to the application in order to find an efficient solution.  Video coding Contents 1. Background 2. MEMTRACE profiler 3. Software/Hardware Optimization 4. Conclusion MEMTRACE profiler  MEMTRACE delivers cycle-accurate profiling results on a C function level.  The results include clock cycles, various memory access statistics, and optionally energy consumption estimation for reduced instruction set computer (RISC)-based processors.  A focus is placed on memory access analysis, as for data-intensive applications this aspect has a high potential for increasing system efficiency. MEMTRACE profiling toolflow MEMTRACE -- Initialization MEMTRACE – Performance Analysis MEMTRACE – Post Processing MEMTRACE backend MEMTRACE -- Profiling data acquisition MEMTRACE -- Profiling data acquisition  init()  Initialize the profiler.  Creates a list of all functions and global variables  nextInstruction()  Checks if the program execution has changed from one function to another  If so, the cycle count of the previous function is recalculated and the call count of the new function is incremented  memoryAccess()  It is decided if a load or store access was performed, and which bit-width (8, 16, or 32-bit) was used. MEMTRACE -- Profiling data acquisition busActivity()  Identifies the bus status (idle cycle, core access or DMA access) and increments the appropriate counter of the current function cacheMiss()  Is called each time a cache miss occurs finish()  When the ISS terminates the simulation Processor model generator Interconnection What can we do by using the result of MEMTRACE profiler? Contents 1. Background 2. MEMTRACE profiler 3. Software/Hardware Optimization 4. Conclusion System partitioning  Computationally intensive functions are wellsuited for hardware acceleration in a coprocessor  Control-intensive functions are better suited for software implementation on ASIPs (Application Specific Instruction set Processors) Software Optimization Loop unrolling For computational intensive parts, arithmetic optimizations or SIMD instructions can be applied, if such instructions are available in the processor  Video applications Hardware Optimization  Memory Subsystem Optimizations  External memory  Cache (Cache miss) • The data areas with the most cache misses and the smallest size should be stored in on-chip memory  SRAM  Instruction Set Architecture Optimizations  Frequently used instructions should be considered as targets for optimization during the processor architecture development. Conclusion Profiling and system analysis MEMTRACE architecture  Initialization  Performance analysis  Post processing Hardware/Software optimization  Software  Hardware And questions? Lu Hao References      [1] H Hübert, B Stabernack. Profiling-based hardware/software co-exploration for the design of video coding architectures. IEEE Transactions on Circuits and Systems for Video Technology, 2009, Pages: 1680-1691 [2]ST Microelectronics: Nomadik STn8820 Mobile Multimedia Application Processor (2008, Feb.). Data brief. [Online]. Available: www.st.com [3] Broadcom: BCM2820 Low Power, High Performance Application Processor (2006, Sep.). Product brief. [Online]. Available: www.broadcom.com [4] G. de Micheli and L. Benini, Network on Chips. San Francisco, CA: Morgan Kaufmann, 2006. [5] H. H¨ubert, “MEMTRACE: A memory, performance and energy profiler targeting RISC-based embedded systems for dataintensive applications,” Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., Tech. Univ. Berlin, Germany, 2009. [Online]. Available: http://opus.kobv.de/tuberlin/volltexte/2009/2261

Profiling_based

Related documents

Products

Support

Profiling_based

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib