Heterogeneous Multi-processor A. Classic design Flow The software design use programming models to abstract the hardware architecture. Figure 1 Classic design flow To produce efficient code, the software needs to be adapted to the target architecture by using specific libraries, such as system library for the different hardware components or specific memory mapping for the different CPU and memory architectures. Discontinuities in the software design, i.e. the software compiler ignores the processor architecture (e.g. interrupts or specific I/Os). Figure 2 execution The combination of the platform with the software code produces an executable model that emulates the execution of the final system, including hardware and software architecture. This executable model allows simulation of the software with detailed hardware–software interaction, software debug, and eventually performance measurement. B. Ideal Design Flow The ideal scheme would be able to produce an efficient software code starting from a high-level program using generic communication primitives, such as send/recv. In an ideal design flow, the software generation targeting a specific architecture consists of partitioning and mapping, final application software code generation and hardware dependent software (HDS) code generation. Figure 3 ideal design flow The HdS is made of lower software layers that may incorporate an operating system (OS), communication management, and a hardware abstraction layer to allow the OS functions to access the hardware resources of the platform. Unfortunately, we are still missing such an ideal generic flow, able to efficiently map high-level programs on heterogeneous MPSoC architectures. In addition, the validation and debugging of HdS remains the main bottleneck in MPSoC design because each processor subsystem requires specific HdS implementation to be efficient. 1. Software oriented The software-oriented approaches make use of a software model in the form of a runtime library to model the interaction with the hardware. The application can be written in a high-level language. The software stack construction consists of compiling this code and linking the results with the run-time libraries. The software stack construction consists of compiling this code and linking the results with the run-time libraries. The library is defined separately for each processor and can be very sophisticated. 2. Hardware oriented The hardware-oriented approach executes the final software on a virtual platform and it corresponds to classic hardware–software co-simulation models using instruction-set simulators (ISS). These techniques require that all the software and hardware are fully specified.. 3. Electronic system level (ESL) design-oriented The ESL oriented approaches use high-level APIs (application programming interface) to abstract the hardware software interfaces. This approach enables the automatic generation of a virtual prototype from a system-level model, but the generation of HdS software layer is performed in one step, which generally implies the use of predefined communication schemes. Summary of Heterogeneous Multi-Processing A heterogeneous system provides the diverse instruction set architectures which allow entire computing environments (operating systems and application bases) to be incorporated into a M P system. The best that homogeneous systems can offer is to extend the power of a single computing environment, which does not extend the fundamental instruction set (and hence the software application base) of the system. It is commonly accepted that software is currently the gating factor in computer system evolution. Software technology is lagging far behind hardware technology. To address this issue the heterogeneous multiprocessing system encourages hardware to be ported in addition to software. This hardware porting creates exciting new avenues for computer system expansion. An object-oriented approach was used in defining the HMW. The information hiding nature of an object-based approach prevents the underlying code and data differences of heterogeneous processing elements from interfering with system-level design. Objectoriented approach to unify the heterogeneous system. Processes are heterogeneous and capable of running under different operating systems ·with dynamic mobility supported. Heterogeneous, mobile processes enhance flexibility and performance. A set of system primitives was defined to assist in the management of objects and processes. The primitives are presented as model constructs for language extension. Consider multiprocessor platforms in which only programmable processors are used as processing components, and they communicate data only through distributed memory units. A "programmer friendly" development environment is sure to produce better software faster than a hostile system which gives up debugging information begrudgingly. The complexity of Multi-Processer design, coupled with the added problems of heterogeneous systems such as data format differences, demands development tools integrated into the computer at the system level. Heterogeneous MP Scheduling In simple MP are self-scheduled It means that there is no need for a global scheduler component in platforms. The processors in platforms can be connected either by a crossbar switch (CBS), or a P2P network, or a shared bus (ShB). Heterogeneous multi-cores are typically composed of small and big cores using global scheduler. Global Scheduler parses each service request and accordingly it interacts with the rest of the platform through the Cloud Database (DB). For different core types on a single die has the potential to improve energy-effiency without sacrificing significant performance. However, the success of heterogeneous multi-cores is directly dependent on how well a scheduling policy maps workloads to the best core type (big or small). Incorrect scheduling decisions can unnecessarily degrade performance and waste energy/power. Advantages and Disadvantages of Heterogeneous Multi-Processors Advantage Disadvantage Increased reliability Increased software complexity Increased survivability Difficult system test and failure diagnosis Increased processing power Unique expertise for design/development required Increased responsiveness Bus arbitration required Higher degree of modularity System expandability in smaller Increments Table 1 Today’s Use 1. XBOX 360 Figure 4 Xbox Specifications 3 CPU cores – 4-way SIMD vector units – 8-way 1MB L2 cache (3.2 GHz) – 2 way SMT – In-order 2 Instructions/cycle • ATI GPU with embedded EDRAM • 3D graphics units • 512-Mbyte DRAM main memory • Big performance increase over last generation • Support high-definition video • Extremely high pixel fill rate (goal: 100+ million pixels/s) • Flexible to suit dynamic range of games • balance hardware, homogenous resources • Programmability (easy to program) 2. Cell Processor Each cell chip has: •One power PC Core 8 computer cores •One chip memory controllers •One chip I/O •One chip network to connect them all Figure 5 cell processor Heterogeneous computing platforms can be found in every domain of computing—from high-end servers and high-performance computing machines all the way down to low power embedded devices including mobile phones and tablets. Multiprocessor Objectives Multiprocessor systems Why build multiprocessor systems Problems Solutions Heterogeneity