Energy-Efficient System Virtualization for Mobile and Embedded Systems Final Review 2014/01/21 Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion What Have Been Done The first-half year ◦ Energy-efficient task scheduling for per-core DVFS architecture Offline energy-efficient task scheduling Online energy-efficient task scheduling The last-half year ◦ Energy-efficient task scheduling for big-LITTLE core architecture Goal of Our big-LITTLE Aware Scheduling Derive an energy-efficient scheduler for big-LITTLE core architecture ◦ Satisfies the resource requirement of each task. ◦ Minimizes the average power consumption. Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion Big-LITTLE Core Architecture Developed by ARM in 2011. Combines two kinds of architecturally compatible processors with different power and performance characteristics. Three different types ◦ 1st Cluster migration ◦ 2nd CPU migration/In-Kernel Switcher ◦ 3rd Heterogeneous Multi-Processing(HMP) Type 1: Cluster Migration Either big or LITTLE cores are used simultaneously. Type 2: CPU Migration Logical CPU: a pair of big and LITTLE core. Only one of the two cores in a pair is powered up and processing tasks at a time. Type 3: HMP All the big and LITTLE cores can be used at the same time. Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion Building the Power Model Measure the average power consumption of big and LITTLE core using different core frequency under different CPU load. Platform: ODROID-XU ◦ 1st type, Cluster migration. ◦ Cortex™-A15 and Cortex™-A7. ◦ Per-cluster DVFS. Average Power Consumption Power Model The power consumption Pt of an interval n n t is: b L Pt Pi ,t Pj ,t b L i 1 j 1 Pnb,t E bf loading n ,t PnL,t E Lf loading n ,t nb and nL: the number of big and little cores Pbi,t and PLj,t : the power consumption of big core i and little core j during time t. Ebf and ELf : the power consumption of big and little core with frequency f under load 100%. loadingn,t : the load of the n-th core in interval t. Task Model For every Taski in a scheduling interval, we define: resourcei CoreFreqi loading i ◦ loadingi :the percentage of time that task Ti is running on a core in a period of time. ◦ CoreFreqi :the current frequency of the core cluster this task is running on. Task Model(Cont.) We also define the minimum resource required for Taski as: res_reqi QoSFreqi QoSLoading i ◦ QoSFreqi :the minimum core frequency that can satisfies the QoS requirement of Taski. ◦ QoSLoadingi :the CPU load of Taski while the core frequency is QoSFreqi. Objective Find a scheduling plan for an interval t according to task loadings, such that Pt is minimum while satisfying task resource requirements. ◦ Recall that P P b P L i ,t j , t t nb nL i 1 j 1 Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion Resource-Guided scheduling Consist of three phases: ◦ TaskInfo phase ◦ LittleCore phase ◦ bigCore phase Activates every scheduling interval, and makes decisions for the next interval. TaskInfo Phase Gathers the loading information of each task. Collects the load and current frequency of each core. LittleCore Phase If LITTLE core cluster is powered-off, skip this phase. Compares resourcei of each Taski in little core cluster with their res_reqi. ◦ If any task gets less resources than its minimum requirement, adjusts core settings. Core Adjustment First, try to provide more resources to tasks by increasing LITTLE core frequency. If increasing frequency cannot provide enough resources, add one LITTLE core. Still, if add core cannot provide enough resources, migrate the task to big core cluster. bigCore Phase If big core cluster is powered-off, skip this phase. Compares resourcei of each Taski in big core cluster with their res_reqi. ◦ Similar to LittleCore phase, without “migrate to bigger/powerful cores”. If every task requirements are satisfied, try to migrate task back to LITTLE. ◦ By estimating if task requirements can be satisfied in LITTLE cores. Flowchart Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion Platform Platform: ODROID-XU ◦ 1st type, Cluster migration ◦ Cortex™-A15 and Cortex™-A7 ◦ Per-cluster DVFS Benchmark Three applications ◦ TTpod: MP3 palyer ◦ Candy Crush: Game ◦ Chrome: Web browser Benchmark TTpod Candy Crush Chrome QoS requirement Play music without interrupts At least 24 FPS during gameplay Jump to next page within one second after clicking a link Simulation Simulate the execution of the three applications separately, and measure their average power consumption on the three types of big-LITTLE core architectures. Compare the average power consumption with Linaro’s strategies. Simulation Results Resource-Guided Scheduler Model I II III TTpod 0.019 0.019 0.019 Candy Crush 0.371 0.371 0.371 Chrome 0.916 0.916 0.916 I 0.019 1.49 1.88 Linaro II 0.019 1.49 1.73 III 0.019 1.49 1.73 Linaro’s strategies increase core frequency while encountering high CPU load, and eventually uses the highest frequency of big core to run Candy Crush and Chrome. Our method use LITTLE core for Candy Crush, and big core with lower frequency for Chrome, thus reduce power consumption while keeping the QoS. Experiment Execute the three applications together, and measure the average power consumption during execution. Scenario ◦ A user first starts TTpod to play some music. A minute later, this user starts to play the game, Candy Crush, while keeping the music playing. After playing the game for three minutes, this user finishes the game and opens Chrome to search for a solution on how to conquer a certain stage of Candy Crush. Results – Linaro’s Strategy Results – Resource-Guided Experimental Summary Average power consumptions ◦ Linaro’s strategy: 0.843 Watt ◦ Resource-guided: 0.089 Watt The main reason is that some applications over-use resources. ◦ For example: Candy Crush ◦ Linaro’s strategy is unaware of such condition, thus keep increasing the core frequency. Outline Project overview big-LITTLE core architecture Models Resource-Guided scheduling ◦ Experimental results Conclusion Conclusion Build an energy-efficient scheduler for big-LITTLE core architecture that satisfies the resource requirement of each task and minimizes the energy consumption. Propose a scheduling policy which decides the resource use for the tasks in a dynamic fashion. The experimental results demonstrate that compared to Linaro’s scheduling strategies, our resource-guided scheduling method is more power-efficient. Realizing Power-aware bigLITTLE scheduler on ICL Hypervisor 2014/01/21 Project Goal Enabling power-aware big-LITTLE scheduler design on ICL Hypervisor Target hardware platform ◦ ODROID-XU+"E" with power meter Further Details Schedule vCPU between host CPU. Hypervisor scheduler may not interfere Guest OS process scheduler. Two conditions: ◦ Guest OS has big-LITTLE-aware scheduler Hypervisor assign host CPU to guest according to Guest OS requirements. ◦ Guest OS does not have big-LITTLE-aware scheduler Hypervisor scheduler assign vCPU according to guest condition. Preliminary Thought Schedules vCPU to host CPU. Guest OS should provide information to hypervisor ◦ Information such as “x big and y little” or “vCore 2 require at least z processing speed”. ◦ What information can ICL hypervisor get from guest OS? VM Introspection Since our scheduler needs QoS information from tasks, we need to know what applications a Guest OS is running. ◦ To maintain QoS and save energy. Thus we need hypervisor to support VM introspection.