with Scott Arnold & Ryan Nuzzaci Power Optimization for Embedded System Idle Time In The Presence of Periodic Interrupt Services Gang Zeng, Hiroyuki Tomiyama, and Hiroaki Takada Dependability Processor utilization is not 100% efficient, even at worst case execution time (WCET) During idle states real-time OS maintain a periodic interrupt to synchronize system eg) uc/OS-II, eCOS, and Linux all require 10ms clock to generate system clock Reduce power usage Processors switch to low-power modes to save energy Many processors provide multiple power saving modes. Dynamic Power Management (DPM) tries to apply optimal low power mode Optimal low power mode is determined by the duration of the system idle state SA-1100 (high-performance, low power) Operational Modes Run mode : Normal operations, full functionalities and high power usage Idle mode : Stopped CPU clock, peripherals clock still enabled Sleep mode : Stopped CPU and peripheral clocks M16C (low-end, low-power) Same modes, different names Lower power and time for transition between modes SA-1100 not suitable for this application considered this paper Despite lower power sleep mode than M16C Too large a transition time for returning to run-mode, not efficient for short interrupt times Normal interrupt service requests related to on-chip interrupts don’t work in sleep mode Dynamic voltage/frequency scaling (DVFS) Reduces voltage and frequency while still meeting the deadline time constraint Commonly accomplished with DC-DC converters and phase-locked loop (PLL) M16C unique for it’s DVFS capability among low-end processors Small time overhead for frequency change By contrast, high-performance processor typically require transition time ranging 189µs to 3.3ms Higher frequency corresponds to higher power in idle mode. Lower frequency has disadvantages Longer execution times for interrupt service routines (ISR) May result in higher total energy usage We need to find the Optimal low-power frequency to save on energy Most DPM schemes focus on stochastic to predictive schemes. Assume fixed power Objective to determine at which power mode the system should remain in Predictive approach On-chip timer interrupt commonly employed in embedded systems to reactivate normal operation On chip clock cannot be disabled in this case DPM vs. DVFS DPM saves power in the long idle times DVFS saves power in the short slack time DVFS assumes periodic tasks with known WCET Slack time cannot be reclaimed completely Assumptions of Case Studies Alterable voltage Multiple low power modes Modifiable clock frequency Real time operating system (RTOS) To simplify calculation we assume Static mode transition power Fixed voltage/frequency transitions Two different case studies The periodic interrupt cannot be disabled The periodic interrupt can be disabled for a specific duration Large DVFS time overhead Static approach adopted (set only once at the beginning of idle) Upon first power saving mode goes to lowest possible state frequency Use equation 1 to calculate power usage Negligible DVFS time overhead Dynamic approach is adopted(two DVFS modes) Full speed is assumed at the beginning of each ISR Slacks off to slow speed before entering each power mode Use equation 3 to calculate power usage Because Iidle is linearly related M(speed) a limited number of speeds are applicable in this way Assuming Known WCET Idle state duration is a known in this case Thus disable clock for this interval Problems Tracing original clock signal to keep synchronization time Additional tick-timer keeps track without power the peripherals This approach is hardware dependent Platform OAKS 16-mini with Renesas M16C/26 processor ▪ 20MHz max, adjustable with divider ▪ 64K ROM, 2K RAM ▪ Custom ISA, 106 instructions, 39/106 single cycle Power Stats (from datasheet) ▪ 16mA @ 3.3V, 20MHz ▪ 1.8uA in wait mode ▪ .7uA in stop mode (static) ▪ Cannot change supply voltage Software RTOS – TOPPERS/JSP kernel ▪ Consistent with uITRON4.0 standard ▪ Easy to read and reconstructible source code ▪ Easily port to other targets ▪ Low RAM usage ▪ Simulation environment in Windows or Linux ▪ Free Testing DMM for current measurement O-scope for voltage waveform Voltage and current acquired separately Configurable clock tick set to the default 1ms period with an execution time of 12us @20MHz minimal power consumption, which is also consistent with the calcu results. Gang Zeng, Hiroyuki Tomiyama, Hiroaki Takada 250 Table 1. Measured normal and wait mode average current under different speed setting minimal power consumption, which is also consistent with the calculated Selectable Measured current (mA) (voltage = 3V) results. Speeds Normal mode: Irm (1/M full speed) Wait mode: Table 1. Measured normal and wait mode average current under different speed settings 10.04 Measured current (mA) (voltage = 3V) 1.30 Normal 6.35 mode: Irm Wait mode: Iim1.26 Normal and wait avg. current 20MHzSelectable (1/1) Speeds full speed) 10MHz (1/M (1/2) 20MHz (1/1) mode5MHz (1/4) 10MHz (1/2) 2.5MHz (1/8) 5MHz (1/4) 1.25MHz 2.5MHz (1/16) (1/8) 10.04 4.35 6.35 3.24 4.35 2.45 3.24 2.45 1.25MHz (1/16) 1.30 1.26 1.24 1.23 1.22 1.24 1.23 1.22 2.3 2.3 2.2 2.1 2.1 2 2 1.9 1.9 ) t(m n cu erag v A 1.8 1.8 1.7 1.7 1.6 1.6 1.5 ) veragcunt(m A Avg. current VS Exec. time with 1ms ISR period 1.4 1.5 1.3 1.4 1.2 1.3 1/1 full speed 1/1 full speed 1/2 1/2 fullfull speed speed 1/4 1/4 fullfull speed speed 1/8 full speed 1/8 full speed 1/16 full speed 1/16 full speed 2.2 1 5 9 13 17 21 25 29 33 37 41 45 49 Execution time of interrupt service routine (us) 1.2 1 results 5 for 9 1ms 13 interrupt 17 21 25 average 29 33 37vs. execution 41 45 time 49 and Figure 5. Calculated period: current speedofselection. Execution time interrupt service routine (us) 1.4 1.3 1.2 1 5 9 13 17 21 25 29 33 37 41 45 49 Execution time of interrupt service routine (us) Figure 5. Calculated results for 1ms interrupt period: average current vs. execution time and speed selection. ISR Period = 1ms ISR Exec. Time = 12us ISR Period = 1ms ISR Exec. Time = 7us ISR Period = 10ms ISR Exec. Time = 12us Table 2. Comparison of measured and calculated average current with Tp=1ms Th=12us Idle state average current (mA) under periodic interrupt Selected Speed service (voltage=3V, period=1ms, Th=12us) (1/M full speed) Measured current Calculated current 20MHz (1/1) 10MHz (1/2) 5MHz (1/4) 2.5MHz (1/8) 1.25MHz (1/16) 1.47 1.45 1.47 1.50 1.57 1.472 1.451 1.461 1.498 1.534 We change the interrupt period to 10ms and perform the above calculations and measurements again. The corresponding results are given in Table 4. As can be seen, the optimal speed is 1.25MHz (1/16full speed). Periodic interrupt results overview ISR Period ISR Exec. Time Optimal Speed 1ms 12us 10MHz (1/2) 1ms 7us 5MHz (1/4) 10ms 12us 1.25MHz (1/16) 100ms 12us 1.25MHz (1/16) Dynamic approach DVFS overhead negligible with M16C Increase speed at the beginning of the ISR Decrease speed on exit to idle state Gang Zeng, Hiroyuki(1ms Tomiyama, Hiroaki Takada Static VS Dynamic Approach ISR period) 2.3 1/1 full speed 1/2 full speed 1/4 full speed 1/8 full speed 1/16 full speed 1/1 + 1/16 1/2 + 1/16 1/4 + 1/16 1/8 + 1/16 1/16 + 1/16 2.2 2.1 2 1.9 1.8 1.7 ) t(m n cu erag v A 2 1.6 1.5 1.4 1.3 1.2 1 5 9 13 17 21 25 29 33 37 41 Execution time of interrupt service routine (us) 45 49 “Cycle-conserving DVS for EDF scheduler” 2 in conjunction with the proposed DPM algorithm are implemented in the TOPPERS/JSP kernel. The experiment test set is presented in Table 5, and corresponding energy results for one minute running are summarized in Table 6. As can be seen, while DVFS can achieve significant power savings compared with full speed running, the proposed configurable clock tick for idle state power management can furtherclock reduce tick the energy by 23% is in disabled average compared The periodic interrupt with normal DVFS without any idle state processing. when inisidle mode to verify the capability of keeping system Experiment also conducted time synchronization. We implement the configurable clock tick and the Idle mode can last up to 3000ms compared to original clock tick in RTOS, respectively, and then let them run the above DVFS experiments for 30 minutes. Finally, we compare their system time 1-10ms after running. The results show no difference between the two imple ISR execution time increases because there is mentations, which indicates the configurable clock tick can trace the original tick precisely even if the tick issyncing disable during idle time. nowclock more overhead forclock clock Table 5. Experiment task set Task set Period (ms) Task 1 Task 2 500-2000 500-3000 WCET (ms) Actual ET (ms) 130 245 28-130 38-245 Power Results DVFS and Idle State Power Management In case the periodic interrupt cannot be disabled, they proposed static and dynamic methods to achieve minimal power consumption The dynamic approach can reduce the average power by 4.3%-11% as compared to the static approach for systems with the clock tick interrupts that can not be disabled In case the periodic interrupt can be disabled they proposed a configurable clock tick to save power by keeping the processor in low power mode for longer time A power reduction of 23% can be achieved for systems with configurable clock tick interrupts Trivial experiments Graphs difficult to read (could have used color) Inaccurate measurement tools Could have expanded to test other ISAs, OSs, architectures, platforms, etc. Could include a cost/benefit analysis