3430 PM framework architecture

advertisement
Wireless Terminals
Software Devt.
OMAP3430 Linux
Power Management
16 Apr 2008
Version 1
Page 1
Page 1
TI Internal Use Only
Too big to fully describe but lets try 
Wireless Terminals
Software Devt.
– Digital Television with Linux - Architecture and Opportunities
from Bas Engel cancelled.
– Will fill in and expand on yesterday’s demo on OMAP3 power
– Give an overview of architecture which was created with
some typical data flows.
• HW overview
• Software overview
– Ask any questions as we go please
– Might not make it through all slides but hopefully touch on
all of points in data flow and any questions
Page 2
Technical Showcase
CE Linux Forum / Embedded Linux Conference 2008
Power Management on OMAP3
Richard Woodruff / Texas Instruments
What is demonstrated
Linux running PowerTop while being measured. The system
goes to 0 volts in between system ticks. Full SOC context
is lost and restored as dictated by data flow.
• Linux CpuIdle + menu governor drive 7 idle/c-states
• Linux CpuFreq + ondemand governor drive 4 operating/pstates. A 5th overdrive operating point is available
• Linux Dynamic tick is implemented (CONFIG_NO_HZ)
• Traditional suspend/resume is implemented
Hardware Information
Texas Instruments OMAP3430
ARM CortexA8 + TI IVA2.2 + IMG SGX530 + TI ISP + more…
http://focus.ti.com/pdfs/wtbu/swpu114g.pdf
How was the Linux improved
TI provides an open source reference implementation
including TRM and boards on current Linux kernels.
• This has been implemented when available with
standard Linux components
• TI has added a resource framework for multi-level
resource evaluation to include multiple multi-state
power domain support, new constraint support, power
aggressive drivers, and more…
Patch Availability
http://www.omapzoom.org/wiki/tiki-index.php
http://linux.omap.com/pub/kernel/3430sdp/
Main generic points
•
Wireless Terminals
Software Devt.
Idea of partial activity.
– Functional clock can run dumping out data
– Other flow path parts of the chip can be disabled and not accessible while your F-Clock runs
– Software does part of the work and hardware dependencies do the rest
•
•
System will restore context only on the data flow/control path. There is no need to
restore the whole system.
For this to happen drivers need to:
– Aggressively get and release resources
• This is NOT suspend/resume. This is on activity path. The device is _still_ is accessible. You are just
enabling/disabling around access or groups of access.
• They must follow TRM guidelines and do get/put. Don’t leave status around.
– Structured so it can handle side effects of possible hardware register loss at clock disable
time.
– Prepared to restore context on clock enable.
– Prepared to enable proper wake up sources specific to function
– If have some operational restriction need to raise this as a constraint
• If I’m an audio device with a fixed external clock, I know my fifo size and its drain rate. I know at
the driver my acceptable latency. I need to tell the system so it doesn’t cut power for two long.
•
Need for some kind of shared resource (vote taking) manager to take the inputs and
do the right things.
Page 4
Why bother / Few Driving Considerations
Wireless Terminals
Software Devt.
• End goal
– Must maintain uW chip consumptions in idle modes
– Must increase performance but keep similar power envelope as batteries
are scaling at same rate and are costly
• New factor
– New process 65nm, 45nm, 32nm… with ~30x or more leakage increase
– Higher active power consumption with fast DPLLs (100sMHz to GHz)
• Design needs
– Must support OFF mode 0 volts (full context loss) to support uW idle use
cases
– Must be able to wake from idle modes with low latency.
– As low as possible users space intervention. System to go to lowest power
state automatically.
Page 5
What is an OMAP34xx (from public TRM)
Wireless Terminals
Software Devt.
Page 6
OMAP3430 Power Partitioning
Page 7
Wireless Terminals
Software Devt.
HW PM Features in OMAP3430
Wireless Terminals
Software Devt.
• Multi Level Clock Gating
• Module -> Clock Domain -> DPLL
• HW Organized for Power Saving
• Voltage Domains -> Power Domains -> Clock Domains -> Clocks
• 16 Power domains
• Two independently scalable voltage domains VDD1 & VDD2
• Dynamic Voltage and Frequency Scaling (DVFS)
• Dynamically changing OPP to suit execution requirements
• Standby Leakage Management (SLM)
• Individual power domain & memory RET & OFF support
• System wide RET & OFF support
• SmartReflex™
• Dynamically adjusting VDD1 & VDD2 OPP voltages to suit the silicon
characteristics, temperature, voltage.
• Power IC Support
• Dedicated HS I2C interfaces for controlling SMPS, LDO etc.
Page 8
How 3430 Linux PM SW Exploits HW Features
Wireless Terminals
Software Devt.
• Control of power resources
– Clocks, power domains, voltage domains, memories and power IC resources
• Inactive state power saving in OS Idle via. CPUIdle and suspend/resume
– Automatic choice between multiple idle C-states by menu governor
– Manual sleep through traditional Linux suspend/resume
– System wide sleep states to OFF and Retention mode
• Device driver interfaces to control power resources
• Constraints & constraints evaluation
• OPP Control
– Operating point control though Dynamic Voltage and Frequency Scaling (DVFS)
and SmartReflex™
• Active state dynamic load prediction and policy management via CPUFreq
– Ondemand governor to control OPP/P-States
• Support for DSS Low Power Refresh mode
Page 9
OMAP 3430 Linux PM Reference Arch.
Page 10
Wireless Terminals
Software Devt.
Key 3430 Linux PM SW Knobs
•
•
•
•
•
•
•
•
•
•
Aggressive resource usage control
MPU Retention
MPU OFF
CORE Retention
Core Off & Off Mode Support
Other Power Domain Off / Retention
DVFS
SmartReflex™
HW Supervised Clock domain transitions
Display Low Power Refresh
Page 11
Wireless Terminals
Software Devt.
Baseport contents
• Functionality available
–
–
–
–
–
–
–
–
–
–
–
–
Page 12
PRCM APIs
Clock framework
Shared Resource Framework
Resource tracking
SmartReflexTM driver
Power IC driver
Constraints framework
CPUIdle framework
CPUFreq framework
Display Low Power Refresh
OFF Mode Support
Suspend/ Resume Support
Wireless Terminals
Software Devt.
PRCM API Layer 1 & 2
•
•
•
•
•
Abstraction of register programming sequences
Layer 1 provides basic functionalities
– Enabling/disabling clocks
– Setting power domain states etc.
– Controlling DPLLs
– Configuring DPLL/clock dividers
– Setting clock source
Layer 2 provides advanced functionalities
– Doing the frequency change part of OPP change
– Doing the voltage change part of OPP change
– Setting the low power mode for the chip
Used by higher layers
Files
– 2.6_kernel\arch\arm\mach-omap2\prcm_34xx.c
– 2.6_kernel\arch\arm\mach-omap2\prcm-regs.h
– 2.6_kernel\include\asm-arm\arch-omap\prcm.h
Page 13
Wireless Terminals
Software Devt.
Clock Framework
•
Wireless Terminals
Software Devt.
Centralized control for all clock related functionalities
– Status (Enable/ Disable)
– Rate
– Dependencies (Clock Tree)
•
•
•
Reference counting to avoid potential sharing conflicts
Re-entrant code to handle simultaneous requestors
Two Layers
– Generic layer common for all OMAP platforms
– OMAP3430 specific clock framework layer
•
Hierarchical
– Clocks represented as hierarchical tree taking into account clock dependencies (parent,
children)
•
•
•
•
Standard open source solution (main lined)
Used by device drivers
Internally uses the PRCM API layers 1 & 2
Files
–
–
–
–
Page 14
2.6_kernel\arch\arm\plat-omap\clock.c
2.6_kernel\arch\arm\mach-omap2\clock_34xx.c
2.6_kernel\arch\arm\mach-omap2\clock_34xx.h
2.6_kernel\include\asm-arm\arch-omap\clock.h
Shared Resources Framework
•
•
•
Centralized control for controlling shared resources such as power domains, DPLLs and
memory resources
Integrated with clock framework to automatically control power domains (except
mpu, core and neon) based on clock requests
Reference counting
–
–
•
Wireless Terminals
Software Devt.
Avoids potential sharing conflicts
Ensures supply of highest requested power domain/ memory state
Re-entrant code
– Handles multiple simultaneous requests to a resource
•
•
Synchronous calls
Capability to get
– Current state of the resource
– Notifications on resource state change
•
•
Generic framework that can be extended to control any type of resource. Is currently
used to model constraints
Files
– 2.6_kernel\arch\arm\mach-omap2\resource_34xx.c
– 2.6_kernel\arch\arm\mach-omap2\resource_34xx.h
– 2.6_kernel\include\asm-arm\arch-omap\resource.h
Page 15
Resources Tracking Layer
Wireless Terminals
Software Devt.
• Tracking layer integrated into respective frameworks : clock
framework, shared resource framework
• Tracks resources in use and resource users
– List of resources in use (and state)
– List of users (drivers) of a particular resource
Page 16
SmartReflex™ Driver
Wireless Terminals
Software Devt.
• SmartReflex™ driver allows for auto-compensation of VDD1 and
VDD2 voltages (around the voltages specified by current OPP)
by analyzing the silicon characteristics, temperature, voltage etc
• The driver configures the Voltage Processor and Voltage
Controller
• The driver interfaces to the Triton2 Power IC via the SR I2C bus
• Files
– 2.6_kernel\arch\arm\mach-omap2\smartreflex.c
– 2.6_kernel\arch\arm\mach-omap2\smartreflex.h
•
SW configuration
– To enable Smart Reflex
• echo –n 1 >sys/power/sr_vdd1_autocomp (for VDD1)
• echo –n 1 >sys/power/ sr_vdd2_autocomp (for VDD2)
– To disable Smart Reflex
• echo –n 0 >sys/power/sr_vdd1_autocomp (for VDD1)
• echo –n 0 >sys/power/ sr_vdd2_autocomp (for VDD2)
Page 17
Power IC Driver
Wireless Terminals
Software Devt.
• The Power IC driver provides APIs for configuring
Triton2.
• Drivers call twl4030 APIs to enable/disable their LDOs.
• Triton specific configuration (enabling smart reflex, etc)
is done during bootup in twl4030_power driver.
• Files
– 2.6_kernel/drivers/i2c/chips/twl4030_power.c
– 2.6_kernel/include/asm-arm/arch-omap/power_companion.h
Page 18
CPUIdle Framework
•
The following C states are supported in Cpuidle driver:
–
–
–
–
–
–
–
–
•
Next timer expiry in the system
Comparing target residency with available sleep time
Comparing exit latency with system wide latency constraints
Checking for activity in the core domain
Dynamic tick based on support in kernel.
Files
–
–
–
•
C0 – System executing code
C1 – MPU WFI + Core active + No tick suppression
C2 – MPU WFI + Core active + Tick suppression
C3 – MPU CSWR + Core active + Tick suppression
C4 – MPU OFF + Core active + Tick suppression
C5 – MPU RET + Core RET + Tick suppression
C6 – MPU OFF + Core RET + Tick suppression
C7 – MPU OFF + Core OFF + Tick suppression
Menu governor adapted to support OMAP3 specific requirements. Takes the following into account to decide
the target sleep state:
–
–
–
–
•
•
Wireless Terminals
Software Devt.
2.6_kernel/arch/arm/mach-omap2/pm_34xx.c
2.6_kernel/arch/arm/mach-omap2/sleep_34xx.S
2.6_kernel/arch/arm/plat-omap/timer32k.c
SW configuration and information
–
echo –n <state_number> > /sys/power/cpuidle_deepest_state
–
cat /proc/pm_prepwstst
•
•
Page 19
Locks the deepest system state at <state_number>. ex.: If state_number = 3, the system does not go to CORE RET and CORE OFF.
Displays the previous power state of MPU, CORE, IVA and clocks in the system.
CPUIdle Framework (contd.)
Wireless Terminals
Software Devt.
• The state of power domains other than MPU and CORE
– Handled outside the idle loop by the shared resource framework
– Based on the requests of the clocks in the power domains
• NEON and PER domains are controlled inside the idle loop since their states
are closely coupled with MPU and CORE states respectively.
• The list of c-states to be supported needs to be refined based on the power
savings and latency of each state.
• Cpuidle framework has these components:
– CPUidle governor: Entity which decides the target state C state of the system.
– CPUidle driver : Entity which populates the C states supported by the system and
implements functions to transition to the C states.
– Generic Cpuidle framework: Every time the idle loop is called, this framework calls
the current governor to decide the target C state of the system. It then calls the
current driver to transition to the C state selected by the governor.
Page 20
Suspend/ Resume
Wireless Terminals
Software Devt.
• Every driver implements suspend/resume after
registration with LDM.
• Drivers release clocks and save context in suspend call
and restore these when resume is called.
• Drivers which have already released their clocks and
have saved their context need not do anything in their
suspend call.
• Files
– 2.6_kernel/arch/arm/mach-omap2/pm_34xx.c
– 2.6_kernel/arch/arm/mach-omap2/prcm_34xx.c
– 2.6_kernel/arch/arm/mach-omap2/sleep_34xx.s
Page 21
OFF Mode Support
Wireless Terminals
Software Devt.
• All registers in the power domain are reset when the
power domain goes to OFF.
• OFF mode could introduce considerable latency for
wakeup.
• The system can enter chip off through two paths:
– Idle loop
– Suspend/Resume
• Driver adaptations for OFF mode support are discussed
in later slides
Page 22
Constraints Framework
Wireless Terminals
Software Devt.
• Constraints are stored in the shared resource framework.
• Type of constraint can be obtained by specifying the name of the constraint.
• There are three types of constraints that are implemented:
– Interrupt latency constraints
• Constraints to control system wide acceptable latency (the target C state that is
programmed in idle thread)
• Internally uses the latency constraint APIs available in the kernel
– Power Domain latency constraints
• Constraints to control transition of a particular power domain
– Frequency/OPP constraints
• Constraints to specify a Frequency/OPP at which the driver can operate
• Files
– 2.6_kernel\arch\arm\mach-omap2\resource_34xx.c
– 2.6_kernel\arch\arm\mach-omap2\resource_34xx.h
– 2.6_kernel\include\asm-arm\arch-omap\resource.h
Page 23
DVFS
Wireless Terminals
Software Devt.
• DVFS is implemented using the constraints framework
• Drivers/Bridge express constraints on OPP or/and on ARM as well as DSP
frequency using the constraints framework
• The constraints framework evaluates all the constraints available before
setting the target OPP/Frequency
• If any pre/post notification functions are registered, the constraint framework
takes care of calling them before and after the OPP/Frequency change
• Along with exported DVFS APIs for VDD2 OPP change, the high VDD2 OPP
are pegged to high VDD1 OPPs (since a high CPU frequency often indicates
high L3 bandwidth usage)
• SW configuration
– To change the OPP of VDD1, use
• echo –n <opp_number> > /sys/power/ vdd1_opp_value.
• opp_number can be 1,2,3,4 or 5. Default is 3 (when cpufreq is disabled)
– To change the OPP of VDD2, use
• echo –n <opp_number> > /sys/power/ vdd2_opp_value
• opp_number can be 2 or 3. Default is 3 (when cpufreq is disabled)
Page 24
CPUFreq Framework
Wireless Terminals
Software Devt.
• Consists of three major blocks
– CPUFreq core
– CPUFreq Driver
– CPUFreq governor
• TI uses the kernel space governor ondemand
• CPUFreq Core
– The CPU Freq Driver registers with the Core, and populates a structure
CPUFreq_driver, which is populated with the API’s implemented by the
driver
• CPUFreq Driver
– Interfaces with shared resource framework to implement an OPP change
• CPUFreq governor
– The ondemand governor sets the CPU speed depending on the current
usage
Page 25
Display Low Power Refresh
•
•
•
•
•
Wireless Terminals
Software Devt.
LPR – OSIdle with LCD ON. Display in VGA mode, 60fps
DSS logic and pixel clocks are reduced
Available DSS FIFOs are merged for greater FIFO depth
CORE is configured to transition to RET when DSS is inactive
SW configuration
– To enable Low Power Refresh mode
• echo –n “enable” > sys/class/display_control/omap_disp_control/lpr
– To disable Low Power Refresh mode
• echo –n “disable” > sys/class/display_control/omap_disp_control/lpr
Page 26
Wireless Terminals
Software Devt.
Device Driver PM support
Page 27
Device Driver Power Responsibilities
•
•
•
•
•
Page 28
Wireless Terminals
Software Devt.
Device drivers need to aggressively manage
request/release of clocks through clock framework
Device drivers should configure optimal power
settings
Device drivers need to register with the LDM and
implement suspend() and resume() calls.
Device drivers can specify constraints when
required
Device drivers need to implement context
save/restore and interact with the appropriate
framework for the same.
Aggressive Clock Management
•
•
•
Wireless Terminals
Software Devt.
Device drivers are expected to control their clocks on a request basis so that we will be
able to hit low power states in idle loop whenever possible.
If the drivers are not transaction based, there can be an inactivity timer for the driver
to cut off clocks after a period of inactivity.
Examples of changes being made to drivers:
– Drivers which are transaction based can control clocks based on activity. ex.: i2c driver
enables clocks when there are pending requests and disables them when there are no
pending requests.
– Camera – Clocks will be enabled as long as the driver is required (i.e. camera is running).
– Display – The fbdev inactivity timer (which is tied to user activity) can be used to turn off
display clock
– MMC – Clocks are controlled on a per command basis.
– GPTimer – Clocks are controlled as per requirement, i.e. when a timer is in use, they will be
enabled and will be disabled when they are not in use
– UART – Console clocks are cut in the idle loop (before putting core domain to retention) and
other UART clocks could be controlled on a need basis.
– USB – Clocks can be controlled as per requirement (only when transfers are going on).
Page 29
Device Driver Specific Power Settings
Wireless Terminals
Software Devt.
• After enabling clocks, the module must configure sysconfig
register with the following settings:
–
–
–
–
AutoIdle
SmartIdle
SmartStandby
Clock activity – set it to FCLK ON ICLK OFF when clocks are enabled –
change it to FCLK OFF ICLK OFF just before clocks are disabled
– WKEN bit
• In addition any other additional wakeup registers/ power
registers of the module need to be configured
– usually mentioned in the power management section of the module in
the TRM.
• The clocks domains are always configured in hardware
supervised mode so that interface clock is automatically
controlled by the hardware.
Page 30
Impact due to DVFS
Wireless Terminals
Software Devt.
• Validation requirement for DVFS
– Determine lowest OPP at which a module can function
– Validate that DVFS can happen when a functionality is ongoing
• If a device driver requires a minimum CPU/L3
frequency for its operations, it can set a constraint on
OPP or on MPU/IVA frequency using the constraints
framework.
• A device driver can register for pre/post notifications on
VDD1 or VDD2 DVFS change.
Page 31
OFF mode impact
Wireless Terminals
Software Devt.
• The following questions need to be addressed for OFF
mode through idle loop:
– When should the drivers save and restore context?
– What is the context to be saved and restored?
– During driver operation, when can a power domain go to off
mode?
– How will the power management framework decide when to
put a power domain to OFF?
Page 32
Context Save & Restore
Wireless Terminals
Software Devt.
• When all drivers in a power domain release their clocks, the
power domain can go to RET or OFF state.
– The shared resource framework programs the power domain to target
state depending on the latency constraints in the system.
• Drivers can follow any of the following methods to save and
restore their context:
– Always save/ always restore – drivers which do not have lots of registers
can always save context and restore context because it will not cause a lot
of overhead.
– Early save/ restore on demand – In this method, drivers save context
every time they release their clocks but restore it only if the power
domain has actually gone to off after saving – this makes sense for drivers
which have a large restore time with save time being minimal
Page 33
How to Prevent Domain Power OFF?
Wireless Terminals
Software Devt.
• Drivers prevent OFF mode by specifying latency constraints.
– Power domain latency constraint can be used to control the
transitions (OFF/RET) of a particular power domain
• In some cases drivers do not know when to express a power domain
latency constraint, either because it is not aware of the current system
level use case or because it does not know when its API will be called
next by an upper layer stack.
• For such cases, an inactivity timer is provided (for CORE and PER
domains) that will delay the transition of a power domain to OFF
– Interrupt latency constraint will be used to prevent a system wide low
power state
• Drivers can also prevent OFF mode by keeping their clocks
active (in which case OFF is not attempted).
Page 34
Debugging
• To check the count of MPU/CORE low power transitions in CPUIdle
– cat /proc/pm_prepwst
• To check if voltage transitions are happening in chip RET/OFF cases
– Confirm that LED D5 on OMAP3430 SDP is blinking
• To confirm frequency change during DVFS
– Observe "BogoMIPS" by executing
cat /proc/cpuinfo
• To check clocks active in the system
– cat /proc/omap_clocks
• To modify the FB inactivity timeout value
– echo –n <timeout_value> > /sys/power/ fb_timeout_value
• To disable framebuffer inactivity timer, do:
– echo –n 0 > /sys/power/ fb_timeout_value
• To disable hw supervised clock domain transitions in menuconfig
– Power Management Options -> Omap power management options ->
OMAP3430 enable H/W supervised transition for clock domains
Page 35
Wireless Terminals
Software Devt.
Debugging (contd.)
Wireless Terminals
Software Devt.
• To disable MPU/CORE OFF transitions in menuconfig
– To disable CORE OFF
• Go to Power Management Options -> OMAP power management options
• Unselect the option “Enable CORE OFF in cpuidle”
– To disable MPU OFF:
• Go to Power Management Options -> OMAP power management options
• Unselect the option “Enable MPU OFF in suspend/resume and cpuidle”
• To enable CPUFreq with ondemand governor
– In menuconfig change the following
• Enable CPU Frequency scaling -> CPU Frequency scaling
• Enable CPU Frequency scaling -> CPU Frequency scaling -> ‘ondemand’ cpufreq policy
governor
• Change Power Management options -> OMAP Power management options ->Default
VDD1 OPP Level to MPU-125/DSP-90Mhz
• Change Power Management options -> OMAP Power management options ->Default
VDD2 OPP level to L3 83 MHz
– After bootup execute the following command, this line can also be added in
/etc/init.d/rcS file
• echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Page 36
Download