Transcript Slide 1

PM Requirements for 2006
Overview and highlights
Mark Gross [email protected]
April 13th, 2006
1
Introduction
• I am the chair or the CELF PM working group.
• CELF is a collection of companies involved with CE products that use
Linux that want to help Linux be more useful for CE applications.
• The current requirements have the following goals
– Represent CE needs to the linux-pm community
– Influence new development
• The document has input from ARM, Intel, Monte Vista, Motorola,
Nokia, Philips, Sony and some input from the Linux-PM list.
• http://tree.celinuxforum.org/CelfPubWiki/CELFPmRequirements2006
?action=AttachFile&do=get&target=CELF_PM_requierments2006.pdf
– Linked to at the top of the CELF public PM Wiki
http://tree.celinuxforum.org/CelfPubWiki/CELFPmRequirements2006
April 13th, 2006
2
Next steps for CELF PM WG
• Take input from PM-Summit
• Coordinate with existing efforts supporting items
in the requirement document and see what we can
do to help
• Pick up some of the new ideas and implement /
submit them for up-stream kernel
• Grow and maintain an open dialog with linux-pm
community.
April 13th, 2006
3
Requirements document overview
• The PDF requirements document is derived from the
public Wiki page.
– It includes a subset of the Wiki items that the CELF PMWG feels
needs attention or looks interesting.
• The requirements are partitioned into the following
categories: Interface, Platform Throttling, Process / OS
Throttling, Lower power kernel processing, Sleep state
support, System Load Prediction, and Measurement and
Benchmarking.
• Some of the requirements are calls for investigations and
more data. i.e. Load System Load Prediction +
measurement and benchmarking.
April 13th, 2006
4
Highest priority items for CELF
• The rest of this presentation will be an open
discussion and walk through of the high
priority items called out in the document.
• I’ll just keep going as time permits ;)
– Or, we can blow through it all as quickly as
possible…
April 13th, 2006
5
Tick-less idle
• CE devices really need this badly.
• CE platforms will issue special instructions to put
the SOC power domains into low power modes
and whenever interrupts happen it wastes a lot of
power waking these domains up to just increment
jiffies.
– Think cell phone waiting for call or user to press
button. The thing is very idle and timer ticks prevent us
of SOC power saving features, which burns battery.
April 13th, 2006
6
Good enough interface for
controlling PM knobs
• Today there is not a arch independent / common method
for controlling anything other than core frequency.
– I believe this is because of the implicit influence of ACPI on
assumptions made by the CPUFREQ infrastructure.
• CE arch’s are constantly rolling there own platform control
drivers, doing some custom thing or using an OSV only
solution (DPM)
– This is sub-optimal.
– The CE-platform developer has no choice today if they want to do
power management today they need to use non-mainline solutions.
• How can we design a platform independent interface to
sometimes arch specific controls?
– Do we provide some extensibility to the cpufreq_driver structure?
April 13th, 2006
7
Design Goals for PM frameworks
1.
Enable policy drivers / interfaces for setting platform dependent
control variable.
Extensible WRT platform specific control variables.
Policy drivers need to be “arch-independent” as far a the kernel
build is concerned, but;
2.
3.
•
Policy drivers that control platform specific parameters not available at
load time, should do no harm.
Perhaps use compile time and kbuild to control the policy driver
options that are built to avoid building such drivers?
•
4.
5.
6.
No implicit constraints on policy algorithms or logic.
Keep policy implementation in user space, or policy drivers
Platform constraints on control variables are enforced by arch code
April 13th, 2006
8
Struct cpufreq_driver needs to
export core Voltage
• Most CE platforms can change both frequency and
Voltage. i.e. K(OS_State, freq, volt)  (freq, volt)
• CPUFREQ implicitly assumes that voltage can be
computed from Frequency. i.e. K(OS_state, freq)
 freq
– This interface constraint locks out entire classes of
power management opportunities for CE products
without using custom patches.
– This interface constraint is likely the reason for the
limited numbers of CPUFREQ governor options today.
April 13th, 2006
9
Reducing timer tick overhead
• For CE devices much of the default kernel
periodic processing is not helpful and burns
power.
• Efforts to quantify or measure the work done on
timer ticks in a running system would be very
helpful.
• Efforts to trim down the work done on timers for
CE platforms is also of high value
April 13th, 2006
10
High latency Idle state machine
•
•
•
•
CE platforms tend to have multiple low power idle states, where different
components on the SOC die are powered off or put in an ultra-low power state.
The problem is that these ultra-low power states is that they have significant
latency.
The problem is not simply choosing which idle function to call from the
scheduler. Its doing so in a sane manner without hosing the system.
Consider having a set of 5 idle functions you could call. Each one with a
different latency, sorted in order of increasing latency. “i1, i2,…i5” Where i5
has a 3sec latency.
– How could such idle functions be used, without sacrificing the usability of the
system?
– You can’t simply call i5, from idle every time. Rather you would likely set up a
one-shot call to i5, where upon the next call to idle the system needs to consider the
lower latency idle functions first.
– Further how could one export the idle function policy controls to user space?
April 13th, 2006
11
CPUFREQ Governor “limiting”
• Using some CPUFREQ governors can lead to issues with media play
back.
• CE devices would like to see maximum platform throttling subject to
the constraint that RT play back of media not drop frames.
• There needs to be a mechanism for implementing this type of
constraint on the platform throttling.
• One idea to implement this capability is to provide a deadline
governor, that takes a periodic deadline and regular heartbeat into from
a mplayer type of application.
– As the application finishes with a frame of compressed media it heartbeats
the governor.
– If the heartbeat comes too close to the deadline, then throttle up the
platform, otherwise throttle back.
April 13th, 2006
12
BACK UP
in case there is time for more
• The following are items worth further
investigation.
April 13th, 2006
13
Measurement and Benchmarking
• We are missing well defined / published processes
and techniques for measuring the effectiveness of
PM techniques. We could use some type of
project to provide these.
• We need work load and benchmark scenarios for
evaluating and driving PM feature development.
–
–
–
–
–
“coffee shop” workload scenarios
Airplane workload scenarios
Cell phone workload scenarios
PDA workload scenarios
Set-top box workload scenarios
April 13th, 2006
14
PM / scheduler coupling
• Throttling work loads is a reasonable approach to
power management.
• Many user scenarios call for closer interaction
between the scheduler and power management.
– Running cron jobs while under battery isn’t cool.
– Asymmetric suspend / resume support for selected
tasks.
• Think running a large cross compile or transcoding mp3’s to
org’s, while tethered, suspend system, go to the coffee shop,
use the system for a while without burning battery on the
background processing, and to have the processing continue
when re-tethered seamlessly.
April 13th, 2006
15
Suspend resume latency and VM
thrash
• Today’s suspend / resume processing is not very
predictable in terms of the time it takes for the
system to return to a useable state, capable of
playing back media.
• One problem is the predicability is very poor
• One problem is even after the UI is up and mostly
responsive, the ability to play back media is
delayed by the thrashing of the VM refreshing
pages that where freed on the suspend operation.
April 13th, 2006
16
Need to avoid implicit ACPI
assumptions in API’s
• CPUFREQ is guilty of this in that it assumes core
voltage is a function of frequency. Which is not
true for most platforms without ACPI.
• Its been getting better. Swsusp no longer uses
“echo s4 > /proc/acpi/sleep”
• Architecture maintainers need to export more
platform throttling interfaces before this problem
sorts itself out.
April 13th, 2006
17
April 13th, 2006
18