Power Management - Welcome to iLab!
Download
Report
Transcript Power Management - Welcome to iLab!
Power Management
2/11/2004
1
Outline:
• Introduction to the topic
• Part 1: Vertigo
–
–
–
–
–
–
How can power consumption by a processor be reduced?
What is DVS?
Performance setting algorithms
Policy stack
Evaluation
Conclusion
• Part 2: Cooperative I/O
– Components of Cooperative I/O
– Low power modes
– Cooperative File operations
– Cooperative Updates & DDT
– Prototype Implementation
– Experiments & Results
– Conclusion
2/11/2004
2
2/11/2004
3
What was some of the common things among these
devices you just saw?
2/11/2004
4
Power Management
The goal of power management is to combine
high performance with low power
consumption.
2/11/2004
5
How can we better manage power ?
• Develop longer lasting batteries.
• Reduce the power consumed by processor/CPU
• Reduce the power consumed by the network
interface and user interface and during I/O.
• Application driven power management which
involves developing energy aware applications.
2/11/2004
6
Part 1: Vertigo
2/11/2004
7
Reduce the power consumed by the processor
3 possible solutions to the problem
– By taking advantage of the highly variable performance
requirements of the application.
– By Dynamic Voltage Scaling(DVS) involves reducing the clock
frequency and corresponding operating voltage of the CPU.
– By reducing the static power consumption, which results from
leakage current in CMOS devices. Using techniques such as
Adaptive reverse Body Biasing(ABB).
2/11/2004
8
• DVS(Dynamic Voltage Scaling) exploits the fact that the peak
frequency of a processor implemented in CMOS is proportional to the
supply voltage, while the amount of dynamic energy required for a
given workload is proportional to the square of the processor’s supply
voltage.
– In simple “running the processor slower means that the voltage level can
be lowered, yielding a quadratic reduction in energy consumption, at the
cost of increased run-time”
– So the key is to reduce the processor’s performance level so as to be able
to exactly meet the software deadlines not to run the processor at its peak
level at all times.
• Similarly, static power consumption due to leakage can be
significantly reduced if processor does not operate at its peak
performance level.
2/11/2004
9
Dynamic Voltage Scaling
• Simple approach to do voltage scaling is based on the
usage model(not on the workload). This approach does not
need any OS support.
• Another approach is to build the power management policy
into the processor’s firmware. This approach also does not
require any modifications to the OS.
• The approach used in Vertigo is to build power
management policy as hooks to the already existing OS.
This helps it to get access to a richer set of data for making
performance perdictions.
2/11/2004
10
Vertigo[1]
Design objectives of the Vertigo system:
• To be minimally intrusive into the host operating system.
So as to make it autonomous from the other units of the
kernel.
• To automatically set the processor performance, without
any application specific involvement.
• To be able to run it in passive mode in order to monitor the
system.
2/11/2004
11
Vertigo Architecture
• Vertigo is implemented as a set
of kernel patches that hook into
the Linux kernel to monitor the
program execution and control
the speed and voltage levels of
the processor.
• Vertigo co-exists with the
existing scheduler, system calls
and power manager.
• It uses a composition of
performance setting algorithms,
all specializing in diffirent
kinds of run-time situations.
2/11/2004
12
Performance Setting Algorithms
They can be broadly divided into two categories
1. Those that use information about task deadlines in realtime kernels to guide performance-setting decisions of the
processor.
2. Those that seek to derive deadlines automatically by either
monitoring the past utilization of the processor(interval
based techniques) or based on semantic task and event
classification.
Note: Vertigo’s performance setting algorithms fall into the 2nd category.
2/11/2004
13
Vertigo’s Performance Setting Algorithms
• The multiple performane setting algorithms are co-ordinated to
find the best estimate for the necessary performance level.
• Hence there is a decision hierarchy, were the top level
algorithms have a right to override the lower levels.
• Vertigo has three levels of decision hierarchy
1. Top level - This algorithm quantifies the performance requirements
of an interactive application.
2. Middle level- An application specific layer, where DVS aware
applications can submit their information.
3. Bottom level- The algorithm tries to estimate future utilization
based on past information.
Note: The focus of the paper is on the top level and bottom level algorithms.
2/11/2004
14
How to quantify Work?
The full speed equivalent(fse) work estimate is computed by
Workfse= ni=1 tipi
(Eq 1)
where i refers to one of the n different performance levels during a given interval with the
corresponding amount of non-idle time spent at that performance level(ti) in seconds
and frequencies(pi) specified in fraction of peak performance
The equation basically states that the workload’s runtime is linearly related to the inverse of
the processor’s performance level.
2/11/2004
15
Perspective-based Algorithm
• It is the algorithm used at the lowest level of the policy stack.
• It is called perspective-based, since it computes the performance
predictions from the perspective of each task and uses the combined
result to control the performance setting of the processor.
• Vertigo computes each task’s observed period for recomputing pertask exponentially decaying averages of the work done during the
task’s period and of its estimated deadlines.
• Performance level is got by
Perf= WorkEst/ Deadline
(Eq 4)
• To find the task’s period, the algorithm tracks the time from when the
task starts executing, through points when it is preemepted and
eventually runs out of work, until the next time it is rescheduled.
2/11/2004
16
Task Period
At point c the Workfse is computed for the period between a and c.
2/11/2004
17
WorkEst and Deadline
WorkEstnew= (k* WorkEstold + Workfse)/ (k+1)
(Eq 2)
Deadlinenew= (k*Deadlineold+Workfse+Idle)/(k+1)
(Eq 3)
Note: k is a constant whose value is choosen as k=3.
2/11/2004
18
Advantage: By keeping track of work and deadline predictions seperately
the performance predictions are weighted over the length of the
interval over which the work estimates are measured.
Disadvantage: If a new non-interactive, CPU-bound task that utilizes the
CPU without being preempted, there is significant latency incured in
responding to the load.
This is overcome by placing a limit over the non-preempted duration over
which the work estimate is computed.
They have choosen a value of 100ms after which is a task is not
preempted the work estimate is recomputed.
2/11/2004
19
Top level Algorithm(1)
• The middle level- DVS-Aware Applications can submit their
requirements to this layer. But it has not been currently implemented in
Vertigo.
• The top level- The algorithm automatically quantifies the performance
requirements of interactive applications and ensures that software
deadlines are met, so that user experience does not suffer.
• An interactive episode is started by the user through a GUI event.
• This event gets handled by the GUI controller and the appropriate task
handler.
• Therefore by monitoring the appropriate system calls, Vertigo can
detect interactive episodes .
• The GUI controller, the task handler and all other tasks initiated by
these are marked as interactive tasks.
• An interactive episode ends when there are no more interactive tasks to
perform.
2/11/2004
20
Top level Algorithm(2)
2/11/2004
•
Skip threshold: It is set to a value of 5ms.
•
Panic threshold: If the interactive episode
does not end by this point the processor
performance is increased to the
maximum. It is set to 50ms.
•
Perception threshold: If an episode is
longer than both the skip and the
perception threshold then the
performance requirement is set to 100%
•
Perception threshold describes a cut-off
point under which events appear to
happen instantaneously for the user.
•
The perception threshold depends upon
the task and user but 50ms is commonly
used.
21
Top level Algorithm(2)
• The predictions for the interactive episodes are as exponentially
decaying averages of the correct settings of the past interactive
episodes.
• If the panic threshold was reached during an episode, the moving
threshold is rescaled so that the last performance gets incoporated with
a higher weight. (k=1 instead of k=3)
• The performance requirements of the interactive episodes shorter than
the perception threshold is calculated as
Perf= Workfse/ Perception Threshold
2/11/2004
(Eq 5)
22
Policy Stack and Work Tracking in Vertigo
• The policy stack keeps track of the various commands and
performance level requests from each policy and uses this information
to combine them into a singel global performance level decision when
needed.
• The policies can be triggered by any event in the system and they may
submit new performance requests at any time.
• The algorithms use processor utilization history to compute
performance level, hence they need to be able to track the work done
over an interval.
• This is done is by the policies in Vertigo by making use of structure
called “Vertigo_work”.
• This not only simplifies the policy code but also simplifies the porting
of Vertigo.
2/11/2004
23
Evaluation
• Sony Vaio PCG-CIVN notebook computer
• Transmeta Crusoe 5600 processor running at 300MHz to
600MHz with 100MHz steps.
• This particular model was used because it had its own
power management done at the firmware level which could
be turned on or off. Thus it was useful in evaluating
Vertigo.
• LongRun is the power management system of Crusoe.
• The OS used Mandrake 7.2 with modified Linux 2.4.4ac18 kernel
2/11/2004
24
Table 3: Application level statics about the plaympeg benchmark
2/11/2004
25
2/11/2004
26
Figure 8: Fraction of the time spent in different interactive workloads
2/11/2004
27
Conclusion
• Plus
– The idea for using more than one performance setting algorithm in a
policy stack
– The Linux kernel itself wasn’t modified.
– The applications need not be energy-aware
– The system showed good results in most situations
• Minus
– It would have been interesting to see the actual energy savings.
– The amount of energy saved by optimising the energy consumption of the
processor alone may not be significant
– The applications need to be energy-aware to make use of the middle level
of the performance stack.
• The ideas used in Vertigo can be used along with other power
management methods for overall improvement.
2/11/2004
28
Part 2
Cooperative I/O- A Novel I/O
Semantics for Energy Aware
Applications
2/11/2004
29
Cooperative I/O- A Novel I/O Semantics for Energy
Aware Applications[2]
• Integrates the application layer into the dynamic power management
of devices.
• Energy-aware applications make use of the new interface operating
system to do cooperative I/O.
• The basic idea is for the OS to batch deferable requests in order to
create long idle periods during which switching to low-power mode
pays off. Ex: Auto-save function of the text editor.
• The proposed model consists of three major parts
– New co-operative file operations that have time-out and cancel flags as
additional parameters.
– The update policy of the OS is re-designed to save energy.
– The OS controls the hard disk modes by using a simple algorithm called
“Device-dependent time-out” or DDT.
2/11/2004
30
Components of Coop-I/O
2/11/2004
31
Low power modes(1)
• The OS may reduce the power consumption of a device by switching
between the device’s operation modes.
• The ATA standard, defines 4 operation modes for the hard disk
– Active mode: The hard disk is reading,writing,seeking or spinning
– Idle mode: The disk is rotating and the interface is active but the head is in
the parking position.
– Standby mode: The disk spindle motor is off, but the interface is still alive
– Sleep mode: Both the spindle motor and the interface are off. To reactivate
we need the reset command.
• The device is “shut down” if it is in the Standby or the Sleep mode.
2/11/2004
32
Low power modes(2)
• A non-negligible amount of time and power is required to enter and
leave these modes.
• Break-even time is defined as the minimum interval between 2 disk
operations for which the switching will pay-off.
• The break-even time is dependent on the characteristics of the disk.
• Stand-by time: If the interval of idleness is longer than the break-even
time then it is called the stand-by period.
2/11/2004
33
Cooperative File Operations(1)
• These operations provide a way for applications to exert influence on
the way the hard-disk operates.
• Three cooperative variants to the commonly used file operations:
– open_coop()
– read_coop()
– write_coop()
• In addition to the standard parameters these operations have 2
additional parameters, the time-out and the cancel flag.
• The time out value specifies when the operation is initiated at the latest
and the cancel flag specifies if or not the particular operation can be
aborted.
• In case of legacy applications their open(), read() and write() map on
to the corresponding cooperative variants but with time-out and cancel
flag set to zero and inactive respectively.
2/11/2004
34
Cooperative Read operation
• Copies of the disk blocks are kept in a cache in main memory, called
the block buffers, to reduce the number of slow disk opertions.
• Modified block buffers are marked dirty and the “dirty buffer life
span” determines the time when these buffers must be written back to
the disk.
• Cooperative read:
– Checks to see if the block is in the cache, if yes, retrieves it.
– If not, checks if the disk is in the active or Idle mode, if yes, then the block
is retrieved immediately.
– Else, it waits until the disk spins again or the time-out has elapsed.
– If the time-out has elapsed then it checks the cancel flag to determine
whether to restart the disk or abort the read operation.
2/11/2004
35
Cooperative Write operation
• If the block is cached, the update task defers write operation until the
dirty buffer life span has elapsed.
• If the block isn’t cached, then the write may induce a read operation,
so that the block can be read into the main memory before it can be
modified.
• The situation gets more complicated when the write operation needs to
read an uncached block after several modifications to the cached
block.
• The problem overcome using the early commit/abort strategy
2/11/2004
36
Early Commit/Abort Strategy(1)
• Decides to commit or abort as soon as the first modification to a block
buffer is going to take place.
• Assume, we want to modify the buffer and the drive is in standby mode,
then we can distinguish 3 situations:
– The drive is activated by another request before the time-out for our request is
reached.
– If at the end of our time-out there are no other dirty buffers to the drive,
depending on the cancel flag we can either commit or abort the write operation.
– If there is another dirty buffer for the drive. In which case we wait until old
requests time-out elapses and the drive is started.
• A write to a block is delayed only as long as the drive is in standby mode
and there are no dirty buffers for that drive.
2/11/2004
37
Early Commit/Abort Strategy(2)
Therefore since a write operations first buffer modification involves
committing the operation, a write can be committed even if the disk is
not running.
Also an abortable write will not be aborted after the time-out even if the
disk is in standby mode. This is case when a read follows a commited
write
–
–
–
–
–
–
2/11/2004
The disk is in standby and there exists dirty buffers for the disk
The write operations has to modify several disk blocks.
The write operation is commited after the first modification.
If a subsequent block is not cached, it has to be read from the disk.
This action will be deferred until time-out has elapsed.
We have to spin the disk then to read the block as the complete operation
is already committed.
38
Cooperative Open Operation
• An open operation can cause a read or a write operation if the file has
to be created or truncated.
• For the open_coop(), the read and write induced by it are mapped to
their corresponding cooperative counterparts.
Usage of the cooperative operations:
• Since the coop operations can be delayed the I/O must be handled by a
separate thread in the program so that data is cached by it for the main
thread.
• The main thread thus reads and writes data to the buffer as if there is
no cooperative I/O.
2/11/2004
39
Cooperative Updates
• Cooperative Update tries to optimize the caching mechanism with
respect to power consumption.
• The update process usually takes place when one of the following
conditions are met
–
–
–
–
An explicit sync() command forces a write back of the buffers to disk
The dirty buffer life span has elapsed
A certain percentage of block buffers are dirty
System needs main memory and has to do a write back to reclaim memory
• The problem with the above policy is 2 updates can be spaced with a
time period that is shorter than the break-even time.
• Solution: Try to batch write requests so that device can spend more
time in low power modes.
• The policy is called “Drive specific cooperative updates”
2/11/2004
40
Drive specific cooperative updates
It consists of 4 strategies:
• Write back all buffers instead of only the oldest ones
• Update cooperatively: The OS tries to combine other disk access such
as read requests and writing back dirty buffers whose “dirty buffer life
span” hasn’t yet elapsed.
• Update each drive seperately: It may help to increase the update
interval for a singel drive, also balance the I/O load, yet the same time
it will not compromise on file system consistency.
– But this would require us to maintain the oldest dirty buffer for each
individual drive along with a list of all the dirty buffers that belong to that
drive.
• Update on shutdown: If the drive is planning to shutdown write back
all the dirty buffers belonging to that drive.
2/11/2004
41
Device dependent time-out(DDT) policy
The policy states:
Shut down the hard disk if
tla + tbe =< t
where
t : The current time
tla : The time of the last hard disk access
tbe : The break even time
Thus the disk drive has to keep track of tla and tbe
2/11/2004
42
Prototype implementation
• They have implemented the Coop-I/O concept in the Linux kernel
version 2.4.10.
• They have also made the following modifications
– Cooperative file operations: In additions to the already mentioned file
operations a function, wait_for_drive() is provided to block them when the
drive is not active.
– The VFS and Ext2 file system have been modified to support the drivespecific cooperative update policy: By mapping device numbers to drives,
also now the need for cooperative updates is checked every time a one
reads or writes to a drive.
– The IDE driver has been enhanced by power mode control for disk
drivers, which includes the DDT algorithm. DDT algorithm is
implemented as a timer based function that is called every 1sec.
2/11/2004
43
Experiments and Results
• Experiments were run on a
– personal computer running Linux kernel
– with a IBM Travelstar 15GN hard disk.
– Power measurements where taken using 4 analog-to-digital convertors
• Testing a cooperative Audio player
– the application was modified to make it energy aware
• Two audio files were played, with the same length but different
compression ratios.
– The two audio files were used, both alone and also companied by a mail
reader.
• All the tests were run for 534 seconds.
2/11/2004
44
The system was tested under 4 strategies:
• Cooperative: Use DDT and new buffer cache and update mechanism.
Read_coop() has a delay time equivalent to the play time for one semibuffer.
• ECU(Energy aware Caching and Update) : Same as above but instead
of read_coop() function a standard read() was used.
• DDT only: Use DDT with the original buffer cache and update
mechanism
• None: Do not use any power saving mechanism at all
• Uncooperative Oracle policy : Values for this are derived from running
the unmodified Linux. This represents the theoritical lower bound.
2/11/2004
45
Figure 4 & Table 1- compares the 4 strategies
2/11/2004
46
Figure 5- compares the 4 energy saving strategies(playing “Toccata”
without mail)
2/11/2004
47
Figure 6- compares interaction between 2 independent tasks,
“Toccata” and Mail cooperative.
2/11/2004
48
Reads and Writes with varying average period length
•
•
•
Multiple tasks periodically read or write data
The energy consumption was measured over 1000seconds
The read and write processes have varying period lengths and idle times
2/11/2004
49
Fig 9: Varying the number of cooperative processes
The energy consumption steadily declines with the increasing proportion of
cooperative processes.
2/11/2004
50
Better way to show the results
2/11/2004
51
Conclusion
• Plus
– Addition of even a single cooperative application shows reduction in the
energy consumed by the disk.
• Minus
– The Coop-I/O’s experimental results wasn’t compared against any of the
benchmarks or other already existing low power I/O systems.
– The paper makes a claim of 50% power reduction which is not sufficiently
supported or shown through experiments.
– Requires modification to the user applications, requiring the applications
to be energy aware.
– There are no comparisions between the various strategies with respect to
the amount of time the disk is active, idle and in standby.
– Also the idea of coop-I/O has not been tested on any of the other
commercial avialable hard disks.
2/11/2004
52
Something to think about
Is power management a good idea?
2/11/2004
53
Reference:
[1]. Vertigo: Automatic Performance Setting for Linux, Krisztian Flautner,
Trevor Mudge. Proceeding of the 5th symposium on operating system
design and implementation, 2002.
[2]. Cooperative I/O- A Novel I/O semantics for energy aware application,
Andreas Weissel, Bjorn Beutel, Frank Bellosa. Proceeding of the 5th
symposium on operating system design and implementation, 2002.
[3]. K.Flautner, S.Reinhardt, t.Mudge. Automatic performance setting for
dynamic voltage scaling. Proceedings of the International Conference
on Mobile Computing and Networking, July 2001.
[4]. K.Flautner, S.Reinhardt, t.Mudge and R.Uhlig. Thread level
parallelism and interactive performance if desktop applications.
Proceeding of Conference on architectural support for programming
languages and operating systems. November 2000.
2/11/2004
54