Slides: Scheduling for reduced cpu energy

Download Report

Transcript Slides: Scheduling for reduced cpu energy

Scheduling for
Reduced CPU Energy
M. Weiser, B. Welch, A. Demers,
and S. Shenker
Introduction


The Energy Saving of A Typical Laptop Computer
Backlight and display & Disk


CPU





Turning off after a period of no use
Simple power-down-when-idle techniques
Another way?
Opportunities
Dynamically varying chip speed & energy
consumption
Cooperation of the operating system scheduler

When to use full power & when not
CPU Energy

Q=C*V


E=Q*V


Charge = Capacitance * Volts
Energy = Charge * Volts
When charging a capacitor


E =1/2 C V2
=> Energy spent charging a gate is
proprotional to the square of the voltage
Reducing CPU Energy




Empirically, lower-voltage circuits have longer settling
times
 SGS-Thompson 8051 CMOS microcontroller
 6 Mhz @ 5V
 4.5 MHz @ 3.3 V
 3 MHz @ 2.2 V
=> max clock rate is proportional to voltage
 The relationship is close to linear for an interesting range
of voltages - 5V to 2.2V
Executing the same # cycles at a lower voltage and a
slower clock speed results in a net power savings
Adjust the CPU speed+voltage in response to
scheduling demands
An Energy Metric for CPUs

MIPJ : Millions of instructions per joule





= MIPS/WATTS
No effect on changes in clock speed
Opportunity For Quadratic Energy Savings
As the clock speed is reduced by n, energy per
cycle can be reduced n2
Three methods to achieve this



Voltage reduction
Reversible logic
Adiabatic switching
An Energy Metric for CPUs cont’d




Voltage Reduction
E/clock is directly proportional to V2
Lower-voltage, slower-clock chip; less energy per
cycle
Reducing The Energy Consumption







The same # of cycles but lower voltage
Ex: a task with 100ms deadline
Method 1
50ms - full speed;
50ms - idle
Method 2
100ms - half speed at half voltage
Energy consumption: 4:1
Approach of This Paper

Energy Saving Technique



Evaluation


The fine grain control of CPU clock speed
Running slower and at reduced voltage
Using trace-driven simulation
Goal


To evaluate the energy savings
To measure the effect of running too slow
Trace Data


From the UNIX scheduler
Workloads




Time stamp: microsecond
Sleep events: wait on hard, soft events




S/W dev., documentation, e-mail, simulation, ...
Typing, scrolling
Hard events: disk wait, page fault
Soft events: keystroke, awaiting network packets
Soft idle can be eliminated by rescheduling
Hard idle is mandated by a wait on a device
Assumptions







Simulation
Soft events belongs to idle periods
No reordering of trace data events
Using no energy when idle
Taking no time to switch speeds
No consideration of > 30 second period of greater
than 90% idleness
Lower bound to practical speed:




1.0
0.66
0.44
0.2
<=>
<=>
<=>
<=>
5V
3.3 V
2.2 V
1.0 V
Scheduling Algorithms

OPT (unbounded-delay perfect-future)






Taking the entire trace
Stretching all the run times to fill all the idle times
Imaginary batch job with perfect knowledge
Impractical & undesirable
Bad response time
FUTURE (bounded-delay limited-future)




Taking the future trace of a small window
Window sizes: 1 ms ~ 400 sec
Impractical but desirable
Good response time on a window of 10 to 50 ms
Scheduling Algorithms -cont’d

PAST (bounded-delay limited-past)








Looking a fixed window into the past
Assuming the next window will be like the previous one
Examine % busy during the pervious interval and adjust
speed for the next interval
Excess cycles can build up if speed (+voltage) is set too
low. => Penalty metric
Excess Cycle Penalty
At each interval, count up left over cycles that
accumulated because you ran too slow
Switch to full speed if there were more excess cycles than
idle time in the previous interval
Hard idle (page fault, disk request) cannot be squeezed
Trace Driven Simulation

Trace Points









Sched:
Idle on:
Idle off:
Fork:
Exec:
Exit:
Sleep:
Wakeup:
context switch away a process
enter the idle loop
leave idle loop to run a process
create a new process
overlay a new process with another program
process termination
wait on an event
notify a sleeping process
Traces


Short runs during specific tasks, editing etc.
Long runs of several hours
Evaluation: The Results of Three Algorithms
QuickTime™ and a
decompressor
are needed to see this picture.
Evaluation: Minimal Voltage & The Excess Cycles
Frequency: All the excess cycles <= x-val., but > previous x-val.
Excess cycles: time to run unfinished instructions at full speed
Lower min vol. => more cases where excess cycles build up
=> accumulate in longer interval => peak extends to right
QuickTime™ and a
decompressor
are needed to see this picture.
Evaluation: Interval Length & Excess Cycles
Peak in excess cycles shifts right as interval len. Increases
Longer scheduling interval => more excess cycles built up
QuickTime™ and a
decompressor
are needed to see this picture.
Evaluation: Different Minimum Voltage Limits
2.2 V is almost as good as 1.0 V
Relative savings for diff. min. voltages
QuickTime™ and a
decompressor
are needed to see this picture.
Evaluation: Changing The Inverval Length
A longer adjustment period results in more savings
QuickTime™ and a
decompressor
are needed to see this picture.
Evaluation: Average Excess Cycles [1]
Lower min. voltage => more excess cycles
Longer intervals => accumulate more excess cycles
Energy savings is function of the interval size
QuickTime™ and a
decompressor
are needed to see this picture.
Evaluation: Average Excess Cycles [2]
QuickTime™ and a
decompressor
are needed to see this picture.
Discussion & Future Work


Feedback source other than idle time
To classify jobs into



No Reordering vs. Reordering



Background, periodic, and foreground
Schd. Order: periodic, foreground, background
Unless a large job mix, reordering not significant.
I/O Wait Model: Hard/Soft
Thinking valid but good to verify
Conclusions





Preliminary Results On CPU Scheduling To Reduce
CPU Energy Usage
Scheduling jobs at different clock rates.
Trace Driven Simulation
OPT / FUTURE / PAST
PAST with a 50ms window



2.2 Volts => 5.0 Volts: This range provides good savings
with moderate penalty
Power savings up to 50% (3.3V), 70% (2.2V)
The Tortoise Is More Efficient Than The Hare.