Slides - NC State University

Download Report

Transcript Slides - NC State University

Critical Power Slope
Understanding the Runtime
Effects of Frequency Scaling
Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen
Ram Rajamony Raj Rajkumar
Motivation

Power management algorithms implicitly assume that
lower performance points are more energy efficient that
higher points

This paper shows that this assumption is not always
valid

Also helps decide which operating points of a processor
should be considered by an power management
algorithm
Outline

Motivation


Efhigh
<
: not always true
How do we choose which operating points to use?
Ef
low
Watts
Watts
Eactive
Eactive
Eidle
t




t
Evaluation of frequency scaling, clock throttling and
dynamic voltage scaling on three existing processors
Analytical model: Critical Power Slope
Analysis on voltage scaling systems
Conclusion
Techniques of Power Management



Frequency scaling
 Processor clock is reduced
 Processor consumes less energy at the expense of reduced
performance
Clock throttling
 Clock runs at original frequency
 Clock signal is gated/disabled for some cycles at regular intervals
Dynamic voltage scaling
 Reduces power consumed by lowering the operating voltage
 Advantageous because E ∝ V2
Linux on Pentium

Dell Inspiron 8000 laptop with 850 MHz PIII processor with 512Mb of
RAM running Linux 2.4.6

Processor runs at 8 different performance states

100% 87.5% 75% 62.5% 50% 37.5% 25% 12.5%

Effect is evaluated by throttling the clock

The following micro benchmarks were considered

Access to register

L1 cache (read)
L1 cache (write)
Access to memory (read)
Access to memory (write)
Disk Read




Micro benchmark performance - Pentium
Power usage in idle mode - Pentium
• Linux scheduler puts the processor into C1 or C2 sleep state
• Idle state power is considered to be a constant
Power measurements at different performance
states - Pentium
•Simple benchmark which exercises the CPU while changing the
performance state from 100% - 12.5%
•As performance is lowered system power usage decreases
linearly
Energy consumption


Energy required to complete the benchmark – Eactive +
Eidle
Compare energy used to execute same load at the same
time interval at different operating points



The time interval does not end at Eactive since the system is kept
on until next request arrives
Idle time = Time to run the benchmark at a particular
operating point – Time to run the benchmark at lowest
performance states
Idle power is known, hence Eidle can be calculated
Eactive +
Eidle decreases slightly as
performance state increases
The
benchmarks suggest we should
run this system at the highest
performance state possible
Linux on PowerPC

PowerPC 405GP
microprocessor,
8KB of D cache
16KB of I cache,
32MB RAM with
Linux 2.4.0

Frequency of the
processor and
processor local
bus (PLB) can be
changed directly
affecting memory
speed
PowerPC: Power measurements
PowerPC: Energy
consumption
•Total energy = Eactive + Eidle
•Eactive = Ecpu + ESDRAM
+Eother
•By lowering frequency, total
energy used by the system
descreases
•Results contrary to the Pentium
based system
Characterization of the two systems

Bimodal behavior – system will either be in active or idle mode
Performance ∝ frequency
Pidle will be considered constant for all frequencies

Consider CPU intensive workload W, lowest frequency fmin



At fmin utilization of the system is 1 and W takes Tfmin units of time to complete
Ef min  Tf min Pf min

(-eq. 1)

At frequency f (f> fmin)
Ef  (Tf
(Ef
=
min
f min
f
Eactive
) Pf  Tf
+
min
(1 
f min
f
) Pidle
Eidle)
(-eq. 2)
Critical Power Slope
•As power ∝ frequency and constant
at idle state (from the graph)
Pf  Pf min  m( f  f
•
min
)
Substituting Pf in eq. 2
(-eq. 3)
•
There should be a slope m where energy
usage at all frequencies is equal
- critical power slope mcritical
•
Equating eq. 1 and eq.3 we get
mcritical  P
f min
f
 Pidle
min
Implications of CPS

If



If


m  mcritical
Energy efficient to run at higher freq.
Pentium
Ef  Ef min
W 12W
mcritical  84815MHz
12.5%  .028
W 15W
m  84830MHz
12.5%  .020
m  mcritical
Energy efficient to run at lower freq.
PowerPC
Ef  Ef min
 2.02W
mcritical  2.2766WMHz
 .0038
3.13W  2.27W
m  266
MHz 66 MHz  .0043
CPS for voltage scaling system


Non linear power savings : P ∝ V2
Look at every operating point at frequency
m
fx
critical

If


If

m m
fx

Pfx  Pidlefx
fx
fx
critical
Energy efficient at higher frequency than
m m
fx
fx
fx
critical
Energy efficient at lower frequency than
fx
fx
Analysis on SA-1100


A StrongARM processor (SA-1100) is considered
Above 74MHz
fx
fx
m m
74 MHz
46 mW
mcritical  121mW
74 MHz
critical

At 74MHz
m

Below 74MHz
74 MHz
mW 106 mW
 121
74 MHz 59 MHz
m m
fx
 0.001
 0.001
fx
critical


Energy Inefficient below 74MHz!
No incentive to operative between 74MHz and 59 MHz
using voltage scaling
Critical Power slope in Realistic workload
• Static page requests on a web
server
•Apache 1.3, Pentium
based laptop
•At 100% performance – 1500
requests/sec
•At 62.5% performance – 700
requests/sec
•Energy increases linearly as
request rate increases
•More energy efficient to run at
higher performance
•Consistent with previous
Pentium system analysis
Conclusion

This paper shows the assumption that lower
performance points are more energy efficient
that higher performance points is not valid

This paper helps decide which operating
point to choose in a power management
scheme
Questions?