Slides - NC State University
Download
Report
Transcript Slides - NC State University
Critical Power Slope
Understanding the Runtime
Effects of Frequency Scaling
Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen
Ram Rajamony Raj Rajkumar
Motivation
Power management algorithms implicitly assume that
lower performance points are more energy efficient that
higher points
This paper shows that this assumption is not always
valid
Also helps decide which operating points of a processor
should be considered by an power management
algorithm
Outline
Motivation
Efhigh
<
: not always true
How do we choose which operating points to use?
Ef
low
Watts
Watts
Eactive
Eactive
Eidle
t
t
Evaluation of frequency scaling, clock throttling and
dynamic voltage scaling on three existing processors
Analytical model: Critical Power Slope
Analysis on voltage scaling systems
Conclusion
Techniques of Power Management
Frequency scaling
Processor clock is reduced
Processor consumes less energy at the expense of reduced
performance
Clock throttling
Clock runs at original frequency
Clock signal is gated/disabled for some cycles at regular intervals
Dynamic voltage scaling
Reduces power consumed by lowering the operating voltage
Advantageous because E ∝ V2
Linux on Pentium
Dell Inspiron 8000 laptop with 850 MHz PIII processor with 512Mb of
RAM running Linux 2.4.6
Processor runs at 8 different performance states
100% 87.5% 75% 62.5% 50% 37.5% 25% 12.5%
Effect is evaluated by throttling the clock
The following micro benchmarks were considered
Access to register
L1 cache (read)
L1 cache (write)
Access to memory (read)
Access to memory (write)
Disk Read
Micro benchmark performance - Pentium
Power usage in idle mode - Pentium
• Linux scheduler puts the processor into C1 or C2 sleep state
• Idle state power is considered to be a constant
Power measurements at different performance
states - Pentium
•Simple benchmark which exercises the CPU while changing the
performance state from 100% - 12.5%
•As performance is lowered system power usage decreases
linearly
Energy consumption
Energy required to complete the benchmark – Eactive +
Eidle
Compare energy used to execute same load at the same
time interval at different operating points
The time interval does not end at Eactive since the system is kept
on until next request arrives
Idle time = Time to run the benchmark at a particular
operating point – Time to run the benchmark at lowest
performance states
Idle power is known, hence Eidle can be calculated
Eactive +
Eidle decreases slightly as
performance state increases
The
benchmarks suggest we should
run this system at the highest
performance state possible
Linux on PowerPC
PowerPC 405GP
microprocessor,
8KB of D cache
16KB of I cache,
32MB RAM with
Linux 2.4.0
Frequency of the
processor and
processor local
bus (PLB) can be
changed directly
affecting memory
speed
PowerPC: Power measurements
PowerPC: Energy
consumption
•Total energy = Eactive + Eidle
•Eactive = Ecpu + ESDRAM
+Eother
•By lowering frequency, total
energy used by the system
descreases
•Results contrary to the Pentium
based system
Characterization of the two systems
Bimodal behavior – system will either be in active or idle mode
Performance ∝ frequency
Pidle will be considered constant for all frequencies
Consider CPU intensive workload W, lowest frequency fmin
At fmin utilization of the system is 1 and W takes Tfmin units of time to complete
Ef min Tf min Pf min
(-eq. 1)
At frequency f (f> fmin)
Ef (Tf
(Ef
=
min
f min
f
Eactive
) Pf Tf
+
min
(1
f min
f
) Pidle
Eidle)
(-eq. 2)
Critical Power Slope
•As power ∝ frequency and constant
at idle state (from the graph)
Pf Pf min m( f f
•
min
)
Substituting Pf in eq. 2
(-eq. 3)
•
There should be a slope m where energy
usage at all frequencies is equal
- critical power slope mcritical
•
Equating eq. 1 and eq.3 we get
mcritical P
f min
f
Pidle
min
Implications of CPS
If
If
m mcritical
Energy efficient to run at higher freq.
Pentium
Ef Ef min
W 12W
mcritical 84815MHz
12.5% .028
W 15W
m 84830MHz
12.5% .020
m mcritical
Energy efficient to run at lower freq.
PowerPC
Ef Ef min
2.02W
mcritical 2.2766WMHz
.0038
3.13W 2.27W
m 266
MHz 66 MHz .0043
CPS for voltage scaling system
Non linear power savings : P ∝ V2
Look at every operating point at frequency
m
fx
critical
If
If
m m
fx
Pfx Pidlefx
fx
fx
critical
Energy efficient at higher frequency than
m m
fx
fx
fx
critical
Energy efficient at lower frequency than
fx
fx
Analysis on SA-1100
A StrongARM processor (SA-1100) is considered
Above 74MHz
fx
fx
m m
74 MHz
46 mW
mcritical 121mW
74 MHz
critical
At 74MHz
m
Below 74MHz
74 MHz
mW 106 mW
121
74 MHz 59 MHz
m m
fx
0.001
0.001
fx
critical
Energy Inefficient below 74MHz!
No incentive to operative between 74MHz and 59 MHz
using voltage scaling
Critical Power slope in Realistic workload
• Static page requests on a web
server
•Apache 1.3, Pentium
based laptop
•At 100% performance – 1500
requests/sec
•At 62.5% performance – 700
requests/sec
•Energy increases linearly as
request rate increases
•More energy efficient to run at
higher performance
•Consistent with previous
Pentium system analysis
Conclusion
This paper shows the assumption that lower
performance points are more energy efficient
that higher performance points is not valid
This paper helps decide which operating
point to choose in a power management
scheme
Questions?