Name Center for Applied Scientific Computing month day, 2001
Download
Report
Transcript Name Center for Applied Scientific Computing month day, 2001
DVSleak: Combining Leakage
Reduction and Voltage Scaling in
Feedback EDF Scheduling
Yifan Zhu, Frank Mueller
North Carolina State University
Center for Efficient, Secure and Reliable Computing
Background
Dyn. Voltage scaling (DVS): lowers dyn. power
Dynamic power was dominating
— Power ~
pt CLVdd2 fclk IleakVdd Pshort
Leakage becoming dominant
Sleep: lowers leakage (static) power
2
Real-Time Systems
Hard real-time systems
— periodic, preemptive, independent tasks [Liu, Layland]
– w/ known worst-case execution time (WCET)
— jobs: periodically released instances of a task
— WCET: measured at the max. freq., w/o DVS
— most practical system: U << 1
Earliest-deadline-first (EDF) scheduling
Ci
— 1 , Ci=WCET, Pi=period
i Pi
n
Ci
f
—
, = act (0< 1) DVS scaling factor
f max
i 1 Pi
3
Motivation
Embedded systems with limited power supply
DVS for real-time system
— trade-off: energy saving vs. timing requirements
— lower CPU voltage/frequency longer to complete
Task workloads change dynamically
— WCET overestimates actual execution time
— wide variation of execution times
–Longest vs. shortest times
Sleeping: 1-2 orders of magnitude less power
— DVS below threshold more energy than sleeping
— Long idle more energy than sleeping
— But wakeup overhead (cold misses in cache)
4
Motivation
Real-world examples:
— graphics: 78% of WCET [Wegener/Mueller]
— defense: 87%; automotive: 74%
— benchmarks: 30-89%; image recognition: 85% [Wolf]
1.2
Prior DVS algorithms: lack
adaptability to dynamic
workloads
Energy (normalized)
c==50%WCET
1
c in [20%WCET, 80%WCET]
0.8
0.6
0.4
0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
WCET Utilization
Look-ahead DVS [Pillai/ Shin]
5
Contribution
A feedback-based framework for dynamic workloads
[LCTES’02, RTAS’04, LCTES’05]
New: A hybrid sleep+DVS scheme, 2 observations:
1. Limit to DVS use sleep below certain threshold
2. Trade-off idle vs. sleep depends on length of inactivity
3. Feedback helps in these decisions
Simulation experiments
Comparison with prior work
6
Related Work
Dynamic Voltage Scaling
— General purpose DVS: Weiser, Govil, Pering, Grunwald
— Real-time DVS: Lee, Pillai, Aydin
— Optimality of DVS: Ishihara, Qu, Lorch, Xie, Saewong
Feedback Real-time Scheduling
— Stankovic, Lu, Varma, Poellabauer, Minerick
Leakage-aware DVS scheduling
— Lee, Quan, Jejurikar ’04/’05, Zhang
— We compare with Jejurikar’05 (closest related, best scheme)
7
Feedback-DVS Framework
V/f selector:
error ci Ci
A
(V,f) = func(error)
Fig. Feedback-DVS Framework
Maximum EDF schedule
determine slack in EDF schedule
assumes: c = WCET
8
Voltage-Frequency Selector
f
:
1
fmax
Ck
Ci
1
Pk i{1,...,n}\{k } Pi
t
Task splitting with WCET: Ci CiA + CiB
— CiA at freq. ( 0 100%); CiB at max. freq.
— More aggressive:
– < uniform frequency w/o splitting
— Objective:
f
–T finishes within the 1st portion
100%
lower energy consumption
A
A
=Ci /(Ci +slack)
Still guaranteed to meet deadline
proof in prior paper
Tb
Ta
t
CiA/ CiB
9
Extension to Leakage-aware DVS
Power ~
pt CLVdd2 fclk IleakVdd Pshort
Dynamic power does NOT
dominate anymore!
Static power exceeds dynamic power when the voltage is reduced
below a threshold value, the critical speed
— Voltage below threshold not energy efficient anymore
— Sleeping may be better
But need to consider wakeup overhead
— Mostly due to cache refill
— Calculated statically based on time to refill reused lines
10
Speed Reduction vs. Task Delaying
T
t
Speed reduction
T
t
Delaying the start time
Why delay the start time of a task?
— To maximize the CPU sleeping time
11
Delay Dispatching a Task
T1
idle1
(i) Consider Schedule
T1
T2
idle2
T3
t
WCET
idle1 T2
(ii) No Delay
WCET
T1
sleep
(iii) Delay
sleep
T3
t
idle2
T2
T3
WCET
CB
t
1.
If idle1+idle2 > tth before DVS but < tth afterwards no DVS
2.
idle1+idle2 < tth no delay
3.
If idle1 < CB no delay
4.
Otherwise delay
Still guaranteed to meet deadline proof in paper
threshold for sleep
12
Scaling below the Critical Speed
Pure DVS: should never scale frequency below critical speed
DVS combined with sleeping:
— sleep if threshold tth > idle slot
— If idle slot is too short (< tth), scale below critical speed
–No other work to do (in contrast to non real-time)
–Lower frequency/voltage power savings
13
Experimental Framework
Scheduling simulator
— Accurately reflects energy trends [Zhu’05] PPC405LP
Use the same power model as [Jejurikar’04]
— Critical speed, wakeup cost
Assume four discrete frequency levels:
— 25%, 50%, 75%, 100% of fmax
Compare energy in hyperperiod (const. amount of work) for
— Pure Feedback-DVS
— DVS+sleep: Feedback-DVS w/ sleep policy (no delay policy)
— DSR-DP: dyn. procrastination+slack reclamation [Jejurikar’05]
— DVSleak: feedback-DVS w/ sleep & delay now/later policies
— Lower-bound schedule: best frequency + sleep for max. idle
14
3 Tasks, Const. Execution, 25% WCET
2000000
Pure DVS
DVS+sleep
DSR-DP
DVSleak
Lower Bound
1800000
Energy Consumption (mJ)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
Significant savings w/ sleep, more for low utilizations
DVSleak: Delay most impact for medium to high utilizations
— Close to lower bound
15
3 Tasks, Const. Execution, 75% WCET
2000000
Pure DVS
DVS+sleep
DSR-DP
DVSleak
Lower Bound
1800000
Energy Consumption (mJ)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
All schemes: resilient to actual/WCET ratio
DVSleak never worse than other schemes, savings:
— 50% over pure, 20% over DVS+sleep, 8.5% over DSR-DP
16
3 Tasks, Var. Execution (pat1), 75% WCET
2000000
Pure DVS
DVS+sleep
DSR-DP
DVSleak
Lower Bound
1800000
Energy Consumption (mJ)
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
DVSleak: more resilient to fluctuating exec. times (unchanged)
feedback helps!
All others: 5-10% more energy consumption than for const. exec.
17
10 Tasks, Const. Execution, 25% WCET
2200000
PureDVS
DVS+sleep
DSR-DP
DVSleak
LowerBound
2000000
Energy Consumption (mJ)
1800000
1600000
1400000
1200000
1000000
800000
600000
400000
200000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
WCET Utilization
More tasks 5-10% higher energy cost (switching)
DVSleak still best of all (~ same margin)
18
Length of Task Periods
U=60%, E normalized to hyperperiod task set 2, c=50% WCET
Harmonic (1) vs. non-harmonic (2):
— 10-27% more energy for non-harmonic cannot fold jobs
released at same time more uncertainty
Longer (2) vs. shorter (3) periods for non-harmonic:
— 2-28% more energy for shorter periods more job releases,
less sleep time
— DVSleak ~ 15% lower energy than DSR-DP
Feedback more important for shorter periods
19
Conclusion
DVSleak: Novel Feedback DVS + leakage (sleep), benefits for
— fluctuating execution times
— shorter task periods
can scale below critical speed
— medium utilizations (most common)
sleep policy by itself enough for high/low utilizations
(always sleep/never sleep)
DVSleak energy over other schemes:
—avg. 50% over DVS-only
—avg. 20% more over DVS+sleep
—Avg. 8.5% more over [Jejurikar’05]
—Sleep now/later important when actual exec. << WCET
Prior: Evaluation on a real embedded platform
—V2f model works for OS scheduling
20
Future Work
Implementation on IBM PPC 405LP test board
Has been used for DVS experiments
— Oscilloscope, data acquisition card for voltage / current
Assessing sleep modes
1.
Clock suspend
same power, all still up
2.
Suspend
1/10 power, SDRAM up
3.
Hibernate
N/A (SDRAM NVRAM)
4.
Standby
N/A (APM over I2C)
Need faster resume (reactivating devices slowlow-power modes)
21