lemieux-ccece2008 - University of British Columbia

Download Report

Transcript lemieux-ccece2008 - University of British Columbia

SoC Energy Savings =
Reduce + Reuse + Recycle
Guy Lemieux, Mehdi Alimadadi,
Samad Sheikhaei, Shahriar Mirabbasi
University of British Columbia, Canada
Patrick Palmer
University of Cambridge, UK
SoC Energy Savings: The 3 R’s
Reduce
Energy
Recycle
Energy
Today, already a
common strategy
Re-use
Energy
Tomorrow, we need to do more of this!!
How? Case Study of a 660MHz DC-DC Power Converter
2
The Problem
• Chip design
– Fixed throughput goal
– High-performance  High power
P    f  C V
2
• Power reduction
– Reduce C
– Reduce f  lower throughput  add parallelism  increase C
– Reduce f and V  lower throughput  add parallelism  decrease P
3
Solutions 1
• Problem
P    f  C V
2
• Standard CMOS design tricks
– Resize transistors for power (not delay)
– Reduce signal transitions (switching activity )
• Glitching (non-functional switching)
• Un-needed functional switching (clock gating, data gating)
– Low-Vdd  mixed voltage islands
– Low-Vt  recover performance due to Low-Vdd
– Multi-Vt, multi-Vdd  lower leakage, mixed voltage islands
4
Solutions 2
• Problem
P    f  C V
2
• Circuit-level tricks to reduce energy
–
–
–
–
Pass transistor logic
Adiabatic logic
Supply stacking
Etc…
5
Power Summary
• Problem
P    f  C V
2
• Standard solutions
– “Bag of tricks”
• Mantra: Reduce, Reuse, Recycle
– Energy reuse and recycling is new!
– … a new trick for the bag ?
– …or a whole new bag of tricks?
6
Energy Reuse and Recycling
• Reuse
– Charge used in one part of circuit
– Moved and re-used to another part of circuit
• No regulation, eg, LC-resonator
• Recycling
– Charge used in one part of circuit
• Not all the energy in the charge was needed
– Re-regulated and delivered to another part of circuit
• Captures “unused” energy headroom
• Delivers to where it can be used
• Not perpetual motion  conversion losses, limited headroom
7
Energy Reuse and Recycling
• Recycling
– Capture under-utilized charge
– Re-regulate, deliver elsewhere
• Needed technology
– On-chip dc-dc converters / voltage regulators
•
•
•
•
•
Step-up and step-down
High efficiency
All on-chip
Low area
Small inductors
8
Switch Mode Power Supply
• CMOS inverter
– Power switches
– Vgate is PWM with duty cycle D to control output
• L, C is a low-pass filter
– Vout = Vdd * D
(step-down or buck converter)
Vdd
Vgate
Vinv
Vin +
-
S
D
L
IL
S
Vout
C
Vgate
Vinv
R
L
IL
Vout
C
R
D
9
CMOS Switch Mode Power Supplies
• Large
CLK inMp, Mn transistors
Vclkfor low on-resistance
– Large input capacitance onCclk
gate terminals
– Requires strong transistors to drive grate
Mp
PWM
CLK in
Lf
Vout
Cf
Rload
Mn
– Front-end drive chain
• Series of inverters, tapered in size
10
Fully Integrated CMOS Power Supplies
• LC CLK
~ in1/F2
Vclk
– Operate at high F  shrinkCclk
L, C on-chip
– High F  high power in front-end drive chain
Mp
PWM
CLK in
Lf
Vout
Cf
Rload
Mn
– Front-end drive chain
• How to its shrink energy use ???
11
Fully Integrated CMOS Power Supplies
•
Problem
– Front-end drive chain uses too much power (at high F)
•
Solution
1. Reduce – a) separate Mp, Mn chains, b) low-swing
2. Reuse – stack drive chains for Mp, Mn
3. Recycle – after stacking drive chain, deliver excess
energy to the load in a regulated fashion
12
1. Reduce Energy
• 1a) Independent Mp, Mn drive chains, enables ZVS
– ZVS: both Mp, Mn off, inductor charges/drains Cx
• 1b) Apply low-swing
Vdd
Lowswing
Lowswing
13
2. Reuse Energy
• 2) Stack Mp, Mn drive chains
– Low-swing = half-swing
– Regulate Vdd/2 ??
Vdd
Vdd
Lowswing
Vdd/2
Lowswing
14
3. Recycle Energy
• 3) Excess front-end energy sent to load
– Mp drive chain ~3x bigger, more energy than Mn
– Linear regulator: 2 diode drop ~Vdd/2
Vdd
Vdd
Vdd/2
15
Simulated Results
•
Efficiency Boost from Recycling
55
Low Swing, with Diodes, with
Mimicked Low Vt (iv)
Low Swing, with Diodes (iii)
50
Low Swing, without Diodes (ii)
45
Full Swing, without Diodes (i)
Efficiency (%)
40
35
30
25
20
15
10
5
10
15
20
25
30
35
40
45
50
55
60
65
70
Duty Cycle (%)
16
Chip: 660MHz DC-DC Power Converter
Reuses & Recycles some of its own Energy
approx 1.2 x 2.8 mm2
17
Chip: Technology Highlights
• dc-dc buck converter, recycles own energy
– 180nm CMOS
– 660MHz to reduce LC area
• 2.5mm2 layout area, inductor-dominated
– 2.2V input, 0.75-1.0V output, 40-55mA
– Simulation
• No recycling: 28% efficient
• With recycling: 43% efficient
18
Chip: Schematic
Wp/Lp=135/0.18
Wn/Ln=45/0.18
Wp/Lp=540/0.18
Wn/Ln=180/0.18
Wp/Lp=2160/0.18
Wn/Ln=720/0.18
VDD2=2.2V
Vpmos-in
Vpmos
3Im/2
R2=R1
+
Top Chain
of Inverters
GND
Im/2
On-Chip
Output Filter
Diode Connected NMOS
Each 4000/0.18
Vm
Cm=125pF
NMOS 4896/3
Mp
12000/0.18
Cgate=3Cm/20
R3=50W
R1=2.2kW
VDD1=2.2V
Vinv
LF=4.38nH
+
Im
ILf
Bottom Chain
of Inverters
D1
CF =1.1nF
Load
Vpmos-in
Vpmos-in
Vnmos-in
Vnmos-in
D2
Vnmos
Vnmos-in
Vout
CX
Mn
4000/0.18
Cgate=Cm/20
R4=50W
On-chip
Wp/Lp=45/0.18
Wn/Ln=15/0.18
Wp/Lp=180/0.18
Wn/Ln=60/0.18
Wp/Lp=720/0.18
Wn/Ln=240/0.18
Transistor dimensions
are in µm.
Off-chip
NMOS
40800/3
19
Chip: Measured Results
Standard error
bars: measured
from 10 chips
1.1
Vout (Test)
Efficiency (Test)
40
0.9
35
0.8
30
0.7
25
0.6
20
40
45
50
55
60
65
Efficiency (%)
Vout (V)
1
45
70
Duty Cycle (%)
20
Chip: Summary
• Chip Lessons
– High-frequency dc-dc conversion works !
– Gives us confidence simulation results are accurate
• Chip research – mostly $-limited
– Need $$ area for these chip designs
• Reference design – no energy recycling
• Modified design – energy recycling
• High currents & parasitics prevent “sharing” just 1 inductor layout
– 180nm is wrong technology
• Need multi-Vt transistors
• Need higher frequency & even smaller inductor
• 90nm $$, 65nm $$$$
21
90nm Chip: Recycle Back-end Clock Energy
•
Merge 3GHz clock driver & dc-dc converter
CLK in
Vclk
Cclk
•
Benefits
– Shared driver chain
– Cclk added to SMPS
Lf
•
Red path
Vout
Cf
Rload
– NMOS drains Cclk  wastes charge!
•
Blue path
– Delay NMOS turn-on (ZVS)  recycles clock energy!
22
90nm Chip: Recycle Back-end Clock Energy
•
High-speed ZVS delay circuit for Mn
– Delay rising edge of Vn
M3
Vdd
1
2
Vm
Vp
Mp
[ISSCC 2007]
M4
•
3
Recycles 50%
of clock energy
(sent to load)
Lf
Vclk
Cclk
Vout
Cf
Rload
M1
4
Vn
ZVS
Delay
Circuit
Mn
GND
M2
23
Future Work
• Need to combine ideas
– 180nm chip: reduce, reuse, recycle front-end drive chain energy
– 90nm chip: recycle back-end clock load energy
• On-chip regulators lead to power savings
– Energy recycling: “free” power supply
– On-chip voltages
• Low-voltage islands
• Dynamic Voltage and Frequency Scaling
• Adaptive body bias / dynamic Vt adjustment
• On-chip regulators lead to new ideas
– New work: low-power 4GHz clock driver inspired by boost converter
24
Future Work
• Observation
– CMOS logic stores energy in capacitors, then
discharges it to GND
– This is wasteful
• Question…
– Can we make CMOS more efficient, e.g. by recycling
the energy through a dc-dc converter ?
• A new dynamic logic family that uses inductors to drain
precharged output nodes instead of pulldown NMOS ?
25
Please remember to
reduce + reuse + recycle.
Thank you.
Merging 180nm and 90nm Designs
Mpu
Vpmos-in
Vpmos
Mnuz
Mnu
Vm
Mp
ZVS
for Mp
Circuit
Vinv
+
Cm
LF
ILf
Charge
Recycling
Circuit
Vout
+
CF
RL
ZVS
for Mn
Circuit
Mpd
Vnmos-in
Vnmos
Mn
Mnd
27