Transcript talk

POWER REDUCTION IN A
32-BIT ADDER USING
PARALLELISM AND REDUCED
SUPPLY VOLTAGE
Clint Patterson
ELEC-6270
April 24, 2009
Why?


Power reduction is a critical design goal in modern
chip design.
Power reduction can be implemented through the
use of parallelism and reduced supply voltage
 Reduced
VDD means individual components function
with more delay – ideally N times for N components.
 Parallel operation necessary to maintain throughput
Objectives




Design and verify a 32-bit synchronous adder
circuit in VHDL
Design and verify a parallel 32-bit synchronous
adder circuit (N=2) in VHDL
Determine voltage used for power analysis
Determine power savings
Standard Block Design


64 bit input vector (2x32 bit)
33 bit output (32 bit + 1)
1
cycle delay for result
Component Verification - Adder




…1010 + …0101 = …1111 (C = 0)
…1011 + …0101 = …0000 (C = 1)
…0001 + …1111 = …0000 (C = 1)
…0000 + …1111 = …1111 (C = 0)
Standard Design Verification



Vector 1 clocked in at cycle 1, result at cycle 2
Vector 2 clocked in at cycle 3, result at cycle 4
Vector 3 clocked in at cycle 4, result at cycle 5
Low-Power Design


Same external I/O as standard design
Each adder uses a divided down clock of F/2
2
cycle delay for result
Component Verification – MPC


Takes in clock of frequency F
and outputs a divided down
clock with frequency F/2
Clock In – 100MHz
 Clock
Out – 50 MHz
Low-Power Design Verification




Vector 1 clocked in at cycle 1, result at cycle 3
Vector 2 clocked in at cycle 3, result at cycle 5
Vector 3 clocked in at cycle 4, result at cycle 6
Vector 4 clocked in at cycle 5, result at cycle 7
Next Steps


VHDL models were optimized in Leonardo Spectrum
(Level 3) and converted to Verilog
Verilog files were converted to .myrutmod and then
analyzed in Powersim
 Null
Results for Low Power Model (0.0000000…)
 Segmentation Faults

Must Use Design Architect / Eldo for analysis
 Use
Verilog Files
 Must determine appropriate voltages for Simulation /
Analysis for meaningful comparisons.
Delay Calculation

Supply voltages determined according to formula :
F = k*(Vdd - Vt)/Vdd
 165MHz = k*(1.8V – 0.38V)/1.8V, k = 209.2 MHz




1.5V (~156MHz), 0.9 (~120MHz) for standard supply
0.75V (~100MHz), 0.65V (~85MHz), and 0.5V
(~50MHz) for low-power supply voltages
Run both models at 100 MHz
Simple periods
 Much slack for 1.5V supply (~3-4ns)
 Slack for low-power model also, only needs 50MHz for 2x
delay

Observations

1.5V and 0.9V for standard model both give verified
results
0.9V shows slightly more delay than 1.5V
 See next slide for comparison of more exaggerated 1.8V
and 0.75V results for standard design


Simulation for 0.5V gives unreasonably low power for LP
model
EZ-Wave won’t load; can’t guarantee correct performance
but assume this figure not legitimate.
 It is assumed that 0.5V doesn’t provide results fast enough,
thus eliminating dynamic power.
 Delay calculation falsely inflated due to not using alphapower law; use 0.65V / 0.75V simulations since they are still
above 50MHz requirement

Eldo Verification

Verification
results for 1.8V
and 0.75V
with similar
vectors
 0.75V
too
slow for
verified results
 1.8V gives too
much slack
Power Reduction


Admittedly, some power savings is seen in the lowpower cases simply because so much slack is allowed
for the 1.5V supply.
Good power reduction is still seen for the LP model at
0.75V/0.65V when comparing to 0.9V standard supply

~63% and 75%, respectively
Conclusions

Very notable power reduction can be seen through
using component parallelism and reduced supply.
 16%

more area (~700 gates instead of ~600)
Verification / Analysis is a critical part of design
 Understanding
of available design tools is key
Future Work


Measure power reduction for a range of parallel
scale implementations (N = 3, 4, 5…) to determine
optimal implementation for savings.
Determine exact formula for delay vs. voltage
through experimental results
Questions?