Transcript talk
POWER REDUCTION IN A
32-BIT ADDER USING
PARALLELISM AND REDUCED
SUPPLY VOLTAGE
Clint Patterson
ELEC-6270
April 24, 2009
Why?
Power reduction is a critical design goal in modern
chip design.
Power reduction can be implemented through the
use of parallelism and reduced supply voltage
Reduced
VDD means individual components function
with more delay – ideally N times for N components.
Parallel operation necessary to maintain throughput
Objectives
Design and verify a 32-bit synchronous adder
circuit in VHDL
Design and verify a parallel 32-bit synchronous
adder circuit (N=2) in VHDL
Determine voltage used for power analysis
Determine power savings
Standard Block Design
64 bit input vector (2x32 bit)
33 bit output (32 bit + 1)
1
cycle delay for result
Component Verification - Adder
…1010 + …0101 = …1111 (C = 0)
…1011 + …0101 = …0000 (C = 1)
…0001 + …1111 = …0000 (C = 1)
…0000 + …1111 = …1111 (C = 0)
Standard Design Verification
Vector 1 clocked in at cycle 1, result at cycle 2
Vector 2 clocked in at cycle 3, result at cycle 4
Vector 3 clocked in at cycle 4, result at cycle 5
Low-Power Design
Same external I/O as standard design
Each adder uses a divided down clock of F/2
2
cycle delay for result
Component Verification – MPC
Takes in clock of frequency F
and outputs a divided down
clock with frequency F/2
Clock In – 100MHz
Clock
Out – 50 MHz
Low-Power Design Verification
Vector 1 clocked in at cycle 1, result at cycle 3
Vector 2 clocked in at cycle 3, result at cycle 5
Vector 3 clocked in at cycle 4, result at cycle 6
Vector 4 clocked in at cycle 5, result at cycle 7
Next Steps
VHDL models were optimized in Leonardo Spectrum
(Level 3) and converted to Verilog
Verilog files were converted to .myrutmod and then
analyzed in Powersim
Null
Results for Low Power Model (0.0000000…)
Segmentation Faults
Must Use Design Architect / Eldo for analysis
Use
Verilog Files
Must determine appropriate voltages for Simulation /
Analysis for meaningful comparisons.
Delay Calculation
Supply voltages determined according to formula :
F = k*(Vdd - Vt)/Vdd
165MHz = k*(1.8V – 0.38V)/1.8V, k = 209.2 MHz
1.5V (~156MHz), 0.9 (~120MHz) for standard supply
0.75V (~100MHz), 0.65V (~85MHz), and 0.5V
(~50MHz) for low-power supply voltages
Run both models at 100 MHz
Simple periods
Much slack for 1.5V supply (~3-4ns)
Slack for low-power model also, only needs 50MHz for 2x
delay
Observations
1.5V and 0.9V for standard model both give verified
results
0.9V shows slightly more delay than 1.5V
See next slide for comparison of more exaggerated 1.8V
and 0.75V results for standard design
Simulation for 0.5V gives unreasonably low power for LP
model
EZ-Wave won’t load; can’t guarantee correct performance
but assume this figure not legitimate.
It is assumed that 0.5V doesn’t provide results fast enough,
thus eliminating dynamic power.
Delay calculation falsely inflated due to not using alphapower law; use 0.65V / 0.75V simulations since they are still
above 50MHz requirement
Eldo Verification
Verification
results for 1.8V
and 0.75V
with similar
vectors
0.75V
too
slow for
verified results
1.8V gives too
much slack
Power Reduction
Admittedly, some power savings is seen in the lowpower cases simply because so much slack is allowed
for the 1.5V supply.
Good power reduction is still seen for the LP model at
0.75V/0.65V when comparing to 0.9V standard supply
~63% and 75%, respectively
Conclusions
Very notable power reduction can be seen through
using component parallelism and reduced supply.
16%
more area (~700 gates instead of ~600)
Verification / Analysis is a critical part of design
Understanding
of available design tools is key
Future Work
Measure power reduction for a range of parallel
scale implementations (N = 3, 4, 5…) to determine
optimal implementation for savings.
Determine exact formula for delay vs. voltage
through experimental results
Questions?