VLSIpresentation
Download
Report
Transcript VLSIpresentation
Low Power SRAM
VLSI Final Presentation
Stephen Durant
Ryan Kruba
Matt Restivo
Voravit Vorapitat
High Level Architecture
ADDR
5:32 Block
Enable
SRAM
Block
OUT
Block I/O
READ
WRITE
ADDR
DATA
BLK ENABLE
SRAM Block
OUT
Output Buffering
SRAM
Block
Sense
Amp
BLK_EN0
BLK_EN1
ADDR 13:12
2:4
Decode
r
ADDR 14
Out
BLK_EN2
BLK_EN3
Block Level Architecture
BLK_EN
CLK
Pulse
Gen 1
Precharge
Decode
r
6:64
ADDR
SRAM Block
BLK_EN
Pulse Gen 1
Pulse
Gen 2
Dela
y
S
A
S
A
S
A
Write
S
A
Input Gating
READ
WRITE
Register
ADDR
DATA
Buffer
ADDR 14:13
SRAM Block
x8
Word Line Pulse
WL
WL
Pulse
Pulse WL to reduce the drop in bit line voltage
during a read
Size the inverters to create min WL pulse length
min WL pulse occurs before the point where the
sense amp can no longer execute a read
Sense Amp Enabling
Sense amp enabled after WL pulse to
maximize differential current
Wordline
pulse generator clocks a second
pulse generator to ensure proper SA timing
SAE signal and precharge signal separate
to allow outputs to hold to end of clock
cycle
Sense Amp
VDD
SAE
SAE
BL
VDD
BLB
SAE
Size the three nmos transistors to control:
Bit
line voltage drop
Delay
Gate Length Vs. Bit Line Voltage Drop
Using a 5 V vdd and allowing OutB to drop to 4 V min
2
1.8
1.6
bit line voltage drop
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
1
2
3
4
l e ngt h
5
6
7
Delay from SAE to Out
2.5
2
Time (nS)
1.5
1
0.5
0
0
1
2
3
4
Le ngt h
•From 50% SAE high to 50% Out low
•Same parameters as bit line voltage graph
5
6
7
Memory Partitioning
32 blocks *256 rows *128 columns
balance
between idle block power savings
and peripheral circuitry
resulting block aspect ratio relatively square
to limit maximum WL/BL capacitances
WL partitioning and four words/row to reduce
power
Simulation Model
Multiple voltage
sources to accurately
measure energy
Sense Amp
Wordline,
active
column, inactive
column, and peripheral
Etotal =
EWL+32Eact+96Einact+
Eperipheral
Bitcell
VSS
Sense Amp
VSS
Dummy Cells
Bitcell
VSS
VSS
Dummy Cells
VSS
Dummy Cells
VSS
Bitcell
Dummy Cells
Bitcell
Low Power Techniques
Optimal Signal Order for Energy
Goal: Making WL pulse as short as possible.
Read
SAE must be asserted only after WL pulse
ends.
Write
WL pulse must start after BL or BLB
completely discharged.
Write ’0’
CL
K
BL
WL
SA
E
Read
Write ’1’
Read
Lower Vdd
Energy=CeffVdd2
(Rail to Rail)
-Expected quadratic energy reduction
Energy=CeffVdd∆V (BL/BLB during read)
- ∆V should scale down but may not be as
fast as Vdd so we expect between linear
and quadratic energy reduction.
Simulation Result for 1 bit
Energy vs Vdd for 1 bit read/write
6
5
Energy (pJ)
4
WL
3
Write
Read
2
DRead
1
0
0
0.5
1
1.5
Vdd (V)
2
2.5
3
Note: The Read/Write/Dread shown here is BL energy only
Energy vs Vdd for 32 bit read/write
250
Energy (pJ)
200
150
TotalWrite
TotalRead
100
Total Average
50
How far should we go?
0
0
0.5
1
1.5
Vdd (V)
2
2.5
ERe ad 32 EWL 32ERe ad 96ED Re ad
EWrite32 EWL 32EWrite 96ED Re ad
3
Clock Gating
Try to reduce the capacitance that high
activity signal have to drive.
Example: WL Pulse which have to drive
256 of 2-input NAND!
Level 0
EffLoad=256
Level 1
EffLoad=128+2
Even Further
Level 2
EffLoad=64+4
Simulation Result
Energy vs Clock Gating Level
25
Energy (pJ)
20
15
10
5
0
0
0.5
1
1.5
2
Level of Clock Gating
2.5
3
3.5
Some note about clock gating
It act like a decoder, in our design we
choose to use level 2 clock gating for WL
pulse so we did not need 8 to 256 decode
any more, we just need the 6 to 64.