Sunil`s presentation - Texas A&M University

Download Report

Transcript Sunil`s presentation - Texas A&M University

A Variation-tolerant Subthreshold Design Approach
Nikhil Jayakumar
Sunil P. Khatri
.
Texas A&M University,
College Station, TX
1
Motivation
 In recent times, chip power consumption has
become a significant hurdle
 Higher power consumption leads to
 Shortened battery life
 Higher on-chip temperatures – reduced operating
life of the chip
 There is a large and growing class of applications
that where power reduction is paramount – not
speed
 Such applications are ideal candidates for subthreshold logic
2
Sub-threshold Leakage
W
I ds  I D 0 .e
L
 Vgs VT Voff

nvt








 Vds 



 vt  

. 1 e




 As supply voltage scales down, the VT of the devices is
scaled down as well
 A lower VT results in exponentially higher leakage
 Leakage power is becoming comparable with dynamic power
 A larger VT would reduce leakage but increase delay
 We can turn this dilemma into an opportunity !!
3
The Opportunity
Traditional Ckt
Sub-threshold Ckt (Vb = 0V) Sub-threshold Ckt (Vb = VDD)
Process Delay(ps) Power(W) P-D-P(J) Delay  Power  P-D-P  Delay  Power  P-D-P 
bsim70
14.157 4.08E-05 5.82E-07 17.01X 308.82X 18.50X
9.93X 141.10X 14.43X
bsim100 17.118 6.39E-05 1.08E-06 24.60X 497.54X 20.08X
12.00X 100.96X
8.20X
 Performed simulations for 2 different processes on a 21 stage
ring oscillator
 Impressive power reduction (100X – 500X)
 PDP improves by as much as 20X
 Delay penalty can be reduced by several means
 Applying forward body bias
 Decreasing VT
 Circuit approaches
4
The Opportunity
VT
0.18
0.17
0.16
0.15
0.14
0.13
bsim70
Delay  Power  P-D-P 
16.15X 167.52X 10.41X
14.88X 151.99X 10.09X
13.78X 137.73X
9.95X
13.15X 124.59X
8.86X
12.43X 112.73X
9.40X
12.32X 101.85X
8.02X
VT
0.27
0.25
0.23
0.21
0.19
0.17
bsim100
Delay  Power  P-D-P 
23.32X 479.85X 20.60X
22.43X 464.33X 20.16X
21.02X 444.23X 20.05X
18.69X 400.89X 20.27X
18.42X 366.28X 18.98X
17.51X 323.26X 17.98X
 We also performed experiments with lower VT values
 Delays improved with decreasing VT values, as expected
 PDP remained high
 Power gains decreased with decreasing VT values
5
Sub-threshold Logic
 Advantages
 Circuits get faster at higher temperature. Hence
reduced need for expensive cooling techniques
 Device transconductance is an exponential function
of Vgs which results in a high ratio of on to off
current. Hence noise margins are near-ideal
 Disadvantages
 Ids exhibits an exponential dependence on
temperature
 Ids also has a strong dependency on process
variations (such as VT variations)
 Ids is small
6
Previous Approaches
 Paul et al (2001) reported a sub-threshold multiplier
 Compensation of Ids over P/T variations
 Tschanz et al (2002) discuss a dynamic body bias
technique to make design process variation tolerant
 Applied in the context of regular CMOS technologies
 Circuit delay matched to critical delay (hard to
determine)
 Matching is performed for entire design monolithically
 In contrast to these, we:
 Compensate sub-threshold delay over P/V/T variations
 Apply our compensation to a network-of-PLA design
 Critical delay is trivially determined
 Perform compensation separately for clusters of spatially
nearby PLAs
7
Our Solution
 We propose a technique that uses self-adjusting bodybias to phase-lock the circuit delay to a beat clock
 Use a network of dynamic NOR-NOR PLAs to
implement circuits




Regular, area and delay efficient approach
PLAs partitioned into clusters of 1000 PLAs each
All PLAs in a cluster share bulkn node
A representative PLA in the cluster is chosen to phase
lock the delay of the cluster to the beat clock
 Beat clock period determines circuit speed
 bulkn voltage modulated via charge pump
 If the delay is too high, a forward body bias is applied
to speed up the PLA, and vice versa
8
Dynamic NOR-NOR PLA
Inputs
Outputs
clk
clk
 We use precharged
NOR-NOR PLAs as
completion
the structure of choice
 Wordlines run
horizontally
 Inputs / their
complements and
outputs run vertically
 Several PLAs in a
cluster share a
common bulkn node
 Each PLA has a
“completion” signal
that switches low after
all the outputs switch9
The Charge Pump
- PLA “completion” signal lags beat clock
- bulkn node gets forward biased
pullup
pulldown
- PLA “completion” signal leads beat clock
- bulkn goes back to zero bias
10
Effectiveness of the
Approach
 We simulated a single
PLA over 0 to 100oC. We
also applied VT variations
(10%) and Vdd variations
(10%)
 The light region shows
the variations on delay
over all the corners
 The red region shows the
delays with the selfadjusting body-bias
circuit
11
Example Showing Phase
Locking
VDD changed
from 0.2 to 0.22V
VDD changed from
0.22 to 0.18V
 This figure shows how
the body bias (and
hence the delay of the
PLA) changes with
changes in VDD
 Note PLA delay
remains relatively
constant
 The adjustment is very
quick (within a clock
cycle)
12
Summary
 Sub-threshold circuit design is promising due to
extreme low power consumption
 100 – 500X power reduction, 10 – 25X speed penalty
 Appealing for a widening class of applications
 However, it is inherently not tolerant to PVT variations
 Our approach dynamically compensates for PVT
variations
 Lock delay of a representative PLA in a cluster, to a beat
clock
 Use a charge pump which modulates nbulk bias voltage
 Dramatic reduction in sensitivity to PVT variations
 This can help achieve a significant yield improvement
13
Thank you!!
14