FPGA Power Reduction Using Configurable Dual-Vdd

Download Report

Transcript FPGA Power Reduction Using Configurable Dual-Vdd

Trace-Based Framework for Concurrent
Development of Process and FPGA
Architecture Considering Process Variation
and Reliability
1Lerong
1Lei
1EE
Cheng, 1Yan Lin,
He, and 2Yu Cao
Department, UCLA
2EE
Department, ASU
Address comments to [email protected]
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
Review of Previous Work
Circuit
element usage
Application
Benchmark
Set
Input Vector
FPGA
Architecture
Short circuit
power ratio
VPR
and
Psim
Switching
activity
Trace
Critical path
structure
Circuit Level
Delay, Power
and Area
 Device and architecture co-optimization



Power and delay [Cheng DAC’05]
Process variation [Wong ICCAD’05]
Soft error rate [Lin, ICCAD’07]
Ptrace
Chip
Level
Delay,
Power,
and Area
Limitation of Ptrace
 Ptrace requires a stable SPICE model which is able to
consider all process corners

SPICE model is not available at the early stage of process
development
 Circuit simulation for all process corners is time consuming

The accuracy of circuit simulation is not needed for quick
architecture evaluation
 Does not handle realistic variation


Non-Gaussian variation sources
Spatial correlation
 Does not handle device aging
Extended Ptrace (Ptrace2)
Input
Process
parameters
PTrace2
Transistor
Electrical
Characteristics
Circuit Level
Power and
Delay
Estimation
Trace
Circuit Element Statistics
Critical Path Structure
Switching Activity
Output
Chip Level
Power and
Delay
Estimation
Variation
Analysis
Reliability
Chip Level
Leakage Power
Dynamic Power
Delay
Process Variation
Power Distribution
Delay Distribution
Reliability
Device Aging
Soft Error Rate
Early-Stage Circuit Modeling
 ITRS MASTAR4 model [ITRS MASTAR4 2005]
Inputs: Lgate
Tox
Outputs:
Ioff
Nbulk
Ion
Xjext
Igon
W
Racc
Igoff
Cg
T
Vdd
Cdiff
Extended Ptrace
Input
Process
parameters
PTrace2
Transistor
Electrical
Characteristics
Circuit Level
Power and
Delay
Estimation
Trace
Circuit Element Statistics
Critical Path Structure
Switching Activity
Output
Chip Level
Power and
Delay
Estimation
Variation
Analysis
Reliability
Chip Level
Leakage Power
Dynamic Power
Delay
Process Variation
Power Distribution
Delay Distribution
Reliability
Device Aging
Soft Error Rate
Circuit Level and Chip Level Power and
Delay
 Circuit level power and delay


Inverter
Pass transistor driven by an inverter
 Chip level power and delay

Similar to the original Ptrace [Cheng DAC’05, Wong ICCAD’05]
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
Experimental Setting
 20 MCNC benchmarks

Assume all 20 MCNC benchmarks are placed in the same chip
 ITRS high performance 32nm technology (HP32)
 Architecture



Cluster size N=6
LUT size K=7
Wire segment length W=4
 Device


Vdd=1.0, 1.05, 1.1 V
Lgate=31, 32, 33 nm
 Baseline ITRS HP32
Delay and Power Tradeoff
 3.1X energy span and 1.3X delay span within search space
Power and Delay Optimization
Device Power
(W)
HP32
1.19
Min-ED 0.77
Delay
(ns)
3.90
4.55
Energy
ED
(nJ)
(nJ·ns)
1.88
22.6
3.50
15.9(-29.4%)
 Device tuning reduces energy delay product by 29.4%
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
Experimental Setting
 Variation sources

Doping density Nbulk


3σg=5% of nominal value, 3σr=3% of nominal value
Gate channel length Lgate

3σg=0.8nm, 3σr=0.6nm
 Simulation

M=10,000 sample Monte Carlo simulation
Power and Delay Distribution
Power and Delay Variation
Leakage (mW)
Delay (ns)
Device
µ
σ
µ
σ
HP32 942
334
3.91
0.119
Min-ED 340 45 (-87%) 4.55 0.159 (+34%)
 Min-ED device setting significantly reduce leakage variation
with a small increase of delay variation
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
NBTI and HCI
 Negative-bias-temperature-instability (NBTI) effect increases
the threshold voltage of PMOS [Wang DAC’06]
 hot-carrier-injection (HCI) increases the threshold voltage of
NMOS [Wang CICC’07]
Inputs: Lgate
Outputs:
Tox
Nbulk
Xjext
ΔVth(NBTI)
W
Racc
ΔVth(HCI)
T
Vdd
Vth Increase Caused by NBTI and HCI
 Vth increase is the most significant in the first year
 Device burn-in can be applied to reduce the impact of device
aging
Impact of Device Burn-in
W/O Burn-in
Device
Current
W/ Burn-in
10 years
P (mW) D (ns) P (mW)
D (ns)
Current
10 years
P (mW) D (ns) P (mW)
D (ns)
HP32
854
3.90
640
4.23
(-25.1%) (+8.5%)
711
4.01
627
4.25
(-10.0%) (+5.5%)
Min-ED
328
4.55
311
4.64
(-5.2%) (+2.0%)
317
4.59
309
(-1.9%)
4.65
(+1.1%)
 High performance device setting is more sensitive to device
aging
 Device aging leads to 8.5% of delay degradation after 10
years
 Device burn-in reduce delay degradation from 8.5% to 5.5%
after 10 years
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
Permanent Soft Error Rate
 Single-event upset (SEU) due to cosmic rays or high energy
particles may affect configuration SRAMs in FPGAs and result
in permanent soft error
Inputs: Lgate
Tox
Nbulk
Outputs:
Xjext
SER
W
Racc
T
Vdd
SER under Different Device Setting
Device
SER (FIT)
HP32
362.45
Min-ED
368.25 (+1.6%)
 SER for both device setting is similar
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
Impact of Device Aging on Power and Delay
Variation
σ Leakage (W)
σ Delay (ns)
Device
Current
10 years
Current
10 years
HP32
334
116 (-65.2%)
0.119
0.121 (+1.67%)
Min-ED
45.0
30.3(-32.7%)
0.159
0.159 (+0.16%)
 Device aging significantly reduces leakage variation and
slightly increase delay variation
Impact of Device Aging and Process
Variation on SER
Current
SRAM SER (FIT) 2.914E-5
10 years
Variation
+0.3%
-0.18% ~ +0.17%
 Neither device aging nor process variation has significant
impact on permanent SER
Outline
 Introduction


Review of existing work
Process models
 Concurrent development of process and architecture


Power and delay
Process variation
 Concurrent development for reliability



Device aging
Permanent soft error rate (SER)
Interaction between process variation and reliability
 Conclusion
Conclusion
 A trace-based framework has been developed to enable
concurrent process and FPGA architecture co-development
 Device tuning achieves significant energy delay product
reduction
 Applying device burn-in reduces delay degradation from 8.5%
to 5.5% within 10 years
 Device aging significantly reduces leakage variation but has
has almost neglegible impact on delay variation
 Neither device aging nor process variation has significant
impact on permanent SER