Research Projects Overview

Download Report

Transcript Research Projects Overview

Simultaneous Power and Thermal
Integrity Driven Via Stapling in 3D ICs
Hao Yu, Joanna Ho and Lei He
Electrical Engineering Dept.
UCLA
Partially supported by NSF and UC-MICRO fund from Intel
New Solution for High-performance Integration


2D SoC has limited device density and interconnect
performance (delay)
Potential solution: 3D Integration


Fabrication Technologies: Chip-level Wafer Bonding or Die-level Silicon
Epitaxial Growth
Extra challenges: thermal integrity and power integrity
2
Thermal Challenge in 3D ICs

Inter-layer dielectrics are poor thermal conductors

the temperature of each die increases along third dimension, where
the heat sink is on the top
40c
70c
100c
130c
160c

High temperature affects interconnect and device
reliability and brings variations to timing

Vertical vias are good thermal conductors

They can be used as thermal vias to remove the heat from each die
3
Power Delivery Challenge in 3D ICs

The voltage bounce is significant in P/G planes at the
bottom due to resonance

Large voltage bounce affects the performance of I/Os

Vertical vias can minimize the returned current path
and hence loop inductance

They can be used as power vias to reduce the voltage bounce for
each P/G plane
4
Via Planning Problem in 3D IC

Motivation

Staple vias from the top heat-sink to the bottom P/G planes

remove heat in silicon die and reduce voltage bounce in package plane

Too many? -> signal routing congestion
 Too few? -> reliability by current density

Previous work (thermal via planning)





Iterative via planning during placement [Goplen-Sapatnekar:ISPD’05]
Alternating-direction via planning during routing [Zhang-Cong:ICCAD’05]
Both use steady-state thermal analysis and ignore variant thermal power
Both ignore that the vertical via can be also designed to remove the
voltage bounce in power supply
Primary contributions of our work


Formulate a levelized via stapling to simultaneously minimize both
temperature hotspot and voltage bounce
Develop an efficient sensitivity-driven optimization with use of
structured and parameterized macromodel
5
Outline

Modeling and Problem Formulation

Integrity Analysis and Sensitivity based
Optimization

Experimental Results

Conclusions
6
Electric and Thermal Duality

Temperature
Voltage state variables (x(t))
Thermal-Power
Input Current sources (u(t))
Thermal conductance
Electrical conductance (G)
Thermal capacitance
Electrical capacitance (C)
Both electric and thermal systems can be described in
MNA (modified nodal analysis)
time domain:
frequency domain:
dx(t )
Gx(t )  C
 Bu (t )
dt
y (t )  LT x(t )
(G  sC ) x( s)  Bu ( s)
y( s)  LT x( s)
B and L are multi-input/output port matrices
y is the selected output response
7
Two Distributed Networks for 3D IC




All device/dielectric layers and power planes are
discretized into tiles
A distributed electrical RLC model for power/ground
plane
A distributed thermal RC model for device/dielectric
layer
Each via is modeled by a RC pair
8
Thermal Model and Analysis

Steady-state thermal model and analysis




Tiles connected by thermal resistance
Heat sources modeled as time-invariant current sources
Steady-state temperature can be obtained by directly solving a
time-invariant linear equation
Transient thermal model and analysis



Tiles connected by thermal resistance and capacitance
Heat sources modeled as time-variant current sources
Transient temperature can be obtained by directly solving a timevariant linear equation
9
Need of Transient Thermal Modeling
Cycle-accurate power
Time-variant workload and
dynamic power management
introduce temporal and
spatial thermal power
variation

Thermal power is the runtime average
of cycle-accurate power over thermal
time-constant
 Thermal power decides temperature
Maximum
thermal-power
s
ms
ns
Power

Transient thermal-power
CPU Cycles

Steady-state analysis needs to assume a maximum
thermal power simultaneously for all regions


But it rarely happens and hence can result in an over-design
Direct transient analysis is accurate but time-consuming

10
It calls for more accurate yet efficient transient thermal modeling during the
design automation
Need of Simultaneous Thermal/Power Co-Design

Temperature hotspots usually distribute differently
from voltage bounce



A thermal integrity map tends to result in a uniform via stapling
pattern
A power integrity map tends to result in a biased via stapling pattern
in center
Considering thermal and power integrity separately
may also lead to over-design
11
Problem Formulation
12
Via Stapling

A levelized via stapling is used
• Each level has a different via density Di

Minimize via number under
thermal/power integrity
constraint
D0







Di
ni
Vmax
Tmax
Dmax
Dmin
D1
D2
levelized via density
via number at different level
power integrity constraint
thermal integrity constraint
congestion from signal via
current density constraints
It can be efficiently solved by a sensitivity based
optmization

The sensitivity is calculated from a structured and parameterized
macromodel
Outline

Modeling and Problem Formulation

Integrity Analysis and Sensitivity based
Optimization

Experimental Results

Conclusions
13
Parameterized System Equation

14
The levelized stapling pattern is described by adjacent matrix X
1
3
1
4
5
7
6
6
7
5
6 7
8
-1
0
0
4
8
X(2,6)=
4 5
1
3
2
3
0
2
1
2
0
-1
1
0
0
8

Via conductance gi and capacitance ci are both proportional to the
area Di or density (Di/a) (a is unit via area)

Both Di and Xi are parametrically added into the nominal MNA equation
K
[G0  sC 0   Di ( g i  sci )x(D, s)  Bu( s)
i 1
y (D, s)  LT x(D, s)
where g i  g 0 X i and ci  c0 X i
Separation of Nominal and Sensitivity


Expand state variables
x(D1,…DK,s) by Taylor expansion x(D, s)  
i
w.r.t. to Di [Li-Pileggi:ICCAD’05]

(i1 ...iK )
i1
iK
x
(
s
)(

D
)
(

D
)
 1,...,K
1
K
i1
1


Construct a new state variables by
(0)
(1)
(1)
(1)
(2)
x

[
x
,
x
,...,
x
,
x
,...,
x
ap
0
1
K
1,1
K ,K ]
nominal values and sensitivities
Expanded system is reorganized
into a lower-triangular-block
system
(Gap  sCap ) xap  Bap u (t ),
Cap has similar structure

15
yap 
 G0
 Dg
 1 1


D g
Gap   K K
 0
LTap xap

 0


 0
0
0
0
0
G0
0
0
0
0
G0
0
0
G0
0
D1 g1
0
D2 g 2
D1 g1
0
0
G0
DK g K
0
Since system size is enlarged, we can reduce it by model
reduction
0
0 


0
0

0


G0 
Macromodel by Model Reduction
…
…
large size

project
Small but
dense
small size
Model reduction can reduce model size and
preserve accuracy by matching moments of inputs
[Odabasioglu-Celik-Pileggi:TCAD’98]
 The projection above is non-structured, and will mess the
nominal values and their sensitivities again
 This can be solved by a structure-preserving reduction [YuTan-He:BMAS’05, Yu-Shi-He:DAC’06]
16
Structured Projection (I)


Block-diagonally
partition the flat
projection matrix
according to the size of
nominal state-variable
and sensitivity
 V0  V0
V  
V1
 1  

 

 
 VK   
VK 1  

 

 
V 2  
 K  
17









VK 2 
VK
VK 1
Structured projection can result in a reduced system
with preserved structure


~
Nominal values and sensitivities are still separated after reduction
There is only one LU-factorization of the reduced G0 in diagonal
~
(0)
1
~
(1)
1
~
(1)
K
~
(1)
1,1
~
(2)
K ,K
xap  [ x , x ,..., x , x ,..., x
~
~
Cap has similar structure as Gap
]
 ~
 G0
~

 A1 g1


 A g~
~
Gap   K K

 0

 0



 0
0
0
0
0
G0
0
0
0
0
G0
0
0
~
~
~
A1 g1
~
A2 g 2
~
0
G0
~
A1 g1
0
~
0
G0
~
0
AK g K
0

0

0


0


0

0


~

G0 
Time-domain Analysis

18
Nominal response and sensitivity can be solved
separately and efficiently with BE in time-domain
~
~
1 ~ ~
1 ~ ~
(Gap  Cap ) xap (t )  Cap xap (t  h)  Bap uap (t )
h
h
~
~
T
ap
~
y ap (t )  L

xap (t )
Direct sensitivity calculation
first-order:

~
Si 
y
f
x
   k dt   LTk
dt   LTk xi(1) dt ,
Ai k 1 ts Ai
Ai
k 1 ts
k 1 ts
K te
K te
K te
Generated sensitivities can be used in any gradient
based optimization
We call this method as SP-MACRO
Sensitivity based Optimization

Via optimization flow
Calculate T/V
nominal+sensitivity

Update Density
Vector
Structured and parameterized reduction provides an
efficient calculation of both nominal value and
sensitivity

The via density vector D can be efficiently updated during each
iteration
iter 1
iter
D


Check Integrity
Constraints
D
 S
Normalized sensitivity according to both temperature and voltage
(T/V) sensitivities
Further speedup: adjoint Lagrangian method similar
to [Visweswariah-Conn-Haring:TCAD’00]
19
Outline

Modeling and Problem Formulation

Integrity Analysis and Sensitivity based
Optimization

Experimental Results

Conclusions
20
Experiment Settings

21
A modest 3D stacking
layer
size
material
number
mesh
heat-sink
2cm x2cmx1mm
copper
1
RC
device-layer
1cmx1cmx4um
silicon
2
RC
inter-layer
1cmx1cmx1um
dielectric
2
RC
P/G plane
2cmx2cm x10um
copper
2
RLC
Sigma
Epsilon
Mu
Kapa_r
Silicon
NA
NA
NA
100W/mK
Copper
59.6x 10^6S/m
NA
NA
400W/mK
Dielectric
NA
3.3
1.0
50W/mK
Kapa_c
1.75x10^6J/m^3K
3.55x10^6J/m^3K
NA
Accuracy of Reduced Macromodel

22
Transient voltage responses of exact and MACRO models at ports 1
and 5 in one P/G plane with step-response input

The responses of macromodels are visually identical to those exact models
but with >100 speedup
Temperature/Voltage Reduction during OPT

The T/V are both decreased iteratively
 The allocated via results in a design meeting the
targeted temperature 52C and the voltage bounce 0.2V
23
Steady-state vs. Transient
Total
tile#
Level
Steady-state
24
Tran by SP-MACRO
vector
Solve
Total
Redu
Solve
Total
Saving
via
176877
187422
235484
239379
Ckt(s)
0.01
0.13
1.22
5.12
BE(s)
0.12
0.17
0.86
1.07
via
156154
166971
206482
21184
ratio
11%
11%
12%
12%
NA
15.87
3.65
216732
NA
620
2140
7900
27740
0,1
0,1,2
0,1,2,3
0,1,2,3,4
dc (s)
4.06
26.37
167.9
1243.7
55680
0,1,2,3,4,5
NA

Transient thermal analysis reduces via by 11.5% on
average compared to using steady thermal analysis

Our SP-Macro results in an efficient transient analysis
that reduces runtime by 155X compared to the direct
steady-state analysis
Sequential vs. Simultaneous

Simultaneous optimization reduces via by 34% on
average compared to the sequential optimization
Total tile#
620
2140
7900
27740
55680

Seq.
176877
187422
235484
239379
NA
Sim.
118020
127651
140433
143718
144998
-32%
-32%
-36%
-37%
NA
Comparisons of via distribution at different levels for
ckt (27740)
Optmethod
P/G-only
Thermalonly
Sim.
Level
0
76832
/
1
3410
1157
2
1901
43567
3
876
4007
4
/
79432
67058
811
2500
2808
70541
25
Conclusions

Vertical vias play a critical role in 3D IC design

A simultaneous thermal and power integrity driven
via planning

26
It saves via number by 34% on average compared to a
sequential design

A structured and parameterized macromodel can be
efficiently employed during the design optimization

This method can be further extended
 3D signal and P/G routing
 Performance driven 3D design
27
Submitted for –
www.mycollegebag.in