Research Projects Overview - University of California, Los
Download
Report
Transcript Research Projects Overview - University of California, Los
Off-chip Decoupling Capacitor Allocation
for Chip Package Co-Design
Hao Yu
Berkeley Design Automation
[email protected]
Chunta Chu and Lei He
EE Department
UCLA
The work was performed at UCLA and was partially supported by
NSF and UC-MICRO
Decap Allocation for Clean Power Delivery 2
Chip-package co-design
requires a noise-free off-chip
power delivery system (PDS)
c
Modeling inductance is a must
Decoupling capacitors
(decaps) are allocated on
chip-package interface to
satisfy power integrity
It is a challenging task to
find a fast yet accurate
decap allocation for a largescale design
How to consider the large and complex
physical-level layout during the systemlevel design?
decap
c
Physical Level Challenge
3
Module 2
Module 1
Finite parastic impedance affects the circuit
functionality at chip-package interface
Supply volatage drop and electromagnetic (EM) coupling
Distributed post-layout model burdens the system-level
power integrity analysis and design
Millions of nodes and terminals with dense inductances
The Need of Macromodeling
Representing a large and
complex power delivery
system blindly leads to
expensive design cycles
A compact representation
by macromodeling is
needed
Existing decap allocation
methods with macromodeling
[Zheng:CICC’04, Chen:ISPD’06]
Generate PDS macromodel
Apply simulated annealing to
add/remove one decap to a
legal position
Can not efficiently handle a
large-scale design
5
Limitations of Existing Macromodeling
How to use it ?
project
Small
but
dense
Macromodeling algorithms [PVL, PACT, PRIMA] are
limited to handle a large-scale PDS
1. Become ineffective when terminal number is large
2. Do not provide the sensitivity information
3. Destroy the structure of state matrix
6
Our Decap Problem Formulation
A multiple-ring-based
problem formulation
Represent decap solution
by combination of multi-level
templates
Constrain by noise integral at I/O
instead of noise amplitude in
[ Chen:ISPD’06]
Optimization Method
Each step inserts a template
with a given decap type based
on sensitivity instead of
simulated-annealing
The key is to efficiently calculate sensitivity from macromodel
7
TBS2: Macromodeling for PDS
Principle Terminal Selection
Parameterization
Capture the essential input/output behavior
Compute performance sensitivities from the layout modifications
Structured Simulation
Sparsely arrange couplings (sparsity), leverage diverse physical
domains (latency) and analyze at block-levels (hierarchy)
A structured and parameterized macromodel connects layout with system
8
TBS2 (1) Principle Terminal Selection
The input signals (J =B x I) are temporally correlated
Described by a correlation matrix C (N x N)
Correlated terminals [b0 b1 b2] can be simplified with
use of a principal component analysis (PCA)
Select K principle terminals by K-means method
9
TBS2 (2) Parameterization
Decaps can be parametrically described by
The sizing vector (D) for M2 types of decaps and the
topological matrix (X) for M1 levels of rings
4
1
3
1
2
1
6
6
7
5
6 7
8
-1
0
0
0
5
7
4 5
1
4
8
X(2,6)=
3
0
3
2
2
-1
1
0
0
8
Total M1XM2 types of parameterized templates
described by a parameterized state matrix in s-domain
10
TBS2 (3) Structured Macromodeling
G0
(MxM)
DG1
G0
(MxM)
(MxM)
G0
DG2
0
G0
(MxM)
(MxM)
(MxM)
Structured
projection
0
0
G0
(MxM)
(MxM)
(MxM)
(MxM)
DG1
G0
(MxM)
(MxM)
DGK
0
G0
(MxM)
(MxM)
(MxM)
Sparse and block-triangular
Block-wise nominal
and sensitivity
Voltage response
DGN
(MxM)
Time domain
Details can be found in TBS1 [Yu:DAC’06] and [Yu:ISLPED’06]
12
Improved Accuracy By TBS2 Reduction
A non-uniform RLC mesh is reduced by an 80th-order
reduction using TBS2 and PRIMA
TBS2 matches more poles than PRIMA w.r.t principle terminals
The waveform accuracy is improved in both frequency/time domain
by TBS2
13
Our Decap Algorithm Overview
1.
2.
3.
4.
Apply TBS2 just one-time to generate a structured
and parameterized macromodel
Calculate block-level nominal noise at each
terminal and its sensitivity w.r.t the partitioned
template
Check if noise integral satisfies constraints
Allocate decaps for each block according to the
sensitivity in a greedy fashion
TBS2
Calculate
nominal+
sensitivity
Check
Constraints
update
Template
14
Reduced Runtime and Cost of Decap Allocation
Comparing three methods:
1) Simulated-annealing with noise amplitude [Chen:ISPD’06];
2) Multiple-ring with noise amplitude [this paper];
3) Multiple-ring with noise integral [this paper]
MRA-NI is up to 97X faster than SA-NA due to structured andparameterized macromodel from TBS2
MRA-NI reduces decap cost by up to 16% due to a more accurate integrity
metric using noise integral
15
Conclusions
1.
2.
Macromodel connects the system-level
design with the physical-level layout
TBS2: Structured and parameterized
macromodel
Provide a fast yet accurate computational
prototyping for large/complex system
Solve an integrity-driven decap allocation for
chip-package co-design
Such a block-wise macromodel and
optimization can be applied to other layout
optimization problems
16