No Slide Title

Download Report

Transcript No Slide Title

Performance Analysis and Technology of
3D ICs
Krishna Saraswat
Shukri Souri
Kaustav Banerjee
Pawan Kapur
Department of Electrical Engineering
Stanford University
Stanford, CA 94305
[email protected]
Funding sources: DARPA, MARCO
Stanford University
Krishna Saraswat
Outline
• Why 3-D ICs?
– Limits of Cu/low K technology
• 3D IC performance simulation
• 3-D technologies
– Seeding crystallization of
amorphous Si
– Processed wafer bonding
• Thermal simulations
Stanford University
Krishna Saraswat
Introduction: Interconnect Delay Is Increasing
 Chip size is continually
increasing due to increasing
complexity
 Device performance is
improving but interconnect
delay is increasing
 Chip sizes today are wire-pitch
limited: Size is determined by
amount of wiring required
Mark Bohr, IEDM Proceedings, 1995
Stanford University
Krishna Saraswat
Cu Resistivity: Effect of Line Width Scaling
• Effect of Cu diffusion Barrier
• Barriers have higher resistivity
• Barriers can’t be scaled below a minimum thickness
• Effect of Electron Scattering
• Reduced mobility as dimensions decrease
• Effect of Higher Frequencies
• Carriers confined to outer skin increasing resistivity
Problem is worse than anticipated in the ITRS 1999 roadmap
Stanford University
Krishna Saraswat
Cu Resistivity: Barriers Deposition Technology
ITRS 1999 Line width (nm)
Globel
Local
525
250
280
133
95
48
Atomic Layer Deposition (ALD)
Ionized PVD
Collimated PVD
• 5 nm barrier assumed at the thinnest spot
• No scattering assumed, I.e., bulk resistivity
Interconnect dimensions scaled according to ITRS 1999
Stanford University
Krishna Saraswat
Cu Resistivity: Effect of Electron Scattering
Diffuse scattering
Lower mobility
Elastic
373 K
Elastic scattering
273 K
Elastic
• No barrier assumed
• Diffuse electron scattering increases resistivity
• Lowering temperature has a big effect
Stanford University
Krishna Saraswat
Fraction of chip area used by repeaters
30
20
Rent’s exponents
10
0
p=0.600
p=0.625
p=0.650
p=0.675
p=0.700
50
100
150
200
250
Technology Generation (nm)
As much as 27% of the chip area at 50 nm node is likely to
be occupied by repeaters.
Stanford University
Krishna Saraswat
3D ICs with Multiple Active Si Layers
Motivation
• Performance of ICs is limited due to R, L, C of interconnects
• Interconnect length and therefore R, L, C can be minimized by stacking active Si layers
• Number of horizontal interconnects can be minimized by using vertical interconnects
• Disparate technology integration possible, e.g., memory & logic, optical I/O, etc.
Repeaters
optical I/O devices
Gate
n+/p+
n+/p+
VILIC
Interconnect
delay
Delay
2
M4
Gate delay
M3
1 active
Si layer
M2
3
M1
Gate
n+/p+
n+/p+
T2
Memory
Analog
M’2
0.1
0.5
Generation (µm)
1
M’1
Gate
Via
n+/p+
n+/p+
T1
Logic
Stanford University
Krishna Saraswat
Chip Size
Device Size Limited
PMOS
Wire Pitch Limited


NMOS
• Memory: SRAM, DRAM
Stanford University

• Logic, e.g., µ-Processors
Krishna Saraswat
Rent’s Rule
N gates
T = k NP
T = # of I/O terminals
N = # of gates
k = avg. I/O’s per gate
P = Rent’s exponent
Stanford University
Krishna Saraswat
Determination of Wire-length Distribution
• Conservation of I/O’s
TA + TB + TC = TA-to-B + TA-to-C + TB-to-C + TABC
Block A with NA gates
TA-to-B = TA + TB -TAB
TB-to-C = TB+ TC -TBC
Block B
• Values of T within a block or collection of
blocks are calculated using Rent’s rule, e.g.,
TA = k (NA) P
TABC = k (NA+ NB+ NC) P
• Recursive use of Rent’s rule gives wire-length
distribution for the whole chip
Block C
Ref: Davis & Meindl, IEEE TED, March 1998
Stanford University
Krishna Saraswat
Inter-Layer Connections For 3-D2-Layers
N
N/2
N/2
T
T1
T2
• Fraction of I/O ports T1 and T2 is used for inter-layer connections, Tint
• Assume I/O port conservation:
T = T1 + T2 - Tint
• Use Rent’s Rule: T = kNP to solve for Tint (p assumed constant)
k = Avg. I/O’s per gate
Stanford University
N = No. of gates p = Rent’s exponent
Krishna Saraswat
Wire-length Distribution of 3-D IC
1
2
5
3
Single Layer
4
Microprocessor Example from NTRS 50 nm Node
Number of Gates
180 million
Minimum Feature Size
50 nm
Number of wiring levels,
9
Metal Resistivity, Copper
1.673e-6 Ω-cm
Dielectric Constant, Polymer er = 2.5
1E8
1
5
3
4
Local
2 Layers
2
Semiglobal Global
1E6
1E4
LSemi-global
1E2
Replace horizontal by
vertical interconnect
2D
LLocal
1E0
3D
1E-2
1E-4
1
10
100
1000
Interconnect Length, l (gate pitches)
Vertical inter-layer connections reduce metal wiring requirement
Stanford University
Krishna Saraswat
Chip Area Estimation
• Placement of a wire in a tier is
determined by some constraint, e.g.,
maximum allowed RC delay
• Wiring Area = wire pitch x total length
Areq = plocLtot_loc + psemiLtot_semi + pglobLtot_glob
= Aloc + Asemi + Aglob
• Ltot calculated from wire-length
distribution
A chip
A loc  A semi  A glob

# of metal layers
Stanford University
A 3-tier wiring network
Global
Semiglobal
Local
Krishna Saraswat
2 Active Layer Results
• Upper tiers pitches are
reduced for constant
chip frequency, fc
• Less wiring needed
• Almost 50% reduction
in chip area
20
1 Layer (2-D)
2 Layers (3-D)
16
2-D (1 Lay er)
7.9 cm 2
12
8
4
3-D (2 Lay ers)
4.0 cm 2
0
1
2
3
4
Normalized Semi-global pitch
Stanford University
Krishna Saraswat
3-D Wire-Length Distribution
Symmetric Interconnects:
Comparable inter- and intradevice layer connectivity
Asymmetric Interconnects:
Negligible inter-device layer
connectivity
Ref: Rahman & Reif (MIT)
N: Number of logic gates, f.o.: fan-out, k and p: Rent’s parameters,
Nz: Number of device layers
More vertical interconnects required
Stanford University
Krishna Saraswat
Microprocessor Application
PHYSICAL PARA METER
Number of Gates, N
Rent’s Exponen t, p
Rent’s Coefficient, k
Minimum Feature Size, F
Max number of wiring levels, nmax
Operating Frequency
Metal Resistivit y, Copper
Dielectric Constant, Polymer
VALUE
180 milli on
0.6
4.0
50nm
9
3 GHz
1.673e-6 ž -cm
r = 2.5
Wiri ng Efficiency Factor
0.4
Normalized Interconnect Delay
More than 2 active layers
1.0
0.95
0.85
0.75
0.65
1
2
3
4
5
No. of Active Layers
Stanford University
Krishna Saraswat
Delay of Scaled 2D and 3D ICs
• Moving repeaters to upper active
tiers reduces interconnect delay
by 9%.
1.0
Interconnect Delay
• 3D (2 Si layers) shows significant
delay reduction (64%).
0.1
Typical gate Delay
Interconnect Delay:
2D IC with repeaters
3D IC constant metal layers
3D IC 2X metal layers
3D IC 2X metal layers, 5 Si layers
0.01
0.001
50
100
150
200
Technology Generation (nm)
• Increasing the number of metal
levels in 3D improves
interconnect delay by another
40%.
• Increasing the number of Si
layers to 5 further improves
250
interconnect delay.
Simulations assumed state-of-the-art chip at a technology node with data from NTRS
Stanford University
Krishna Saraswat
3D Approaches
Repeaters or
optical I/O devices
Gate
Wafer Bonding (MIT)
n+/p+
n+/p+
VILIC
M4
M3
M2
M1
Gate
n+/p+
n+/p+
T2
Memory
or
Analog
M’2
M’1
Gate
Via
n+/p+
n+/p+
T1
Logic
Epitaxial Lateral Overgrowth (Purdue)
Stanford University
Seeding crystallization of -Si
(Stanford)
Krishna Saraswat
Statistical Variations in Poly-TFT Properties
Conventional Poly-TFT
Smooth Interface
(Crystallized a-Si)
Mobility
Deposited Gate
Dielectric
Crystallized using
lasers , RTA, or long
furnace anneals
Grains in
Channel
Gate
Grain size
0.3-0.5 µm
Gate Oxide
Drain
Channel
Source
Substrate
Effect of Grain Boundaries
• As channel length  grain size,
statistical variation increases
• Elimination of grain boundaries
should reduce this variation
Stanford University
Krishna Saraswat
Ge Seeded Lateral Crystallization
Ge seeds
Seeding
-Si
SiO2
a -Si
Grain
Substrate
Grain Growth
Single Grain 0.1 µm NMOS
Lateral crystallization
MOSFET Fabrication
Gate
Gate oxide
Source Channel Drain
Substrate
Concept:
– Locally induce nucleation
– Grow laterally, inhibiting additional nucleation
– Build MOSFET in a single grain
Stanford University
Krishna Saraswat
Single Grain Transistors in Ge Induced Crystallized Si
ID-VG of 0.1 µm NMOS
Mobility
300
Control
250
Seed
SGT
200
150
100
50
0
0
Stanford University
2
4 6 8 10 12 14 16 18 20
Drawn Channel Length, L (um)
Krishna Saraswat
Ni Seeded Lateral Crystallization
NMOS
Ni seed
SiGe gate
SiO2
Crystallized Si
substrate
-Si
Tmax = 450ºC
• Initially transistor fabricated in -Si
• Ni seeding for simultaneous crystallization and dopant activation
• Low thermal budget (≤ 450°C)
•Devices could be fabricated on top of a metal line
Stanford University
Krishna Saraswat
Thermal Behavior in 3D ICs
Power Dissipation for 2D
I
Passivation
VDie
Gate
Silicon
tSi
TDie
tPkg
Tpkg
RSi
VPackage
Heat Flow
Package
RPkg
Tsink
Heat Sink
a)
Vsink
b)
• Energy is dissipated during transistor operation
• Heat is conducted through the low thermal conductivity dielectric,
Silicon substrate and packaging to heat sink
• 1-D model assumed to calculate die temperature
Stanford University
Krishna Saraswat
3D Examples for Thermal Study
M4
Bulk Si
n+
n+
M3
M6
M2
M5
M1
M4
Gate
p+
p+
T2
M3
M’2
M’2
M’1
M’1
Gate
Gate
n+
n+
T1
Bulk Si
• Case A: Heat dissipation is
confined to one surface
Stanford University
T2
Gate
n+
n+
T1
Bulk Si
• Case B: Heat dissipation
possible from 2 surfaces.
Krishna Saraswat
Die Temperature Simulation
Attainable die temperatures for 2-D and 3-D ICs at the NTRS based
50 nm node using advanced heat-sinking technologies that would
reduce the normalized thermal resistance, R
Stanford University
Krishna Saraswat
3D ICs: Implications for Circuit Design
• Critical Path Layout: By vertical stacking, the distance between logic blocks on
the critical path can be reduced to improve circuit performance.
• Integration of disparate technologies is easier
• Microprocessor Design: on-chip caches on the second active layer will reduce
distance from the logic and computational blocks.
• RF and Mixed Signal ICs: Substrate isolation between the digital and RF/analog
components can be improved by dividing them among separate active layers ideal for system on a chip design.
• Optical I/O can be integrated in the top layer
• Repeaters: Chip area can be saved by placing repeaters (~ 10,000 for high
performance circuits) on the higher active layers.
• Physical Design and Synthesis: Due to a non-planar target graph (upon which the
circuit graph is embedded), placement and routing algorithms, and hence
synthesis algorithms and architectural choices, need to be suitably modified.
Stanford University
Krishna Saraswat
Summary
• Cu/low k will not solve the problems of interconnects.
• Modeling of interconnect delay shows significant improvement by
transitioning from 2-D to 3-D ICs.
• Seeding and lateral crystallization of amorphous Si is a promising
technique to implement 3-D ICs.
• Thermal dissipation in 3-D ICs may require innovative packaging
solutions.
Stanford University
Krishna Saraswat