ppt - UCSD VLSI CAD Laboratory
Download
Report
Transcript ppt - UCSD VLSI CAD Laboratory
ECE260B – CSE241A
Winter 2005
Introduction and ASIC Flow
Instructor: Bao Liu
Website: http://vlsicad.ucsd.edu/courses/ece260b-w05
ECE 260B – CSE 241A Intro and ASIC Flow .1
Slides courtesy of Prof. Andrew B. Kahng
http://vlsicad.ucsd.edu
Why not a Silicon Compiler?
Ideal
Reality
Silicon Compiler
Design methodology
Simple
Complex
No human interaction
Lots of human interaction
Spec/Matlab/VHDL
synthesis
verification
?
placement
routing
Circuit on Silicon
ECE 260B – CSE 241A Intro and ASIC Flow .2
http://vlsicad.ucsd.edu
Teams in a Design Process
VLSI designers
CAD developers
Process people
VLSI designers
Spec/Matlab/VHDL
CAD developers
Testing team
synthesis
verification
?
placement
routing
Testing team
Circuit on Silicon
Process people
ECE 260B – CSE 241A Intro and ASIC Flow .3
http://vlsicad.ucsd.edu
Class Objectives
Learn about ASIC implementation flow: VerilogGDSII
Semi-custom implementation of CMOS digital circuits, and
optimization with respect to different constraints: area, speed, power,
reliability, cost
Understand impact of constraints, tradeoffs, technology scaling
Get some feel for each phase of the implementation flow
Learn about building blocks: wires, gates, memories
Prepare for future design experiences
Get some feel for industry-standard design tools, libraries
- Will mostly use Cadence BuildGates and SOC Encounter, and Artisan
TSMC 0.18/0.13um libraries
Synthesize small cores from RTL into GDSII
ECE 260B – CSE 241A Intro and ASIC Flow .4
http://vlsicad.ucsd.edu
Outline
Introduction
Technology Evolution
Silicon Complexity
System Complexity
Design Flows
Traditional
State of the Art
- Design Metrics
- Design Closure
ECE 260B – CSE 241A Intro and ASIC Flow .5
http://vlsicad.ucsd.edu
Technology Evolution: Cost and Integration Drivers
Moore’s Law is about cost
Pentium 4 die shot:
Increased integration,
decreased cost more
possibilities for
semiconductor-based
products
ECE 260B – CSE 241A Intro and ASIC Flow .6
2.2cm
Slide courtesy of Mary Jane Irwin, PSU
http://vlsicad.ucsd.edu
Sense of Scale (Scaling)
What fits on a VLSI Chip
today?
State of the art logic chip
20mm on a side (400mm2)
0.13mm drawn gate length
0.5mm wire pitch
8-level metal
0.5mm
(8 l)
For comparison
32b RISC processor
- 8K l x 16Kl
64b FP
Processor
SRAM
- about 32l x 32l per bit
- 8K x 16K is 128Kb, 16KB
0.13mm (2 l)
DRAM
32b RISC
Processor
20mm
(40,000 wire pitches)
320,000 l
- 8l x 16l per bit
- 8K x16K is 1Mb, 128KB
ECE 260B – CSE 241A Intro and ASIC Flow .7
Slide courtesy of Ken Yang, UCLA
http://vlsicad.ucsd.edu
MOS Transistor Scaling (1974 to present)
S=0.7
[0.5x per 2 nodes]
Poly
Pitch
Metal
Pitch
(Typical
MPU/ASIC)
(Typical
DRAM)
Decreased transistor/feature sizes
Increased variability (tox, BEOL, DFM, SEU, etc.)
Short channel effect, leakage power
ECE 260B – CSE 241A Intro and ASIC Flow .8
Source: 2001 ITRS - Exec. Summary, ORTC Figure
http://vlsicad.ucsd.edu
HP / LOP / LSTP Device Roadmaps
Parameter
Vdd
Vth (V)
Ion (uA/um)
CV/I (ps)
Ioff (uA/um)
Type
99
01
03
05
07
10
13
16
MPU
LOP
LSTP
MPU
LOP
LSTP
MPU
LOP
LSTP
MPU
LOP
LSTP
1.5
1.3
1.3
0.21
0.34
0.51
1041
636
300
2.00
3.50
4.21
1.2
1.2
1.2
0.19
0.34
0.51
926
600
300
1.63
2.55
4.61
1.0
1.1
1.2
0.13
0.36
0.53
967
600
400
1.16
2.02
2.96
0.9
1.0
1.2
0.09
0.33
0.54
924
600
400
0.86
1.58
2.51
0.7
0.9
1.1
0.05
0.29
0.52
1091
700
500
0.66
1.14
1.81
0.6
0.8
1.0
0.021
0.29
0.49
1250
700
500
0.39
0.85
1.43
0.5
0.7
0.9
0.003
0.25
0.45
1492
800
600
0.23
0.56
0.91
0.4
0.6
0.9
0.003
0.22
0.45
1507
900
800
0.16
0.35
0.57
MPU
LOP
LSTP
0.00
1e-4
1e-6
0.01
1e-4
1e-6
0.07
1e-4
1e-6
0.30
3e-4
1e-6
1.00
7e-4
1e-6
3
1e-3
3e-6
7
3e-3
7e-6
10
1e-2
1e-5
ECE 260B – CSE 241A Intro and ASIC Flow .9
http://vlsicad.ucsd.edu
SEMATECH Prototype BEOL stack, 2000
Wire
Global (up to 5)
Via
Passivation
Dielectric
Etch Stop Layer
Dielectric Capping Layer
Copper Conductor with
Barrier/Nucleation Layer
Intermediate (up to 4)
Local (2)
Pre Metal Dielectric
Tungsten Contact Plug
Reverse-scaled global interconnects
Growing interconnect complexity
Performance critical global interconnects
ECE 260B – CSE 241A Intro and ASIC Flow .10
http://vlsicad.ucsd.edu
Intel 130nm BEOL Stack
Intel 6LM 130nm process with
vias shown (connecting layers)
Aspect ratio = thickness / minimum width
ECE 260B – CSE 241A Intro and ASIC Flow .11
http://vlsicad.ucsd.edu
Interconnect Capacitance: Parallel Plate Model
ILD = interlevel dielectric
L
W
T
HILD
SiO2
Substrate
Bottom plate of
cap can be
another metal
layer
Cint = eox * (W*L / tox)
ECE 260B – CSE 241A Intro and ASIC Flow .12
http://vlsicad.ucsd.edu
Line Dimensions and Fringing Capacitance
Lateral cap
w
S
Capacitive coupling
Crosstalk effect
Signal integrity
ECE 260B – CSE 241A Intro and ASIC Flow .13
http://vlsicad.ucsd.edu
Interconnect Evolution and Modeling Needs
Before 1990, wires were thick and wide while devices
were big and slow
In the 1990s, scaling (by scale factor S) led to smaller and
faster devices and smaller, more resistive wires
Large wiring capacitances and device resistances
Wiring resistance << device resistance
Model wires as capacitances only
Reverse scaling of properties of wires
RC models became necessary
In the 2000s, frequencies are high enough that inductance
has become a major component of total impedance
ECE 260B – CSE 241A Intro and ASIC Flow .14
http://vlsicad.ucsd.edu
Evolving Interconnects Affect Timing
Interconnect capacitance > gate input capacitance
Better prediction
Interconnect resistance no longer ignorable
Better modeling: distributed R(L)C network, AWE, etc.
Effective capacitance < total load capacitance
Interconnect delay > gate delay for sub-micron technologies
ECE 260B – CSE 241A Intro and ASIC Flow .15
http://vlsicad.ucsd.edu
Sub-Wavelength Optical Lithography
What are implications of this picture?
•Slide courtesy of Numerical Technologies, Inc.
ECE 260B – CSE 241A Intro and ASIC Flow .16
http://vlsicad.ucsd.edu
…Complexity of Photomasks
How many wafers, on average, are printed with a mask set?
ECE 260B – CSE 241A Intro and ASIC Flow .17
http://vlsicad.ucsd.edu
Summary of Technology Scaling
Scaling of 0.7x every three
.25u
1997
5LM
.18u
1999
6LM
.13u
2002
7LM
(two?) years
.10u
2005
7LM
.07u
2008
8LM
.05u
2011
9LM
Interconnect delay dominates system performance
consumes up to 70% of clock cycle
Cross coupling capacitance is dominating
cross capacitance 100%, ground capacitance 0%
ground capacitance is 90% in .18u
huge signal integrity implications (e.g., guardbands in static
analysis approaches)
Multiple clock cycles required to cross chip
whether 3 or 15 not as important as fact of “multiple” > 1
ECE 260B – CSE 241A Intro and ASIC Flow .18
http://vlsicad.ucsd.edu
New Materials Implications
Lower dielectric permittivity
reduces total capacitance
doesn’t change cross-coupled / grounded capacitance
proportions
Copper metallization
reduces RC delay
avoids electromigration (factor of 4-5 ?)
thinner deposition reduces cross cap
Multiple layers of routing
enabled by planarization; 10% extra cost per layer
reverse-scaled top-level interconnects
relative routing pitch may increase
room for shielding
ECE 260B – CSE 241A Intro and ASIC Flow .19
http://vlsicad.ucsd.edu
Technical Issues
Manufacturability (chip can't be built)
antenna rules
minimum area rules for stacked vias
CMP (chemical mechanical polishing) area fill rules
layout corrections for optical proximity effects in subwavelength
lithography; associated verification issues
Signal integrity (failure to meet timing targets)
crosstalk induced errors
timing dependence on crosstalk
IR drop on power supplies
Reliability (design failures in the field)
electromigration on power supplies
hot electron effects on devices
wire self heat effects on clocks and signals
ECE 260B – CSE 241A Intro and ASIC Flow .20
http://vlsicad.ucsd.edu
Noise
Analog design concerns are due
to physical noise
sources
because of discreteness of electronic charge and stochastic
nature of electronic transport processes
example: thermal noise, flicker noise, shot noise
Digital circuits due to large, abrupt voltage swings, create
deterministic noise which is several orders of magnitude
higher than stochastic physical noise
still digital circuits are prevalent because they are inherently
immune to noise
Technology scaling and performance demands make
noisiness of digital circuits a big problem
Courtesy Hormoz/Muddu, ASIC99
Silicon Complexity Challenges
Silicon Complexity = impact of process scaling, new materials, new
device/interconnect architectures
Non-ideal scaling (leakage, power management, circuit/device
innovation, current delivery)
Coupled high-frequency devices and interconnects (signal integrity
analysis and management)
Manufacturing variability (library characterization, analog and digital
circuit performance, error-tolerant design, layout reusability, static
performance verification methodology/tools)
Scaling of global interconnect performance (communication,
synchronization)
Decreased reliability (soft error uncertainty, gate insulator tunneling and
breakdown, joule heating and electromigration)
Complexity of manufacturing handoff (reticle enhancement and mask
writing/inspection flow, manufacturing NRE cost)
ECE 260B – CSE 241A Intro and ASIC Flow .22
If you don’t know a term, ask…
http://vlsicad.ucsd.edu
In a PDA…
Reference Design: personal digital assistant (PDA)
Composed of CPU, DSP, peripheral I/O, and memory
ECE 260B – CSE 241A Intro and ASIC Flow .23
http://vlsicad.ucsd.edu
Required Performance for Multi-Media Processing
0.01
0.1
Video
MPEG1
Extraction
JPEG
Audio
Voice
1
10
GOPS
100
Compression
MP/MLMPEG2 Extraction
MP/HL
MPEG4
Sentence Translation
Voice Auto Translation
Dolby-AC3
MPEG
Word Recognition
Graphics
3D Graphics 10Mpps
100Mpps
2D Graphics
Communication
SW Defined Radio
VoIP Modem
Face Recognition
Recognition Modem
Voice Print RecognitionMoving Picture Recognition
FAX
GOPS: Giga Operations Per Second
ECE 260B – CSE 241A Intro and ASIC Flow .24
http://vlsicad.ucsd.edu
…Implemented With an SoC
MM Application
0.18um / 400MHz / 470mW (typical)
MP3
JPEG
Simple Moving Picture
Available Time
6-10Hr
PWM RTC
I2C
USB
USB OST
KEY
MMC
GPIO
I-cache D-cache
32KB 32KB
6.5MTrs.
Max 400MHz
DMA controller
I2S
UART AC97
Peripheral
Area 4 – 48MHz
Processor
Area
CPU
FICP SSP
Sound
Specification MMC
CPG
PWR
MEM
Cnt.
LCD
Cnt.
SDRAM Flash LCD
64MB
32MB
Data Transfer
Area
100MHz
If the PDA must have 200h standby time with a 120g battery… ?
ECE 260B – CSE 241A Intro and ASIC Flow .25
http://vlsicad.ucsd.edu
System Complexity Challenges
System Complexity = exponentially increasing transistor counts, with
increased diversity (mixed-signal SOC, …)
Reuse (hierarchical design support, heterogeneous SOC integration,
reuse of verification/test/IP)
Verification and test (specification capture, design for verifiability,
verification reuse, system-level and software verification, AMS self-test,
noise-delay fault tests, test reuse)
Cost-driven design optimization (manufacturing cost modeling and
analysis, quality metrics, die-package co-optimization, …)
Embedded software design (platform-based system design
methodologies, software verification/analysis, codesign w/HW)
Reliable implementation platforms (predictable chip implementation onto
multiple fabrics, higher-level handoff)
Design process management (team size / geog distribution, data mgmt,
collaborative design, process improvement)
ECE 260B – CSE 241A Intro and ASIC Flow .26
http://vlsicad.ucsd.edu
Outline
Introduction
Technology Evolution
Silicon Complexity
System Complexity
Design Flows
Traditional
State of the Art
- Design Metrics
- Design Closure
ECE 260B – CSE 241A Intro and ASIC Flow .27
http://vlsicad.ucsd.edu
Levels of VLSI Design in a Traditional Flow
Specification
Architecture
gates, flip-flops, and the connections
between them
RTL
transistor circuits to realize logic
elements
Placement
Device
Extraction and
Timing
Verification
behavior of individual circuit elements
Routing
Layout
Verification
Circuit
High Level Synthesis
Logic
high-level design of component
- state defined
- logic partitioned into major blocks
Architecture Design
Synthesis
what the system (or component) is
supposed to do
geometry used to define and connect
circuit elements
GDSII
Process
steps used to define circuit elements
ECE 260B – CSE 241A Intro and ASIC Flow .28
Manufacturing
http://vlsicad.ucsd.edu
Design Principles (Traditional)
Partition the problem (hirarchical design)
Different abstraction levels: RTL, gate-level, switch-level,
transistor-level
Orthogonize concerns
Abstraction vs. implementation
Logic vs. timing
Constrain the design space to simplify the design
process
Balance between design complexity and performance
E.g., standard-cell methodology
ECE 260B – CSE 241A Intro and ASIC Flow .29
http://vlsicad.ucsd.edu
VLSI Design Flow Evolution
Expanding in two directions
System-on-Chip (SoC) Design
Design for Manufacturability (DFM)
Architecture Design
High Level Synthesis
More design metrics
Area
Timing
Power
Signal Integrity
Reliability
Tighter Integration
Design closure
RTL/GDSII sign-off re-defined
Synthesis
Verification
RTL
Placement
Extraction and
Timing
Verification
Routing
GDSII
Manufacturing
ECE 260B – CSE 241A Intro and ASIC Flow .30
http://vlsicad.ucsd.edu
Design Procedure and Tools
Behavior modeling
Matlab/C/VHDL
Logic synthesis
High Level Synthesis
DesignCompiler, BuildGates, …
Verification of synthesis
- Formal Verification (Verplex)
- Static timing analysis
(PrimeTime)
Architecture Design
Synthesis
Verification
RTL
Place and route
Astro, SOCE, …
Verification of layout
Placement
- DRC, ERC, LVS (Calibre)
- Extraction (SignalStorm)
Extraction and
Timing
Verification
Routing
- Delay Calculation (CeltIC)
- Simulation (SPICE)
GDSII
DFM
Manufacturing
ECE 260B – CSE 241A Intro and ASIC Flow .31
http://vlsicad.ucsd.edu
Design Principles(State of the Art)
Integrate the problem (design closure)
Balance design metrics
Back-annotation, predictability
Area/timing/power/signal integrity/reliability
Explore the design space
Balance between design complexity and performance
Platform-based SoC design
ECE 260B – CSE 241A Intro and ASIC Flow .32
http://vlsicad.ucsd.edu
Design Methodologies (+ business models)
Full-Custom (high effort, leading-edge performance, high-volume)
Semi-Custom (strong infrastructure, economical in lower volumes)
ASIC (Application-Specific Integrated Circuit)
Standard Cell/Gate Array/Via Programmable/Structured ASIC
FPGA
Special
Analog (custom layout, I/Os and sense amps)
Mixed-Signal / RF (unique to each process, no scaling)
System-on-Chip ( System-in-Package)
Various components: IP blocks, ASIC, FPGA, memory, uP, RF, etc.
Define implementation platform, hardware-software co-design
Performance vs. complexity
ECE 260B – CSE 241A Intro and ASIC Flow .33
http://vlsicad.ucsd.edu
Flow
Wire Model
3-D RLC
Modeling
Tool
r,s, m
Layers
Layout rules
Parasitic Extraction Library
C-Model
Standard Cell Library
Device model
Schematic
Entry
Cell
Synthesis Library (Timing/Power/Area)
Place & Route Library (Ports)
Structural
Verilog
Model
Behavioral Synthesis
Model
Verilog
Structural
RTL
Functional
Block
Layout
P&R
Floorplan
P&R
Functional
Static Timing
ECE 260B – CSE 241A Intro and ASIC Flow .34
Characterization
Layout
Entry
Slide courtesy of Mary Jane Irwin, PSU
Global
Layout
Floorplan
DRC/ERC/LVS
Static/Dynamic Timing w/extract
Power/Area
Scan/Testability
Clock Routing/Analysis
http://vlsicad.ucsd.edu
Traditional Taxonomy
Behavioral Level Design
IO Pad Placement
Logic Design and
Simulation
Logic
Synthesis
Logic Partitioning
Die Planning
Front End
Power/Ground
Stripes, Rings Routing
Global Placement
Detail Placement
Simulation
Floorplanning
Clock Tree Synthesis
and Routing
Design Verification
Timing Verification
Global Routing
Test Generation
Back End
ECE 260B – CSE 241A Intro and ASIC Flow .35
LVS
DRC
ERC
Extraction and
Delay Calc.
Timing
Verification
Detail Routing
http://vlsicad.ucsd.edu
Generic Flow Steps
Library preparation
Library data preparation
Design data preparation
•Physical floorplanning
•Place and route
•RC extraction
•Formal verification
•Physical verification
•Release to manufacturing
Logic design
Specification to RTL
RTL simulation
Hierarchical floorplanning
Synthesis
Formal verification
Gate level simulation
Static timing analysis
ECE 260B – CSE 241A Intro and ASIC Flow .36
Physical design
Design for test
Engineering change order
http://vlsicad.ucsd.edu
Library and Design Data
Models and technology data required to execute
the design flow
Power, timing: ALF, DCL, OLA, .lib, STAMP
Layout: LEF, DEF, GDSII
Delays and path timing, parasitics: SDF, GCF,
SDC, DSPF, RSPF, SPEF, SPICE
Layout rules:
ECE 260B – CSE 241A Intro and ASIC Flow .37
Dracula, Calibre “deck”
http://vlsicad.ucsd.edu
Architecture Design
Platform-based SoC Design
Platform is a library of design resources
Helps design space exploration
Meet in the middle
Embedded system
Application space
Application instance
Platform
specification
Hardware-software co-design
System platform
Platform
design-space
exploration
Platform instance
Architecture space
Figure courtesy of Alberto Sangiovanni-Vincentelli, UCB
ECE 260B – CSE 241A Intro and ASIC Flow .38
http://vlsicad.ucsd.edu
High-Level Synthesis (Behavior RTL)
Scheduling
Resource allocation
of the input specification language to the internal representation
Parallelism extraction
Design of control style and clocking scheme
Compilation
Assignment of operation to the allocated hardware components
Controller synthesis
Selection of the types of hardware components and the number for
each type to be included in the final implementation
Module binding
Assignment of each operation to a time slot corresponding to a clock
cycle or time interval
usually via data flow analysis techniques
…
ECE 260B – CSE 241A Intro and ASIC Flow .39
http://vlsicad.ucsd.edu
Architecture Level Floorplanning
Defines the basic chip layout architecture
Define the standard cell rows and I/O placement locations
Place RAMs and other macros
Separate gate array, memory, analog, RF blocks
Define power distribution structures such as rings and stripes
Allow space for clock, major buses, etc.
Rules of thumb for cell density are used to initially
calculate design size
ECE 260B – CSE 241A Intro and ASIC Flow .40
http://vlsicad.ucsd.edu
Logic Synthesis
Conversion of RTL to
gate-level netlist
Targeted to a foundry-specific library
Can be performed hierarchically (block by block)
Timing-driven
Clock information
Primary input arrival times, primary output required times
Input driving cells, output loading
False paths, multi-cycle paths
Interconnect delay may be calculated based on a
“wireload model” which uses fanout to estimate delay
Clock parameters (insertion delay, skew, jitter, etc.) are
assumed to be attainable later in place and route
ECE 260B – CSE 241A Intro and ASIC Flow .41
http://vlsicad.ucsd.edu
Formal Verification
RTL description and gate level netlist are compared to
verify functional equivalence, thereby verifying the
synthesis results
Formal methods
Graph isomorphism
Binary Decision Diagram (BDD)
Emerging technology that supplements the more
traditional gate-level simulation approach
FV also performed after place-and-route (if gate netlist
changes)
ECE 260B – CSE 241A Intro and ASIC Flow .42
http://vlsicad.ucsd.edu
RTL Simulation
RTL code, written in Verilog, VHDL or a combination of
both, is simulated to verify functional correctness
Testbenches apply input stimulus to the design
Several methods are used to verify the outputs
Self-checking testbenches automatically verify output
correctness and report mismatches
Results can be stored in a file and compared to previous results
Waveform displays can be used to interactively verify the outputs
ECE 260B – CSE 241A Intro and ASIC Flow .43
http://vlsicad.ucsd.edu
Gate-Level Simulation
Covers both functionality and timing
Cell timing is included in the simulation models and
interconnect delay is passed from the synthesis run
Worst case PVT conditions are used to analyze for setup
violations, and best case PVT conditions are used to analyze
for hold violations
Correctness is only as good as the test vectors used
Especially critical for non-synchronous designs, verification of
false path and multi-cycle path constraints
PVT = Process, Voltage, Temperature
ECE 260B – CSE 241A Intro and ASIC Flow .44
http://vlsicad.ucsd.edu
Static Timing Analysis
Verifies that design operates at desired frequency
Implicitly assumes correct timing constraints (!), e.g., boundary conditions
Timing constraints are similar to those used by logic synthesis
As with gate-level simulation, both best- and worst-case analysis is
performed
Typically performed on full-chip (not block) basis
Verifies setup and hold times at FF inputs; can also check timing from
and to PI’s and PO’s; can also check point-to-point delay values (with
blocking of pins, etc.)
May require modified constraints for inter-block issues: multiple clock
domains, multi-cycle paths, etc.
For compatibility with timing-driven layout flow, helps to have simple /
single set of constraints
Other issues: incremental analysis, …
ECE 260B – CSE 241A Intro and ASIC Flow .45
http://vlsicad.ucsd.edu
Block-Level Physical Floorplanning
Reconcile logical and physical hierarchies
Cells that are interconnected want to be close together
Take advantage of RTL hierarchy
Generate a physical hierarchy
RTL hierarchy = best physical hierarchy?
Often bundled within the same cockpit as the place and
route tool
Give placement some initial clues to reduce complexity
ECE 260B – CSE 241A Intro and ASIC Flow .46
http://vlsicad.ucsd.edu
Place and Route
Automatically place the standard cells
Generate clock trees
Add any remaining power bus connections
Route clock lines
Route signal interconnects
Design rule checks on the routes and cell placements
Timing driven tools
Require timing constraints and analysis algorithms similar to those
used during the static timing analysis step
ECE 260B – CSE 241A Intro and ASIC Flow .47
http://vlsicad.ucsd.edu
RC(L) Extraction
Calculate resistance and capacitance (and inductance) of
interconnects
Based on placement of cells
Routing segments
Calculate capacitive (inductive) effects of adjacent segments
Extract capacitance between metal segments
RC(L) data transferred back to
Static timing analysis (back annotation)
Gate level simulation
Replaces wire load model used in synthesis
Drive delay calculation, signal integrity analysis (crosstalk, other
noise), static timing
Q: How do parasitics and noise affect performance?
ECE 260B – CSE 241A Intro and ASIC Flow .48
http://vlsicad.ucsd.edu
Physical Verification
DRC – Design Rule Check
LVS – Layout Versus Schematic
Verifies that layout and netlist are equivalent at the transistor
level
Electrical Rule Check
Spacing, min dimension rules
Dangling nets, floating nodes
GDSII (Stream Format)
Final merge of layout, routing and placement data for mask
production
ECE 260B – CSE 241A Intro and ASIC Flow .49
http://vlsicad.ucsd.edu
Release to Manufacturing
Final edits to the layout are made
DRC and LVS are run to verify the correctness of the modified
database
‘Tapeout’ documentation is prepared prior to release of the GDSII to
the foundry
Pad location information is prepared, typically in a spreadsheet
Manufacturing steps
Metal fill and metal stress relief rules are checked
Manufacturing information such as scribe lanes, seal rings, mask shop
data, part numbers, logos and pin 1 identification information for
assembly are also added
Cadence’s Virtuoso is used for custom-manual edits of the mask
layers
generation of masks
silicon processing
wafer testing
assembly and packaging
manufacturing test
ECE 260B – CSE 241A Intro and ASIC Flow .50
http://vlsicad.ucsd.edu
A More Detailed Design Flow
Design Specs
Lib.+CWLM
Lib.+CWLM
Fnl. Design
Synthesis
Floor-plan & PG
Placement
Physical re-synth
Clock distribution
Route, scan re-order
Timing analysis, IPO
Fnl., pwr., SI ECO
Reqmts.
ERC, DRC, LVS
Tape-out
A. Khan, Simplex/Altius
ECE 260B – CSE 241A Intro and ASIC Flow .51
Constraints
• Architectural optimization (timing)
• Inter-group buses, bandwidth
• Clock, SI, test; validation
•
•
•
•
Floorplanning and custom WLM
Power distribution (Internal, I/O)
I/O driver, padring design
Board-level timing, SI
• Row definitions
• Placement of cells
• Congestion analysis
• Placement-based re-synthesis
• Noise minimization, isolation
• Clock distribution
• Full routing
• Scan stitching, re-ordering
• Full RC back-annotation
• Hierarchical timing, electrical and
SI analysis and IPO/ECO
http://vlsicad.ucsd.edu
Outline
Introduction
Technology Evolution
Silicon Complexity
System Complexity
Design Flows
Traditional
State of the Art
- Design Metrics
- Design Closure
ECE 260B – CSE 241A Intro and ASIC Flow .52
http://vlsicad.ucsd.edu
More Design Metrics and Techniques
Area
Dynamic
Static
Leakage
Signal Integrity
Variation (Vdd, thermal, process
variation (tox, BEOL))
Electromigration
Hot electron effect (SEU)
ECE 260B – CSE 241A Intro and ASIC Flow .53
Logic transformation, transistor sizing
Buffering, re-routing
Power minimization
Synthesis (technology mapping)
Placement, routing
Performance optimization
Crosstalk (capacitive, inductive)
Supply voltage drop (IR drop, LdI/dt)
Reliability
Gate
Interconnect
Power
Cost minimization
Cell area
Wirelength
Timing
Gating (sleep transistors), variant Vdd
Process optimization
Dual-Vth
Signal Integrity
Sizing, net ordering, shielding
P/G design, placement, synthesis
Reliability
Statistical design optimization
Design margin
http://vlsicad.ucsd.edu
Design Flow Evolution (ITRS-2003)
Past (250–180nm)
System
Design
Future (65nm –)
Present (130–90nm)
System
System
Design
Model
System
System
Design
Model
System
Model
SPEC
Functional
Perf.
HW/SW
Model
Optimization
Performance
SPEC
Functional
HW/SW
Testability
Verification
Opt
Verification
Functional
Verification
RTL
SW
SW
SW
RTL
Cockpit
SW
Opt
Auto-Pilot
Opt
Optimize
Hw/Sw
Testability
+ Placement Opt
Verification
Cockpit
Performance
File
Testability
Verification
Place/Wire
+ Timing Analysis
+ Logic Opt
Place
Wire
other
Circuit
Place
other
Optimize
Logic
SW
Logic
Wire
Auto-Pilot
EQ Check
EQ Check
+ Timing Analysis
EQ check
Performance
Synthesis
Comm.
Data
Model
Repository
File
MASKS
Analyze
Comm.
Hw/Sw
Data
Model
Repository
Perf.
Timing
Power
Noise
Test
Mfg.
other
Analyze
Timing
MASKS
Power
Noise
Test
other
Multiple design files are converged into one efficient Data Model
Disk accesses are eliminated in critical methodology loops
Verification of function, performance, testability and other design
criteria all move to earlier, higher levels of abstraction followed by
Equivalence checking
Assertion-driven design optimizations
MASKS
Industry standard interfaces for data access and control
Incremental modular tools for optimization and analysis
Design Convergence Drivers and Approaches
Wireload Model
Helps delay estimation at synthesis
stage
Gate delay = f(input slew, load cap)
Wire cap = f’(fanout number)
Cap
Empirical
Different for each technology, library,
tool, design, and design stage
Statistical (from library), custom
(multiple iterations), structural (look at
adjacent nets) …
Large deviation remains
Routing obstacles (hard IP blocks,
macros, etc.)
Routing algorithms/implementations
(timing driven, net ordering, details)
ECE 260B – CSE 241A Intro and ASIC Flow .56
2
5
10
15
#Pins
15
10
% Est Error
5
0
0
5
10
-5
-10
Design
http://vlsicad.ucsd.edu
15
Interconnect Statistics
Local Interconnect
SLocal = STechnology
SGlobal = SDie
Global Interconnect
What are some implications?
ECE 260B – CSE 241A Intro and ASIC Flow .57
http://vlsicad.ucsd.edu
Rent’s Rule
Power law distribution
N = Gp
lgN
N: number of nets
G: number of gates
p: Rent exponent between 0 ~ 1
Foundation of statistical
interconnect prediction
lgG
Empirical, unclear theoretical
root
ECE 260B – CSE 241A Intro and ASIC Flow .58
http://vlsicad.ucsd.edu
Constructive Interconnect Prediction
Statistical models have their limitations
Critical paths and the law of small numbers
Statistics properties, e.g., average wirelength
Extreme statistics properties, e.g., critical path length
Implementation details
Routing congestion, e.g., horizontal effect
Timing optimization, e.g., layer assignment
Via blockage, pin accessability, wrong way routing, etc.
Predict by construction (physical synthesis)
try a fast (global) router
Scheffer and Nequist, Proc. ACM SLIP 2000, pp. 139-144
ECE 260B – CSE 241A Intro and ASIC Flow .59
http://vlsicad.ucsd.edu
Goal: Design Convergence
What must converge?
logic, timing, power, SI, reliability in a physical embedding
support front-end signoff with a predictable back-end
Achieve Convergence through Predictability
correct by construction (“assume, then enforce”)
- constraints and assumptions passed downstream; not much goes
upstream
- ignores concerns via guardbanding
- separates concerns as able (e.g., FE logic/timing vs. BE spatial
embedding)
construct by correction (“tight loops”)
- logic-layout unification; synthesis-analysis unification, concurrent
optimization
elimination of concerns
- reduced degrees of freedom, pre-emptive design techniques
- e.g., power distribution, layer assignment / repeater rules
ECE 260B – CSE 241A Intro and ASIC Flow .60
http://vlsicad.ucsd.edu
“Physical Prototyping Philosophy”
RT
L
Functionality known
Prototype delivers accurate
physical data
Levels of accuracy
Gates
Physical Prototype
Timing / routability known
Hierarchical timing budgeting:
Floorplan / Placement
Routing
ECE 260B – CSE 241A Intro and ASIC Flow .61
Placement-acknowledgeable
synthesis (PKS)
Including global route
Post-detailed-route (In-Place
Optimization, i.e., IPO)
Chip-level CTS, top-level route
and IPO, power analysis and
grid design
Block-level synthesis,
placement, IPO, routing
“Handoff with enough
physical information to
ensure correct results”
M. Courtoy, Silicon Perspective
http://vlsicad.ucsd.edu
Coarse Placement Drives Partitioning, Coarse
Routing Drives Pin Assignment / Timing Opt
Physical Prototype
Partitioning
Block 1
Block 2
Block 3
Block-Level Pin Assignments
Block-Level Timing Budgets
M. Courtoy, Silicon Perspective
Full-chip prototype
results in optimal pin
placement
Results in narrower
channels and
reduced die size
Reduces the routing
congestion
Improves the chip
timing
Accurate timing
budgets result in
predictable timing
convergence
Cool Pictures of the Pieces…
Full Chip Power
Planning
Power IR Drop
Analysis
Place
Detailed Trial Route
RC Extraction
Delay Calc / STA
IPO
Full Chip
Physical
Prototype
Timing
Closure
Hierarchical Clock
Tree Synthesis
100ps
skew
150ps
skew
130ps
skew
50ps
skew
120ps skew
50ps
skew
Block-Level
Optimization
Partition
“Tape Out Every Day”
ECE 260B – CSE 241A Intro and ASIC Flow .63
M. Courtoy, Silicon Perspective
http://vlsicad.ucsd.edu