Physical Design

Download Report

Transcript Physical Design

Physical Design
• Outline
–
–
–
–
What is Physical Design
Design Methods
Design Styles
Analysis and Verification
• Goal
– Understand physical design topics
What is Physical Design?
• Mapping logic to physical implementation
– implementation
» components
» component locations
» component wiring
» geometrical shapes
– examples
» TTL chips on a PC board
» single FPGA
» custom CMOS chip
ai2.1 ai2.2
Implementation Methods
• Implementation methods
– integrated circuits
» programmable arrays - e.g. ROM, FPGA
» full custom fabrication
– hybrid integrated circuits
» thin film - built-in resistors
» thick film - ceramics
» silicon-on-silicon - multi-chip modules
– circuit boards
» discrete wiring - wire-wrap
» printed circuits
• Design rules
– topology and geometry constraints
– imposed by physics and manufacturing
– example: wires must be > 2 microns wide
Design Methods
• Full custom design
– no constraints - output is geometry
» highest-volume, highest performance designs
– requires some handcrafted design
» 5-10 transistors/day for custom layout
– use to design cells for other methods
– primary CAD tools
» layout editor, plotter
• Cell-based design
–
–
–
–
compose design using a library of cells
at board-level, cells are chips
cell = single gate up to microprocessor
primary CAD tools
» partitioning
» placement and routing
Design Methods
• Symbolic design
– reduce problem to topology
– let tools determine geometry (following design rules)
– can reuse same topology when design rules change
» e.g. shrink wires from 2 microns to 1 micron
– used mostly to design cells
• Procedural design
– “cells” are programs
– module generation - ROMs, RAMs, PLAs
– silicon compilation - module assembly from HLL
• Analysis and verification
– design rule checking - geometry widths, spacings ok?
– circuit extraction - geometry => circuit
– interconnect verification - circuit A == circuit B?
Design Styles
• Gate array design
– FPGAs are a form of gate array
• Standard cell design
• General cell design
• Full custom design
– still used for analog circuits
Design
Styles
Implementation
Methods
Gate Array
Standard Cell
General Cell
Programmable Arrays
Custom
Custom
Cell
Symbolic
Design Methods
Procedural
Gate Array Design
• Array of prefabricated gates/transistors
• Map cell-based design onto gates
• Wire up gates
– in routing channels between gates
– over top of gates (sea of gates)
– predefined wiring patterns to convert transistors to gates
• CAD problems
– placement of gates/transistors onto fixed sites
– global and local wire routing in fixed space
Standard Cell Design
• Design circuit using standard cells
– cells are small numbers of gates, latches, etc.
• Technology mapping selects cells
• Place and wire them
– cells placed in rows
» all cells same height, different widths
– wiring between rows - channels
• CAD problems
– cell placement - row and location within row
– wiring in channels
– minimize area, delay
General Cell Design
• Generalization of standard cells
• Cells can be large, irregularly shaped
– standard cells, RAMs, ROMs, datapaths, etc.
• Used in large designs
– e.g. Pentium has datapaths, RAM, ROM, standard cells, etc.
• CAD problems
– placement and routing of arbitrary shapes is difficult
Cache
RAM
Decode
PLA
Datapath
uCode
ROM
Std. Cells
Analysis and Verification
• Analysis
– circuit extraction
» determine circuit from geometry
» compute circuit parameters from geometry
» resistance, capacitance, transistor sizes
– feed back to logic design, place & route
• Verification
– design rules
» geometry rules - e.g. widths, spacings
» electrical rules - e.g. no floating gate inputs
– interconnect
» compare designed and extracted circuit
» pin-point difference if there is one
– catch human and CAD tool bugs
Placement
Outline
– What is Placement?
– Why Placement?
– Placement Algorithms
Goal
– Understand placement problem
– Understand placement algorithms
What is Placement?
• Determination of component locations
– cells in standard cells
– ICs on PC board
» pins on ICs
– gates in gate array
– modules in chip floorplan
A
C
B
• Goal
E
D
– minimize chip/board area
» minimize routing area
» minimize unused area
– minimize wiring length
» minimize length of critical paths
– evenly distribute power dissipation
– minimize crosstalk
B
C
D
E
A
Why Placement?
• Hand placement impractical
– 1k-1000k cells in standard cell array
• Multi-goal optimization
–
–
–
–
–
placement area
routing area
delay
power
symmetry - analog circuits
• Generation of routing specification
– placement algorithms generate information for router
» routing channels in slicing structure
Objective Functions
• Minimize placement cost according to function
– want fast but reasonably accurate metrics
– area
» total area taken by components and estimated wiring
– wire length (!= delay)
» minimum spanning tree
» Steiner tree
» half-perimeter of bounding box
– wiring congestion
» track density along channels
» cutset size
density:
avg. 2.2
peak/avg 1.82
1
2
4
3
1
Cluster Growth Placement
• Add components to initial placement
– select components strongly connected to placed ones
– place near strongly connected components
• Algorithm
seedComponents = select components for seed
currentPlacement = PLACE(seedComponents, currentPlacement)
while all components not placed do
selectedComponent = SELECT(currentPlacement)
currentPlacement = PLACE(selectedComponent, currentPlacement)
endloop
– results not that good
» often used for initial starting position for other algorithms
– O(n2) complexity for n components
Placed
Components
Unplaced
Components
Quadratic Placement
•
•
Let (xi ,yi )  Coordinate s of the center of cell i
wij
 Weight of the net between cell i and cell j
x, y
 Solution v ectors
Cost of the net between cell i and cell j
1
•
 wij ( xi  x j )2  ( yi  y j )2 
2
1 T
1 T
T
T
Total cost C(x )  x Qx  d x x  y Qy  d y y  const
2
2
C (x, y )
C ( x, y )
Minimum cost :
 0,
0
xi
yi
Analytic Placement
•
Analytical Placement Framework:
repeat
Solve the convex quadratic program
Spread the cells
until the cells are evenly distributed
Simulated Annealing
• Problem
– “greedy” approaches only accept “downhill” moves that improve
cost function
– can get stuck in local minima far from global minimum
• Solution: Simulated Annealing
–
–
–
–
jump out of local minima
analogy to real annealing
heat allows atoms to jump out of local minima
cool slowly so atoms jump enough to get close to global minimum
Simulated Annealing
• Randomly move components
– score decreases => accept move
– score increases => accept move with some probability
» common function: e-deltaS/t
» probability decreases over time
• “Cool” system
– initial “temperature” t high enough to randomize system
– decrease temperature - the “annealing schedule”
– do a bunch of moves between temperature changes
Simulated Annealing Algorithm
t = t0
currentPlacement = randomInitialPlacement
currentScore = SCORE(currentPlacement)
until freezing point is reached do
until equilibrium at current t is reached do
selectedComponents = SELECT(atRandom)
trialPlacement = MOVE(selectedComponents, atRandom)
trialScore = SCORE(trialPlacement)
deltaS = trialScore - currentScore
if deltaS < 0 then currentScore = trialScore
else
r = uniformrandom(0,1)
if r < e-deltaS/t then
currentScore = trialScore
currentPlacement = trialPlacement
endif
endif
endloop
t = a*t
endloop
Move Functions for Placement
• Pair Swap
– cells may be of different size
– results in cell overlap
» penalize in cost function
» final postprocess to remove
overlaps
• Single Cell
– often causes cell overlap
– more general
• Cluster
– move connected components
– avoids a lot of unnecessary moves
Simulated Annealing Control
• Annealing Schedule
– cool fast to minimize CPU time
– cool slowly to get close to global optimum
» maintain equilibrium at each temperature
» exponentially slowly to get global optimum
– usually a = 0.95
– a = e-0.7t is closer to optimal
– at each temperature do “enough” moves to reach equilibrium
– stop when nothing moves
• Cost Function
– weighted sum of objectives
– score = w1*WireLength + w2*ChipArea + w3*CellOverlapArea
+ w4*ChipAspectRatio
– change weights to emphasize different goals
» “good” weights found by trial-and-error
» can also change weights as annealing runs
Simulated Annealing Improvements
• Minimizing Move Rejection
– at high temperature nearly all moves are accepted
» system is randomized
– at low temperature nearly all moves are rejected
» most CPU time is wasted
• Approaches for Placement
– range limiters - limit distance of moves
» far away moves unlikely at lower temperatures
– only generate acceptable moves
» moves that improve cost function
» partitioning example: move cells with positive gain
» must keep around lots of information
» cost per move is higher, but less wasted effort
Simulated Annealing Issues
• Advantages
–
–
–
–
–
general-purpose optimization approach
finds global optimum
easily trade speed for quality
can incorporate many cost functions
can incorporate complex move functions
» take advantage of problem structure
• Disadvantages
– constraints complicate move and cost functions
» e.g. preplaced blocks, fixed wire lengths
– CPU hog
• Implementations
– Timberwolf, Cadence, Avant! - standard cell placement
– ACACIA - analog circuit transistor placement
Routing
Outline
–
–
–
–
–
What is Routing?
Why Routing?
Routing Algorithms Overview
Global Routing
Detail Routing
Goal
– Understand routing problem
– Understand overview of routing
algorithms
– Understand shortest path algorithms
What is Routing?
• Determination of component wiring
– assignment of wires to routing areas
» restricted routing problems
» e.g. routing channel
– detailed wiring within areas
» layer assignment
» vias
B
C
D
E
A
• Goal
– minimize routing area
» minimize wiring length
» minimize channel height
– minimize lengths of critical paths
– 100% completion in allocated space
» fixed wiring areas
» e.g. gate arrays, FPGAs
– minimize crosstalk
B
C
D
E
A
Why Routing?
• Hand routing impractical
– 1k-100k nets in large chip designs
– prone to error
– example
» IBM TCM has 26 routing layers
» how to use layers?
• Multi-goal optimization
–
–
–
–
–
routing area
delay
cross-talk
clock and power routing
manufacturing yield
Objective Functions
• Minimize routing cost according to function
– want fast but reasonably accurate metrics
– wiring area
» channel area = channel density * channel length
– wire length
» minimum spanning tree
» Steiner tree
» half-perimeter of bounding box
– wiring congestion
» track density along channels
L
peak density 4
average 2.2
peak/avg 1.82
area = h*L
h
1
2
4
3
1
Types of Routers
• Global routers
– function
» determine routing areas
» assign net to routing areas
» minimize global routing area, path lengths
– consider congestion, approximate path length
• Detail routers
– goal
» route actual wires
» minimize routing area, path lengths
– general-purpose - maze, line probe
– restricted - channel, switchbox, river routers
• Specialized
– power, clock routers
Example: Global Routing
B
C
D
B
C
D
E
B
A
E
C
D
B
A
C
D
E
A
E
A
Example: Detailed Routing
1
1
2
3 4 2
2
3
2
4
2 1
3
3
Shortest Path and Multi-Commodity
• Find shortest path between two vertices in
weighted graph
–
–
–
–
graph with edge weights
weight is distance, congestion, etc.
search graph from source to destination
backtrace to source once destination is found
• Multi-commodity flow
– Each net is a commodity
– Simultaneously route all commodities
S
1
3
3
4
2
3
1
1
2
5
1
10
D
Maze Routing
• Place wires and components on grid
– grid size = wire width + spacing
– works best if all wire widths and spacings are equal
– grid squares occupied
• Distance metric
– each square has 4 neighbors unit cost away
• Search graph
– node for each square
– unit-cost edge to 4 neighboring nodes
1
1
1
1
1 1
1
1
w+s
Maze Routing
• Shortest path search (Lee-Moore)
– maintain leaf nodes of expansion tree
– at each step add unexplored neighbors to list
» with accumulated cost
– stop when target vertex reached
» backtrace for route
3
3 2
3 2 1
3 2
3
• Guaranteed to find shortest path
– large memory requirements - the grid
– long search time - number of vertices marked
– local optimization - only routing one net at a time
16 15 14 13 12 11 10
16 15 14 13
16 15 14
8
9
16 15 14 13 12 11 10
13 12 11
16 15 14 10
16 15
9
9 8 7 6
5
7 6 5 4
8 7 6 5
8 7
9 8
8 7
8 7 6 5
5
4
3
2
1
2
3
4
4
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2 3
1 2 3
2 3
3
Algorithm
add source node S into list
costS = 0
for all nodes i on list
if i == destination D
pathcost = costD
break
for j=N,S,E,W neighbors that are not visited
and not occupied
add node j to list
costj = costi+1
mark i as visited
remove node i from list
while i != S do
mark i as path
i = min cost N,S,E,W visited neighbor
Maze Routing
• Avoiding blockage
– wire at a time => can block wires
A
A
B
A
A
B
B
A
B
• Solutions
– rip-up and re-route
» remove wire(s) causing blockage
» route blocked wire
» route ripped up wires
– shove-aside
» add dummy grid lines
» squeeze wire through
A
B
Line Probe Routing
• Gridless - store list of lines and obstructions
– sorted lists of vertical and horizontal lines
• Algorithm
– from source and destination project 4 horizontal/vertical rays (probes)
– if probes intersect, done - route wires from nodes to intersection
– if probes blocked, choose escape point and send new probes
» a point just past obstruction
E
E
B
A
E
E
Line Probe Routing
• Advantages
– small memory requirements - no grid
» store sorted lists of vertical and horizontal segments
– fast
» binary search of segments for blockage and escape points
» # probes << # grid points
– higher precision coordinates - no grid
• Disadvantages
– serial wire at a time - blockages
» similar rip-up and shove-aside approaches to cope
– basic algorithm may not find route when one exists
» need to try more escape points
» degenerates to grid search
Channel Routing
• Channel
– terminals on two sides of rectangle are fixed
– horizontal wires on layer1 - trunks, which run in tracks
– vertical wires on layer2 - branches, which run in columns
• Routing Problem
– minimize number of tracks used => minimizes channel height
– minimize wire lengths, vias
– all wires routed at same time - better overall optimization
A
B
A
A
track
via
B
A
dogleg
branch
trunk
Channel Routing
• Greedy algorithm
– route column by column rather than track by track
– scan left-right by column
– at each track apply heuristics to bring new nets to an
existing trunk, and move nets between tracks
» greedy - at each column try to minimize tracks and
minimize distance to terminals
– can often use exhaustive search
» number of tracks in a column is small - < 100
– good results, but lots of vias
» lots of track to track movement for each net
Algorithm
for each column
for each terminal bring in branch
connect to trunk if it exists
create trunk at first empty track
otherwise create new track
move between tracks using heuristics
collapse nets to fewer trunks with spare branch space
reduce distance between net trunks
move nets towards next terminal
connect multiple trunks for a net at end of channel
A
B
B
A
Pattern Channel Routers
• Slide window along channel
– recognize patterns of terminals and occupiedGet more
optimal results
– doglegs
• Challenge
– developing rule set
A
C
D
B
Problem:
Vertical constraint between A and B.
C and D occupy surrounding tracks
and columns
Solution:
Shift contacts down and to right,
doglegs on each layer to reach them
Issues with DFM
• Due to shrinking feature size, more design rules
are required
• Details routing becomes more tedious,
considering various manufacturability rules
• Should we consider DFM while doing routing, or
leave DFM to the next level?