Transcript ppt

Logic synthesis flow
• Technology independent mapping
– Two level or multilevel optimization to optimize a coarse metric
related to area/delay
• Technology dependent mapping
– More “concrete” synthesis: map logic function to a given library,
typically with characterized areas and delays
– Represent circuit by a subject graph
– Represent library cells by a set of pattern graphs
– (These graphs are always directed acyclic graphs (DAG’s) for
combinational circuits)
– Find optimal mapping of pattern graphs to the subject graph to
optimize a cost function
Example library and its pattern graphs
Nodes:
2NAND
NOT
AOI21
(6)
INV
(2)
NAND2
(4)
NAND3
(6)
AOI22
(8)
XOR2
(12)
NAND4
(8)
or
Cost of using the gate, taken here
as the number of transistors
(input permutations not shown)
Example circuit
Subject graph
Two possible mappings
NAND2
INV
NAND4
NAND3
Cost = 2Cost(NAND2)+2Cost(INV)
+Cost(NAND4)+Cost (NAND3)
= 2  4 + 2  2 + 8 + 6 = 26
NAND2
AOI21
Cost = Cost(NAND2)+Cost(INV)
+Cost(AOI21)+2Cost (NAND3)
= 4 + 2 + 6 + 2  6 = 24
INV
NAND3
NAND3
Tree mapping
• Applicable to “fanout-free” regions and provably optimal
• Simple objective: minimize area
– Cost is fanout-independent
– Things are more tricky when costs are fanout-dependent
• Basic idea
– Traverse from inputs to outputs
– At each node
• Enumerate all mappings of the node and find cost
• Choose lowest cost solution based on node cost and sum of input
costs
– Optimal substructure property: any optimal solution can be formed
by using the optimal solutions at previous nodes
Tree mapping: an example
Mapped as NAND3
Cost = Cost(NAND3)+Cost(fanins)
= 6+0+4 = 10
Mapped as NAND2
Cost = Cost(NAND2)+Cost(fanins)
= 4+6+4 = 14
Cost = 4
Cost = 2+4=6
Cost = min(10,14) = 10
Cost = 4
Cost = 2+10=12
Cost = min(4+12+0,6+6+4+0,8+0+4+0) = 12
NAND2
NAND3
NAND4
Finer points
• Handling inversions better
–
–
–
–
–
Introduce a buffer element (two inverters) with cost 0 into library
Replace inverter-free edges in subject graph by a pair of inverters
Make appropriate (similar) minor changes to pattern graphs
Map as before
Advantage: allows inversions to be considered; any inversions not used
are replaced by buffers with cost zero during mapping
• Mapping DAGs
– DAG mapping is NP-complete for area objectives
– Intuition: at multi-fanout points, contradictory optimal choices can be
made; can only be resolved by logic duplication
– Heuristic: decompose DAG into a forest of trees and map each optimally
Circuit decomposition
• Quality of mapping depends on decomposition of circuit
– 2NAND/NOT and AND/OR/NOT decompositions used most often
– Number of such decompositions is huge!
• Clever way of considering all decompositions proposed by
Lehman, Watanabe, Grodstein and Harkness, IEEE
Transactions on CAD 8/1997, pp. 813-834
Other objective functions
• Minimizing area considered – seen to be easy
• Minimizing delay is more complex, since delay is fanoutdependent
• Basic procedure for tree-mapping
– Traverse graph as before
– Store several best solutions at each node, parameterized by load
capacitance values (i.e., store best solution for several load caps)
– This also allows mapper to use various power levels available for
each cell in the library
Other objective functions (contd.)
• Minimize Area subject to Delay  Dspec
• Basic dynamic programming procedure
–
–
–
–
Traverse tree as before
Find all (Area,Delay) choices at each node
(For fanout-dependent delay, parameterize delay by fanout cap)
Remove provably suboptimal (Area,Delay) subsolutions
• If Area1  Area2 and Delay1  Delay2 (clearly one inequality must be
strict!), then any optimal solution will prefer (Area1,Delay1) to
(Area2,Delay2)
This point is provably worst
• Can prune the latter from the list
than this one
Delay
• Pictorially
Area
Final note on synthesis
• Deep submicron technologies (< 0.25 micron or so):
– Wire capacitances are significant
– Interconnect significantly affects circuit performance
• Traditional wire-load models
– Fanout load depends on # fanouts, calculated statistically
– Invalid in deep submicron since fanout RC’s depend on locations and not
just number of fanouts
• Physical synthesis
– Concurrently perform placement and synthesis
– Common approach
• Create “physical prototype” – coarse placement
• Perform synthesis according to this placement; refine placement as you go
along