slides (PowerPoint) - Newcastle University
Download
Report
Transcript slides (PowerPoint) - Newcastle University
Efficient Automatic Resolution
of Encoding Conflicts Using
STG Unfoldings
Victor Khomenko
School of Computing Science,
Newcastle University, UK
Asynchronous circuits
The traditional synchronous (clocked) designs
lack flexibility to cope with contemporary
design technology challenges
Asynchronous circuits – no clocks:
Low power consumption and EMI
Tolerant of voltage, temperature and
manufacturing process variations
Modularity – no problems with the clock skew
and related subtle issues
[ITRS’05]: 22% of designs will be driven by ‘handshake
clocking’ in 2013, and 40% in 2020
Hard to synthesize efficient circuits
The theory is not sufficiently developed
Limited tool support
2
Motivation
•
Resolution of encoding conflicts is one of
the most difficult tasks in logic synthesis
•
The quality of the resulting circuit (in terms
of area and latency) depends to a large
extent on the way the encoding conflicts
were resolved
3
Example: VME Bus Controller
Data Transceiver
Bus
dsr
dtack
VME Bus
Controller
lds
ldtack
dsr+
lds+
dtackd-
ldsdsr-
ldtackdtack+
Device
d
ldtack+
d+
4
Example: Encoding Conflict
dtack-
00100
10000
dsr+
00000
lds+
ldtack-
ldtack-
dtack01100
lds01110
01000
ldtack-
dsr+
10010
11000
ldsldsdtackdsr+
01010 11010 M’’
ldtack+
M’
11010
d+
ddsr01111
dtack+
11111
11011
5
State Graphs vs. Unfoldings
State Graphs:
Relatively easy theory
Many efficient algorithms
Not visual
State space explosion problem
6
State Graphs vs. Unfoldings
Unfoldings:
Alleviate the state space explosion problem
More visual than state graphs
Proven efficient for model checking
Quite complicated theory
Not sufficiently investigated
Relatively few algorithms
7
Example: Encoding Conflict
e8
e10
dtack- dsr+
e1
e2
e3
e4
dsr+
lds+
ldtack+
d+
Code(conf’)=10110
e5
e6
dtack+ dsr-
Code(conf’’)=10110
e7
e12
lds+
dlds-
ldtack-
e9
e11
8
Example: Resolving the conflict
dtack-
dsr+
csc+
lds+
d-
lds-
ldtack-
ldtack+
csc-
dsr-
dtack+
d+
9
Example: Resolving the conflict
dtack-
001000
ldtack-
dsr+
000000
ldtack-
lds011100
d-
010000
100001
lds+
ldtack-
dtack011000
csc+
100000
dsr+
100101
110000
ldtack+
ldsldsdtackdsr+
110100 M’’ M’ 110101
010100
d+
csc011110
011111
dsr111111
dtack+
110111
10
Example: Resulting Circuit
Data Transceiver
Bus
d
dtack
dsr
csc
Device
lds
ldtack
11
Core map
•
•
•
Cores often overlap
High-density areas are good candidates for
signal insertion
Analogy with topographic maps
Core1
Core2
A1
A2
A3
Core3
12
Transformations
•
•
•
Need to check the validity of the
transformation: safeness, bisimulation, not
delaying inputs, preserving semi-modularity
The validity should be checked before the
transformation is performed, i.e. on the original
prefix (to avoid backtracking)
Perform the transformation directly on the
prefix to avoid re-unfolding
Re-unfolding is time-consuming
Good for visualization (re-unfolding can
dramatically change the look of the prefix)
Can transfer unresolved encoding conflicts
between the iterations of the algorithm
13
Sequential pre-insertion
Preserves safeness
Preserves traces
Can introduce deadlocks: check that the new transition
never ‘steals’ tokens from any other enabled transition
state property – can be checked on the prefix
Easy to check that inputs are not delayed
Can violate semi-modularity
state property – can be checked on the prefix
14
Sequential post-insertion
Preserves safeness
Yields a bisimular PN
Preserves semi-modularity
Easy to check (conservatively) that inputs are not
delayed
15
Concurrent insertion
Can introduce non-safeness and/or deadlocks
the correctness can be checked on the prefix
Easy to check that inputs are not delayed
Preserves semi-modularity
16
Equivalent insertions
Equivalence is easy to check
Fewer transformations to consider
Can convert to ‘canonical form’, e.g.
pre-insertions
No need to check validity – postinsertions are always valid
17
Commutative insertions
Two transition insertions commute if they can be
performed in any order
• concurrent insertions commute with any other
insertions
• pre-insertions commute with post-insertions
• two pre/post-insertions commute iff they split
different transitions or the sets of split off
places do not overlap
A valid insertion remains valid if another valid
commutative insertion is applied first, i.e. the
validity needs to be checked only once
18
Outline of the algorithm
1.
2.
3.
4.
Compute unfolding prefix of the STG
Compute conflict cores and terminate if none
Compute the set of valid transition insertions
Find a subset of these insertions such that
no two of them are
SAT
– non-commuting, or
Incremental
– concurrent, or
– in auto-conflict, or
SAT
– one of them can trigger the other
consistent assignment of signs is possible
some encoding conflicts are resolved
a cost function is minimised
5. Apply the insertions to the STG and the prefix and
goto 2
19
Cost function
Parameterised by the user; takes into account:
• the number of unresolved CSC and USC cores
• the delay introduced by the insertion
• the number of syntactic triggers of all noninput signals
• the number of inserted transitions of a signal
• the number of signals which are not locked
with the newly inserted signal
20
Case study 1: VME bus controller
e8
e10
dtack- dsr+
e1
e2
e3
e4
dsr+
lds+
ldtack+
d+
•
•
•
•
e5
e6
dtack+ dsr-
e7
e12
lds+
dlds-
ldtack-
e9
e11
Fully sequential solution
Fully concurrent solution
Solution with two set and two reset transitions
Explored design space: 17 different solutions
21
Case study 2: 8-way sequencer
•
•
Fully concurrent solution with 3 signals (csc1 is
set and reset twice)
Petrify needs 4 signals
22
Experimental results
•
•
•
Compared with Petrify and the ILP approach of
[CC06]
Small benchmarks:
always better in terms of inserted signals
8.5-8.8% smaller area
Jordi Cortadella was impressed
Scalable benchmarks (the tool of [CC06] was
not available):
almost the same area and number of signals
runtime and memory consumption are better
by orders of magnitude
some intractable for Petrify benchmarks
were easily solved
23
Thank you!
Any questions?
24