Power Estimation - Elwin Chandra Monie

Download Report

Transcript Power Estimation - Elwin Chandra Monie

Power Estimation
Dr. Elwin Chandra Monie
Abstraction, Complexity, Accuracy
Abstraction level
Algorithm
Computing resources
Analysis accuracy
Least
Worst
Most
Best
Software and system
Hardware behavior
Register transfer
Logic
Circuit
Device
2
Spice Simulation
• Circuit/device level analysis
• Circuit modeled as network of transistors, capacitors, resistors
and voltage/current sources.
• Node current equations using Kirchhoff’s current law.
• Average and instantaneous power computed from supply
voltage and device current.
• Analysis is accurate but expensive
• Used to characterize parts of a larger circuit.
3
Gate-Level Power Analysis
• Pre-simulation analysis:
• Partition circuit into channel connected gate
components.
• Determine node capacitances from layout analysis
(accurate) or from wire-load model* (approximate).
• Determine dynamic and static power from Spice for
each gate.
• Determine gate delays using Spice or Elmore delay
model.
* Wire-load model estimates capacitance of a net by its pin-count.
See Yeap, p. 39.
4
Gate-Level Power Analysis (Cont.)
• Run discrete-event (event-driven) logic
simulation with a set of input vectors.
• Monitor the toggle count of each net and obtain
capacitive power dissipation:
Pcap = Σ Ck V 2 f
all nodes k
• Where:
• Ck is the total node capacitance being switched, as
determined by the simulator.
• V is the supply voltage.
• f is the clock frequency, i.e., the number of vectors applied
per unit time
5
Gate-Level Power Analysis (Cont.)
• Monitor dynamic energy events at the input of each gate
and obtain internal switching power dissipation:
Pint
= Σ
Σ
E(g,e) F(g,e)
gates g events e
• Where
• E(g,e) = energy of event e of gate g, pre-computed
from Spice.
• F(g,e) = occurrence frequency of the event e at
gate g, observed by logic simulation.
6
Gate-Level Power Analysis (Cont.)
• Monitor the static power dissipation state of each
gate and obtain the static power dissipation:
Pstat = Σ
gates g
Σ P(g,s) T(g,s)/ T
states s
• Where
• P(g,s) = static power dissipation of gate g for state s,
obtained from Spice.
• T(g,s) = duration of state s at gate g, obtained from logic
simulation.
• T = vector period.
7
Gate-Level Power Analysis
• Sum up all three components of power:
P = Pcap + Pint + Pstat
8
Switching Frequency
Number of transitions per unit time:
T
=
N(t)
───
t
For a continuous signal:
T
=
lim
t→∞
N(t)
───
t
T is defined as transition density.
9
Static Signal Probabilities
• Observe signal for interval t 0 + t 1
• Signal is 1 for duration t 1
• Signal is 0 for duration t 0
• Signal probabilities:
• p 1 = t 1/(t 0 + t 1)
• p 0 = t 0/(t 0 + t 1) = 1 – p 1
10
Static Transition Probabilities
• Transition probabilities:
• T 01 = p 0 Prob{signal is 1 | signal was 0} = p 0 p1
• T 10 = p 1 Prob{signal is 0 | signal was 1} = p 1 p 0
• T = T 01 + T 10 = 2 p 0 p 1 = 2 p 1 (1 – p 1)
• Transition density: T = 2 p 1 (1 – p 1)
11
Static Transition Frequency
f = p1(1 – p1)
0.25
0.2
0.1
0.0
0
0.25
0.5
p1
0.75
1.0
12
Inaccuracy in Transition Density
p1 = 0.5
T = 1.0
1/fck
p1 = 0.5
T = 4/6
p1 = 0.5
T = 1/6
Observe that the formula, T = 2 p1 (1 – p1), is not correct.
13
Inaccuracy in Transition Density
p1 = 0.5
T = 1.0
1/fck
p1 = 0.5
T = 4/6
p1 = 0.5
T = 1/6
Observe that the formula, T = 2 p1 (1 – p1), is not correct.
14
Cause for Error and Correction
• Probability of transition is not independent of the present
state of the signal.
• Determine probability p 01 of a 0→1 transition.
• Recognize p 01 ≠ p 0 × p 1
• We obtain p 1 = (1 – p 1)p 01 + p 1 p 11
p 01
p 1 = ─────────
1 – p 11 + p 01
15
Correction (Cont.)
• Since p 11 + p 10 = 1, i.e., given that the signal was
previously 1, its present value can be either 1 or 0.
• Therefore,
p 01
p 1 = ──────
p 10 + p 01
This uniquely gives signal probability as a function of
transition probabilities.
16
Transition and Signal Probabilities
p01 = p10 = 0.5
p1 = 0.5
1/fck
p01 = p10 = 1/3
p1 = 0.5
p01 = p10 = 1/6
p1 = 0.5
17
Probabilities: p0, p1, p00, p01, p10, p11
•
•
•
p 01 + p 00 =1
p 11 + p 10 = 1
p0=1–p1
p 01
p 1 = ───────
p 10 + p 01
18
Transition Density
• T
= 2 p 1 (1 – p 1) = p 0 p 01 + p 1 p 10
= 2 p 10 p 01 / (p 10 + p 01)
= 2 p 1 p 10 = 2 p 0 p 01
19
Power Calculation
• Power can be estimated if transition density is known for
all signals.
• Calculation of transition density requires
• Signal probabilities
• Transition densities for primary inputs; computed from
vector statistics
20
Signal Probabilities
x1
x1 x2
x2
x1
x1 + x2 – x1x2
x2
x1
1 - x1
21
Signal Probabilities
x1
x2
x3
X1
0
0
0
0
1
1
1
1
X2
0
0
1
1
0
0
1
1
0.5
x1 x2
0.25
0.5
0.625
0.5
X3
0
1
0
1
0
1
0
1
Y
1
0
1
0
1
0
1
1
y = 1 - (1 - x1x2) x3
= 1 - x3 + x1x2x3
= 0.625
Ref: K. P. Parker and E. J. McCluskey,
“Probabilistic Treatment of General
Combinational Networks,” IEEE Trans.
on Computers, vol. C-24, no. 6, pp. 668670, June 1975.
22
Correlated Signal Probabilities
x1
0.5
x1 x2
0.5
0.25
0.625?
x2
X1
0
0
1
1
X2
0
1
0
1
Y
1
0
1
1
y = 1 - (1 - x1x2) x2
= 1 – x2 + x1x2x2
= 1 – x2 + x1x2
= 0.75 (correct value)
23
Correlated Signal Probabilities
x1
x2
X1
0
0
1
1
0.5
x1 + x2 – x1x2
0.5
0.75
X2
0
1
0
1
Y
0
1
0
1
0.375?
y = (x1 + x2 – x1x2) x2
= x1x2 + x2x2 – x1x2x2
= x1x2 + x2 – x1x2
= x2
= 0.5 (correct value)
24
Observation
• Numerical computation of signal probabilities is
accurate for fanout-free circuits.
25
Remedies
• Use Shannon’s expansion theorem to compute signal
probabilities.
• Use Boolean difference formula to compute transition
densities.
26
Shannon’s Expansion Theorem
• C. E. Shannon, “A Symbolic Analysis of Relay
and Switching Circuits,” Trans. AIEE, vol. 57, pp.
713-723, 1938.
• Consider:
• Boolean variables, X1, X2, . . . , Xn
• Boolean function, F(X1, X2, . . . , Xn)
• Then F = Xi F(Xi=1) + Xi’ F(Xi=0)
• Where
• Xi’ is complement of X1
• Cofactors, F(Xi=j) = F(X1, X2, . . , Xi=j, . . , Xn), j = 0 or 1
27
Expansion About Two Inputs
•
F = XiXj F(Xi=1, Xj=1) + XiXj’ F(Xi=1, Xj=0)
+ Xi’Xj F(Xi=0, Xj=1)
+ Xi’Xj’ F(Xi=0, Xj=0)
• In general, a Boolean function can be expanded about any
number of input variables.
• Expansion about k variables will have 2k terms.
28
Correlated Signal Probabilities
X1
X1 X2
Y = X1 X2 + X2’
X2
X1
0
0
1
1
X2
0
1
0
1
Y
1
0
1
1
Shannon expansion about the
reconverging input, X2:
Y = X2 Y(X2 = 1) + X2’ Y(X2 = 0)
= X2 (X1) + X2’ (1)
29
Correlated Signals
• When the output function is expanded about all
reconverging input variables,
• All cofactors correspond to fanout-free circuits.
• Signal probabilities for cofactor outputs can be calculated
without error.
• A weighted sum of cofactor probabilities gives the correct
probability of the output.
• For two reconverging inputs:
f = xixj f(Xi=1, Xj=1) + xi(1-xj) f(Xi=1, Xj=0)
+ (1-xi)xj f(Xi=0, Xj=1) + (1-xi)(1-xj) f(Xi=0, Xj=0)
30
Correlated Signal Probabilities
X1
X1 X2
Y = X1 X2 + X2’
X2
X1
0
0
1
1
X2
0
1
0
1
Y
1
0
1
1
Shannon expansion about the
reconverging input, X2:
Y = X2 Y(X2=1) + X2’ Y(X2=0)
= X2 (X1) + X2’ (1)
y = x2 (0.5) + (1-x2) (1)
= 0.5 (0.5) + (1-0.5) (1)
= 0.75
31
Example
0.5
0.5
1
0
0.5
Reconv.
signal
Supergate
0.25
0.5
0.0
0.0 0.5
1.0
Point of
reconv.
0.5
1.0
0.375
Signal probability for supergate output
= 0.5 Prob{rec. signal = 1} + 1.0 Prob{rec. signal = 0}
= 0.5 × 0.5 + 1.0 × 0.5 = 0.75
S. C. Seth and V. D. Agrawal, “A New Model for Computation of
Probabilistic Testability in Combinational Circuits,” Integration, the VLSI
Journal, vol. 7, no. 1, pp. 49-75, April 1989.
32
Probability Calculation Algorithm
• Partition circuit into supergates.
• Definition: A supergate is a circuit partition with a single output
such that all fanouts that reconverge at the output are contained
within the supergate.
• Identify reconverging and non-reconverging inputs
of each supergate.
• Compute signal probabilities from PI to PO:
• For a supergate whose input probabilities are known
• Enumerate reconverging input states
• For each input state do gate by gate probability
computation
• Sum up corresponding signal probabilities,
weighted by state probabilities
33
Calculating Transition Density
..
..
.
1
x1, T1
xn, Tn
Boolean
function
y, T(Y) = ?
n
34
Boolean Difference
Boolean diff(Y, Xi) =
∂Y
── = Y(Xi=1) ⊕ Y(Xi=0)
∂Xi
• Boolean diff(Y, Xi) = 1 means that a path is sensitized from input Xi
to output Y.
• Prob(Boolean diff(Y, Xi) = 1) is the probability of transmitting a toggle
from Xi to Y.
• Probability of Boolean difference is determined from the probabilities
of cofactors of Y with respect to Xi.
F. F. Sellers, M. Y. Hsiao and L. W. Bearnson, “Analyzing Errors with
the Boolean Difference,” IEEE Trans. on Computers, vol. C-17, no. 7,
pp. 676-683, July 1968.
35
Transition Density
n
T(y) = Σ T(Xi) Prob(Boolean diff(Y, Xi) = 1)
i=1
F. Najm, “Transition Density: A New Measure of Activity in Digital
Circuits,” IEEE Trans. CAD, vol. 12, pp. 310-323, Feb. 1993.
36
Power Computation
• For each primary input, determine signal probability and
transition density for given vectors.
• For each internal node and primary output Y, find the
transition density T(Y), using supergate partitioning and
the Boolean difference formula.
• Compute power,
Σ
0.5CY V2 T(Y)
all Y
where CY is the capacitance of node Y and V is supply
voltage.
P=
37
Transition Density and Power
X1
X2
X3
0.2, 1
0.3, 2
0.06, 0.7
0.436, 3.24
Ci
Y
0.4, 3
CY
Transition density
Signal probability
Power = 0.5 V 2 (0.7Ci + 3.24CY)
38
Prob. Method vs. Logic Sim.
Probability method
Logic Simulation
Circuit
No. of
gates
Av. density
CPU s*
Av. density
CPU s*
Error
%
C432
160
3.46
0.52
3.39
63
+2.1
C499
202
11.36
0.58
8.57
241
+29.8
C880
383
2.78
1.06
3.25
132
-14.5
C1355
346
4.19
1.39
6.18
408
-32.2
C1908
880
2.97
2.00
5.01
464
-40.7
C2670
1193
3.50
3.45
4.00
619
-12.5
C3540
1669
4.47
3.77
4.49
1082
-0.4
C5315
2307
3.52
6.41
4.79
1616
-26.5
C6288
2406
25.10
5.67
34.17
31057
-26.5
3.83
9.85
5.08
2713
-24.2
C7552 3512
* CONVEX c240
39
Problem 1
For equiprobable inputs analyze the 0→1 transition probabilities of all
gates in the two implementations of a four-input AND gate shown
below. Assuming that the gates have zero delays, which
implementation will consume less average dynamic power?
A
B
C
D
E
F
Chain structure
G
A
B
C
D
E
G
F
Tree structure
40
Problem 1 Solution
Given the primary input probabilities, P(A) = P(B) = P(C) = P(D) = 0.5,
signal and transition (0→1) probabilities are as follows:
Signal
name
Chain
Tree
Prob(sig.= 1) Prob(0→1) Prob(sig.=1)
Prob(0→1)
E
0.2500
0.1875
0.2500
0.1875
F
0.1250
0.1094
0.2500
0.1875
G
0.0625
0.0586
0.0625
0.0586
Total
transitions/vector
0.3555
0.4336
The tree implementation consumes 100×(0.4336 – 0.3555)/0.3555 = 22%
more average dynamic power. This advantage of the chain structure may
be somewhat reduced because of glitches caused by unbalanced path
delays.
41
Problem 2
Assume that the two-input AND gates in Problem 1 each has one unit
of delay. Find input vector pairs for each implementation that will
consume the peak dynamic power. Which implementation consumes
less peak dynamic power?
A
B
C
D
E
F
Chain structure
G
A
B
C
D
E
G
F
Tree structure
42
Problem 2 Solution
For the chain structure, a vector pair {A B C D} = {1110},{1011} will
produce four gate transitions as shown below.
A
E
F
B
C
D
G
A=11
B=10
E=10
C=11
F=10
D=01
G=00
0
1
2
3
Time units
43
Problem 2 Solution (Cont.)
The tree structure has balanced delay paths. So it cannot make more
than 3 gate transitions. A vector pair {ABCD} = {1111},{1010} will
produce three transitions as shown below.
A
E
B
G
C
D
F
A=11
Therefore, just counting the gate
transitions, we find that the chain
consumes 100(4 – 3)/3 = 33%
higher peak power than the tree.
B=10
E=10
C=11
D=10
F=10
G=10
Time units
0
1
2
3
44