DLsp07-m12-CombTiming-v3 - FAMU

Download Report

Transcript DLsp07-m12-CombTiming-v3 - FAMU

FAMU-FSU
College of Engineering
EEL 3705 / 3705L
Digital Logic Design
Fall 2006
Instructor: Dr. Michael Frank
Module #12:
Combinational Logic Cost & Timing Analysis
(Thanks to Dr. Perry for some slides)
FAMU-FSU College of Engineering
Combinational Logic – Cost Analysis

In practice, accurately calculating the manufacturing
cost of a given design may be very complicated…

Some factors:

General type of logic technology used:



Many precise details of the specific technology used
Nonlinear effects of wiring cost


Average wire length tends to grow as # of devices increases
Nonlinear effects of die area on IC yield


Custom VLSI vs. ASIC vs. FPGA vs. old-school TTL/MSI
May be ameliorated by fault-tolerant architectural techniques
Still, a simple, first-order, back-of-the-envelope estimate
of circuit cost can be obtained by modeling cost as linear
in the number of gates or transistors used.
FAMU-FSU College of Engineering
Number of Transistors per Logic Gate

Example for a typical, simple static CMOS
VLSI technology:





NOT gate (inverter)
Buffer
NAND/NOR:
AND/OR:
XOR/XNOR:

2 transistors
4 trans.
4T
6T
8-10 T
8 if complement of input signal is already available
FAMU-FSU College of Engineering
Cost/Transistor Figures Today

Typical ballpark figures for a leading-edge,
high-performance CPU with plenty of cache
today (2007):



~$500
~1 billion
Thus, the average cost per transistor is:


Cost per IC:
Number of transistors:
$500 / 109 T = $10−7/ T = 10−5 ¢/T = .00001 ¢/T
Note this includes an amortized share of the
cost due to wiring, yield considerations, etc.
FAMU-FSU College of Engineering
Cost Estimation Example

Estimate the cost of a simple 64-bit ripplecarry adder in a modern VLSI process.





Half-adder = AND+XOR = 6+10 T = 16 T
Full adder = 2 HAs + OR = 2×16 + 6 = 38 T
64 Full adders = 64 × 38T = 2,432 Ts
2,432 T × .00001¢/T = .02432 ¢
Thus, a simple 64-bit adder costs about two
one-hundredths of a cent to manufacture

This may also be further optimized
FAMU-FSU College of Engineering
Cost-performance Analysis

An important figure of merit for many digital
systems is their cost-performance, meaning
performance (operations performed per unit of time)
per unit of manufacturing cost



A.k.a. cost-efficiency, hardware efficiency
Can be measured in e.g. ops/sec/$
Example: Suppose the 64-bit adder of the previous
slide can be clocked at 1 GHz. What is its costperformance, in terms of 64-bit add operations?

(109 ops / sec /adder)/(0.024 ¢/adder)×(100¢/$)
= 4.2 × 1012 ops/sec/$
FAMU-FSU College of Engineering
Power-performance Analysis

As power consumption becomes a dominant limiting factor
on performance, power-performance (performance per unit
power dissipated) becomes increasingly important.



A.k.a. (computational) energy efficiency
Measured in ops/sec/Watt, or ops/Joule
Example: Suppose each logic gate in the previous example
consumes 1 fJ = 6,241 eV on each clock cycle. What then is
its power-performance for 64-bit adds?


Number of logic gates in adder design: 5×64 = 320
Energy dissipated per 64-bit add operation:


320 × 6,241 eV = 2 MeV = 3.2 × 10−13 J
Power-performance:

1/(3.2×10−13 J) = 1.95 × 1013 ops/Joule = 3.125×1012 ops/sec/Watt
FAMU-FSU College of Engineering
Cost vs. Power Example

Suppose that, using the technology of the previous
examples, I wish to design a massively parallel 3D
graphics processing unit (GPU) for a handheld
videogame unit. In this GPU, most of the cost and
power budget goes to 64-bit add ops. But it must
cost no more than $50, and dissipate no more than
10 Watts of power. Which is the major limiting
factor on performance: Hardware cost, or power?

Cost-limited performance on 64-bit add operations:


Power-limited performance on 64-bit add operations :


(4.2×1012 ops/s/$)×($50) = 210 T adds/sec
(3.125×1012 ops/s/W)×(10W) = 31.25 T adds/sec
Power is by far the dominant limiting factor!
Timing Analysis for Combinational
Logic
Delay Time
Def: Time required for output signal Y
to change due to change in input signal
X
X
t=0
F(x)
Y
t=0
Up to now, we have assumed this delay
time has been 0 seconds.
Delay Time
In a “real” circuit, it will take tp
seconds for Y to change due to X
X
t=0
F(x)
Y
t=tp
tp is known as the propagation delay time
Timing Diagram
We use a timing diagram to graphically
represent this delay
1
X
0
t=0
time,s
1
Y
0
t=tp
time,s
Horizontal axis = time axis
Vertical axis = Logical level axis (Logic One or Logic Zero)
Timing Diagram
We see a change in X at t=0 causes a
change in Y at t=tp
1
X
0
t=0
t=T
time,s
1
Y
0
t=tp
t=T+tp
time,s
Horizontal axis = time axis
Vertical axis = Logical level axis (Logic One or Logic Zero)
Timing Diagram
We also see a change in X at t=T
causes another change in Y at t=T+tp
1
X
0
t=0
t=T
time,s
1
Y
0
t=tp
t=T+tp
time,s
We see that logic circuit F causes a delay of tp seconds in
the signal
Simple Example – Not Gate
Let tp=2 ns
Where ns = nanosecond = 1x10-9 seconds
X
Y
X
0
2ns
time,ns
Y
2
time,ns
Simple Example – 2 Not Gates
Let tp=2 ns
X
Z
Y
4ns
X
2ns
2ns
Z
Y
0
2
4
Total Delay = 2ns + 2ns = 4ns
6
8
t,ns
Simple Example – 2 Not Gates
Notes:
Time axis is shared among signals
Logic levels (1 or 0) are implied, not shown
X
Z
Y
0
2
4
6
8
t,ns
Simple Example – 2 Not Gates
Sometimes dashed vertical lines are added to aid reading
diagram
2ns
2ns
2ns
2ns
2ns
X
Z
Y
0
2
4
6
8
t,ns
Where does this delay come
from?
Circuit Delay
Circuit Delay
All electrical circuits have intrinsic
resistance (R) and capacitance (C).
C
R
Let’s analyze a simple RC circuit
Circuit Delay – SimpleVinRC Circuit
1
R
Vout(t)
0.9
0.8
0.7
C
Vin(t)
0.6
0.5
0.4
Vout
0.3
0.2
0.1
0
0

 t
Vout  t   Vdd 1  exp  
 

  RC



  RC  time constant
1
2
3
4
5
6
7
Note:
t x  0.69 Vout  t x   0.5Vdd
t x  2.3
Vout  t x   0.9Vdd
t x  4.6
Vout  t x   0.99Vdd
Circuit Delay – Example
Vin
1
R
Vout(t)
0.9
0.8
0.7
Vin(t)
C
0.6
0.5
0.4
Vout
0.3
0.2
0.1
0
0
1
2
3
Let R=1ohm, C=1F, so that RC=1 second
Time Delay is 0.7s or 700 ms for 0.5Vdd
Time Delay is 2.3s for 0.9Vdd
Time Delay is 4.6s for 0.99 Vdd
4
5
6
7
How do we relate this to logic
diagrams?
Def: tplh
tplh = low-to-high propagation delay time
This is the time required for the output to rise from 0V to ½ VDD
1
0.9
0.8
0.7
tplh
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
Def: tphl
Tphl = high-to-low propagation delay time
This is the time required for the output to fall from Vdd to ½ VDD
1
0.9
0.8
0.7
tphl
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
Def: tp
(propagation delay time)
Let’s define tp = propagation delay time as
1
 p   t plh  t phl 
2
This will be the “average” delay through the circuit
Gate Delay – Simple RC Model
Ideal gate with tp=0 delay
RC network
R
Vout(t)
Vin(t)
Ideal gate with RC network
C
Vin(t)
Vout(t)
Tp=tp_not
Equivalent model with
Gate delay of tp_not
Gate Delay - Example
X
X
Y
0
5ns
tp_not
We indicate tp on the gate
25ns
Y
0
5ns
30ns
Combinational Logic Delay
Longest delay


F  a, b, c, d   D  AB  B  C
A
5ns
B
5ns
5ns
5ns
C
This circuit has multiple delay paths
A-Y = 5ns+5ns+5ns=15ns
B-Y = 5ns+5ns+5ns+5ns=20ns
C-Y = 5ns+5ns+5ns=15ns
D-Y = 5ns
Longest delay = 20ns
Shortest delay = 5ns
5ns
D
Shortest delay
Y

Combinational Logic Delay
Longest delay


F  a, b, c, d   D  AB  B  C
A
5ns
B
5ns
5ns
5ns
C
5ns
D
We’ll use the longest delay to represent
the logic function F.
Let’s call it Tcl for time, combinational logic
Longest delay = 20ns
Shortest delay
Y

Combinational Logic (CL)
Cloud Model
A
5ns
5ns
B
X
5ns
C
5ns
D
5ns
F
Y
tcl
E
Tcl=20ns
Tcl=20ns


F  a, b, c, d   D  AB  B  C

Y
Logic Simulators
Used to simulate the output
response of a logic circuit.
Logic Simulations
Three primary types

Circuit simulator (e.g. PSPICE)
 “Exact” delay for each gate
 Most accurate timing analysis
 Very slow compared to other types

Functional Simulation (e.g. Quartus )
 Assumes one unit delay for each gate
 Very fast compared to other types
 Most inaccurate timing analysis

Timing Simulation (e.g. Quartus)
 Assumes “average” tp delay for each gate
 Not the fastest or slowest timing analysis
 Provides “pretty good” timing analysis
TPS Quizzes
Timing Quiz 1
Calculate all delay paths through
the circuit shown below
A
5ns
B
2ns
5ns
8ns
C
10ns
D
What is the shortest and longest delay?
Y
Solution: Calculate all delay paths
through the circuit shown below
A
5ns
B
2ns
5ns
8ns
C
10ns
Y
D
This circuit has multiple delay paths
A-Y = 5ns+5ns+10ns=20ns
B-Y = 2ns+5ns+5ns+10ns=22ns
Shortest path=10ns
B-Y = 8ns+5ns+10ns=23ns
Longest path=23ns
C-Y = 8ns+5ns+10ns=23ns
D-Y = 10ns
Timing Quiz 2
Given the circuit below, find
(a) Expression for the logic function
(b) Longest delay in original circuit
A
5ns
7ns
7ns
B
C
2ns
Y
Solution: Given the circuit below, find
(a) Original logic function
(b) Longest delay in original circuit
A
5ns
7ns
7ns
B
C
2ns


Y   AC   B  C  C
Longest Delay = 7ns+7ns = 14ns
Y
Timing Quiz 3
Given the circuit below,
(a) Using Boolean Algebra, minimize the logic function
(b) Longest delay in minimized circuit
Delay times are
NOT gates= 2ns; AND,OR gates= 5ns
NAND, NOR gates= 7ns; XOR gates: 10ns
XNOR gates: 12ns
A
5ns
7ns
7ns
B
C
2ns
Y
Solution: Given the circuit below, find
(a) Minimized logic function
(b) Longest delay in minimized circuit
Delay times are
NOT gates= 2ns; AND,OR gates= 5ns
NAND, NOR gates= 7ns; XOR gates: 10ns
XNOR gates: 12ns
You can show
A
5ns
7ns
7ns
B
C
2ns
Y
Y  AC
Solution: Given the circuit below, find
(a) Minimized logic function
(b) Longest delay in minimized circuit
Delay times are
NOT gates= 2ns; AND,OR gates= 5ns
NAND, NOR gates= 7ns; XOR gates: 10ns
XNOR gates: 12ns
Y  AC
C
5ns
A
2ns
Longest delay is 7ns
Y
Solution Expanded


Y   AC   B  C  C


Y  AC B  C C  AC  B  C  C
 
 AC C  ( A  C )C  AC
Y  AC
Given the circuit below,
(a) Using a Truth Table and a K-map, minimize the logic
function
A
5ns
7ns
7ns
B
C
2ns
Y
Solution
Do yourself!