Transcript Lecture1
Synchronous Digital Design
Methodology and Guidelines
Digital System Design
Synchronous Design
• All flip-flops clocked by one common clock
• Reset only used for initialization
• Races and hazards are no problem
Why synchronous design?
• Hazard
– Problems due to timing that cannot be observed
from functional analysis
Timing Hazard
• Static hazard: possibility of a brief signal
value change when the signal was expected
to be stable, due to timing (glitch)
• Dynamic hazard: possibility of multiple
output transitions caused by a single input
transition due to multiple signal paths with
different delays
Static Hazard
1
2
3
4
5
6
7
8
9
10
I0
I1
S
I2
I3
I1
I3
I4
Y
S
Ideal transition (no delays)
Y
I2
If d is the delay of each gate
I4
I0
1
2
3
4
5
6
I0
I1
Logic Circuit
S
I2
I3
I4
Y
glitch
Realistic transition
7
8
9
10
Analyzing Static Hazards using
Karnaugh maps
I1
I1I0
I3
S
S
Y
I2
I4
I0
00 01 11 10
0
0
1
1
0
1
0
0
1
1
Logic Circuit
I1
I1I0
S
I3
00 01 11 10
0
0
1
1
0
1
0
0
1
1
S
Y
I2
I0
I4
Logic Circuit without hazard
A static hazard can
occur when
changing a single
input variable
causes a jump from
one prime implicant
to another
Solution: include an
additional prime
implicant
Eliminating hazards using FlipFlops
1
Clk
I0
I1
I1
I3
S
S
D
SET
Q
I2
I2
I3
CLR
I0
Y
I4
Clk
Q
I4
D
Q
Logic Circuit
2
3
4
5
6
7
8
9
10
Synchronous Design
• Three things must be ensured by the
designer:
– Minimize and determine clock skew
– Account for flip-flop setup and hold times
– Reliably synchronize asynchronous inputs
Timing Analysis
CLOCK
Q
Propagation
delay
Combinational path
delay
D
Slack
Setup
time
Hold
time
>0 Setup time margin
>0 Hold time margin
Clock skew
IN
D
SET
Q1
D
Q
SET
Q2
Q
CLK
CLK2
CLR
CLK
Q
CLR
Q
IN
CLK2
Q2
Example
• Determine the maximum frequency of the
following circuit with and without skew
D
SET
CLR
Q
Q
Clock Jitter
Clock Gating
• Clock gating is done to disable the clock for
low power consumption using a clken
signal
• It is wrong to gate the clock in the
following way, instead use a synchronous
load (enable) signal
D
SET
Q
CLK
EN
CLR
Q
Asynchronous Inputs
It is impossible to guarantee setup and hold timing
constraints on inputs synchronized with a clock
unrelated to the system clock
ASYNCIN
D
SET
CLR
CLK
(SYSTEM CLOCK)
CLK
ASYNCIN
SYNCIN
Q
Q
SYNCIN
SYNCHRONOUS
SYSTEM
Asynchronous inputs
• Synchronize only in one place
ASYNCIN
D
SET
CLR
Q
SYNCIN1
Q
SYNCHRONOUS
SYSTEM
CLK
(SYSTEM CLOCK)
D
SET
CLR
SYNCIN2
Q
Q
Metastability
• Metastability is a phenomenon that may occur if the setup
and hold time requirements of the FF are not met, leading
in the output settling in an unknown value after unspecified
time.
Reliable synchronizer design
Example
• Design a synchronizer that synchronizes
two inputs async1 and async2 generated
with a 50 MHz clock CLK1, to a system
with a 33 MHz clock CLK2 totally
independent of CLK1. Draw appropriate
timing diagrams.
Mean-time between failures
exp( t r / )
MTBF (t r )
T0 f
f: frequency of flip-flop clock
a: number of asynchronous input changes per second in
flip-flop input
To, τ: constants depending on flip-flop electrical
characteristics
Assume a 10 Mhz clock, ts = 20 ns, To = 0.4 sec, τ = 1.5 ns
and that the asynchronous input can change 100,000 times
per second, then
tr = 1/f – ts = 80 ns
MTBF(80ns) = exp(80/1.5)/0.4×10^7×10^5= 3.6×10^11 s
Cascaded synchronizer
Synchronizing bus transfers
• Do not use dual f/f synchronizers in all
bits, this will only increase the chances
of metastability
• Synchronize the control signals and
read the input when safe to do so
1
2
3
4
5
VALID
SYNCHRONOUS SYSTEM
DATA
ACK
6
7
8
9
10
VALID
DATA
invalid
valid
invalid
ACK
1
2
3
4
5
6
7
8
VALID_ASYNC
DATA_ASYNC
invalid
valid
invalid
Clock
VALID_SYNC1
VALID_SYNC2
DATA_SYNC
ACK
invalid
valid
invalid
9
10
Synchronization circuit
1
2
3
4
5
6
7
8
VALID_ASYNC
DATA_ASYNC
invalid
valid
invalid
Clock
VALID_SYNC1
VALID_SYNC2
DATA_SYNC
ACK
invalid
valid
invalid
9
10
FIFO Synchronizer basic concept
• On burst transfers, the receiver
cannot afford to wait for the signal VALID
to settle.
DATA
• Solution: A dual-port RAM FIFO
ACK
• Problem: How do we synchronize
the counters?
1
2
3
4
SYNCHRONOUS SYSTEM
5
6
7
8
9
VALID_ASYNC
DATA_ASYNC
valid0
invalid
valid1
valid2
CLK1(independent)
valid3
invalid
CLK2 (system clock)
DATA_ASYNC
DATA_SYNC
DUAL-PORT RAM
VALID_ASYNC
COUNTER
COUNTER
10
Summary
• In order to avoid hazards and races, synchronous
design is used
• In synchronous design a single common clock is
used and reset is only used for initialization
• The only considerations in synchronous design are
the flip-flop setup and hold times, clock skew and
asynchronous input synchronization
• Asynchronous inputs are commonly synchronized
using 2 flip-flops clocked with the synchronous
system clock
• Synchronization should only be done in one place
• In bus transfers, synchronize only the control
signals or use a FIFO
Design trade-offs
Common design trade-offs
• Performance
– Latency
– Throughput
– Delay (timing)
• Area
– Gates (ASIC)
– Flip-flops/LUTs (FPGA)
• Power consumption
– Dynamic
– Static
– Leakage
Design for Speed
• Design for High Throughput
– Definition: High data rate, acceptable latency
– Technique: Pipelining
• Design for Low Latency
– Definition: Output available as soon as possible
– Technique: Parallelism, Removal of pipelining
• Design for Timing
– Definition: High clock speed, low delay between
registers
– Technique: Add intermediate registers
Example 1: Design for low latency
(parallelism)
• X=a+b+c+d
Critical Path
Critical Path
a
a
+
+
b
b
+
+
c
d
c
+
+
x
d
Delay = 3*add
Latency = 1 cycle
Throughput = X bits/clock
Delay = 2*add
Latency = 1 cycle
Throughput = X bits/clock
x
Example 1: Design for delay
• X=a+b+c+d
a
+
b
R
E
G
+
c
+
d
R
E
G
Delay = 1*add + Reg
Latency = 2 cycles
Throughput = X bits/clock
x
Example 2: Design for delay
x=0;
for (i=0; i<4; i++)
x+= a[i]*b;
+
a[i]
X
b
Critical path
Delay: 1*Mul + 1 Add
Latency: 4 cycles
Throughput: X bits/4 cycles
R
E
G
x
Example 2: Design for latency
Example 2: Design for throughput
Design for Area
• Resource (logic) sharing
• Rolling up the pipeline
Resource Sharing
• Y= C1* X[0] + C2 *X[1] + C3*X[2]
• Is it possible to perform all multiplications
with a single multiplier?
• Is it possible to perform all additions with a
single accumulator?
Resource Sharing
R
E
G
X
R
E
G
+
R
E
G
Design for low-power
• Power components:
• Dynamic power consumption (switching):
power consumed due to charging and
discharging parasitic capacitances on gates
and wires
• Static power consumption: Power
consumed when no switching
• Leakage current power consumption:
Design for power
• Clock Gating
• Dual-edge triggered Flip-Flops
• Lowering core voltage
Clock Gating
• Clock gating is done to disable the clock for
low power consumption using a clken
signal
• It is wrong to gate the clock in the
following way, instead use a synchronous
load (enable) signal or a global clock
multiplexer (if available) D Q
SET
CLK
EN
CLR
Q
Dual-Edge Triggered Flip-Flops
1
2
3
4
5
Single-edge triggered FF
Clock
Q
1
Clock
Q
2
3
4
5
Dual-edge triggered FF
(same data rate)
• Dual-edge triggered flip-flops should only be
used if available in the target technology
• Otherwise, redundant flip-flops and gating will be
used to emulate the desired functionality
Lowering core voltage
• Only reduce core voltage within acceptable
limits (5 to 10%)
• Power consumption in a simple resistor is
proportional to the square of the voltage
• Keep in mind that performance will degrade
too
Review questions/problems
• Pipelining will make your circuit
–
–
–
–
A. smaller
B. exhibit lower latency
C. Consume less power
D. exhibit higher throughput
• Parallelism creates a
–
–
–
–
A. latency/throughput trade-off
B. Performance/area trade-off
C. Area/power consumption trade-off
D. performance/power consumption trade-off
• Pipeline the following datapath for a three-cycle latency so that you
get the maximum operation frequency. How much is the maximum
operation frequency?
input
COMB1
5 ns
COMB2
3 ns
COMB4
2 ns
COMB3
4 ns
COMB1
1 ns
output