Timing Error Tolerance in Pipeline Based Core Designs

Download Report

Transcript Timing Error Tolerance in Pipeline Based Core Designs

Timing Error Tolerance in
Pipeline Based Core Designs
Stefanos Valadimas
University of Athens
Dept. of Informatics &
Telecommunications
+302107275346
[email protected]
Angela Arapoyanni
University of Athens
Dept. of Informatics &
Telecommunications
+302107275314
[email protected]
Motivation
•
The huge number of paths in ICs, the increased path delay deviations and the
manufacturing defects may result in timing errors that are hard to detect in high
frequency, high device count ICs. Thus, defective ICs may pass fabrication tests.
•
Crosstalk and power supply disturbance are known cause for timing errors.
•
Aging mechanisms like Negative Bias Temperature Instability (NBTI) and Hot
Carrier Injection (HCI) become important and increase timing error rate.
•
Modern may suffer from timing errors. systems running at multiple frequency and
voltage levels
Concurrent testing techniques for error detection and correction
are becoming mandatory in order to achieve acceptable levels of
error robustness and meet reliability requirements.
Outline
• The Razor technique
• The proposed error detection and correction
mechanisms
• Comparisons and experimental results
• Conclusions
The Razor Architecture
•Additional cost: 1 Latch, 1 XOR, and 1 MUX
per Flip-Flop
The Time Dilation Technique
CLK
Logic Stage
Sj
TD Register
TD Flip-Flop
D
0
MUX
M
Main
Flip-Flop
Logic Stage
Sj+1
Q
1
Comparator
XOR
Error_L
OR
Error_Rj
Memory
Error
Flip-Flop
Mem_CLK
• Additional cost: 1 XOR and 1 MUX per Flip-Flop
• The multiplexer with the feedback acts as a latch
and captures the delayed valid data
Scan Version
CLK
TDS Register
TDS
Flip-Flop
Logic Stage
D
0
Scan_IN
0
MUX-B
1
1
MUX-A
Sj
M
Main
Flip-Flop
Logic Stage
Sj+1
Q
Scan Flip-Flop
Error_L
XOR
MUX-Latch
...
Error
Capture
Circuitry
Memory
Scan_EN
...
OR
Error_Rj
Error_R1
Error_Rm
(a)
Mem_CLK
Error Capture Circuitry
Error_R1
Error_Rj
Error_Rm
...
...
Error
OR
Error
Flip-Flop
OR
Memory
Delay
Mem_CLK
Scan_EN
(b)
• When the
Scan_EN is
“high” the TDS
Flip-Flop
operates like a
Scan Flip-Flop
The Freezing TDS Flip-flop
CLK
Freezing TDS
Flip-Flop
0
1
1
0
Scan_EN
M
MUX-B
Scan_IN
MUX-A
D
Memory
Main
Flip-Flop
• Freezing TDS FlipFlop is placed at
Q
the end of fast
paths.
• Delay buffers are
inserted only in
fast paths that
intersect with
time critical paths
Q output of the Main Flip-Flop drives MUX-B
instead of the M line
The Error Detection And Correction
Technique
CLK
Register(j)
Logic Stage
F
Logic Stage
Sj
D
Main
Flip-Flop
Sj+1
Q
XOR
2
Error_F
XOR
1
EDC
Flip-Flop
OR
Error_Rj
Latch
Cmp
D
R
CK
(a)
Pulse
Reset
(b)
CLK
XOR
AND
Pulse
Clock Gating Signal Generation
to the core
SYS_CLK
Reset
Release
Unit
Error_R1
Error_Rj
Error_Rm
...
...
Error
OR
Core level
clock gating
D CL
Error
Flip-Flop
Delay Element
Block
AND
CLK
to the core
Simulated waveforms of the Time
Dilation technique
Error Correction
Memory State
Error Injection
Extended Memory State
Error
Detection
Erroneous Data
Correct Data
•32bit MIPS R2000 RISC microprocessor core, using UMC 90nm
technology and the standard cells of Faraday Technology
Simulation Results of the Error
Detection and Correction technique
Mips comparison results
Power (mW)@
125Mhz
Area (μm2)
Original
design
Razor
5.96
224116
7.88
258238
Time
Dilation
EDC
6.87
255020
6.24
228649
Comparison
• In the proposed techniques, 92 Flip-Flops are replaced
by the error detection Flip-Flops. In case of Time
Dilation the rest Flip-Flops are replaced by Freezing
TDS Flip-Flops, while in EDC technique the rest FlipFlops remain unaltered.
• The Time Dilation approach offers lower silicon area
and power requirements as compared to Razor, since
no additional memory elements are required.
• The EDC technique offers even lower silicon area and
power requirements due to the fact that
 A flip-flop is replaced, only if it is at the end (output) of a
time critical path (slow path) in a logic stage.
 Only the fast side paths of a time critical path must
comply with the minimum delay constraint.
Conclusions
• We presented two timing error tolerance techniques
for enhanced reliability in flip-flop based nanometer
technology cores.
• The first is based on a new scan flip-flop.
• The second is based on on a new bit flipping flip-flop.
• The proposed approaches are characterized by:
multiple error detection capability
single clock cycle for pipeline recovery
low silicon area requirements
reduced power consumption