Transcript ppt

SiLab presentation on Reliable Computing
Combinational Logic Soft Error Analysis and Protection
Ali Ahmadi
May 2008
Outline
• Introduction
• soft error analysis of combinational logic
• Techniques for Protection of combinational
logic to soft error
Why Combinational Logic?
• increase in the amount of combinational logic per chip,
increase in frequency of operation and a decrease in the logic
gate dimensions, the contribution of combinational logic soft
error rate to the MTTF is increasing.
• Several studies have been and are being conducted to design
which are more robust against these soft errors [1].
• According to [3, 4], the logic soft error rate (logic SER) will
equal the unprotected memory soft error rate in 2010.
• little attention has been paid to increase the robustness
of combinational logic elements.
Soft error analysis
• Soft errors arise from the interaction of alpha particles and
cosmic neutrons with silicon. The electron-hole pairs
generated by this interaction induce current pulses at transistor
junctions, which can result in a logical fault.
• standard cells aren't equally sensitive, and all transistors within
a standard cell don’t contribute equally to the logic SER of a
gate.
• in combinational logic, a particle strike causes a transient
error, manifesting itself as a glitch propagating to the primary
outputs or the next level of flip-flops
Soft error analysis
n-transistor directly connected to the output, has
the highest probability of inducing a transient.
1- higher mobility of electron
2- location of transistor
The failure rate values in the table are rounded values based on critical
charge simulations using models calibrated with data from alpha and
neutron SER measurements on memories.
[1]
Circuit level protection of SER
• Transient probability depends heavily on the input
combination, which influences the amount of sensitive area
and the total drive strength of the transistors driving the output
node.
• Use gate multiplication method
[1]
How to Handle Soft Error
• Error detection and retry
– using concurrent error detection (CED)
– If an error is detected, the system recovers
through rollback and retry thereby preventing
a failure.
• Error masking (TMR)
– Real-time systems
Is Soft error latched?
If a soft error occurs at an internal node of a logic
circuit, there are three factors that determine whether
it will be latched and result in a error:
1) The rate at which an SE of sufficient strength to
propagate to a latch occurs at a node
2) The probability that there exists a functionally
sensitized path from the node to a latch
3) The probability that the SE is captured in latch
[3]
Techniques for protection
• A well known circuit level design approach is triple
mode redundancy (TMR).
- More than 200% overhead
- Sensitivity of voter itself to soft error
• Partial error masking
Cluster sharing reduction
Dominant value reduction
Partial error masking ( Cluster sharing reduction)
• soft error susceptibility of certain nodes in the logic
circuit can be orders of magnitude higher than that of
the other nodes in the design.
• nodes with low observability and controllability be
clustered together.
• The clusters are removed from two out of the three
copies of the TMR design.
[2]
Partial error masking ( Cluster sharing reduction)
Highlighted gates clustered.
[2]
Partial error masking (Dominant value reduction)
• Differentiates between the logic 0 and logic 1 soft
error susceptibility of a primary output. The idea is to
identify such outputs and replace triplication by
duplication in such instances.
Partial error masking (Dominant value reduction)
original failure rate is the sum of the logic 0 and logic 1
failure rates for a single copy of the circuit.
[2]
Partial error masking
Combine Cluster sharing reduction and Dominant value reduction
[2]
Simulation results of Partial error masking
[2]
New technique for Soft error Protection
• Failuer Rate of a Gate
– Type of gate
– Input vector of gate
• contributions of all gates in the circuit are affected by
input vectors
• Disadvantageous of last techniques
– Didn’t consider combination of input vectors
Input stimuli methods
• Statistical Input Vector (SIV): uses only one (statistical)
input vector
• All Possible Input Vectors (AIV)
• Random Input Vectors (RIV) - This method uses a certain
number of random input vectors.
Circuit with 50% SER reduction by duplicating only 3 gates
[1]
References
[1] A. K. Nieuwland, S. Jasarevic, G. Jerin ” Combinational logic soft error analysis
and protection”, 12th IEEE International On-Line Testing Symposium (IOLTS'06)
[2] K. Mohanram, N. A. Touba” Partial error masking to reduce soft error failure rate in
logic”, 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI
Systems (DFT’03).
[3] W. Wang, H. Gong, ”Edge triggered pulse latch design with delayed latching edge
for radiation hardened application”, IEEE Trans. on Nuclear Science, vol 51, no. 6,
pp. 3626-3630, Dec. 2004.
[4] R. Baumann, “The impact of technology scaling on soft error rate performance and
limits to the efficacy of error correction”, IEDM , pp. 329-332, 2002.