CMP116: Teste de Sistemas de Hardware

Download Report

Transcript CMP116: Teste de Sistemas de Hardware

CMP238: Projeto e Teste de
Sistemas VLSI
Marcelo Lubaszewski
Aula 2 - Teste
PPGC - UFRGS
2005/I
Lecture 2 - Fault Modeling
•
•
•
•
•
Defects, Errors, and Faults
Why model faults?
Some real defects in VLSI and PCB
Common fault models
Stuck-at faults
•
•
•
•
Single stuck-at faults
Fault equivalence
Fault dominance and checkpoint theorem
Classes of stuck-at faults and multiple faults
• Other Common Faults
• Faults in FPGAs
Defects, Faults, and Errors
• Defect: unintendent difference between the
implemented HW and its intendent design
– May or may not cause a system failure
• Fault: representation of a defect at the abstracted
function level
Defect X
Imperfections in the HW
Fault
Imperfections in the function
Defects, Faults, and Errors
• Error: Manifestation of a fault that results in
incorrect circuit (system) outputs or states
– Caused by faults
• Failure: Deviation of a circuit or system from its
specified behavior
– Fails to do what it should do
– Caused by an error
• Defect --> Fault ---> Error ---> Failure
Some Real Defects in Chips
 Processing defects
 Missing contact windows
 Parasitic transistors
 Oxide breakdown
 ...
 Material defects
 Bulk defects (cracks, crystal imperfections)
 Surface impurities (ion migration)
 ...
 Time-dependent defects
 Dielectric breakdown
 Electromigration
 ...
 Packaging defects
 Contact degradation
 Seal leaks. . .
Ref.: M. J. Howes and D. V. Morgan, Reliability and Degradation Semiconductor Devices and Circuits, Wiley, 1981.
Example
• Defect: a short to ground
• Fault: signal b stuck at logic 0
1
1
a
0
0
1
z
b
• Error: z has the wrong value if a = b = 1
Example
• Defect: a short to ground
• Fault: signal b stuck at logic 0
0
1
a
1
0
1
z
b
• Error: z has the wrong value if a = b = 1
• But, if a = 0, fault exists, but no error!
Why Model Faults?
• Real defects (often mechanical) too numerous and
often not analyzable
• A fault model identifies targets for testing
– Model faults most likely to occur
• Fault model limits the scope of test generation
– Create tests only for the modeled faults
• A fault model makes analysis possible
– Associate specific defects with specific test patterns
• Effectiveness measurable by experiments
– Fault coverage can be computed for specific test patterns
to reflect its effectiveness
Common Fault Models
• Single stuck-at faults
• Transistor open and short faults
• Memory faults
• PLA faults (stuck-at, cross-point, bridging)
• Functional faults (processors)
• Delay faults (transition, path)
• Analog faults
Single Stuck-at Fault
• Three properties define a single stuck-at fault
• Only one line is faulty
• The faulty line is permanently set to 0 or 1
• The fault can be at an input or output of a gate
Single Stuck-at Fault
• Three properties define a single stuck-at fault
• Only one line is faulty
• The faulty line is permanently set to 0 or 1
• The fault can be at an input or output of a gate
• Example: NAND gate has 3 fault sites ( ) and 6 single
stuck-at faults
s-a-0 fault, s-a-1 fault
1
1
a
z
b
Single Stuck-at Fault
• Three properties define a single stuck-at fault
• Only one line is faulty
• The faulty line is permanently set to 0 or 1
• The fault can be at an input or output of a gate
• Example: NAND gate has 3 fault sites ( ) and 6 single
stuck-at faults
Good circuit value
Faulty circuit value
1
1
a
s-a-0
1 (0)
z
b
1
Single Stuck-at Fault
• Three properties define a single stuck-at fault
• Only one line is faulty
• The faulty line is permanently set to 0 or 1
• The fault can be at an input or output of a gate
• Example: NAND gate has 3 fault sites ( ) and 6 single
stuck-at faults
Good circuit value
Faulty circuit value
1
1
a
s-a-0
1 (0)
z
b
1
Test vector for a s-a-0 fault
Single Stuck-at Fault
Example: XOR circuit has 12 fault sites and 24 single stuck-at faults
Good circuit value
j
c
1
0
a
b
d
e
f
g
1
0
1
h
i
z
1
k
Single Stuck-at Fault
Example: XOR circuit has 12 fault sites and 24 single stuck-at faults
Faulty circuit value
Good circuit value
j
c
1
0
a
b
d
e
0(1)
s-a-0
g
1
1(0)
h
i
f
Test vector for h s-a-0 fault
z
1
k
Fault Equivalence
• Fault equivalence: Two faults f1 and f2 are
equivalent if all tests that detect f1 also detect f2.
• If faults f1 and f2 are equivalent then the
corresponding faulty functions are identical.
• Fault collapsing: All single faults of a logic circuit
can be divided into disjoint equivalence subsets,
where all faults in a subset are mutually
equivalent. A collapsed fault set contains one
fault from each equivalence subset.
Equivalence Rules
sa0
sa1
sa0 sa1
sa0 sa1
AND
sa0 sa1
sa0 sa1
OR
WIRE
sa0 sa1
sa0 sa1
sa0
sa1
sa0 sa1
sa0 sa1
NAND
sa0 sa1
NOT
sa0 sa1
NOR
sa0 sa1
sa0 sa1
sa0
sa1
FANOUT
sa1
sa0
Equivalence Rules
sa0 sa1
sa0
sa0
sa1
sa1
sa0 sa1
AND
sa0 sa1
sa0 sa1
OR
WIRE
sa0 sa1
sa0 sa1
sa0
sa1
sa0 sa1
sa0
sa0 sa1
NAND
sa0 sa1
NOT
sa1
sa0 sa1
NOR
sa0 sa1
sa0 sa1
sa0
sa1
FANOUT
sa0
sa1
sa0
sa1
Equivalence Example
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
Equivalence Example
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
16
Collapse ratio = ----- =
0.533
Equivalence Example
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
Equivalence Example
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
Faults in red
removed by
equivalence
collapsing
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
20
Collapse ratio = ----- = 0.625
32
Fault Dominance
• If all tests of some fault F1 detect another fault F2,
then F2 is said to dominate F1.
• Dominance fault collapsing: If fault F2 dominates
F1, then F2 is removed from the fault list.
• When dominance fault collapsing is used, it is
sufficient to consider only the input faults of Boolean
gates. See the next example.
Dominance Example
F1
s-a-1
F2
s-a-1
Dominance Example
All tests of F2
F1
s-a-1
F2
s-a-1
110
101
001
000
100
010
011
Only test of F1
Dominance Example
All tests of F2
F1
s-a-1
F2
s-a-1
110
101
001
000
100
011
Only test of F1
s-a-1
s-a-1
s-a-1
010
s-a-0
A dominance collapsed fault set
Dominance Example
sa1
sa0
sa1
sa1
sa1
sa0
sa0 sa1
sa1
sa1
sa0
sa1
sa1
sa1
sa1
sa0
Dominance Example
sa1
sa0
sa1
sa1
sa1
sa0
sa0 sa1
sa1
sa1
sa0
sa1
sa1
sa1
sa1
sa0
Dominance Example
sa1
sa0
sa1
sa0 sa1
sa0
sa1
sa1
sa0 sa1
sa0 sa1
sa1
sa0
sa0 sa1
sa1
sa1
sa0
sa1
Dominance Example
sa1
sa0
sa1
sa0 sa1
sa0
sa1
sa1
sa0 sa1
sa0 sa1
sa1
sa0
sa0 sa1
sa1
sa1
sa0
Faults in red
removed by
equivalence
collapsing
sa1
Checkpoints
• Primary inputs and fanout branches of a combinational
circuit are called checkpoints.
Total fault sites = 16
Checkpoints
• Primary inputs and fanout branches of a combinational
circuit are called checkpoints.
Total fault sites = 16
Checkpoints ( ) = 10
Checkpoints
• Primary inputs and fanout branches of a combinational
circuit are called checkpoints.
Total fault sites = 16
Checkpoints ( ) = 10
Checkpoint theorem: A test set that detects all single
(multiple) stuck-at faults on all checkpoints of a
combinational circuit, also detects all single
(multiple) stuck-at faults in that circuit.
Why Fault Collapsing?
• Memory & CPU- Time saving
– To ease the burden for test generation and fault
simulation in testing
Multiple Stuck-at Faults
• A multiple stuck-at fault means that any set of
lines is stuck-at some combination of (0,1)
values.
• The total number of single and multiple stuck-at
faults in a circuit with k single fault sites is 3k-1.
• A single fault test can fail to detect the target
fault if another fault is also present, however,
such masking of one fault by another is rare.
• Statistically, single fault tests cover a very large
number of multiple faults.
Why Single Stuck- At Fault Model?
• Complexity is greatly reduced.
– Many different physical defects may be modeled by the same
logical single stuck- at fault.
• Single stuck- at fault is technology independent
– Can be applied to TTL, ECL, CMOS, etc.
• Single stuck- at fault is design style independent
– Gate Arrays, Standard Cell, Custom VLSI
• Even when single stuck- at fault does not accurately
model some physical defects, the tests derived for logic
faults are still valid for most defects.
• Single stuck- at tests cover a large percentage of multiple
stuck- at faults.
Bridging Faults
• Two or more normally distinct points (lines) are
shorted together
– Logic effect depends on technology
– Wired- AND for TTL
– Wired- OR for ECL
– CMOS?
Transistor (Switch) Faults
• MOS transistor is considered an ideal switch
and two types of faults are modeled:
• Stuck-open -- a single transistor is permanently
stuck in the open state.
• Stuck-short -- a single transistor is permanently
shorted irrespective of its gate voltage.
• Detection of a stuck-open fault requires two
vectors.
• Detection of a stuck-short fault requires the
measurement of quiescent current (IDDQ).
CMOS Transistor Stuck- Short
• Transistor stuck- on may cause ambiguous logic level
– depends on the relative impedances of the pull- up & pulldown networks
• When input is low, both P and N transistors are
conducting causing increased quiescent current,
called IDDQ fault.
CMOS Transistor Stuck- OPEN
• Transistor stuck- open may cause output
floating.
Functional Faults
• Fault effects modeled at a higher level than
logic for function modules, such as
–
–
–
–
–
–
Decoders
Multiplexers
Adders
Counters
RAMs
ROMs
Functional Faults of Decoder
• f( L i /L j ): Instead of line L i , Line L j is selected
• f( L i /L i +L j ): In addition to L i , L j is selected
• f( L i /0): None of the lines are selected
Memory Faults
• Parametric Faults – Output Levels
– Power Consumption
– Noise Margin
– Data Retention Time
• Functional Faults
– Stuck Faults in Address Register, Data Register,
and Address Decoder
– Cell Stuck Faults
– Adjacent Cell Coupling Faults
– Pattern- Sensitive Faults
Memory Faults
• Pattern- sensitive faults: the presence of a
faulty signal depends on the signal values of
the nearby points
– Most common in DRAMs
• Adjacent cell coupling faults
– Pattern sensitivity between a pair of cells
PLA Faults
• Stuck Faults
• Crosspoint Faults
– Extra/ Missing Transistors
• Bridging Faults
• Break Faults
Missing Crosspoint Faults in PLA
• Missing crosspoint in AND- array
– Growth fault
• Missing crosspoint in OR- array
– Disappearance fault
Extra Crosspoint Faults in PLA
• Extra crosspoint in AND- array
– Shrinkage or disappearance fault
• Extra crosspoint in OR- array
– Appearance fault
Gate- Delay- Fault
• Slow to rise, slow to fall
– x is slow to rise when channel resistance R1 is
abnormally high
Gate- Delay- Fault
• Disadvantage:
– Delay faults resulting from the sum of several
small incremental delay defects may not be
detected.
Path- Delay- Fault
• Propagation delay of the path exceeds the
clock interval.
• The number of paths grows exponentially
with the number of gates.
State Transition Graph
• Each state transition is associated with a 4tuple:
– source state, input, output, destination state
Single State Transition Fault Model
• A fault causes a single state transition to a
wrong destination state.
Faults in FPGAs
FPGA building blocks:
E1
E2
E3
 Permanent faults:
same ASIC models apply
clk
E1
E2
 But for transients ...
E1
E3
clk
E2
E3
BlockRAM
F1
F2
F3
F4
LUT
M
ff
M
M M M M
M
M
Configuration Memory Cell
SEU
(Bit flip)
Virtex (Xilinx)
Effect of Transients in
SRAM-based FPGAs
CLB Comb. Logic:
~0.5 % of the FPGA
sensitive area
E1
E2
E3
clk
 Possible Bit flip
 Transient effect
 Corrected at the next load
E1
E2
E1
E3
clk
E2
E3
BlockRAM
F1
F2
F3
F4
LUT
M
ff
M
M M M M
M
M
Configuration Memory Cell
SEU
(Bit flip)
Virtex (Xilinx)
Effect of Transients in
SRAM-based FPGAs
CLB Flip-flops:
~0.5 % of the FPGA
sensitive area
E1
E2
E3
clk
 Bit flip
 Transient effect
 Corrected at the next load
E1
E2
E1
E3
clk
E2
E3
BlockRAM
F1
F2
F3
F4
LUT
M
ff
M
M M M M
M
M
Configuration Memory Cell
SEU
(Bit flip)
Virtex (Xilinx)
Effect of Transients in
SRAM-based FPGAs
CLB LUTs:
~8% of the FPGA
sensitive area
E1
E2
E3
 Bit flip
 Permanent effect
 Corrected by
reconfiguration
clk
E1
E2
E1
E3
clk
E2
E3
BlockRAM
F1
F2
F3
F4
LUT
M
ff
M
M M M M
M
M
Configuration Memory Cell
SEU
(Bit flip)
Virtex (Xilinx)
Effect of Transients in SRAMbased FPGAs
Routing and CLB
customization:
~91.0 % of the FPGA
sensitive area
E1
E2
E3
 Short or open circuit
 Corrected by reconfiguration
clk
E1
E2
E1
E3
clk
E2
E3
BlockRAM
F1
F2
F3
F4
LUT
M
ff
M
M M M M
M
M
Configuration Memory Cell
SEU
(Bit flip)
Virtex (Xilinx)
Summary
• Fault models are analyzable approximations of defects and
are essential for a test methodology.
• For digital logic single stuck-at fault model offers best
advantage of tools and experience.
• Many other faults (bridging, stuck-open and multiple stuckat) are largely covered by stuck-at fault tests.
• Stuck-short and delay faults and technology-dependent
faults require special tests.
• Memory and analog circuits need other specialized fault
models and tests.
• Transient faults may have permanent effects in FPGAs