p141_buchner_s - NASA Office of Logic Design

Download Report

Transcript p141_buchner_s - NASA Office of Logic Design

Single Event Effects Testing of the
Atmel IEEE1355 Protocol Chip
Stephen Buchner1, Mark Walter2, Moses McCall3 and
Christian Poivey4
1QSS
Group Inc., Lanham MD 20772
2Orbital Science Corp, Dulles VA 99999
3NASA-GSFC, Greenbelt MD 20771
4SGT-Inc, Greenbelt MD 20771
Buchner
1
2004 MAPLD: 141
What is IEEE 1355?
• IEEE 1355 specifies the physical media and low level protocols
for a family of serial interconnect systems.
• The speeds and media range from 10 Mbps to 1Gbps in both
copper and optical technologies and are scalable.
• Data are transmitted between nodes via packets. Each packet
consists of a header, a data section, and a CRC section (to flag
errors).
• The header contains information concerning the destination
node, data format, packet length, etc.
• The protocol is based on the “Seven-Layer Open System
Interconnect Model”.
• Routers determine paths taken by packets through the internet.
Buchner
2
2004 MAPLD: 141
IEEE 1355 Used in Solar Dynamic
Observatory (SDO)
Atmel ASICS located on NIC cards
EVE
NIC
Kalpha
KA
card
IEEE1355
AIA
150 Mbps
NIC
RF
NIC
HMI
NIC
PSE
ACS
mP
1553
Buchner
S
3
2004 MAPLD: 141
Implementation
• Transmission protocols are controlled by an ASIC manufactured
by Atmel.
• The ASIC is implemented in a 0.6 mm three-level metal CMOS
“Sea of Gates” technology (MG1RT).
• The ASIC contains logic elements, registers, memory and a
phase lock loop.
• The device is a TSS901E Atmel chip capable of running three
channels. It is mounted on a 4LINKS board that can be plugged
into a computer slot.
Buchner
4
2004 MAPLD: 141
Radiation Effects
• Ionizing particles in space produce both total ionizing dose (TID)
degradation and single event effects (SEEs) in electronic
circuits.
• TID causes a gradual degradation in performance manifested
through increased leakage currents, slower operation and
eventually functional failure.
• SEEs can take many forms. We are concerned with single event
latchup (SEL) and single event upset (SEU). SEL may lead to
destructive failure and SEU may halt operation. The SEU is then
termed a single event functional interrupt (SEFI).
Buchner
5
2004 MAPLD: 141
Previous Radiation Testing of
Atmel Chip
• No parametric degradation up to a TID level of 40 Krad(Si).
• SEL threshold exceeds 120 MeV.cm2/mg.
• SEU testing of individual latches and memories only. Revealed a
relatively low threshold that depended on supply voltage. (Lowest
LETth = 12 MeV.cm2/mg.)
• The low LET threshold implies possible proton sensitivity.
• No SEU data for chip configured as an IEEE1355 protocol control
circuit. Therefore, presence and consequences of SEFIs not known.
Also, PLL not tested and it could exhibit a frequency dependence.
Buchner
6
2004 MAPLD: 141
Hardware for SEE Testing
NIC
ASIC (DUT)
exposed
to ion beam
Computer B
Computer A
Three
cables
ASIC not
exposed
to ion beam
Buchner
Extender Card
7
2004 MAPLD: 141
Hardware for Testing
Atmel
Asic
Extender
Card
Buchner
8
2004 MAPLD: 141
Software for SEE Testing
• Step 1. Designate Master and Slave computers.
• Step 2. Start Master before Slave.
• Step 3. Master in each channel sends a “flow control character”
to the Slave, requesting the Slave to send one byte of data
back.
• Step 4. The Slave generates a packet consisting of a “Header”
containing flow control characters followed by one byte of data
(A5). Parity bits are added to flag any errors that may arise in
the header or data parts of the packet.
• Step 5. Packet is transmitted from Slave to Master.
Buchner
9
2004 MAPLD: 141
Irradiation Conditions
Ion
LET
(MeV.cm2/mg)
Cu
20.7
Kr
29.3
Xe
53.9
Au
87.4
• DUT configured to be Master and Slave.
•
•
Buchner
Frequencies: 6 MHz, 80 MHz, 100 MHz, 140 MHz.
Supply voltage = 5.0 V.
10
2004 MAPLD: 141
Expected Errors
• Errors may occur in the packets, in either the header or data
parts. This will be flagged by the extra parity bits used by the
CRC. Errors in a packet can occur when the packet is
temporarily stored in either one of the two FIFOs – one for
transmission and one for reception.
• Errors may occur in the registers containing data used to
configure the network. There are 96 such registers of which 40
can be read because they are fixed and only used once when
communications are first established. The remaining 56 are
dynamic and errors in those registers can cause SEFIs.
• Errors in the PLL can cause a loss of synchronization that
results in a SEFI.
Buchner
11
2004 MAPLD: 141
Results
• SEFIs observed at all LETs. However, only required a software
restart and not a full power cycle.
• No latchup observed.
• Error rate independent of frequency.
• All errors appeared only in the Master independent of whether
the DUT was configured to be the Master or the Slave. This is
because only the Master detected errors while the Slave acted
as a “dumb” terminal.
Buchner
12
2004 MAPLD: 141
Results
• Communications were halted when one, two or three links were
dropped.
• Only 40 of the 96 registers could be monitored for SEUs. The
SEU threshold for those registers was greater than 29.3
MeV.cm2/mg
• The SEU cross-section for Link drops and Link errors is below
20 MeV.cm2/mg. The lower LET threshold is most likely due to
errors in the PLL.
Buchner
13
2004 MAPLD: 141
Results
Master configuration
Xsection (cm 2/link)
1.E-03
1.E-04
link err 80 MHz
1.E-05
link err 100 MHz
1.E-06
link err 140 MHz
1.E-07
link drop 80 MHz
link drop 100 MHz
1.E-08
link drop 140 MHz
1.E-09
1.E-10
0
20
40
60
80
100
LET (MeVcm2/mg)
Buchner
14
2004 MAPLD: 141
Results
2
Xsection (cm /register)
Master configuration
1.E-03
1.E-04
1.E-05
reg 80 MHz
1.E-06
reg 100 MHz
1.E-07
reg 140 MHz
1.E-08
1.E-09
1.E-10
0
20
40
60
80
100
LET (MeVcm2/mg)
Buchner
15
2004 MAPLD: 141
Summary and Conclusions
• Results of SEE testing of the Atmel ASIC 3-channel
chip:
–
–
–
–
–
Buchner
Not sensitive to SEL.
Upsets take the form of either link drops or register errors.
Error rate is not sensitive to frequency.
SEFIs require a software restart and not a power reset.
Errors have a low LET threshold, which means that the error
rate must be calculated and methods implemented to
immediately initiate a software restart.
16
2004 MAPLD: 141