Status of Maraton problem - Indico

Download Report

Transcript Status of Maraton problem - Indico

Status of Maraton problem
Niko Neufeld
Calo Commissioning meeting 17-07-07
The problem
• When using the Ethernet access to the RCM
Niko Neufeld
CERN, PH
2
Setup
Crate 1
or LV 1
Rectifier
USB
Ethernet
Maraton
Crate 2
or LV 2
RCM
Niko Neufeld
CERN, PH
3
Software Stacks
PVSSscript
OPC Client (PVSS00opc)
OPC Server
MuHsE
webbrowser
SNMP/UDP/IP/Ethernet
USB
HTTP/Ethernet
Niko Neufeld
CERN, PH
4
The Problem
•
•
When using PVSS panels a command to the
Maraton will “grey” the button, the panel is then
blocked
When this happens the following can cure the
problem:
1. kill the OPC server (restart the PVSS panel) - sometimes
2. kill the OPC server and re-register it - sometimes
3. reset (= powercycle) the RCM - sometimes (this does not
seem to work for the Inner Tracker only)
4. reset the RCM and power-cycle the rectifier - always (this
is done only by the Inner Tracker)
Niko Neufeld
CERN, PH
5
Facts
•
•
•
•
When 3.) or 4.) happens the web-interface still works
and shows that the last command has been ignored
– --> the RCM has not crashed
When 3.) or 4.) happens the Muhse tool shows readerrors and it is impossible to send commands
– --> the command processor loop (or whatever it is) in the
RCM is stuck
When the Muhse tool alone is used, no problem has
shown up anywhere ever
In the Calo at least the problem has been observed
on many RCMs in uncorrelated circumstances
Niko Neufeld
CERN, PH
6
Who has seen / has not the
problem
•
•
•
•
•
The Calorimeter
– frequently in the Pit
– never in the test-setup in 157
The RICH
– occasionally in the Pit
The IT
– frequently in there test-setup in Bat 13
IT/CO
– cannot reproduce this on their (single RCM) test-stand
The other LHC experiments
– but it is not clear how often this has actually been used
Niko Neufeld
CERN, PH
7
Sad facts
• Until today I have been incapable of
reproducing the problem
– torture script in 156
– letting the voltage on for a long time
Niko Neufeld
CERN, PH
8
Speculation / Further tests
•
•
•
Does the size of the system matter?
– probably not so much CALO >>> IT (lab)
Does the OPC configuration matter?
– not clear - should test this
Does the LAN matter?
– IT uses CERN general purpose network (not so clean)
– CALO uses LHCb network on a CALO private VLAN - should
be reasonably clean
– --> could try a totally private network (between computer and
RCMs only)
Niko Neufeld
CERN, PH
9
Checklist for the future
• When the problem occurs, please
– note the time, what was the last action you tried,
the name of the RCM, what was needed to cure it,
if it is not always the same also copy the OPC
configuration
– if you can inform me (Niko) - in that case do not try
to cure it yet
Niko Neufeld
CERN, PH
10