Rice Workshop LV-Maraton 20081218 - CMS

Download Report

Transcript Rice Workshop LV-Maraton 20081218 - CMS

Rice Workshop
Wiener Maraton LV
This document covers the LV system from the AC supplied by CMS
through the Maratons and to the individual PCrates and CSC JBs
Prepared by
Fred Borcherding, Sasha Golyash,
Petr Levchenko
7-Jan-2009
Rice Workshop Maraton LV
1
Maraton OPFC Modules
•
Three Problems with OPFC modules
– 1 Dramatic failure of front panel switch
– 2 Switching off/on module too quickly causes failure
– 3 No remote control or monitoring
•
Problem 1 >> The front panel switch has been a problem – the quick connect
to the switch becomes resistive
– The front panel rocker switch on the module serves several functions
• It is the circuit breaker for the power OUTPUT > note that it does not interrupt the input
AC at its source
• It flashes in different patterns to indicate the status of the module
–
Among them – DC off => module switched off, DC on => module switched on and operating
properly
– The front panel switch is connected to the input power via a quick connect
• This can become resistive over time
• We see this problem but ECAL does not >> but we are running near max power to
module, ECAL runs at just above 1/2
– CMS Electronics Group (Magnus & Sharp) along with CSC have been working on
getting a solution
• Working on getting an agreement with Wiener to replace the quick connect with soldered
connection
• Possibly just solder existing connections in place
– We have no schedule on when this would be done
– Petr and I have been and will continue working on this problem
7-Jan-2009
Rice Workshop Maraton LV
2
Maraton OPFC Modules
• Problem 2 >> The Soft Start can fail and damage module
– Turning the power OFF and then back ON too quickly can cause the
failure
– This is a feature of the slow ramp-down inside the module
• When the module is switched off the output voltage is slowly ramped down
over seconds
• If the module is switched back on before the output voltage has settled to zero,
the soft start is bypassed
• The module tries to immediately restore the full 385VDC output voltage
• This can cause damage inside the module.
– For now the fix is administrative
• Only experts cycle switches > with proper time interval observed
• Shifters instructed (& procedures) to switch ON only if they are found OFF >
never cycle
– No agreement with Wiener on a hardware solution to this problem (see
problem 3) >> this problem will be pursued through CMS
7-Jan-2009
Rice Workshop Maraton LV
3
Maraton OPFC Modules
•
Problem 3 >> No remote control or monitoring on module
– The module as is has no external monitoring or control capabilities
– But a connection to its internal processor is installed inside the module case
• This connection is used by the vendor to monitor and control the module for testing before
and after repair
• This connection is also used to program the processor
• This is an RS232 protocol connection
– CMS is in discussions with Wiener to move this connector to the outside of the
module case
• Modules would be sent back to Wiener a few at a time, modified and returned
• CMS would be responsible for the software to communicate with the module and bridge
across to DCS
• Envision one PC for CSC, and one for ECAL to commincate with modules and with DCS
– External control & monitoring system would have to be rebust
• Do NOT want to interfere with processor operation >> remember the program can be
changed and corrupted via this link
• Need to have the system robust, a bad command to a single module takes out 18 CSC.
– We have no agreement and of course no schedule for this upgrade >> we will
continue to pursue this through CMS Electronics
7-Jan-2009
Rice Workshop Maraton LV
4
Maraton
• Two Problems (concerns) with LV System
• Problem 1 >> Problems with LV distribution system downstream of
Maratons
– A problem with loose and improper crimped connections was discovered
at the LV output of a Maraton
– Power was turned off and a manual inspection was made of all
connections and needed repairs made before power was restored to system
– Subsequently the connections were systematically rechecked manually
and with temperature sensors, plus the individual current outputs were
analyzed for the system looking for resistive channels.
• Future >
– Repeat systematic check manually, with temperature sensors, and looking
at current upon re-start in 2009
– Repeat for effected parts of the system whenever an intervention is made
– Repeat checks monthly until beam
– Repeat as deemed necessary from then on
• Sasha will continue leading in this effort
7-Jan-2009
Rice Workshop Maraton LV
5
Maraton
•
Problem 2 >> Checking and rectification between documentation and as built
–
–
–
–
–
–
Are the PCrates and JBs connected to the Maraton channels we think they are? And does DCS
control and monitor the hardware modules it thinks it does?
We started with design documentation of connectivity
Added rack names (i.e. X2A41c) and CSC names (i.e. VME+2/6) to the documentation
Added DCS names to documentation
Then systematically controlled each Pcrate and JB to check the combined document (did not do
YE+-3, but that is almost trivial)
Found discrepancies >>
•
•
•
•
•
One in Maraton to Pcrate routing >> being investigated / fixed by Sasha
One in OPFC to Maraton routing >> being investigated / fixed by Petr
Recommend that on re-startup after shutdown (and fix completion) this test be re-done
>> takes about one 8 hour shift
Testing will be led by me with shifters help, fixes will be made by Sasha, Petr, or Valerii
as appropriate.
Document at:
–
–
http://cms-emu-slicetest.web.cern.ch/cms-emuslicetest/904/Documentation/LVPS/Maraton_output_terminals(3).htm
http://twiki.cern.ch/twiki/bin/view/CMS/CSCelectronics#LV
7-Jan-2009
Rice Workshop Maraton LV
6
CANBus problems with Maratons
Also see DCS discussions for DCS monitoring and control considerations for this problem
•
Observed but not yet diagnosed
–
CANBus PC crashes – could be symptom or cause
•
•
–
–
–
–
•
Note LV stays ON and hardware protections stay in place
But DCS monitoring and software protections are lost
Reboot computer
Cannot communicate with Maratons
Cycle power to Maraton(s) >> Then communication can be restored
Works BUT have to restart and reinitialize electronics for multiple crates plus 9 chamber per
crate. Also cannot be done on the fly but requires time between runs.
Is this problem in the CANBus or the interface
–
–
Cannot isolate the problem between these two
Recently have been in communication with CMS people about CANBus
•
CMS has a CANBus analyzer that we might use to
–
–
–
Will start the process of borrowing, learning, and using this in January
•
–
•
Isolate the problem between the bus and interface
Analyze our bus looking for problems and to optimize its operation
Petr will provide this support
When we are confident that CANBus and hardware are optimum further operation of the
system should uncover only any existing server problems.
Note that there is also a similar CANBus communication problem with the Pcrate PCB’s
>> this may or may NOT be related.
7-Jan-2009
Rice Workshop Maraton LV
7
Maraton Inventory from Peter
• -- how many do we have? -- How many spares do
we have? *CSC LV system spare status:
•
•
•
•
•
*Module name
*1. Maraton, PS
*2. Maraton, bin
*3. OPFC
*4. OPFC Crate
| # installed
| 36
| 36
| 36
|6
| # hot spares | # received | # sent | # units |# status
|3
| 40
|1
|0
| OK
|2
| 38
|0
|0
| OK
|4
| 40
|0
|0
| OK
|2
|8
|0
|0
| OK –
• Where are the spares? -> IN ISR/USC/S4 –
• What level of engineering effort is needed to
maintain the element? -> Study situation in
ISR/904/P5 and implement solution in EMU
7-Jan-2009
Rice Workshop Maraton LV
8
Summary
• The Maraton LV system has some problems that need to be
addressed.
–
–
–
–
PFC, Maratons, and LV distribution cabling
Some of the work can be done before power is again available.
Most work needs to be done as part of the LV startup of the system
Other work needs interfacing with CMS and Weinter persons.
• Do not have a good estimate of the time required
– Prior to startup
– During startup
– After startup
1 week
2 weeks
Unknown
• This work may be disruptive of ‘normal’ running so will need to be
scheduled
7-Jan-2009
Rice Workshop Maraton LV
9