CMW status - LHC commissioning

Download Report

Transcript CMW status - LHC commissioning

P.Charrue – LBCM 14 Sept 2010
For the CMW team

Current CMW issues (3)

CMW middle and long term plans

How to report issues to the Controls Group
4th May 2010
P.Charrue - LBCM
2

Description :
 JAVA clients blocked (XPOC project) and not getting data anymore from
the devices

Cause :
 Socket blocking situation in the JacORB CORBA library (part of the CMW
infrastructure) – known bug in JacORB

Occurence :
 Once to the XPOC client
 Often for the Logging infrastructure

Immediate cure :
 Restart the client application as the blocking situation cannot be
resolved

CMW proposal :
 Today: We provide a callback to the client application which detect such
blocking situation and take take action (mail, sms, alarm, restart, log, …)
 In 2 weeks: We will deliver a patch to this external Jacorb library to solve
this blocking situation;currently tested.
4th May 2010
P.Charrue - LBCM
3

Description :
 CMW Proxy is blocked due slow consuming clients

Cause :
 ‘Slow clients’ subscribed to Proxy are not consuming the data quick enough
and block many notification threads (in Proxy) resulting in a complete blocking
of the Proxy

Occurrence :
 BBQ, Hump Buster

Immediate cure :
 Kill the ‘slow client’ application as the blocking situation cannot be resolved
automatically

CMW proposal :
 A new version of the Proxy has been developed that handles correctly slow
clients (by reserving processing resources for every subscribed client) and
minimizes impact of slow consumers on the well behaving clients
 Currently tested for the CMW-Proxy-BQ
 When the test are completed the upgraded Proxy will be deployed in close
collaboration with Operations – end this week
4th May 2010
P.Charrue - LBCM
4

Description :
 Client/Server communication is lost inside the JAVA client application: busy
CMW notification thread inside the JAVA client prevents any subsequent
communication (idle socket in FIN_WAIT1 left in the FrontEnd)

Cause :
 JAVA client CMW thread responsible for the socket operation is too busy by
doing data processing and therefore cannot cleanly close the communication

Occurrence :
 Collimators

Immediate cure :
 Restart the JAVA application as the blocking situation cannot be resolved

CMW proposal :
 Get more data from blocked JAVA application to confirm our hypothesis
 Organise code review with the authors of these JAVA clients to understand why
the communication threads are blocked
 Help the developers of the Java Clients to move to JAPC (as this issue is solved
using JAPC)
4th May 2010
P.Charrue - LBCM
5

Medium term plans
 Deploy Proxies with support to slow clients
 Deploy patched Jacorb library to solve the JAVA client blocking
situation
 Push the usage of JAPC to avoid the loss of communication from
certain JAVA client applications

Long term plans
 The CMW team is currently preparing a complete technical review of
the Communication Infrastructure
▪ Several clients have already been interviewed
▪ The issues of the present infrastructure have been captured and prioritised along
with the new functionality requested
▪ Several solutions have been evaluated
▪ External middleware experts have been contacted to help us confirm our choices
 The actual review will take place in October 2010
 https://wikis.cern.ch/display/MW/CMW+Review
4th May 2010
P.Charrue - LBCM
6

From the e-logbook, a simple right-clic on an entry will
create a JIRA issue
 Each JIRA issue is then assigned and is closely followed-up
 http://issues/browse/APS
 PS and SPS operators are making good use of this

From your browser, go to http://issues and fill in a new
JIRA issue

As a last solution:
 Avoid direct email to individuals (they might be on vacation, not
reading their mail, sick, on leave, ….)
 Instead opt for the support mailing lists (e.g. [email protected], [email protected], …)
4th May 2010
P.Charrue - LBCM
7