CMW status - LHC commissioning
Download
Report
Transcript CMW status - LHC commissioning
P.Charrue – LBCM 14 Sept 2010
For the CMW team
Current CMW issues (3)
CMW middle and long term plans
How to report issues to the Controls Group
4th May 2010
P.Charrue - LBCM
2
Description :
JAVA clients blocked (XPOC project) and not getting data anymore from
the devices
Cause :
Socket blocking situation in the JacORB CORBA library (part of the CMW
infrastructure) – known bug in JacORB
Occurence :
Once to the XPOC client
Often for the Logging infrastructure
Immediate cure :
Restart the client application as the blocking situation cannot be
resolved
CMW proposal :
Today: We provide a callback to the client application which detect such
blocking situation and take take action (mail, sms, alarm, restart, log, …)
In 2 weeks: We will deliver a patch to this external Jacorb library to solve
this blocking situation;currently tested.
4th May 2010
P.Charrue - LBCM
3
Description :
CMW Proxy is blocked due slow consuming clients
Cause :
‘Slow clients’ subscribed to Proxy are not consuming the data quick enough
and block many notification threads (in Proxy) resulting in a complete blocking
of the Proxy
Occurrence :
BBQ, Hump Buster
Immediate cure :
Kill the ‘slow client’ application as the blocking situation cannot be resolved
automatically
CMW proposal :
A new version of the Proxy has been developed that handles correctly slow
clients (by reserving processing resources for every subscribed client) and
minimizes impact of slow consumers on the well behaving clients
Currently tested for the CMW-Proxy-BQ
When the test are completed the upgraded Proxy will be deployed in close
collaboration with Operations – end this week
4th May 2010
P.Charrue - LBCM
4
Description :
Client/Server communication is lost inside the JAVA client application: busy
CMW notification thread inside the JAVA client prevents any subsequent
communication (idle socket in FIN_WAIT1 left in the FrontEnd)
Cause :
JAVA client CMW thread responsible for the socket operation is too busy by
doing data processing and therefore cannot cleanly close the communication
Occurrence :
Collimators
Immediate cure :
Restart the JAVA application as the blocking situation cannot be resolved
CMW proposal :
Get more data from blocked JAVA application to confirm our hypothesis
Organise code review with the authors of these JAVA clients to understand why
the communication threads are blocked
Help the developers of the Java Clients to move to JAPC (as this issue is solved
using JAPC)
4th May 2010
P.Charrue - LBCM
5
Medium term plans
Deploy Proxies with support to slow clients
Deploy patched Jacorb library to solve the JAVA client blocking
situation
Push the usage of JAPC to avoid the loss of communication from
certain JAVA client applications
Long term plans
The CMW team is currently preparing a complete technical review of
the Communication Infrastructure
▪ Several clients have already been interviewed
▪ The issues of the present infrastructure have been captured and prioritised along
with the new functionality requested
▪ Several solutions have been evaluated
▪ External middleware experts have been contacted to help us confirm our choices
The actual review will take place in October 2010
https://wikis.cern.ch/display/MW/CMW+Review
4th May 2010
P.Charrue - LBCM
6
From the e-logbook, a simple right-clic on an entry will
create a JIRA issue
Each JIRA issue is then assigned and is closely followed-up
http://issues/browse/APS
PS and SPS operators are making good use of this
From your browser, go to http://issues and fill in a new
JIRA issue
As a last solution:
Avoid direct email to individuals (they might be on vacation, not
reading their mail, sick, on leave, ….)
Instead opt for the support mailing lists (e.g. [email protected], [email protected], …)
4th May 2010
P.Charrue - LBCM
7