Middleware renovation – technical overview 16th - Indico
Download
Report
Transcript Middleware renovation – technical overview 16th - Indico
Wojciech Sliwinski BE-CO-IN
for the Middleware team:
Felix Ehm, Kris Kostro, Joel Lauener,
Radoslaw Orecki, Ilia Yastrebov, [Andrzej Dworak]
Special thanks to: Vito Baggiolini and Pierre Charrue
Agenda
Context & Motivation for Renovation
Middleware Review process
Technical evaluation of the transport layer
Changes in the MW Architecture in LS1
Conclusions
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
2
Agenda
Context & Motivation for Renovation
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
3
MW Mandate & Scope
Standard set of MW solutions
Centrally managed services
Track & optimize runtime parameters
Well defined feedback channel for users
Provide support & follow-up issues
Control System
GUI Applications
Control Logic
Middleware
Scope: CERN Accelerator Complex
Operational 24*7*365
Must be Reliable & High Quality
73’000 HW devices, 3’150 servers
In all Eqp. groups (4 dpts: BE, EN, GS, TE)
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
4
CMW in the Controls System
GENERAL
PURPOSE
NETWORK
FIXED
DISPLAYS
OPERATOR
CONSOLES
FILE SERVERS
JMS client (Java)
TCP/IP
GUIs communication services
APPLICATION SERVERS
CMW client (Java)
JAPC
Logging, LSA, InCA, SIS
SCADA SERVERS
CMW client/server (C++/Java)
Proxy, DIP, AlarmMon, AQ
JMS client
(Java)services
TCP/IP
communication
Servers: Logging, InCA, SIS
TIMING GENERATION
RT Lynx/OS
VME FRONT ENDS
WORLDFIP
Front Ends
M IDDLE TIER
CERN GIGABIT ETHERNET TECHNICAL NETWORK
CMW client (C++/Java)
JAPC
GUIs, LabView, RADE
PRESENTATION TIER
OPERATOR
CONSOLES
T
T
T
T
PLCs
BEAM POSITION MONITORS,
BEAM LOSS MONITORS,
BEAM INTERLOCKS,
RF SYSTEMS, ETC…
T
QUENCH PROTECTION AGENTS,
POWER CONVERTERS FUNCTIONS
GENERATORS, CRYO TEMPERATURE
SENSORS…
DIRECT I/O
T
T
FIP/IO
OPTICAL
FIBERS
T
PROFIBUS
T
T
CMW server (C++)
PVSS (Cryo, Vacuum)
RESOURCE TIER
CMW server (C++)
FESA, FGC, GM
WorldFIP SEGMENT
(1, 2.5 MBits/sec)
TCP/IP communication services
ACTUATORS AND SENSORS
CRYOGENICS, VACUUM, ETC…
LHC MACHINE
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
5
Motivations for MW Renovation
Current CORBA-based CMW-RDA
Integrated in the Control system
Used to operate all CERN accelerators
Provides widely accepted Device/Property model
> 10 years old
Why to review & upgrade MW ?
CORBA was choosen 15 years ago
Technical limitations of CORBA-based transport
Functional limitations of the current CMW-RDA
Codebase with long history difficult to maintain, needs architecture review
Major issue of long-term support & future evolution
Evolution of technology over last 10 years: HW, OS, middleware, 3rd party libraries
Human factor less & less CORBA expertise on the market
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
6
Technical limitations of CORBA transport
Became legacy, not actively supported maintenance issue
Shrinking community, slow response time
omniORB (C++) – 1 developer/maintainer, last release mid-2011
JacORB (Java) – few developers, small community
Major technical limitations
Lack of fully asynchronous processing channel
Blocking communication infamous JacORB blocking issue
Lack of low-level control of IO resources (sockets, request queues)
Development issues
Difficult to extend the wire protocol Backward compatibility issue
Complex, error prone API
Heavy in memory usage
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
7
Summary: Why change CORBA?
CORBA was choosen 15 years ago
Not actively maintained big risk for the MW project
Better solutions exist on the market
Invest in future solution rather than maintaining old one
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
8
Functional limitations of CMW-RDA
Several pending operational issues
Difficult (or hardly possible) to resolve with current library
Any major change very difficult to introduce
○ Technical Stops & Xmas breaks too short for massive deployment
○ High risk Major impact on front-end frameworks and applications
No protection against ’slow/bad’ client applications
Misbehaving application may destabilise front-end server
Affects reliability of the subscription channel
Workaround: introduction of Proxy
Poor scalability when many clients subscribed
Stability issues observed when >200 clients subscribed (even for Proxy)
Threading model doesn’t scale well with many clients
Missing support for priority clients (e.g. SIS, PM, InCA, Logging)
Non-critical clients (e.g. GUIs) have the same communication priority
+ others …
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
9
Summary: Why change CMW-RDA?
With current CORBA-based middleware we can’t solve
the pending operational issues
We can’t provide better scalability & reliability
CMW-RDA is difficult to evolve & extend
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
10
Agenda
Middleware Review process
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
11
Middleware Renovation process
MW Renovation = MW Review + MW Upgrade
MW Review aims to provide the most appropriate technical solution satisfying the
user requirements
MW Upgrade establishes the plan & strategy for introduction of the new MW
Objective: LS1 the unique opportunity for the major MW upgrade
Middleware Review Process
Gathering of users feedback and requirements (2010-11)
Review of communication and serialization libraries (2011-12)
Prototyping using selected communication products (2012)
Design & impl. of new RDA3: Data, Client & Server (2012-13)
Testing & validation of core MW infrastructure (summer’13)
Upgrade of all dependent MW libraries & services (2013-14)
○ JAPC, Directory Service, Proxy, DIP Gateway
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
12
Review of users requirements
2010-11 – series of interviews with major users
Lars Jensen, Stephen Jackson (BI)
Andy Butterworth, Frode Weierud, Roman Sorokoletov (RF)
Brice Copy, Clara Gaspar (DIP, DIM)
Frederic Bernard, Herve Milcent, Alexander Egorov (PVSS)
Alexey Dubrovskiy (CTF), Kris Kostro (DIP gateways)
Marine Gourber-Pace, Nicolas Hoibian (Logging)
Nicolas De Metz-Noblat (Front-Ends), Alastair Bland (Infrastructure)
Michel Arruat (FESA), Stephen Page (FGC)
Niall Stapley, Mark Buttner, Marek Misiowiec (LASER & DIAMON)
Nicolas Magnin, Christophe Chanavat (ABT)
Stephane Deghaye, Jakub Wozniak (InCA, SIS)
Vito Baggiolini, Roman Gorbonosov (JAPC & DA systems)
+ regular feedback from OP
+ internal team input
http://wikis/display/MW/Interviews+with+Experts
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
13
New RDA3: Accepted requirements
New requirement
General
Java & C++ API, Win (64-bit) & Linux (SLC5 32-bit & SLC6 64-bit)
Accelerator Device Model (i.e. Device/Property)
Get, Set, Async-Get, Async-Set, Subscribe
Early detection of communication failures
Improve error reporting in all the layers: client, server, gateways
Admin interface & runtime diagnostics & statistics
Data support
Data object: primitives, n-dim arrays, data structures
Subscription mechanism
Subscription behaviour the same regardless condition of the server (active, down)
Several client subscription policies (default: continuous)
Provide subscription notification ordering
First-Update enforced via CMW on server-side
○ Provide callback to front-end framework for the server-side Get
Drop support for on-change flag
Standardise use of subscription filters and update flags (e.g. immediate update)
Add header for acquired Data common metadata (e.g. acq. stamp, cycle name)
All loss of data (dropped updates) must be notified to clients
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
14
New RDA3: Accepted requirements
New requirement
Client side
RDA3 client API connects with both: RDA2 (old) & RDA3 (new) servers
Efficient mechanism for: connection, disconnection & reconnection
Must be able to recover from any interruption of communication with the server
○ Server restarts, IP address change, rename/move of a device to another server
Improved semantics of Array Calls, i.e. handling of individual parameters
Enhanced diagnostics & collection of statistics
Server side
Policies for discarding notifications, i.e. deal with overflows and ’bad clients’
○ Instrument with counters & timings allowing to diagnose the notifications delivery
Prioritisation of Get/Set requests for high-priority clients
Server-side subscription tree fully managed by CMW
○ Server does not need to manage client subscriptions any more
Manage the client connections, e.g. forced disconnect of a client
Client lifetime callbacks (i.e. connected, disconnected)
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
15
New RDA3: Accepted requirements
New requirement
Server side (cont.)
Client discovery for the diagnostics purposes (i.e. connected clients with payload)
Enhanced diagnostics & collection of statistics
Ongoing discussions (not accepted yet)
Prioritisation of subscription notifications for high-priority clients
Technical notes
Invest in asynchronous & non-blocking communication
Prefer 0-copy & lock-free data structures, message queues
http://wikis/display/MW/Design+of+New+RDA
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
16
New RDA3: Summary of requirements
Unchanged
Device/Property model
Set of basic operations (Get, Set, Subscribe)
Fixes & improvements
Subscription mechanism
Connection management
Diagnostics & statistics
New functionality
Policies for subscription management (client & server)
Client priorities
Server-side subscription tree
Extended Data support
Standardise First-Update concept
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
17
Agenda
Technical evaluation of the
transport layer
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
18
Middleware transport requirements
Lightweight
Desirable
Friendly API, documentation
Request/reply & pub/sub patterns
Asynchronous
Performance & Scalability
Mandatory
Stability, Maturity & Longevity
Active community
Open source license
C++/Java
Fundamental
Linux/Windows
Over TCP/IP LAN
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
19
Evaluation process –> our criteria
Appearance
Simple usage
• Creators
• specification
• documentation
• Users
• forums
• bug reports
• Internet
Testing
• Download
• licensing
• Compile
• Linux & gcc
• Run examples
• Communication
patterns
• Performance
• Exceptional
situations
• QoS
• Configuration
CRITERIA
API, look & feel,
documentation
16th April 2013
Resources,
binary size,
memory
Community,
Communications
maturity
patterns
QoS
Performance
Andrzej Dworak, ICALEPCS 2011
Wojciech Sliwinski, Middleware Renovation: Technical Overview
20
Evaluated middleware products
All opinions are based only on our knowledge and evaluation. Each of the
products, depending on the requirements, may constitute a good solution.
CoreDX
OpenAMQ
RTI DDS
QPid
ZeroMQ
OpenSpliceDDS
RabbitMQ
YAMI
Ice
omniORB
JacORB
16th April 2013
MQtt RSMB
Thrift
Mosquito
Andrzej Dworak, ICALEPCS 2011
Wojciech Sliwinski, Middleware Renovation: Technical Overview
21
16th April 2013
Sync, async &
msg patterns
QoS
Dependencies
& memory f-p
Performance
Look & feel,
API, docs
Community &
maturity
Score
Products comparison (according to the criteria)
ZeroMQ
6
Ice
5
YAMI4
4
RTI
3
Qpid
3
CORBA
2
Thrift
2
Andrzej Dworak, ICALEPCS 2011
Wojciech Sliwinski, Middleware Renovation: Technical Overview
22
Conclusions
Several good middleware solutions available
The choice is dictated by the most critical requirements
Not easy performance matters but also ease of use, community, …
Prototyping was done with the most promising candidates:
ZeroMQ, Ice & YAMI
Finally we decided to choose ZeroMQ (http://www.zeromq.org/)
Asynchronous & non-blocking communication
0-copy & lock-free data structures, message queues
Nice API, good documentation & active community
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
23
New RDA3 Java – Sync Get round-trip time
Syn Get round-trip (1kB message payload)
18
16
14
Round-trip (ms)
12
10
max
8
average
6
4
2
0
0
100
200
300
400
500
600
700
800
900
1000
Number of clients
Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
24
New RDA3 Java – subscription notification latency
Subscription notification latency (1kB message payload)
250
Latency (ms)
200
150
min
max
100
average
50
0
0
100
200
300
400
500
600
700
800
900
1000
Number of clients
Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
25
New RDA3 Java – subscription notification latency
Subscription notification latency (a closer look)
6
5
Latency (ms)
4
min
3
max
average
2
1
0
0
20
40
60
80
100
120
140
160
180
200
Number of clients
Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
26
Agenda
Changes in the MW Architecture in LS1
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
27
User written
Current MW Architecture
Java Control
Programs
Central services
VB, Excel, LabView
C++ Programs
Passerelle C++
RDA Client API (C++/Java)
Administration
console
Clients
JAPC API
Middleware
Device/Property Model
Configuration
Database
CCDB
CMW Infrastructure
CORBA-IIOP
Directory
Directory
Service
Service
RBAC
RBAC
A1
Service
Service
RDA Server API (C++/Java)
Device/Property Model
Virtual Devices
(Java)
CMW int.
CMW int.
CMW int.
CMW int.
CMW int.
FESA
Server
FGC
Server
PS-GM
Server
PVSS
Gateway
More
Servers
Servers
CMW integr.
Physical Devices (BI, BT, CRYO, COLL, QPS, PC, RF, VAC, …)
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
28
User written
Changes in MW Architecture in LS1
Middleware
Central services
Java Control
Programs
C++ Programs
Passerelle C++
RDA Client API (C++/Java)
Administration
console
Clients
JAPC API
Upgrade in LS1
VB, Excel, LabView
Device/Property Model
Configuration
Database
CCDB
CMW Infrastructure
ZeroMQ
Directory
Directory
Service
Service
RBAC
RBAC
A1
Service
Service
RDA Server API (C++/Java)
Device/Property Model
Virtual Devices
(Java)
CMW int.
CMW int.
CMW int.
CMW int.
CMW int.
FESA
Server
FGC
Server
PS-GM
Server
PVSS
Gateway
More
Servers
Servers
CMW integr.
Physical Devices (BI, BT, CRYO, COLL, QPS, PC, RF, VAC, …)
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
29
LS1: Changes in RDA
New major version: RDA3 (June’13 – alpha version)
Public API NOT backward compatible
New protocol, new architecture, new design
Same Device/Property model & Get/Set/Subscribe calls
Announcement via accsoft-java-announce list
Required Actions for RDA Users
For Java: Use new version of JAPC (API unchanged)
For Java: New JAPC will support communication with RDA2 & RDA3 servers
For C++: Upgrade user code to new RDA3 API
For C++: RDA3 will support communication with RDA2 & RDA3 servers
Consequences if NO Action staying with old RDA2
NOT possible to communicate with new RDA3 servers (FESA3, FGC, etc.)
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
30
LS1: Changes in JAPC
New major JAPC version upgrade for RDA3 (September’13)
Public API backward compatible
Possible API extensions, but always compatible
Announcement via accsoft-java-announce list
Required Actions for JAPC Users
Update JAPC jars (via CommonBuild)
Re-release your product (via CommonBuild)
New JAPC will support communication with RDA2 & RDA3 servers
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
31
Middleware Team
Wojtek Sliwinski (Lead) 100% – Directory, RDA, Proxy, RBAC
Felix Ehm 30% – JMS, Log/Tracing, Feedback/Metrics
Joel Lauener 90% – CMW Admin, Directory, RDA, GM, DIP Gw.
Kris Kostro 20% – DIP Gateways, RDA3
Wojtek Buczak 30% – JAPC Core
Ilia Yastrebov 100% – RDA, RBAC, Passerelle, Proxy, Log
Radoslaw Orecki 100% – Directory, RDA3
Support: [email protected], [email protected]
Docs: http://wikis/display/MW
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
32
Conclusions
We have to replace CORBA with a new solution
We want to resolve the pending operational issues
We collected updated users requirements
New product (ZeroMQ) was choosen to replace CORBA
We will minimize changes to client SW during LS1
16th April 2013
Wojciech Sliwinski, Middleware Renovation: Technical Overview
33