EPICS Asia 2004 Tokai, Japan December 8-10

Download Report

Transcript EPICS Asia 2004 Tokai, Japan December 8-10

EPICS Asia 2004
Tokai, Japan
December 8-10
http://www-linac.kek.jp/epicsasia2004
and later
http://www-acc.kek.jp/WWW-ACC-exp/EPICS_Gr/EPICSAsia2004
Where is Tokai?
Site Map
Projects
• JPARC
3.5 Billion; 7 years; ready in 2008
• Shanghai Light Source
150 Million; 4 years; ready in 2009
(does not include salaries)
• LCLS
• SNS – complete April 2006
• Pohang – XFEL 80 Million; 2009
Shanghai Map
SNS Now
SNS Control Room
General Impressions
• RTEMS is an accepted system
• LINUX is widely used
• EPICS knowledge “ramp up” time is an
issue
• RDB support around EPICS is a hot topic
• “Micro-EPICS” is being looked at by many
• 3.14 OSI is widely appreciated
Things which got my notice
• SNS Archiver
• SNS Network analysis
• KGB black-box serial
(and GPIB) support
• Argonne Virtual Linac
“Epics on a CD”
• Argonne Training Series
• rdbCORE collaboration
•
•
•
•
New Java CA
Enhanced Access Control
Asyn + stream device
Timing system
collaboration (TIMO is
everywhere)
• VDCT collaboration
• EDM progress
• 3.15 and 4.0 Plans
Archiver Issues Attacked on Four Fronts

Archiver Engine
» New LANL archive engine is
now running (Kasemir)

Archive Retrieval Tools
» New Java-based client-server
archive viewer has been
deployed. (Chevtsov)
» Collaboration with JLab,
LCLS and Cosylab

Archive Monitoring Tools
have been deployed
» Averaging ~3GB/day

Work has started on
Archiver data management
tools.
Archiver Status

SNS: Exclusively using the new toolset,
»
»


BESSY: Successfully added new "Index" files to 200GB of data.
Only known problem with Engine:
»
»

No crashes, more features than ever
multiple axes, show original samples, indicate disconnected/invalid,
Biggest hassle:
»

ChannelAccess 3.14.6 sometimes hangs on shutdown when there are
disconnected channels.
Hasn't happened for weeks, supposed to be solved in the not yet available next
release of base.
Development of Java Viewer is out of the woods:
»
»

25 archive engines running,
accumulated over 100GB in only about 6 month.
Data management (backups, moving older data to other computers, ...) isn't
automatic, possibly never will be, and requires one person for about 8h each
week.
Biggest user complaint:
»
»
»
CGIExport, the data access via any web browser, is gone. Though inferior to the
Java Viewer, users who remember it miss the zero-install CGIExport.
Java Viewer offers more.
Java Webstart is almost as convenient as CGIExport, yet requires installation of
viewer.
ICS – Software Engineering Group
13
Network Tools are Integrated with
EPICS
Network traffic integrated with EPICS via devSnmp device support*
-- data is now displayed via EDM
-- data is now stored in the EPICS archiver
*SNMP Driver from LANL
VxWorks Network Stack Flow
User Application
WRS Standard Socket Interface
UDP/TCP/Raw read()
write()
Socket Send
Buffer/Queue
Socket Receive
Buffer/Queue
TCP Fragment Reassembly
Queue
IP
IP Send Queue
(50 packets)
IP Receive Queue
(50 packets)
Data Clusters
(330)
Sys Clusters
(140)
Network Stack Memory Pools
IP Fragment Reassembly
Queue
taskPriority = 50
tNetTask
Driver Memory Pools
numTds
(64)
numRds
(32)
DMA
loanBufs
(16)
ARP Receive Queue
(50 packets)
DMA
(dec21x40)
Chip
CISCO’s World
Physical layer
Protocols we deal with in EPICS
•
UDP port 5065
–
–
–
–
CA beacons (“I am here” Heartbeat)
Used to re-establish CA TCP virtual circuits
CA beacons do not expect any replies
The CA Beacon Daemon is listening on UDP port 5065.
• A.K.A caRepeater
•
UDP port 5064
– CA search message
– A response is expected within some timeout interval
•
TCP port 5064
– CA server establishes a virtual circuit on port 5064
•
NFS
– UDP port 111
• Loading up IOC application
• Running autosave/restore
• Re-directing IOC files to boot server
•
NTP
– UDP port 123
• Keep IOCs time in synch
• At SNS we should see this about every 10 seconds in our current configuration.
•
RSH
– UDP port 514
• Remote login support
Network Traffic Analysis
Ethereal Packet Analysis Timeline
PowerOn IOC
T - 0 sec
Bring NIC online
T + 3 sec
EPICS neighbors come
T + 3.02 sec
Network Traffic Analysis
Ethereal Packet Analysis Timeline
(Cont’d)
Warning!!
NFS is heavy
Why is a retransmit necessary, hmmm?
Network Traffic Analysis
(Annotated)
Load vxWorks
NIC restart
Startup.cmd
Load EPICS
Heavy NFS
Still loading EPICS
More NFS
iocInit is ready
EPICS is running
AutoSave/Restore
Heavy NFS
Normal Work
Reboot IOC
Results/Conclusions
• The Network Analysis allows tuning of the network stack from
apriori information as well as empirical data collected from the
real environment.
• We have discovered some devices on our network that have
improper configurations and hence cause unnecessary traffic.
• We have discovered that NFS is really a heavy hitter and that
autosave/restore request files should be stored in one location.
• We have discovered that IGMP snooping must be supported on
the CISCO edge switches to contain Allen Bradley Control Logix
PLC multicast traffic. Multicast traffic should be contained in
general.
– We moved from the CISCO 3500 series to the CISCO 2950 series
• CISCO 3500 series only supported CGMP snooping
• We learned that sometimes IOC application errors are the main
cause of Network Stack Exhaustion and/or failure.
• We have added an “open-source” network sniffer (Ethereal) to
our EPICS Network trouble-shooting ToolKit.
• We have built in the Network diagnostics show routines from
WRS in to our IOC’s common support library.
Some more explanations:
• This microIOC should be a black box for installation:
– with a built-in EPICS database
– already with preconfigured records
– everything must be very user friendly, with wizards, in a
plug&play manner..
• And made of standard components:
– an Ethernet 10/100 MBit connector
– an onboard linux/RTEMS processor
– a Web server for configuration and viewing
– Off-the-shelf parts to replace
• No moving parts (fan, disk) to break in first place
Enter microIOC
small embedded computer interfacing different devices
Small
Ethernet
Fanless & Diskless
Ultra Low Power
PoE – Power Over Ethernet
Various IO
Analog & Digital IO
Motor & Control
Implementation Details
• dual Ethernet port allows to separate microIOCs
and devices from the rest of the control system
• available with Linux and RTEMS operating
systems and on request with vxWorks
– Giving resonable performance and realtime
• database can be persisted in flash, avoiding
problems due to network failures
• hardware components of the microIOC are of
high quality and have long life times
– PS has 500,000 h MTBF (55 years)
• by design, mechanical parts such as hard disks
and fans are avoided
The Main User Features Are:
• completely stand-alone, no VME/PCI or boot PC
necessary
• plug&play: configure IP (DHCP), connect cables and it
works
• simple configuration through Web server, built-in EPICS
db
– VDCT preconfigured db file for standard devices: PLCs, vacuum,
timing, motor control and monochromators
– a simple wizard to configure record names and constants
• installed EDM, Java and Web-based panels for display
and setting
• monitor system health
• upgrade management
• professional support and replacement contract as option
• lower price than a comparable VME system
Argonne Training Aids
• Vlinac – EPICS on a CD demo app
Installs on Windows, Mac, Linux, Solaris
• Lecture Series –
– Online
– Grouped for different audiences
– Powerpoint + streaming video
Available at:
http://www.aps.anl.gov/aod/bcda/epicsgettingstarted
rdbCORE
• SNS used “proactive” RDB support
(to produce EPICS stuff; what they’d like)
• Argonne used “reactive” RDB support
(to analyze EPICS stuff; what really exists)
• Collaboration to identify commonalities
– Not ORACLE dependent
– Not site specific
– Extendable
First Step – identify common
needs
IRMIS
SNS RDB
rdbCore
APS RDB
IRMIS/
PV crawler
JERI
VDCT, vi,
scripts,..
xml
MPS, …
template substitution
values
(read only)
.db files
XAL
IOCBoot/
IOCcore
XAL Applications
Current Efforts
Plans are still developing … but as of today
…
– First tables of rdbCore
• PV database (every field of every record)
• Installed device database
– Control Flow/Housing/Power
• Cable database
– First Tools
• ‘Controls Framework’ extension of XAL access rdbCore
• st.cmd crawler to populate PV database
• PV Viewer
• “vcct” – Visual Connection Configuration Tool
– View relationships between installed devices
– Cable Editor/Viewer
Primary Tables
• Process Variable Table (of rdbCore)
– Custom record definitions (and even modified record definitions) are
recognized
– 100% self-populated by “st.cmd crawler” that interprets dbLoadRecords &
dbLoadDatabase lines
• Need a plan to accommodate other CA servers
– “extensions” to rdbCore can be added to reference client use of all PVs
• Crawl through MEDM, ALH, Archiver config files
– “Generic SQL” which can generate Oracle or MySQL tables
– Contains an entry for each Process Variable (record.field) name loaded into
an IOC
Primary Tables
• Installed Devices Table (of rdbCore)
– Each device is fully described by the following Contains an entry for every
replaceable component installed in the control system.
– hierarchies:
• Control parent – What is it connected to?
• Housing parent – What is it housed in?
• Power parent – What is it powered by?
– 40-70% self-populated by EPICS business rules (INP/OUT fields,
configDevice(), dbior, etc)
• Cable Table (of rdbCore)
– Contains an entry for every cable installed in the control system
– Uses ports on “installed devices” as source and destination
Primary Tables
• PV Table, Installed Device Table, Cable
Table provide numerous relationships for
advanced queries
– What PVs will be affected by a particular device failure?
– What PVs will be affected if this cable is disconnected?
– What set of devices could cause a particular set of PVs to all be INVALID?
• And with “extended” tables …
– What applications (MEDM displays, scripts, XAL apps, etc) will be affected
if this device is powered off?
– What applications (MEDM displays, scripts, XAL apps, etc) will be affected
if this breaker trips?
Primary Tools - VCCT –
Control/Housing/Power
JCA
•
•
•
•
•
•
•
KGB (Cosylab)
Not based on JNI
Version 4.11 of CA
Ken Evans gave high praise
Simple to change from old to new JCA
Leader/Follower design pattern
JCA “lite” is possible (one thread only)
Enhanced Access Control
• Desire to “reserve” group of PVs for
exclusive write
• Many ideas put forward
• Leading candidate seems to be an
enhanced gateway process which uses
extra connections to dynamically manage
access control
ASYN and Stream Device
• Marty Kraimer’s asyn is THE way to do
serial I/O drivers
• Dirk Zimoch’s Stream Device is THE way
to implement new devices
• They have formed a mutual admiration
society
Timing Collaboration
• Timo Korhonen is everywhere.
• Argonne/SLS/Diamond EVG/EVR design
is now ubiquitous
– Shanghai
– Pohang
– JPARC
VDCT status
• Latest SNS-funded work is done
• VDCT is now THE tool of choice
EDM Status
• Converters from MEDM and EDD/DM now
work well and are as complete as possible
• One big drawback is non-resizability
• SNS is all EDM
Epics Futures
• 3.14.7 just announced
(“more stable than 3.14.6” – Marty)
• 3.15 features and schedule announced
• 4.0 features and schedule announced
• Emphasis on scheduled delivery
How Did We Get Here
• There was a series of EPICS 2010 meetings that were
organized to develop a grand plan and secure funding
• A large list of capabilities and technologies were
collected, however we were not able to generate any
serious funding.
• The control group at Argonne and I met to discuss
these issues and determine how we could start
moving.
• A long list of interesting items was produced.
• It was reduced it to compelling IOC Core issues.
• That list and plan is now to be presented for
comment/support/participation.
Feature/Plan for 3.15
•
compile the DBD into the database
– Andrew – Jan-March 05
– Include the dbd file into the db file
•
Device support for online add
–
–
–
–
•
Online Add
–
–
–
–
•
•
Andrew – March-Sept 05
Add calls before and after addresses change
Support addition of new hardware during operation
Add calls to support removal of hardware
Andrew – March-Sept 05
Add new record instances during operation
Connects to existing hardware
Connects to existing records (or they become ca links)
Remove Annoying Things (REALLY ANNOYING THAT IS)
Jan 01-06 release
Plan for 4.00
•
CA V4 Functional Specification
– Func. Spec. Team - Jan-March 05
•
CA V4 Design
– Small Design Team – March-June 05
•
CA V4 Implementation
– Small Implementation Team – July – July 06
•
Extensible Links
– Andrew – July – January 06
•
Database V4 Functional
– Func. Spec. Team – Jan- March 05
•
Database V4 Design
– Design Team – March – June 05
•
Database Access V4 Implementation
– Small Implementation Team – July – July 06
Features in 4.00
•
•
•
•
•
•
Everything we have now and…..
Arbitrary Strings
Array support to include: subarrays, frequency, offset, dimensions
Aggregate Data - a set of channels treated as one variable; get/put in one IOC
Client priorities
Monitor options to include: on another value, rate, rate if changed, %change,
dev
•
•
•
•
•
•
•
•
Transaction support – multiple actions with acknowledge in a single transaction
Group metadata into types and request independently: time, graphics, alarm,
ctrl, statistical
Redundant Clients to a server
Redundant Name Server, Aggregate data source, metadata source
Redundant IOCs to a PLC
Opaque or complex data support
Provide simple data records that can build complex records
History data requests??
Everything We Have Now
•
Performance must remain (or improve)
• Name resolution
• Get/Monitor/Put/Put w/ Callback/Put
completion
• Events for notification: value, archive,
alarm
• Conversion to native type
• Automatic reconnect
• All Current Metadata Supported
Features that are not included in the plan
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Regular meetings with key collaborators
Fund living expenses for short term collaborators
Linux real-time performance enhancements/evaluation
Improve control package for the database
Record management
Operating System support
Device/driver verification
Platform support
Solution support for devices
Error handling / logging
Secure Channel Access
Ease of use – ala labview
Solicit annoyances from the community
Tools extensions –
Framework for tightly coupled applications
Relationship between EPICS and Access Grid
Include standard functions in most used utilities
(edm/medm, alh, stripTool, archiver, save/restore, warm reboot, pvgateway,
Nameserver, consistent configuration across these tools, high level api (XAL?)
Features that are not included in the plan
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
VDCT
IOC applications IDE (esp. 3.14)
rdbCore
Collaboration support
Enhance web site
Enhance training
Enhance documentation
Improve release testing
Centralize device support
Coordinate release distribution
Consistent format for contributed modules
Maintain rec Ref Manual
Organize regular training
Keep current on CPUs/BSPs
Develop EPICS primer
Coordinate collaboration meetings
Develop automatic test suites
Quality contributed modules
Support and enhance popular extensions
Provide exhaustive cross-platform testing ad development hardware
Searchable database and supported devices and record types
Conclusions
• This is an attempt to restart the development effort in
IOC core
• APS has dedicated some limited resource to this and
this effort. More support is needed.
• Many items are not included in this list
– Many have contributed as part of their project development
– Our ability to provide new direction, improved tools are
dependent on projects to contribute on a continuing basis
EPICS – Today’s Control
System of Choice
… but what about tomorrow?
Ned Arnold
12/10/2004
The Collaboration Grows …
ANL APS &
LANL begin
collaboration
on EPICS
May 2004 Meeting
• 100+ Attendees
• 34 Institutions
•75+ Presentations
> 150 Licensees
CLS
DIAMOND
2003
SLAC
DESY
FNAL D0
LANL develops
Ground Test
Accelerator
Control System
(GTACS)
KEK
CEBAF
KECK
BESSY
LBL
1990
IPNS
Australian
Synchrotron
SLS
SNS
IPNS DAQ
1995
RIA
Gemini
Still the “Control
System of
2000
2005
Choice”
Reflection on the Collaboration
• Pros
–
–
–
–
Synergy – yields many good ideas
Common requirements have a single solution
An extremely beneficial and effective collaboration
Large installed base improves robustness
• Cons
– No one is in charge – tasks can not be delegated or assigned
– Contributors do what they want to do [i.e. the un-fun yet
important work is often left undone]
– When money gets tight, there is less effort donated to the
community good
The Cons are Starting to be Felt
• No one’s in charge …
– In light of advancing technology, some of EPICS’ core facilities
could be substantially enhanced
• More good ideas than resources, no common direction
• Contributors do what they want to do …
– Many of the things required to enhance EPICS user support are
not voluntarily addressed by the collaboration
• Consistent documentation
• Consistent method in packaging/testing contributions
• Money getting tight …
– Construction Funds: Running out … not much on the horizon
– Operations Funds: Already being used …being pinched
Looking to the Future
• What has been attempted?
– EPICS 2010 Meetings (three in the last two
years)
• Good discussions
• The seed has been sown …
• An attempt was made to move forward …
• It is now time to fertilize, water, and make
something grow
Looking to the Future
• What is Needed?
– “A plan”
– “A plan” that encourages the Pros and
mitigates the Cons
– “A plan” that receives wide acceptance and
support
• … which then becomes “The Plan”
– Implementation of “The Plan”
– Demonstration that we can deliver on “The
Plan”, which will build additional momentum
for further plans
Vision … (i.e. A Plan)
• Establish and promote a direction for the
future
• Aggressively solicit resources from
collaborating institutions
• Make something happen
Establish a direction …
• Divide the numerous topics into four
categories
– EPICS Core
• iocCore
• Channel Access
• SNL
– Core Tools/Extensions
• Display Manager, ALH, Archiver, StripTool, Gateway
• VDCT
• Other? (rdbCore?)
– Collaboration Support Issues
– Ideas and R&D for EPICS 5.0 and beyond