No Slide Title - icalepcs 2005
Download
Report
Transcript No Slide Title - icalepcs 2005
Key Topics
NIF Overview
Distributed Component Architecture
Distribution Complexity
Connection Management
Diagnostic Framework
Shot Automation
Commissioning Tools
Software Framework Evolution
NIF Facility flyover
NIF is a stadium-sized facility that will contain a 192-beam, 1.8-Megajoule, 500-Terawatt, ultraviolet laser system
a 10-meter diameter target chamber with room for nearly 100 diagnostics
Symmetry & nomenclature
National Ignition Facility
“Quads” and “Bundles” are the basic
building blocks of the NIF
The Integrated Computer Control System (ICCS)
for NIF is based on a scalable software framework
Object-oriented Ada95, Java, CORBA
Ada implements control system semantics
Java implements GUI layer and COTS integration
CORBA provides transparent language binding and distributed
communication using TCP/IP transport
60,000 control points, 140,000 CORBA objects, 750 computers
Framework Approach
Framework Toolkit
NIF Control System
Operators
Architecture &
Environment
Supervisory
Controls
text
text
Application
Patterns
text
Customizable
text
Software
Components
Build
Subsystems
Configuration
Logging
Archive
etc.
Bundle
Managers
Automatic
Controls
“Plug-in” Interconnections
Common
Services
Front-End
Processors
Distribution Mechanism
LRU actuators, sensors, devices
ICCS Distributed Component Architecture
Framework
Server
GUI
Navigator
Status &
Control
Supervisor
Shot
Supervisor
FWL
FWL
FWL
FWL
CORBA
FWL
Process Kind
Process
Counts
FEP
device
A framework layer
(FWL) resides in
each process
Framework Server
CORBA
Objects
10
338
Status & Control Supervisor
219
3736
Shot Supervisor
249
3864
Front-End Processors (FEP)
700
134916
50
250
1228
143104
GUI Navigator
Totals
Common Application Process Architecture
C
T
O
C
R
P
B
Connection
Objects
Application Objects
Object System Status Message
Shot
Machine Config/
Alerts Events Reservations Archive History Name
Factory Manager Monitor Log
Service
Framework Agents
A
Startup/Shutdown
Heartbeat
Diagnostic Agent
UDP
• Common startup and shutdown protocol
• Behavior of application processes is completely data driven
• Service distribution encapsulated by Framework Service APIs
Distribution Complexity
Layered client-server computer architecture
340 distinct CORBA interfaces
700 Front-End Processors (Power PC, x86, Sun) interface to
various sensors, actuators, instruments
50 server class computers (Sun) host supervisor software and
framework servers
14 console PCs host Java GUIs in the control room
Bundle-based hardware partitioning eliminates scaling risk by
bounding CORBA object populations and TCP connections
CORBA provides transparent language binding (Java, Ada95) and
location independent inter-process communication
Policies define interface de-coupling mechanisms and common
exception pattern
ICCS employs a component-based communication
architecture with connection management
Decoupled inter-process communications reduces deadlock
potential
Object reconnection allows transparent process restarts
Subscription management restores all subscription services
Process heartbeats verify process health
Status health heartbeats provide positive status health feedback
Timed remote invocations protect clients from problematic services
ICCS provides fault resilience – degraded operation in the
presence of server failure and recovery upon server restoration
Diagnostic Architecture Framework
“Out-of-CORBA-band” probes (UDP)
System Level Diagnostics
Network, I/O, memory, CPU
SNMP based
Process Level Diagnostics
Diagnostic Agent embedded into ICCS process architecture
Custom diagnostic objects register with the Diagnostic Agent
Remotely activated and displayed
Stored for off-line analysis
Supports framework and application diagnostic classes
Shot Automation Framework
Requirements derived from NIF Early Light experience
Intensive, year long, design and development effort
Achieves NIF scale through bundle partitioning
State machine guides operators through 10 well defined shot cycle states
Shot model describes (in data) subsystem activities and dependencies
A workflow engine executes the shot model
Calculated participation based on laser beam destination and diagnostic use
(only control components that are in the beam path)
Support error recovery
Operators given Pause/Play, Stop, Manual Step, Retry, and Resume
Automation semantics
Can de-participate non-performing non-critical components/segments
Shot model editor provides flexibility to define different NIF operating
scenarios
Commissioning Tools facilitate NIF build-out
Tools that enable efficient calibration/qualification of laser
components
Framework utilizes existing device layer CORBA interfaces and
separates displays from algorithms via autonomous threads
Contain complex algorithms that send commands to collections of
devices and aggregate/process data
Results stored in a configuration database for use in integrated shot
operations
ICCS Framework Evolution
Developed over the past 4+ years
Iterative process – requirements, design, and refinement
April and Sept 2001 releases of Framework 1.0 and 2.0 contained common
services and templates
April 2002 release of Framework 3.0 satisfied significant portions of
connection management and fault tolerant performance requirements
Application layers built on Framework 3.x supported NIF Early Light
activation and experimental campaigns through summer of 2004
Sept 2003 release of Framework 4.0 contained migration to new versions of
COTS (OS, Database, ORBs, compilers, CM systems)
Nov 2004 release of Framework 4.1 contained Diagnostic, Shot Automation,
and Commissioning Tool Frameworks
Jan 2006 release of Framework 5.0 contains additional commissioning tools
and refinements to Alert, Reservation, and Shot Automation
ICCS is positioned for June 2006 multi-bundle
milestone with framework release 5.0
Connection Management minimizes impact of process failures
Distribution complexity is mitigated through hardware partitioning
and de-coupled interface design patterns
Model-driven shot automation provides flexibility to operate the NIF
in different ways
Highly data-driven architecture allows modification of run-time
behavior without new software releases
Run-time diagnostics for collecting and evaluating system
performance
Commissioning tools help meet rapid laser construction schedule
Rigorous manual and automated formal software testing program
removes >95% of software defects before deployment
Status of the use of Large-Scale CORBADistributed Software Framework for
NIF Controls
ICALEPCS 2005
October 10-14, 2005
R. Carey, R. Bettenhausen, C. Estes, J. Fisher, J. Krammen, L. Lagin,
A. Ludwigsen, D. Mathisen, J. Matone, R. Patterson, C. Reynolds,
R. Sanchez, E. Stout, J. Tappero, P. VanArsdall
National Ignition Facility
Lawrence Livermore National Laboratory
The work was performed under the auspices of the U.S. Department of Energy
National Nuclear Security Administration by the University of California
Lawrence Livermore National Laboratory under Contract No. W-7405-ENG-48.
Network
Test suite characterized detailed effects of
CORBA failure modes
Java Java
Ada
Ada
Objects
Clients Servers Clients Servers ---------------CORBA
Visibroker
OIS
---------------Transport
TCP/IP
---------------OS
Solaris
Failures under different socket
conditions:
Server fails before/after initial
client connection
Client fails after server
connection
Failures during request processing:
Ada
Ada
Clients Servers Objects
---------------CORBA
OIS
---------------Transport
TCP/IP
---------------OS
VxWorks
Server fails during processing
Client fails during a request
Client registers callback; client
fails, restarts, and re-registers.
Server attempts client call-back
Client sends request to server.
Server hangs
CORBA failure modes are handled by
Connection Management Framework
Plug-and-play component architecture is
designed to scale up
ICCS Core
Servers
Configuration
Archive
History
Message Log
Alerts, Events
Reservations
Shot Set Up
Supervisor Consoles
System
Manager
Supervisor
Control Units
Graphical User
Interfaces
Software Distribution Bus (CORBA over network)
Front End Processors
Device
Control
Device
Reservation
Device
Emulation
Status
Monitor
Controller
Interface
Device
History