more information

Download Report

Transcript more information

Partner
Logo
(My) Vision of where we are going
WP4 workshop, 10/12/2002
Olof Bärring
Olof Bärring, WP4 workshop 10/12/2002 - n° 1
Outline
 Fabric
automation architecture
 Today
and tomorrow subsystem by subsystem
 Important
 How
steps towards tomorrow
to use this workshop
Olof Bärring, WP4 workshop 10/12/2002 - n° 2
Automation Architecture
Authorisation
Monitoring repository
Job
scheduling
Fault Tolerance
PC
Configuration
Database
RPM
Software repository
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 3
Today
Authorisation
Monitoring repository
Job
scheduling
Fault Tolerance
PC
Configuration
Database
RPM
Software repository
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 4
Configuration:
Today
Configuration
Database
Abra-cadabra
Hocus-pocus
PAN
Node
profile
CDB API
HLDL template
Configuration
Database
Node
profile
PC
Olof Bärring, WP4 workshop 10/12/2002 - n° 5
Configuration:
… and tomorrow (February)
Configuration
Database
Cache
NVA API
Abra-cadabra
Hocus-pocus
PAN
Node
profile
Node
profile
CDB API
HLDL template
Configuration
Database
Node
profile
PC
Olof Bärring, WP4 workshop 10/12/2002 - n° 6
Authorisation
Monitoring repository
Job
scheduling
Fault Tolerance
PC
Configuration
Database
RPM
Software repository
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 7
Monitoring:
Today
Monitoring repository
PC
Repository
API
Repository server
Transport
Agent
Sensor
Sensor
Sensor
FlatFile DB
Olof Bärring, WP4 workshop 10/12/2002 - n° 8
Monitoring:
… and tomorrow (March)
Monitoring repository
Node
profile
Monitoring
Component
PC
Alarm GUI
Repository
API
Repository server
Transport
Agent
Sensor
Sensor
Sensor
SQL DB
Olof Bärring, WP4 workshop 10/12/2002 - n° 9
Authorisation
Monitoring repository
Job
scheduling
Fault Tolerance
PC
Configuration
Database
RPM
Software repository
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 10
Fault tolerance:
Today
Rule Editor
Rule
XML
Monitoring API
Correlation Engine
Fault Tolerance
PC
Actuator
Actuator
Actuator
Actuator
Olof Bärring, WP4 workshop 10/12/2002 - n° 11
Fault tolerance:
… and tomorrow (summer???)
Job
scheduling
Rule Editor
Rule
HLDL
Monitoring API
CDB
Node
profile
Fault Tolerance
???
Correlation Engine
PC
FT
Component
Actuator
Actuator
Actuator
Actuator
Olof Bärring, WP4 workshop 10/12/2002 - n° 12
Authorisation
Monitoring repository
Job
scheduling
Fault Tolerance
PC
Configuration
Database
RPM
Software repository
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 13
Installation&Maintenance:
Today
LCFGng
.def/.h
Component
Component
Component
Component
RPM
PC
RPM
Software repository
Olof Bärring, WP4 workshop 10/12/2002 - n° 14
Installation&Maintenance:
… and tomorrow (April)
RPM
pkgt??
rpmt
SPM
Cache
Node
profile
cdispd
SPM
Component
PC
RPM
Software repository
Component
Component
Component
Component
Olof Bärring, WP4 workshop 10/12/2002 - n° 15
Authorisation
Monitoring repository
Job
scheduling
Fault Tolerance
PC
Configuration
Database
RPM
Software repository
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 16
Resource mgmt
Today
Scheduler
Admin
API
Job
scheduling
Runtime
Control
PBS
Olof Bärring, WP4 workshop 10/12/2002 - n° 17
Resource mgmt
… and tomorrow (spring??)
Monitoring??
Job
scheduling
Scheduler
Admin
API
Runtime
Control
RMS
Component
PBS
LSF
Node
profile
Olof Bärring, WP4 workshop 10/12/2002 - n° 18
Important steps towards tomorrow
 WP4



subsystems need to be
Configured
Monitored
Repaired
Olof Bärring, WP4 workshop 10/12/2002 - n° 19
Important steps towards tomorrow (1)
 WP4




subsystem needs to be configured
Identify what potentially needs to be (re)configured
Define your configuration parameters and how they should fit into
the global schema
Write HLDL templates for your subsystem (learn from exercises
this week)
Write “configuration components” that calls the NVA API and
generates configuration files for your services
Olof Bärring, WP4 workshop 10/12/2002 - n° 20
Important steps towards tomorrow (2)
 WP4



subsystem needs to be monitored
Identify what can go wrong  define the subsystem health
metrics
Implement a sensor that measures the health metrics
Configure the monitoring subsystem to become aware of your
sensor/metrics. How is this done best?


Keep the configuration together with your subsystem and “link” it to
the monitoring?
Or should the configuration be added directly to the monitoring HLDL
template?
Olof Bärring, WP4 workshop 10/12/2002 - n° 21
Important steps towards tomorrow (3)
 WP4



subsystem needs to recover from unhealthy states
Determine how to recover the subsystem from the identified set
of unhealthy states
Implement actuator scripts that performs the repairs
Define the rule that links your health metrics to your recovery
actuator
Olof Bärring, WP4 workshop 10/12/2002 - n° 22
Important steps towards tomorrow (3)
 WP4



subsystem needs to recover from unhealthy states
Determine how to recover the subsystem from the identified set
of unhealthy states
Implement actuator scripts that performs the repairs
Define the rule that links your health metrics to your recovery
actuator
Olof Bärring, WP4 workshop 10/12/2002 - n° 23
Open issues (to think over during the
exercises)
 Who


launches the recovery actuator scripts? I see two cases:
Repair that do not involve a configuration change, e.g. restart
daemon; clean /tmp;
Repair that do involve a configuration change, e.g. service client
reconfiguration when a central service falls over
 Desired
state versus actual state duality: how to enter a
reference to the desired state in a FT rule?

FT rules naturally reference actual state through monitoring
metrics. How can the same be done for the desired state?
Olof Bärring, WP4 workshop 10/12/2002 - n° 24
How to use this workshop
 The
objective for this workshop to facilitate the integration of
WP4 subsystems. It is not for you to show how nice your
software is written and how well your task has been working, so


Teachers: please make sure to focus your exercises on important
interfaces and how to use them. Leave out cool features unless the
time allows for it
Participants: use the exercises to understand how to interface
your subsystem. Identify potential problems with the interfaces
from your subsystem point of view. Save those issues for the
Friday brainstorming session
 In
the brainstorm in Friday: discuss possible open issues or
problems identified during the hands-on exercises.
Olof Bärring, WP4 workshop 10/12/2002 - n° 25