DIRAC3_Project_31012007 - Indico
Download
Report
Transcript DIRAC3_Project_31012007 - Indico
DIRAC3 organization
A.Tsaregorodtsev,
CPPM, Marseille
31 January 2007, Barcelona
1
Outline
Structure of the DIRAC code
Release procedure
Installation procedure
2
Code structure
3
Structuring the DIRAC3 code
In order to proceed with the new DIRAC3
functionality implementation we should decide on the
structure of the code
The code structure is required in several
environments
Installation environment
Development environment
CVS repository
Parts that can be common should be common
The python code base should be structured following
the functional decomposition
4
Installation – high level view
CVS
doc
DIRACROOT
scripts
etc
python
lib
bin
DIRACROOT is the root directory of the DIRAC
installation, e.g. /opt/dirac
doc directory contains release notes, compiled
Epydoc code documentation, user manuals, etc
scripts – various command lines (python or shell),
typically will be included into the PATH.
python – all the DIRAC python code goes here (see
below). This is the single DIRAC defined
PYTHONPATH element
5
Installation – high level view (2)
lib – contains all the (platform dependant)
binary libraries shipped with the DIRAC
distribution, included into the
LD_LIBRARY_PATH
Compiled
python modules ( sqlite, pyOpenSSL,
Classad, etc )
LCG libraries
lib/python directory will also contain the
Python interpreter modules
6
Installation – high level view (2)
bin – contains all the binary (platform
dependant) executables
Python
interpreter
LCG commands
Runit commands
7
CVS repository structure
CVSROOT
doc
scripts
etc
python
source
doc – contains release notes, manuals, docs which
are not automatically generated
scripts – the same as in the Installation environment
etc – contains sample configuration files which will be
edited during the installation
python – the same as in the Installation environment
source – contains CMT’fied C/C++ sources
8
Python code structure
python
contrib
WorkflowLib
DIRAC
High level decomposition
contrib – third party
• ApiMon, sqlite, etc
python modules
– contains the standard definitions
of the DIRAC workflow components – workflows,
steps, modules
DIRAC – the main package of the DIRAC python
code base
WorkflowLib
9
WorkflowLib
WorkflowLib
Workflows
Currently a mixture of the templates and production
definitions with no versioning
Workflow and Step definitions are XML files;
Modules are python modules
Modules
These are the definitions of the workflows which are
currently stored in the Production Repository
Steps
E.g. GaudiApplication, SoftwareInstallation
Another possibility is to keep the WorkflowLib at the
top level of the CVS repository
Not necessary to include in the DIRAC distribution in many
cases
10
DIRAC python code structure
Core
DIRAC
Utilities
DISET
Logger
Workflow
DataAccess
Storage
ReplicaManager
FileCatalog
WorkloadMgmt
DataMgmt
ProductionMgmt
Subsystems
VObox
Information
Interface
API
DIRAC-shell
Web
11
DIRAC python code structure
Upper level – divide the code by Core utilities and
major Subsystems
Subsystems usually are installed on separate
machines
facilitate building separate distributions if really necessary
Information subsystem includes Configuration,
Accounting, Monitoring, Activity, Bookkeeping
Those can be considered subsystems as well and go to the
upper level
12
DIRAC python code structure
<Subsystem>
Subsystems have mandatory subdirectories to allow
for common service and agent invocation method
Services – DISET service handlers
Agents – agent modules
Subsystem code contains both the service and client
part
Services
Agents
DB
Clients
Usually developed by the same person
Not easy to make a client distribution
Other directories are also possible on this level as
needed by the package developers
13
Development environment
For the development of the python code it is
mandatory to create an installation environment with
the part being developed checked out from the CVS
repository directly
To allow easy check-in of the code, docs, etc
This is the case of the test WMS for example
Therefore, the python code structure should be
identical in the CVS and in the
installation/development environment
Less clear how to do with the C/C++ code
Each code update needs recompilation, distribution rebuild
and reinstallation ?
14
Release procedure
15
Release procedure
We should distinguish development and production
releases
Production (unlike development) release includes:
C/C++ code compilation for different platforms
Generation of the Epydoc code documentation
Confirmation by the DIRAC users (GANGA)
Installation in the LHCb release area
Making a development release should not take more
than few minutes
Frequent releases are necessary during the development
phase in order to ship the distribution to various
environments for the tests
• LCG, pilots, VO-boxes
16
Release procedure step by step
The upcoming release is announced by the release
manager/project coordinator
The DIRAC developers commit their codes to be
included to CVS
By default the Header CVS revision will be taken
If the above is not suitable, some packages can be tagged
for the release and the tag communicated to the release
manager
Release notes are provided by the developers to the
release manager
All the codes are collected in one place by the
release manager and tagged
The tag follows the LHCb versioning convention vXXXrXXX
All the files are tagged with the same tag
17
Release procedure step by step (2)
The C/C++ code is compiled using the CMT build
system
For the reference architecture
For other architectures in use
The binaries are collected in the bin and lib directories to be
included into the distribution
• Alternatively the binaries can be provided by the binary
package developers instead of recompilation by the release
manager.
Release notes are compiled by the release manager
and added to the doc directory
Release notes will be included into the release
announcement
The Epydoc code documentation is generated and
added to the doc directory
Provided also at the DIRAC web page
18
Release procedure step by step (3)
The distribution tar file is built by collecting the
prepared directories in a directory structure which
corresponds to the installation directory structure
Needs a dedicated tool ( script )
Points for discussion:
• Single distribution for multiple platforms ( as it is now ) or
separate distribution per platform ?
• Single distribution of the whole DIRAC system or per
subsystem distributions
The new release is installed on the Test system
Services, pilot distributions
The tests are done with real production jobs for the
WMS and DMS part
This includes tests of various clients and tools, e.g. GANGA
client
19
Release procedure step by step (4)
After the tests are done the release is
installed in the CERN/LHCb release area
The release is announced to the DIRAC
developers and other relevant mailing lists
The various DIRAC installations are
upgraded to the new release
These
is not always necessary
20
Bug fix and development releases
Light releases which skip a number of steps
Usually no binaries recompilation
Not needed for all the platforms
No epydoc regeneration
No installation in the release area
Used to fix simple bugs
Can be used in production ( services, pilots, clients ) to
quickly patch the system
• Changes traceability is retained with the release notes
Used for the tests in the environments where
distributions are necessary
E.g. pilots agents or job wrappers
The versioning convention is vXXXrXXXpXXX
Who can build it ? Any developer ?
21
Versioning convention
DIRAC is a services based system
Usually the clients and services stay compatible across
multiple releases
The service interface incompatibilities must be
carefully reflected in the release versions
Each release which has changes in the services
interfaces that can result in previously released
clients failures must be assigned a new major
version. Otherwise only the new minor version is
assigned
v12r11 service and v12r1 client are guaranteed to be
compatible
22
Single vs multiple distributions
Single distribution:
Pros:
• Easy to build and install
• Easy to insure compatibility of different components
Cons:
• More software than necessary is installed
Multiple per subsystem distributions
Pros:
• No unnecessary software installed
• Can help spotting unnecessary dependencies between
the packages
Cons:
• More difficult to build, more building tools necessary to
maintain
23
Single vs multiple distributions(2)
Notes:
the distribution size is not the issue:
• 12MB now, if Python interpreter included +14MB
If
even multiple distributions are built, they must all
carry the same versions, otherwise we will have to
maintain a mapping of compatibilities of different
distributions
The proposal is to build a single distribution
24
Installation procedure
25
Installation procedure
Single script installation procedure – this should be
retained
Practically just untarring of one or more distribution files
Choice of the binary platform
Automatic or manual ?
Choice of the DIRAC version should be allowed with
a default provided
Checking for the Application Software should be
separated
Only necessary in a specific pilot agent environment
Simple automatic setup should be provided
Complemented by the additional configuration details
Subsequent updates maintain the defined configuration
26
Shipping the python interpreter
There are several indications that the latest python
versions are behaving better
E.g., in a multithreaded environment
Just more efficient, many bugs fixed
Having a definite version of python can reduce the
risk of having obscure errors
We can also start using more advanced features of
the language which are not present yet in 2.2, for
example
We are already starting to use it with the services in
the well controlled environment
Better efficiency, stability, some problems have gone
There is an interest in shipping the python together
with the DIRAC distribution
27
Shipping the python interpreter(2)
The python interpreter tar file weighs ~14 MB
compressed – the overhead is not large
Some
components that we do not use still can be
scrapped
We will have to compile python distributions
for various platforms
This
hopefully does not happen often
The version of python to be shipped should
be the same as the one of the AA
Recommendation
of the DIRAC Review
I think we should do that
28
DIRAC3 roadmap
29
®Evolution ?
It is important to have a functional, if even not
complete, system as soon as possible
New
necessary developments will go immediately
to the new code base
Starting testing as soon as possible is extremely
important
Should we be overly purists ?
Putting in place the new code structure (if
agreed) can be quick.
Reshuffling the code will take longer with
different paces for different components
30
How should we proceed:
evolutionary approach
Putting in place the new CVS repository and define its high level
structure
Review the services interfaces and fix them if necessary
Migrate the services code to the new structure
Non-DISET services migrate to the DISET framework
Develop the release tools
CMT packages, release building tools
Compile and start using the new release on a dedicated host
One of the retired lxgateXX
After 1 month from now
Upgrade the code to the new conventions and new functionality
in this new working chain of releases
With this approach we will have a running system without a
perfect code by June
31
How should we proceed:
revolutionary approach
Putting in place the new CVS repository and define its high level
structure
Define in details the new coding conventions, practices,
frameworks
Fix it in the corresponding documents for reference
Start component by component migration to the new rules and
frameworks
Migrate packages only if they are complying with the new
rules
After this migration is done start developing the new required
functionality
After the functionality is in place start integration and testing of
the whole system
With this approach we can have a perfect code but no running
system by June
32
How should we proceed ?
Let us discuss
We have to see who will be available to carry
out the work
New
developments
Support of the ongoing activities
New activities, e.g. Pit-Castor transfers, PVS tests,
etc
Learning to work with the LCG resources
• Transfers, gLite, new SRM, etc, etc, etc.
We have to see what is absolutely mandatory
and what can be put off for few months
33
Task list
Put in place the new CVS repository
Define high level directories, packages
Define the CMT part and migrate all the
C/C++ code to the new build system
Define the release tools
Define the installation scripts
Document all the procedures
34
Agent Framework
35
Agents
By agents we understand active software
components which are carrying out well defined tasks
in a recurrent way by sending requests to various
services
Agents are running usually as daemon processes
The current implementation of the agents framework
is very simple
Agent container is filled by various agent modules which can
be found in predefined places and which composition is
defined by the configuration parameters
Agent container is providing continuous execution loop with
a predefined frequency invoking modules one by one.
36
Agents: what can be done better
Agents of different kinds are running in
parallel with their own invocation frequency
The Agent
container is not really used as such,
rather all the agents are running separately
A separate control of each agent is necessary
Starting,
stopping, monitoring
This is better done with an external tool, e.g. runit
We can drop the Agent Container and rather
make a base agent class with the functionality
common to all the agents
37
AgentBase class
Common invocation methods
initialize(),
execute(), finalize()
Providing a common execution loop
Providing standard reporting to an Agent
monitoring service
Message passing mechanism
Jabber
? Will not be usable in the grid enviroment
Secure Bulletin board ?
38
Agent executor
Substitution to the dirac-agent script
Finding agents in a number of predefined
locations – AGENT_PATH
Passing command line options to the Agent
as configuration parameters
39