DIRAC3_Project_31012007 - Indico

Download Report

Transcript DIRAC3_Project_31012007 - Indico

DIRAC3 organization
A.Tsaregorodtsev,
CPPM, Marseille
31 January 2007, Barcelona
1
Outline

Structure of the DIRAC code

Release procedure

Installation procedure
2
Code structure
3
Structuring the DIRAC3 code

In order to proceed with the new DIRAC3
functionality implementation we should decide on the
structure of the code

The code structure is required in several
environments



Installation environment
Development environment
CVS repository

Parts that can be common should be common

The python code base should be structured following
the functional decomposition
4
Installation – high level view
CVS
doc
DIRACROOT
scripts
etc
python
lib
bin

DIRACROOT is the root directory of the DIRAC
installation, e.g. /opt/dirac

doc directory contains release notes, compiled
Epydoc code documentation, user manuals, etc

scripts – various command lines (python or shell),
typically will be included into the PATH.

python – all the DIRAC python code goes here (see
below). This is the single DIRAC defined
PYTHONPATH element
5
Installation – high level view (2)

lib – contains all the (platform dependant)
binary libraries shipped with the DIRAC
distribution, included into the
LD_LIBRARY_PATH
 Compiled
python modules ( sqlite, pyOpenSSL,
Classad, etc )
 LCG libraries

lib/python directory will also contain the
Python interpreter modules
6
Installation – high level view (2)

bin – contains all the binary (platform
dependant) executables
 Python
interpreter
 LCG commands
 Runit commands
7
CVS repository structure
CVSROOT
doc
scripts
etc
python
source

doc – contains release notes, manuals, docs which
are not automatically generated

scripts – the same as in the Installation environment

etc – contains sample configuration files which will be
edited during the installation

python – the same as in the Installation environment

source – contains CMT’fied C/C++ sources
8
Python code structure
python
contrib

WorkflowLib
DIRAC
High level decomposition
 contrib – third party
• ApiMon, sqlite, etc
python modules
– contains the standard definitions
of the DIRAC workflow components – workflows,
steps, modules
 DIRAC – the main package of the DIRAC python
code base
 WorkflowLib
9
WorkflowLib
WorkflowLib
Workflows

Currently a mixture of the templates and production
definitions with no versioning
Workflow and Step definitions are XML files;
Modules are python modules


Modules
These are the definitions of the workflows which are
currently stored in the Production Repository


Steps
E.g. GaudiApplication, SoftwareInstallation
Another possibility is to keep the WorkflowLib at the
top level of the CVS repository

Not necessary to include in the DIRAC distribution in many
cases
10
DIRAC python code structure
Core
DIRAC
Utilities
DISET
Logger
Workflow
DataAccess
Storage
ReplicaManager
FileCatalog
WorkloadMgmt
DataMgmt
ProductionMgmt
Subsystems
VObox
Information
Interface
API
DIRAC-shell
Web
11
DIRAC python code structure

Upper level – divide the code by Core utilities and
major Subsystems

Subsystems usually are installed on separate
machines


facilitate building separate distributions if really necessary
Information subsystem includes Configuration,
Accounting, Monitoring, Activity, Bookkeeping

Those can be considered subsystems as well and go to the
upper level
12
DIRAC python code structure
<Subsystem>

Subsystems have mandatory subdirectories to allow
for common service and agent invocation method



Services – DISET service handlers
Agents – agent modules
Subsystem code contains both the service and client
part



Services
Agents
DB
Clients
Usually developed by the same person
Not easy to make a client distribution
Other directories are also possible on this level as
needed by the package developers
13
Development environment

For the development of the python code it is
mandatory to create an installation environment with
the part being developed checked out from the CVS
repository directly

To allow easy check-in of the code, docs, etc

This is the case of the test WMS for example

Therefore, the python code structure should be
identical in the CVS and in the
installation/development environment

Less clear how to do with the C/C++ code

Each code update needs recompilation, distribution rebuild
and reinstallation ?
14
Release procedure
15
Release procedure

We should distinguish development and production
releases

Production (unlike development) release includes:





C/C++ code compilation for different platforms
Generation of the Epydoc code documentation
Confirmation by the DIRAC users (GANGA)
Installation in the LHCb release area
Making a development release should not take more
than few minutes

Frequent releases are necessary during the development
phase in order to ship the distribution to various
environments for the tests
• LCG, pilots, VO-boxes
16
Release procedure step by step

The upcoming release is announced by the release
manager/project coordinator

The DIRAC developers commit their codes to be
included to CVS


By default the Header CVS revision will be taken
If the above is not suitable, some packages can be tagged
for the release and the tag communicated to the release
manager

Release notes are provided by the developers to the
release manager

All the codes are collected in one place by the
release manager and tagged


The tag follows the LHCb versioning convention vXXXrXXX
All the files are tagged with the same tag
17
Release procedure step by step (2)

The C/C++ code is compiled using the CMT build
system



For the reference architecture
For other architectures in use
The binaries are collected in the bin and lib directories to be
included into the distribution
• Alternatively the binaries can be provided by the binary
package developers instead of recompilation by the release
manager.

Release notes are compiled by the release manager
and added to the doc directory


Release notes will be included into the release
announcement
The Epydoc code documentation is generated and
added to the doc directory

Provided also at the DIRAC web page
18
Release procedure step by step (3)

The distribution tar file is built by collecting the
prepared directories in a directory structure which
corresponds to the installation directory structure


Needs a dedicated tool ( script )
Points for discussion:
• Single distribution for multiple platforms ( as it is now ) or
separate distribution per platform ?
• Single distribution of the whole DIRAC system or per
subsystem distributions

The new release is installed on the Test system


Services, pilot distributions
The tests are done with real production jobs for the
WMS and DMS part

This includes tests of various clients and tools, e.g. GANGA
client
19
Release procedure step by step (4)

After the tests are done the release is
installed in the CERN/LHCb release area

The release is announced to the DIRAC
developers and other relevant mailing lists

The various DIRAC installations are
upgraded to the new release
 These
is not always necessary
20
Bug fix and development releases

Light releases which skip a number of steps





Usually no binaries recompilation
Not needed for all the platforms
No epydoc regeneration
No installation in the release area
Used to fix simple bugs

Can be used in production ( services, pilots, clients ) to
quickly patch the system
• Changes traceability is retained with the release notes

Used for the tests in the environments where
distributions are necessary


E.g. pilots agents or job wrappers
The versioning convention is vXXXrXXXpXXX

Who can build it ? Any developer ?
21
Versioning convention

DIRAC is a services based system

Usually the clients and services stay compatible across
multiple releases

The service interface incompatibilities must be
carefully reflected in the release versions

Each release which has changes in the services
interfaces that can result in previously released
clients failures must be assigned a new major
version. Otherwise only the new minor version is
assigned

v12r11 service and v12r1 client are guaranteed to be
compatible
22
Single vs multiple distributions

Single distribution:
 Pros:
• Easy to build and install
• Easy to insure compatibility of different components
 Cons:
• More software than necessary is installed

Multiple per subsystem distributions
 Pros:
• No unnecessary software installed
• Can help spotting unnecessary dependencies between
the packages
 Cons:
• More difficult to build, more building tools necessary to
maintain
23
Single vs multiple distributions(2)

Notes:
 the distribution size is not the issue:
• 12MB now, if Python interpreter included +14MB
 If
even multiple distributions are built, they must all
carry the same versions, otherwise we will have to
maintain a mapping of compatibilities of different
distributions

The proposal is to build a single distribution
24
Installation procedure
25
Installation procedure

Single script installation procedure – this should be
retained


Practically just untarring of one or more distribution files
Choice of the binary platform

Automatic or manual ?

Choice of the DIRAC version should be allowed with
a default provided

Checking for the Application Software should be
separated


Only necessary in a specific pilot agent environment
Simple automatic setup should be provided


Complemented by the additional configuration details
Subsequent updates maintain the defined configuration
26
Shipping the python interpreter

There are several indications that the latest python
versions are behaving better


E.g., in a multithreaded environment
Just more efficient, many bugs fixed

Having a definite version of python can reduce the
risk of having obscure errors

We can also start using more advanced features of
the language which are not present yet in 2.2, for
example

We are already starting to use it with the services in
the well controlled environment


Better efficiency, stability, some problems have gone
There is an interest in shipping the python together
with the DIRAC distribution
27
Shipping the python interpreter(2)

The python interpreter tar file weighs ~14 MB
compressed – the overhead is not large
 Some
components that we do not use still can be
scrapped

We will have to compile python distributions
for various platforms
 This

hopefully does not happen often
The version of python to be shipped should
be the same as the one of the AA
 Recommendation

of the DIRAC Review
I think we should do that
28
DIRAC3 roadmap
29
®Evolution ?

It is important to have a functional, if even not
complete, system as soon as possible
 New
necessary developments will go immediately
to the new code base
 Starting testing as soon as possible is extremely
important
 Should we be overly purists ?

Putting in place the new code structure (if
agreed) can be quick.

Reshuffling the code will take longer with
different paces for different components
30
How should we proceed:
evolutionary approach

Putting in place the new CVS repository and define its high level
structure

Review the services interfaces and fix them if necessary

Migrate the services code to the new structure
 Non-DISET services migrate to the DISET framework
Develop the release tools
 CMT packages, release building tools
Compile and start using the new release on a dedicated host
 One of the retired lxgateXX
 After 1 month from now
Upgrade the code to the new conventions and new functionality
in this new working chain of releases




With this approach we will have a running system without a
perfect code by June
31
How should we proceed:
revolutionary approach

Putting in place the new CVS repository and define its high level
structure

Define in details the new coding conventions, practices,
frameworks
 Fix it in the corresponding documents for reference
Start component by component migration to the new rules and
frameworks
 Migrate packages only if they are complying with the new
rules
After this migration is done start developing the new required
functionality



After the functionality is in place start integration and testing of
the whole system

With this approach we can have a perfect code but no running
system by June
32
How should we proceed ?
Let us discuss

We have to see who will be available to carry
out the work
 New
developments
 Support of the ongoing activities
 New activities, e.g. Pit-Castor transfers, PVS tests,
etc
 Learning to work with the LCG resources
• Transfers, gLite, new SRM, etc, etc, etc.

We have to see what is absolutely mandatory
and what can be put off for few months
33
Task list

Put in place the new CVS repository

Define high level directories, packages

Define the CMT part and migrate all the
C/C++ code to the new build system

Define the release tools

Define the installation scripts

Document all the procedures
34
Agent Framework
35
Agents

By agents we understand active software
components which are carrying out well defined tasks
in a recurrent way by sending requests to various
services


Agents are running usually as daemon processes
The current implementation of the agents framework
is very simple


Agent container is filled by various agent modules which can
be found in predefined places and which composition is
defined by the configuration parameters
Agent container is providing continuous execution loop with
a predefined frequency invoking modules one by one.
36
Agents: what can be done better

Agents of different kinds are running in
parallel with their own invocation frequency
 The Agent
container is not really used as such,
rather all the agents are running separately

A separate control of each agent is necessary
 Starting,
stopping, monitoring
 This is better done with an external tool, e.g. runit

We can drop the Agent Container and rather
make a base agent class with the functionality
common to all the agents
37
AgentBase class

Common invocation methods
 initialize(),
execute(), finalize()

Providing a common execution loop

Providing standard reporting to an Agent
monitoring service

Message passing mechanism
 Jabber
? Will not be usable in the grid enviroment
 Secure Bulletin board ?
38
Agent executor

Substitution to the dirac-agent script

Finding agents in a number of predefined
locations – AGENT_PATH

Passing command line options to the Agent
as configuration parameters
39