transparencies
Download
Report
Transcript transparencies
DIRAC Job submission
A.Tsaregorodtsev,
CPPM, Marseille
LHCb-ATLAS GANGA Workshop, 21 April 2004
1
DIRAC Job structure
Job consists of one or more
steps; can be split in subjobs
corresponding to steps.
Step consists of one or more
modules; is indivisible w.r.t to
the job scheduling;
Module is the least unit of the
execution:
Use standard modules in
production;
User defined modules in
analysis can be used.
Production Job
Gauss Step
SoftwareInstallation module
GaussApplication module
BookkeepingUpdate module
Boole+Brunel Step
SoftwareInstallation module
BooleApplication module
BrunelApplication module
BookkeepingUpdate module
2
Job class diagram
TypedParameter
n
Job
n
TypedParameter n
Step
n
TypedParameter n
Shell
ApplicationFactory
Module
Scriptlet
Script
Application
3
Job object structure
Three-level structure as a result of quite a lot
of experimentation:
Need
for very complex workflows for production
jobs;
Can be reduced to a single step, single module
job for analysis
Simple Job Class structure allows for easy
formalization:
example – simple XML schema for persistent
representation
For
4
Job object contract (subset)
Job(‘job.xml’) - constructor from XML file
addParameter()
addOption()
- just a special type parameter
addStep()
getPackages() – packages required by all the steps
inputData()
outputData()
execute() - recursively executes all the steps, modules
toXML() – save as XML file
toJDL() – generate JDL file suitable for submission
5
Job API example
module=Module()
module.addParameter(‘NAME’,’PARAM’,’DaVinci’)
module.addParameter(‘TYPE’,’PARAM’,’APPLICATION’)
module.addParameter(‘VERSION’,’PARAM’,’@{DAVINCI_VERSION}’)
opt = ‘={DATAFILE=”@{INPUT_FILE}” TYP=“ROOT”}’
module.addOption(‘EventSelector’,’Input’,opt)
step = Step()
step.addModule(module)
step.addParameter(‘INPUT_FILE’,’INPUTDATA’,
’lfn:/lhcb/production/DST/xxxx.dst’)
job = Job()
job.addStep(step)
job.addParameter(‘DAVINCI_VERSION’,’PARAM’,’v12r3’)
job.addParameter(‘MaxCPUTime’,’JDL’,’100000’)
job.addHeader(‘Author’,’atsareg’)
xml_string = job.toXML()
jdl_string = job.toJDL()
6
Steps
Main purpose - container of modules
Execute
modules;
Handle module failures;
Reporting job progress;
Steps are indivisible from the Workload
management perspective;
Steps can be dependant one on each other
Job
object can represent a DAG of arbitrary
complexity
7
Module types
The actual work is done in Modules:
Shell
• Any shell script either preinstalled or provided as a
Module parameter;
Scriptlet
• Python code executed within the job process
Script
• Python class with an execute() method – Functor;
• Either shipped with the job or preinstalled.
Application
• More involved Python script for invoking a (Gaudi)
application
• Preinstalled as part of the DIRAC software
8
Module parameters
Modules have access to the parameters of
the containing Step and to the parameters of
the Job;
Module parameters can be defined in terms
of the Step or Job parameters:
Allows
to define and reuse very complex
workflows (as templates with parameter
placeholders);
Just few parameters on the Job level should be
defined in order to instantiate the workflow.
9
Job further enhancements
Possibility to generate complete job steering
code from interconnected modules
Advantages
Job
object execution does not depend on the local
DIRAC software installation
More flexibility when connecting steps:
can pass complex objects from step to step
Already used inside the DIRAC Production
Console
Generating
simpler XML job representation to
submit to DIRAC WMS
10
Workflow definition
Workflow is an object of the same Job class;
Possibly not fully defined
Dirac Console provides a graphical interface to
assemble workflows of any complexity from standard
building blocks (modules) :
Very useful for production workflow definitions;
May be too powerful ( for the moment ) for a user job.
11
Job Submission: UI
First, the DIRAC UI should be installed:
Single script installation;
Requires only local python interpreter;
Will be installed on lxplus for common use.
Job submission CLI:
dirac-job-submit <jdl>;
• Returns jobID (number);
dirac-job-status <jobID>
• Returns python dictionary;
dirac-job-get-output <jobID>
• Retrieves OutputSandbox to the local directory
The Python API is provided as well
12
Job Submission: API
Job submission API to be used in the user
scripts or UI applications ( GANGA )
example with simplified Job definition API
from Dirac import *
dirac = Dirac()
job=Job()
job.setApplication('DaVinci','v12r11')
job.setOption(‘DV_Pi0Calibr.opts’)
job.setInputSandbox([‘DV_Pi0Calibr.opts’,'Application_DaVinci_v12r11/lib'])
job.setInputData(['/lhcb/production/DC04/v2/DST/00000743_00008447_9.dst’,
‘/lhcb/production/DC04/v2/DST/00000743_00008479_9.dst’])
job.setOutputSandbox(['DVHistos.root','DVNtuples.root'])
jobid = dirac.submit(job)
print "Job ID = ",jobid
13
Example JDL file
JDL file is generated behind the scene to be sent to
DIRAC WMS
Similar to LCG JDL
Requirements = ( member("Lyon_HPSS",other.LocalSE) ) &&
( member(“CERN_Castor",other.LocalSE) );
Executable = "$LHCBPRODROOT/DIRAC/scripts/jobexec";
Arguments = "jobDescription.xml";
JobName = "DaVinci_1";
SoftwarePackages = { "DaVinci.v12r11" };
JobType = "user";
Owner = "ibelyaev";
InputSandbox = { "lib.tar.gz", "jobDescription.xml", "DV_Pi0Calibr.opts" };
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = { "std.out", "std.err", "pi0calibr.hbook", "pi0histos.hbook" };
InputData = { "LFN:/lhcb/production/DC04/v2/DST/00000743_00008447_9.dst",
"LFN:/lhcb/production/DC04/v2/DST/00000743_00008479_9.dst”};
OutputData = { "/lhcb/test/DaVinci/v1r0/LOG/DaVinci_v12r11.alog" };
parameters = [ STEPS = "1"; STEP_1_NAME = "0_0_1" ];
ProductionId = "00000000";
JobId = 1347
14
Job execution sequence
User interface
Job JDL
Job wrapper
Remarks
handler can be the
same both with WMSlike and direct job
submission
Job wrapper does not
depend on the CE
back-end
CE
handler
CE
• Depends on the VO
execution environment
VO software;
VO services (DB’s,
monitoring, etc);
DIRAC
WMS
Job JDL
Site
Agent
Job wrapper
Job wrapper
CE
handler
CE
Batch system
WN
Job wrapper
WN
…
WN
Job wrapper
VO B execution
environmentVO C execution
VO A execution
environment
environment
15
Job Wrapper
The wrapper script is what is submitted to CE
and in fact executed on the Worker Node:
Automatically
generated by the WMS ( Agent );
Sets up the environment for the job in place.
Wraps the executable provided by the user:
Brings
in input data;
Brings in InputSandbox;
Invokes user analysis executable;
Reports the job status
Uploads resulting output data
16
Job Wrapper
Wrapper uses external ( standard ) services
via plug-ins:
Configuration
Service
File Catalog
• Plug-ins for BK, Alien, LFC catalogs
Data moving with Replica Manager
• Plug-ins for gridftp, xxxtp, file, rfio accessible storages
Job status/parameters reporting
• Plug-ins for DIRAC monitoring, MonALISA
• E.g. providing R-GMA plug-in would be easy
17
Job Wrapper
Same wrappers are used on all the DIRAC
CE back-ends
If
necessary back-end specific wrappers can be
generated by the specific CE clients.
Other VO execution environments can be
adapted by providing VO specific plug-ins
18
DIRAC CE
CE interface (back-end):
submitJob(wrapper,jdl,batchname=None)
• Returns localJobID
getJobStatus(localJobID)
killJob(localJobID)
getDynamicInfo()
• Info on numbers of queuing/running jobs
Other useful functions ( missing ):
getTimeLeft(localJobID)
getLimits:
• Scratch, memory, spool, etc
19
DIRAC CE
Implementation
Special
class used by the agents running closely
to the actual CE;
Can be easily transformed into a client-server pair
for remote job submission:
• Example: ComputingElementEDG was done this way.
• No changes to the Agent code
Agent
Agent
CE
CE
client
Batch
System
CE
server
Batch
System
20