Transcript tutorial5

Grid Resource Allocation Management
(GRAM)
• GRAM provides the user to access the grid in order to run, terminate
and monitor jobs remotely. The job request is sent to the gatekeeper
of the machine. The gatekeeper creates a job manager to handle
the job process.
Image source: A Resource Management Architecture for Metacomputing Systems
Grid Resource Allocation Management
(GRAM)
•
•
Job is a binary executable or command to be run in a remote machine. The
remote machine to be contacted should have Globus toolkit installed. The
server, which manages the requests of clients, is called a “gatekeeper”.
Gatekeeper’s job is:
– to perform mutual authentication of user and resource,
– determining a local user name for the remote user,
– starting a job manager which executes as that local user and actually handles
the request.
•
•
A job manager is responsible for creating the actual processes requested
by the user.
Job submission modes: It can be either batch or non-batch.
– Batch Jobs: A job-id is returned when the job is submitted. The output is obtained
via the job-id afterwards. It is useful in process-intensive applications.
– Non-batch Jobs: The client will wait for the remote gatekeeper throughout the
whole process and then receive the output.
•
•
The GRAM reporter is responsible for storing into MDS various information
about scheduler structure and state
You will be authenticated to the system using the following command:
– grid-proxy-init
•
Your certificate was created under /tmp directory.
– ls –al /tmp
Resource Specification Language
(RSL)
•
•
RSL provides a common language for the description of jobs and
resources. It enables the user to construct complex resource and runtime
environment descriptions by introducing specific attribute-value pairings.
Example:
& (executable = /bin/ls (* <-- that is an unquoted literal *))
(arguments = /grid/users)
(stdout = output.stdout)
•
Before sending the query, first create a directory under /grid/users. Specify
the name of the directory as your name.
mkdir /grid/users/tugba
•
Then, run the following command:
globusrun -r concorde03.mcs.surrey.ac.uk '&(executable=/bin/ls)(arguments= /grid/users)(stdout=
output.stdout)’
•
You will see the following output:
GRAM Job submission successful
GLOBUS_GRAM_PROTOCOL_JOB_STATE_ACTIVE
GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE
•
•
The output.stdout file is saved under your $HOME directory. Please see the
contents.
Question: Why isn’t it in your current directory?
Resource Specification Language
(RSL)
•
You can also save the RSL string to a file. The extension of the file should
be .rsl. Save the following RSL string as test.rsl under
/grid/users/<your_name>.
& (executable = /bin/ls (* <-- that is an unquoted literal *))
(arguments = /grid/users)
(stdout = output.stdout)
•
Then, run the following command:
globusrun -r concorde02.mcs.surrey.ac.uk -f test.rsl
•
You will see the following output:
GRAM Job submission successful
GLOBUS_GRAM_PROTOCOL_JOB_STATE_ACTIVE
GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE
Resource Specification Language
(RSL)
•
Exercise 1: Execute a Java program on a remote machine. The Java
program will write “Hello World” the command line.
•
Create HelloWorld.java file under $HOME/gt3/samples.
public class HelloWorld{
public static void main(String args[]){
System.out.println("\n\n\n"+args[0]);
}
}
•
Compile the Java program.
javac HelloWorld.java
•
Execute the job on the remote machine:
globusrun -r concorde02.mcs.surrey.ac.uk '&(rsl_substitution=(OUTPUT_PATH
/user/csckmst/css1tt/gt3/samples)(JAVA_PATH
/opt/SUNWjava/j2sdk1.4.1_01/bin))(executable=$(JAVA_PATH)/java)(directory=$(OUTPUT_PATH))
(stdout=outputjava.stdout)(arguments=HelloWorld Hello)’
“rsl_substitution” enables the user to define generic attribute-value pairs. These attributes can be used
elsewhere in the command. The RSL parser substitutes the attribute with its value before the execution of the
command during compilation. “rsl_substitution” is preferred as it increases the readibility of the RSL string and
avoids the repetitions of long declarations. “directory” is the job’s active directory.
The output file will be saved under $HOME/gt3/samples.
Exercise: Write the RSL part of the query to a file and execute the command
using that file.
Resource Specification Language
(RSL)
The DUROC component of the Globus Toolkit allows resources to be coallocated for
a single job. This means that the multiple resources can be allocated simultaneously
for a single job and executed in parallel.
Aim: Run two different jobs in parallel. Both jobs are to list the contents of a specific
directory for “count” number of times and writes the output to the specified file. Create
a RSL file called “duroctest.rsl”. Write the following lines to the file:
+
(&(resourceManagerContact="concorde02.mcs.surrey.ac.uk")
(executable="/bin/ls")
(count=20)
(arguments= /dev)
(environment=(GLOBUS_DUROC_SUBJOB_INDEX 0 )
)
(label="subjob 0")
(stderr="durocex.err")
(stdout="duroc1.out")
)
(&(resourceManagerContact=concorde03.mcs.surrey.ac.uk)
(executable="/bin/ls")
(arguments= /sbin)
(count=40)
(environment=(GLOBUS_DUROC_SUBJOB_INDEX 1 )
)
(label="subjob 1")
(stderr=durocex1.err)
(stdout=duroc2.out)
)
Resource Specification Language
(RSL)
Run the following command:
globusrun -f duroctest.rsl
While it is running, run “ps -u css1tt” (replace with your login name) on remote
machines. You will see that your processes are running successfully.The output will
be like:
14203 pts/0 00:00:00 tcsh
23556 ?
00:00:00 globus-job-mana
23573 ?
00:00:00 globus-job-mana
23609 ?
00:00:00 ls <defunct>
23640 ?
00:00:00 ls <defunct>
23671 ?
00:00:00 ls <defunct>
23702 ?
00:00:00 ls <defunct>
23733 ?
00:00:00 ls <defunct>
23749 ?
00:00:00 globus-gass-cac
Exercise: Write a Java program, which will count from 1 to any given number 10
number of times and write the results to the screen. Then, you are expected to run
this application at least on three machines in parallel. The limit numbers have to be
greater than 10000. The output has to be directed to a file, which is under
$HOME/gt3/samples/RSLExample.
Resource Specification Language
(RSL)
• Important RSL strings:
–
–
–
–
arguments: executable name
count: number of executions
directory: the directory where the jobmanager uses
environment: the environment variables required to
execute the job
– stderr: Remote file to store the standard error from the
job.
– stdin: Remote file to be used as an standard input
– stdout: Remote file to store the output of the job.
Global Access to Secondary Storage
(GASS) and GridFTP
–
–
–
GridFTP: GridFTP is a high-performance, secure protocol
based on the Internet Engineering Task Force's FTP standards
which uses the GSI (Grid Security Infrastructure) for
authentication and new extensions to the FTP protocol for
parallel data transfer, partial file transfer, and third-party
(server-to-server) data transfer.
Data Replication: Multiple copies of data improves access on
a distributed environment. The data replica, which is closest to
the access point is selected when it is required. There are two
data replication technologies: ‘replica catalog’ and ‘replica
management tool’.
GASS: GASS allows applications to operate on remote files.
GASS also provides cache management.
Global Access to Secondary Storage
(GASS) and GridFTP
•
Run the GASS server.
globus-gass-server &
Output will be a URL address.
•
•
Copy HelloWorld.java to /grid/users/<your_name>
Open another terminal in another machine such as
concorde02 and run:
grid-proxy-init
•
Copy HelloWorld.java from concorde02 to
concorde03.
globus-url-copy
https://concorde02.mcs.surrey.ac.uk:34344/grid/users/tugba/HelloWorld.java
file://concorde03.mcs.surrey.ac.uk/grid/users/tugba/HelloWorld.java
Global Access to Secondary Storage
(GASS) and GridFTP
•
globus-url-copy” can be used for local file transfers
as well. In such cases, it acts like “cp” command in
Unix.
globus-url-copy
file://concorde02.mcs.surrey.ac.uk/grid/users/tugba/HelloWorld.java
file://concorde02.mcs.surrey.ac.uk/user/csckmst/css1tt/gt3/HelloWorld.java
•
GridFTP also allows you to open multiple streams
between two different machines for transferring
files. This property allows faster transfer between
the machines. The number of streams can be
specified with “-p” argument.
globus-url-copy -p 3
https://concorde02.mcs.surrey.ac.uk:34344/grid/users/tugba/HelloWorld.java
file://concorde03.mcs.surrey.ac.uk/grid/users/tugba/HelloWorld.java
References
• A Resource Management Architecture for Metacomputing Systems,
K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W.
Smith, and S. Tuecke; Proc. IPPS/SPDP '98 Workshop on Job
Scheduling Strategies for Parallel Processing, 1998.
• IBM’s Grid Job submission using the Java CoG Kit:
• http://www-106.ibm.com/developerworks/java/library/wsgridcog.html (prepared by Vladimir Silva)
• The Globus Resource Specification Language RSL v1.0 and
Commands for Running Jobs:
• http://www-fp.globus.org/gram/rsl_spec1.html
• http://www.globus.org/v1.1/programs/globusrun.html
• http://www.globus.org/toolkit/docs/3.0/gram/rsl-schema.html