Transcript Document

UNICORE
Introduction to the Intel Client
and a look behind the scenes…
Grid Summer School, July 28, 2004
Ralf Ratering
Intel
Parallel and Distributed Solutions Division (PDSD)
Outline




Getting started with the UNICORE client
Constructing jobs in the client
Integrated application support
A real-world application
-2-
The Intel UNICORE Client




Graphical interface to UNICORE Grids
Platform-independent Java application
Open Source available from UNICORE Forum
Functionality:
– Job preparation,
monitoring and control
– Complex workflows
– File management
– Certificate handling
– Integrated application
support
-3-
History of UNICORE Client Versions
Early prototypes
developed in
UNICORE project
1997
1998
1999
First stable
version 3.0
2000
2001
Enhanced
functionality:
version 4
Final version from
Grip project:
5.0 Build 4
2002
2003
2004
Now: UNICORE 5.1
OpenSource project at
unicore.sourceforge.net
-4-
Starting the Client

Prerequisites: Java ≥ 1.4.2
UNICORE configuration
directory <.unicore> in
your HOME directory

Automatically creates an empty keystore and
imports trusted certificates from „cert“ directory
Define password
for your unicore
keystore file
(.unicore/keystore)
-5-
Getting a Test Certificate
Certificate signing request (CSR)
Information will be used to
generate a test certificate for you.

„Import test certificates“ from
„Settings->Keystore Editor“
CA web service
endpoint
-6-
Certificate Web Services



Low Security Model for Test Grid Access
Certificates are imported automatically into Client
Currently implemented at Research Center Jülich:
SERVER
CLIENT
– Add an identity verification step on server side
Request
Certificate
Signing
Request
Trusted
Certificates
User
Certificate
Test CA
Certificate
Certificate Service
-7-
Ready to go? „Hello Grid World!“
UNICORE Site == Gateway
Typically represents a computing center
Virtual Site == Network Job Supervisor
Typically represents target system
1. Execute a simple script on the
UNICORE Test Grid
2. Get back standard output and
standard error
DEM
O
-8-
Behind the Scenes: Authentication
Client
send user certificate
Gateway
send gateway certificate
User
Certificate
Trust gateway
certificate issuer?
establish SSL connection
Gateway
Certificate
Trust user
certificate issuer?
-9-
Behind the Scenes: Authorization
Certificate 1
Typical
UNICORE
User
Client
Login A
Login B
Certificate 2
Login C
Certificate 3
Login D
Gateway
Login E
AJO
Certificate 4
Certificate 5
Test Grid
User
User Certificate
AJO Certificate==
SSL Certificate?
User Certificate
UUDB
NJS
IDB
User Login
TSI
- 10 -
Behind the Scenes:
Creation & Submission
CLIENT
IncarnateFiles
Script
Container
Abstract
Job Object
MakePortfolio
ExecuteScriptTask
SERVER
1. Create file with script contents
2. Wrap file in portfolio
3. Execute portfolio as script
Script_HelloWorld1234...
Job Directory (USpace)
A temporary directory at the
target system where the job will
be executed
- 11 -
Monitoring the Job Status
Successful: job has finished succesfully
Not successful: job has finished, but a task failed
Executing: Parts of a job are running or queued
Running: Task is running
Queued: Task is queued at a batch sub system
Pending: Task is waiting for a predecessor to finish
Killed: Task has been killed manually
Held: Task has been held manually
Ready: Task is ready to be processed by NJS
Never run: Task was never executed
- 12 -
The Primes Example
2
public void breakKey() {
3
try {
BufferedReader br = new BufferedReader(new FileReader("primes.txt"));
while (true) {
13
st = new StringTokenizer(inputLine," ");
val = new BigInteger(st.nextToken());
if ( (N.mod(val).compareTo(BigInteger.ZERO)) == 0) {
19
23
29
37
q = N.divide(val);
41
return;
43
}
47
53
}
59
} catch (NullPointerException e) {
61
System.out.println("Done!");
67
{
71
73
System.err.println("IO Error:" + e);
79
}
...
p = BigInteger.ZERO;
}
17
31
p = val;
q = BigInteger.ZERO;
7
11
inputLine = br.readLine();
} catch (IOException e)
5
ArrBreakKey.java
Primes.txt
- 13 -
CLIENT
Demo 1: ‘‘Gridify‘‘ the Primes Example
ArrBreakKey.java
1. Import java file
SERVER
ArrBreakKey.java
2. Compile java file
ArrBreakKey.class
DEM
O
3. Execute class file
Job Directory
(USpace)
4. Get result in stdout/stderr
- 14 -
Behind the Scenes: Software
Resources
SERVER
CLIENT
Command Task
Executes a software resource,
or command (a binary that will
be imported into the job
directory)
APPLICATION javac 1.4
Incarnation Database (IDB)
Description „Java Compiler“
Application Resources contain
system specific information,
absolute paths, libraries,
environment variables, etc.
INVOCATION [
/usr/local/java/bin/javac
]
END
- 15 -
SERVER
CLIENT
Behind the Scenes: Fetching Outcome
Session Directory
stdout, stderr
Configurable in User Defaults:
Paths->Scratch Directory
stdout, stderr
Fetch Outcome
ArrBreakKey.java
2. Compile java file
stdout, stderr
ArrBreakKey.class
3. Execute class file
Job Directory (USpace)
stdout, stderr
Files Directory
- 16 -
SERVER
CLIENT
Demo 2: Steer the Lattice Boltzmann
Simulation
Plugin Task
Sample Panel
Editor
Export Panel
Control Panel
Job Directory
input file
sample.gif
reads
writes
Lattice-Boltzmann
Simulation Code
control file
reads
output.gif
writes
DEM
O
- 17 -
Behind the Scenes: Plug-in Concept

Add your own functionality to the client!
– Heavily used in research projects all over the world
– More than 20 plug-ins already exist



No changes to basic client software needed
Plug-ins are written in Java
Distribution as signed jar archives
- 18 -
Using 3rd Party Plug-ins



Get plug-in jar file from web-site, email, CD-ROM, etc.
Store it in client‘s plug-in directory
Client will check plug-in signature
Import plug-in certificates
from the actions menu in
the keystore editor
Is one certificate in
the chain a trusted
entry in the keystore?
no
REJECT
yes
Is the signing certificate a
trusted entry in the
keystore?
no
Add signing
certificate to
keystore?
no
REJECT
yes
LOAD
yes
LOAD
- 19 -
Task Plug-ins



Add a new type of task to the client GUI
New task can be integrated into complex jobs
Application support: CPMD, Fluent, Gaussian, etc.
Add task
item
Settings
item
Icon
Plugin
info
- 20 -
A Task Plug-in: CPMD

Workflow for Car–Parrinello molecular dynamics code
Input: conf_file1
Wavefunction
Optimization
Geometry
Optimization
Output: stdout stderr
RESTART.1, LATEST, ...
further optimization
?
Input: conf_file2 RESTART
Other ...
MD Run
re-iterate
?
further evaluation
Visualization
- 21 -
A Task Plug-in: CPMD
CPMD Plug-In Task used
in UNICORE workflows
CPMD wizard assists
in setting up the input
parameters
- 22 -
A Task Plug-in: CPMD

Visualize results
- 23 -
Supporting an application at a site


Install the application itself
Add entry to the Incarnation Database (IDB)
APPLICATION CPMD 3.4.1
Description „Car Parrinello Molecular Dynamics Code“
INVOCATION [
export JOBTYPE=8E8;
/usr/mpi/bin/mpiexec –p IAPAR -n $UC_PROCESSORS
/usr/local/bin/cpmd.x
$CPMD_FILE $PP_LIBRARY
]
- 24 -
Extension Plug-ins


Add any other functionality
Resource Broker, Interactive Access, etc.
JPA toolbar
Settings
item
Extensions
menu
Virtual site
toolbar
Plugin info
- 25 -
An Extension Plug-in: Resource Broker



Specify resource requests in your job
Submit it to a broker site
Get back offers from broker
- 26 -
Existing Plug-Ins (incomplete)













CPMD (FZ Jülich)
Gaussian (ICM Warsaw)
Amber (ICM Warsaw)
Visualizer (ICM Warsaw)
SQL Database Access (ICM
Warsaw)
PDB Search (ICM Warsaw)
Nastran (University of Karlsruhe)
Fluent (University of Karlsruhe)
Star-CD (University of Karlsruhe)
Dyna 3D (T-Systems Germany)
Local Weather Model (DWD)
POV-Ray (Pallas GmbH)
...







Resource Broker
(University of Manchester)
Interactive Access
(Parallab Norway)
Billing (T-Systems
Germany)
Application Coupling
(IDRIS France)
Plugin Installer (ICM
Warsaw)
Auto Update (Pallas
GmbH)
...
- 27 -
CLIENT
Using File Tasks
Local
Home
Root
USpace
Temp
Storage Server
SERVER 2
SERVER 1
Spool
Home
USpace
Root
Temp
Storage Server
- 28 -
How to specify resource requests?




Tasks can have resource sets containing requests
If not resource set is attached, default resources are used
Resource sets can be edited, loaded and saved
If a resource request does not match resources available
at a site, the client displays an error
Resource Set 1
Resource Set 2
- 29 -
Demo 3: Run a multi site job
1.
2.
3.
4.
Use the primes example
Compile the source file on one virtual site
Transfer the resulting class file to a sub
job running at a different virtual site
Execute the class file in the sub job
DEM
O
- 30 -
Client
Behind the Scenes: Authorization
Site A
SSL Certificate
== Trusted NJS?
Site B
Gateway
AJO
SubAJO
Gateway
SubAJO
User
Certificate
SubAJO
User
Certificate
User
Certificate
User Certificate
UUDB
User
Certificate
User Login
NJS
UUDB
NJS
- 31 -
Complex Workflow: Control Tasks
Do Repeat Loop
If Then Else
Do N Loop
Hold Task
- 32 -
Demo 4: Test the return code in a loop
import java.util.Random;
public class Application {
public static void main(String[] args) {
Random rnd = new Random(System.currentTimeMillis());
double random = rnd.nextDouble();
System.out.println("RANDOM: " + random);
int exitCode = (int)(5*random);
System.out.println("EXIT CODE: " + exitCode);
System.exit(exitCode);
}
}
Repeat execution until it
fails with a exit code 2!
DEM
O
- 33 -
Behind the Scenes: Ignore Failure


UNICORE jobs stop execution when a task fails
Sometimes Task failure is acceptable
– If and DoRepeat conditions
– Tasks that try to use restart files
– Whenever you do not care about task success

Set „Ignore Failure“ flag on Task
Right Mouse Click in
Dependency Editor
- 34 -
Loops: Accessing the iteration counter

Iteration variable: $UC_ITERATION_COUNTS

Lives on server side
Supported in

– Script Tasks
– File Tasks
– Re-direction of stdout/stderr

Nested loops: iteration numbers are separated
by „_“, e.g. „2_3“

Caution: counter will not be propagated to sub
jobs
- 35 -
Integrated Application Example: POV-Ray
Display
CLIENT
Scene Description
#include "colors.inc"
#include "shapes.inc"
camera {
location <50.0, 55.0, -75.0>
direction z
}
plane {y, 0.0 texture {pigment {RichBlue }}}
object { WineGlass translate -x*12.15}
light_source { <10.0,50.0,35.0> colour White }
...
SERVER
Command
Line
Parameters
Input Files
Output Image
Job Directory (USpace)
Include
Files
Demo Image from Pov-Ray Distribution
Libraries
POV-Ray
Application
Remote File System (XSpace)
- 36 -
Demo 5: Hold and release a job
1.
2.
3.
4.
Render Background
Image
Hold Job to check
Image
Manually Resume
Job Execution
Render Final Image
DEM
O
Demo Images from Pov-Ray Distribution
- 37 -
Job Monitor Actions
Get new status for
a site, job or task
Remove job from server.
Deletes local and remote
temporary directories
Get stdout, stderr and
exported files of a job
Kill job
Hold job execution
Resume a job that was
held by a „Hold Job“
action or a Hold task
Copy a job from the job monitor.
The job can be pasted into the
job preparation tree and re-run
e.g. with different parameters
Show dependencies of job
Show resources for task
- 38 -
Caching Resource Information

Client works on cached resource information
– UNICORE Sites, Virtual Sites, available resources

Resource cache will be updated on...
– ... startup
– ... refresh on „Job Monitoring“ tree node

Client uses cached
information in offline mode
- 39 -
Accessing other UNICORE Sites
Job Monitor Root
Performing a „Refresh“ on this
node will reload UNICORE Sites
UNICORE Sites
will be read from an XML file
Can be a URL on the web
Virtual Sites
are configured at the
UNICORE Site
- 40 -
Configuration: Using Different Identities
Key entries: Who am I?
Using different identities
- 41 -
Browsing Remote File Systems

Remote File Chooser
– Used in Script Task, Command Task, for File
Imports, Exports, etc.
Select virtual site
or „Local“
Preemptive file
chooser mode will
enhance performance
on fast file systems
- 42 -
The Client Log


„clientlog.txt“ or „clientlog.xml“
Used by developers to figure out problems
User Defaults->Paths:
User Defaults->Logging Settings:
INFO should be fine
Use PLAIN
Enable under Windows,
when no console is used
- 43 -
Starting the client re-visited

client.jar in lib directory
– start with .exe (Windows) or run script (Unix/Linux)
– or: „java –jar client.jar“

Command line options
– Choose an alternative configuration directory:
 -Dcom.pallas.unicore.configpath=<mypath>
– Enable the security manager:
 -Dcom.pallas.unicore.security.manager
– Enable SOCKS proxy:
 -DsocksProxyHost=“socks-proxy.isw.intel.com"
 -DsocksProxyPort="1080"
- 44 -
A real world Enterprise application:
UNICORE inside Intel

Software testing at Parallel and
Distributed Solutions Division (PDSD)
– Windows TSI port on server side
– Complex existing testing environment
version x
INNL system
MPICH
CGSL system
MPICH2
KSL system
version y
...
...
...
1.
2.
3.
PMB
Intel Test Suite
NPB
...
Build with parameters
Run with parameters
Get result files
- 45 -
Intel PDSD Grid
Nizhny Novgorod, Russia
Champaign, Illinois
4 Node
Xeon™ Cluster
4 Node
Xeon™ Cluster
4 Node
Xeon™ Cluster
Cologne, Germany
2 Node
Xeon™ Cluster
4 x Itanium® 2

UNICORE makes testing different versions
on distributed systems a lot easier
- 46 -
Lessons learned…

Security is negligible within intranet
– Systems are protected by firewall

Firewalls in the Intranet are a problem
– Administrators have to open ports for every new
NJS to the Gateways

Users come and go
– Managing user database and logins too complex

Solutions
– Open port range in firewalls
– All testers use the same user certificate!!!
- 47 -
Summary

Intel UNICORE Client offers an intuitive user
interface to UNICORE Grids

Client can be downloaded as Open Source at
unicore.sourceforge.net

Client functionality can be extended through
plug-in interface
- 48 -