Transcript Parallelism

KARMA with ProActive Parallel Suite
12/01/2009
Air France, Sophia Antipolis
Solutions and Services for Accelerating your Applications
ProActive Parallel Suite
Speed-up
applications by
distributing them
over a set of
resources
The result of 9
years of R&D at
INRIA
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
•
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Triggering mode (start and kill)
Parallelization capability (functional / technical)
Re-run ability (fault tolerance / recovery)
Performance (elapse time)
Type (interactive, schedule, event, exceptional, administrative)
Data volume (light, medium, heavy)
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
Open system
-
Interoperability (upper and lower stream)
API (language, completeness, reliability)
Tools (friendliness, eclipse integration ability,..)
Parallelization
-
Multiple processes capability (functional)
Parallelization programming capability (technical).
ProActive Technology Integration
Simple and Seamless integration
 Installation : one directory
 Deployment : one XML file (GCM Deployment)
 Possible automated agent-based deployment
 Support for common standards
 Connection/Acquisition: Ssh, Rsh, LSF, PBS, SGE,…
 Communication: RMI, RMI/SSH, HTTP, Ibis,…
 Export any activity as a WebService
 J2EE (in and out)
 Validated on common JDKs 1.5, 1.6 and 1.7
 Activities are monitored through standard JMX
 Compliant with common monitoring tools (TIBCO Hawk)
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
•
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Triggering mode (start and kill)
Parallelization capability (functional / technical)
Re-run ability (fault tolerance / recovery)
Performance (elapse time)
Type (interactive, schedule, event, exceptional, administrative)
Data volume (light, medium, heavy)
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
Open system
-
Interoperability (upper and lower stream)
API (language, completeness, reliability)
Tools (friendliness, eclipse integration ability,..)
Parallelization
-
Multiple processes capability (functional)
Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Triggering mode (start and kill)
Parallelization capability (functional / technical)
Re-run ability (fault tolerance / recovery)
Performance (elapse time)
Type (interactive, schedule, event, exceptional, administrative)
Data volume (light, medium, heavy)
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
- API (language, completeness, reliability)
- Tools (friendliness, eclipse integration ability,..)
-
Multiple processes capability (functional)
Parallelization programming capability (technical).
ProActive Scheduling
Managed
environment
for distributed
computation
Scheduler / Resource Manager Overview
• Multi-platform Graphical Client (RCP)
• File-based or LDAP authentication
• Static Workflow Job Scheduling, Native and Java
tasks, Retry on Error, Priority Policy, Configuration
Scripts,…
• Dynamic and Static node sources, Resource
Selection by script, Monitoring and Control GUI,…
• ProActive Deployment capabilities : Desktops,
Clusters, ProActive P2P,…
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Triggering mode (start and kill)
Parallelization capability (functional / technical)
Re-run ability (fault tolerance / recovery)
Performance (elapse time)
Type (interactive, schedule, event, exceptional, administrative)
Data volume (light, medium, heavy)
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
- API (language, completeness, reliability)
- Tools (friendliness, eclipse integration ability,..)
-
Multiple processes capability (functional)
Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Triggering mode (start and kill)
Parallelization capability (functional / technical)
Re-run ability (fault tolerance / recovery)
Performance (elapse time)
Type (interactive, schedule, event, exceptional, administrative)
Data volume (light, medium, heavy)
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
- API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
-
Multiple processes capability (functional)
Parallelization programming capability (technical).
Parallelism with ProActive Scheduling
 Computations are defined as Jobs made of Tasks
Task 1(input 1)
…
Task 2(input 2)
Task N(input N)
Functional //
 Bag of Tasks
Task 2(input 2)
Task 1(input 1)
 Tasks Flow
res1
res2
Task 3(res1,res2)
 ProActive application
Technical //
Task 1(input 1)
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Triggering mode (start and kill)
Parallelization capability (functional / technical)
Re-run ability (fault tolerance / recovery)
Performance (elapse time)
Type (interactive, schedule, event, exceptional, administrative)
Data volume (light, medium, heavy)
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
- API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
-
Multiple processes capability (functional)
Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Processes are expressed according to the following features:
- Triggering mode (start and kill)
 Parallelization capability (functional / technical)
- Re-run ability (fault tolerance / recovery)
- Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
- Data volume (light, medium, heavy)
•
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
- API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
Defining Tasks
Tasks can be defined in Java or as Native Application
 Dynamic Task creation through Java API
JavaTask aTask = new JavaTask();
aTask.setName("task 1");
aTask.setRerunnable(2);
aTask.setRestartOnError(RestartMode.ELSEWHERE);
aTask.addArgument("foo",new Boolean(true));
aTask.setExecClassName("com.activeeon.Compute");
aTask.setPreScript("/path/to/script_file");
 Can attach pre/post script to a task
 Any JSR-223 supported language (JS, Python, Ruby,…)
 Used for configuration and data transfer
 Data are handled separately
Defining Jobs
Jobs can be defined in Java or XML
 Dynamic job creation through Java API
TaskFlowJob job = new TaskFlowJob();
job.setName("job_name");
job.setPriority(JobPriority.NORMAL);
job.setCancelOnError(true);
job.addTask(new NativeTask(cmd));
 Can easily develop tools that generate and submit jobs
 Specialized
 Automated
 Static description in XML descriptor
 Can be defined by any non specialized user
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
HEAVY PROCESS (Batch long time)
LIGHT PROCESS (Interactive short time)
Processes are expressed according to the following features:
- Triggering mode (start and kill)
 Parallelization capability (functional / technical)
- Re-run ability (fault tolerance / recovery)
- Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
- Data volume (light, medium, heavy)
•
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
- API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
- Triggering mode (start and kill)
 Parallelization capability (functional / technical)
- Re-run ability (fault tolerance / recovery)
- Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
•
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
Executing Jobs
 Submission through…
 Java API
 Graphical interface
 Command line
 Recovery is fully configurable with User/System level distinction
1. Host failure : automatically re-dispatched with needed input
 Number of retries bounded by the administrator
2. Task failure : can be either…
 Automatically re-dispatched with needed input
 Number of retries bounded by the user
 Aborted (Task) : exception is transmitted as task result
 Aborted (Job) : job is fully canceled
 Scheduler itself is recoverable from underlying database
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
- Triggering mode (start and kill)
 Parallelization capability (functional / technical)
- Re-run ability (fault tolerance / recovery)
- Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
•
Manage resources consumption
-
Availability
Resources optimization (CPU, Core, Memory)
Strategy / Policy (Arbitration resources and tasks request prioritization )
Fault error management
•
Open system
•
Parallelization
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
- Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
- Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
Monitoring Jobs
 System and ProActive level monitoring
 Through standard JMX
 Provide generic tool for monitoring and reporting
 Compliant with common monitoring tool
Monitoring Jobs
 System and ProActive level monitoring
 Through standard JMX
 Provide generic tool for monitoring and reporting
 Compliant with common monitoring tool
 Scheduler level monitoring
 On-going porting from MVC to JMX
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
- Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
- Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
- Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
Controlling Jobs
 Fine grain control through GUI and API
Job level
(User)
Scheduler level
(Admin)
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
- Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
 Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
Specializing the Scheduler
Business Level specialization and automation
 ActiveEon provides business specific interface for the Scheduler
 Bio-technology (Gold, MapRead)
 Finance
 MatLab/Scilab
Specializing the Scheduler
Business Level specialization and automation
 ActiveEon provides business specific interface for the Scheduler
 Bio-technology (Gold, MapRead)
 Finance
 MatLab/Scilab
 User-level hooks to (very) easily specialize the Scheduler
 Plug-in for RCP client
 e.g. ActiveEon’s Gold module
 Result preview mechanism
 Can preview result in the scheduler GUI (textual/graphical)
 Can attach any external application to a task for result display
Specializing the Scheduler
Business Level specialization and automation
 ActiveEon provides business specific interface for the Scheduler
 Bio-technology (Gold, MapRead)
 Finance
 MatLab/Scilab
 User-level hooks to (very) easily specialize the Scheduler
 Plugin for RCP client
 e.g. ActiveEon’s Gold module
 Result preview mechanism
 Can preview result in the scheduler GUI (textual/graphical)
 Can attach any external application to a task for result display
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
- Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
 Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
 Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
 Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
ProActive Resource Manager
 Rely on ProActive Programming features
 GCM deployment standard
 Agent based deployment
 Support for many LAN / WAN / Grid / Cloud standards…
MultiCPU/Core
Cluster
Desktop Grid
Clusters / Grille
Straightforward Upgrade
Clouds / RAS
ProActive Resource Manager
 Rely on ProActive Programming features
 GCM deployment standard
 Agent based deployment
 Support for many LAN / WAN / Grid / Cloud standards…
 Deal with resources heterogeneity with Selection Scripts
 User defined : no need for system reconfiguration
 Any JSR-223 supported language (JS, Python, Ruby,…)
 Can rely on dynamic decision or on static host description
 Optimized (probabilistic selection of candidates)
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
 Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
- Availability
- Resources optimization (CPU, Core, Memory)
 Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
KARMA performances constraints
•
KARMA processes computation needs refer to :
•
Processes are expressed according to the following features:
•
Manage resources consumption
•
Open system
•
Parallelization
 HEAVY PROCESS (Batch long time)
 LIGHT PROCESS (Interactive short time)
 Triggering mode (start and kill)
 Parallelization capability (functional / technical)
 Re-run ability (fault tolerance / recovery)
 Performance (elapse time)
 Type (interactive, schedule, event, exceptional, administrative)
 Data volume (light, medium, heavy)
 Availability
 Resources optimization (CPU, Core, Memory)
 Strategy / Policy (Arbitration resources and tasks request prioritization )
 Fault error management
 Interoperability (upper and lower stream)
 API (language, completeness, reliability)
 Tools (friendliness, eclipse integration ability,..)
 Multiple processes capability (functional)
 Parallelization programming capability (technical).
Thank you for your attention
Demo