Message Passing Framework for Globally Interconnected Clusters

Download Report

Transcript Message Passing Framework for Globally Interconnected Clusters

A-JUMP Message Passing
Framework for Globally
Interconnected Clusters
Mr. Sajjad Asghar
Advance Scientific Computing Group,
National Centre for Physics,
Islamabad, Pakistan.
Research Papers





2
Mehnaz Hafeez, Sajjad Asghar, Usman A. Malik, Adeel-ur-Rehman,
and Naveed Riaz, “Message Passing Framework for Globally
Interconnected Clusters”, CHEP 2010, Taipei, Taiwan (Accepted).
Sajjad Asghar, Mehnaz Hafeez, Usman A. Malik, Adeel-ur-Rehman,
and Naveed Riaz, “A Framework for Parallel Code Execution using
Java”, IEEE International Conference IEEE International Conference
TENCON2010, Fukuoka, Japan (Accepted).
Mehnaz Hafeez, Sajjad Asghar, Usman A. Malik, Adeel-ur-Rehman,
and Naveed Riaz, “Survey of MPI Implementations delimited by
Java”, IEEE International Conference TENCON2010, Fukuoka, Japan
(Accepted).
Adeel-ur-Rehman, Sajjad Asghar, Usman A. Malik, Mehnaz Hafeez and
Naveed Riaz , “Multi-lingual Binding of MPI Code on A-JUMP”,
IEEE International Conference on Frontiers of Information Technology,
Islamabad Pakistan (Submitted).
Sajjad Asghar, Mehnaz Hafeez, Usman A. Malik, Adeel-ur-Rehman,
and Naveed Riaz, “A-JUMP, Architecture for Java Universal
Message Passing”, IEEE International Conference on Frontiers of
Information Technology, Islamabad Pakistan (Submitted).
CHEP 2010
4/4/2017
Layout
Introduction
 Related Work
 A-JUMP Frame work
 Message Passing for Interconnected
Clusters
 Performance Analysis
 Future work & Conclusion
 References

3
CHEP 2010
4/4/2017
Message Passing Interface




5
Defacto Standard and defines interfaces
for C/C++ and Fortran
Programmer defines
 the work and data distribution
 How and when communication has to be
done via communication according to the
underlying algorithm.
Synchronization is implicit (can also be
user-defined)
Support distributed memory architectures
CHEP 2010
4/4/2017
Message Passing Interface

Features
1.
2.
3.
4.
5.
6.
7.

6
Standardization
Portability
Performance
Efficient implementations
Availability
Functionality
Scalability
The MPI Forum identified some critical
shortcomings of existing message passing
systems in areas such as complex data
layouts or support for modularity and safe
communication.
CHEP 2010
4/4/2017
Java and MPI




7
Java is an excellent basis for implementing message
passing framework.
 Java supports multi-paradigm communications
environment.
 Java’s bytecode can be executed securely on many
platforms.
It allows integrating many technologies that are used
for existing software development.
Java provides many desired features for building wide
range of applications like; standalone, client-server,
semantic web, enterprise, mobile, graphical or
distributed.
Thus, it makes Java an excellent basis for achieving
productivity of parallel applications.
CHEP 2010
4/4/2017
Layout
Introduction
 Related Work
 A-JUMP Frame work
 Message Passing for Interconnected
Clusters
 Performance Analysis
 Future work & Conclusion
 References

8
CHEP 2010
4/4/2017
9
CHEP 2010
4/4/2017
Architecture for Java Universal
Message Passing (A-JUMP)
Framework
A-JUMP is an HPC framework built for
distributed memory architectures in
order to facilitate explicit parallelism.
 A-JUMP is purely written in Java.
 A-JUMP is based on mpijava1.2
specifications.
 A-JUMP backbone is called High
Performance Computing Bus (HPC
Bus).

10
CHEP 2010
4/4/2017
HPC Bus of A-JUMP






11
Communication backbone of the A-JUMP
 Based on Apache ActiveMQ
Includes inter-process communication, monitoring and
propagation of Registry information.
Provides interoperability between hardware resources
and communication protocols.
Ensures that any changes in communication protocols
and network topologies will remain transparent to end
users.
Ensures loosely coupled services and components.
Segregate the application and communication layers.
CHEP 2010
4/4/2017
HPC Bus of A-JUMP
C++ Client
12
CHEP 2010
4/4/2017
Architecture for Java Universal
Message Passing (A-JUMP)
Framework

13
A-JUMP is designed to provide the following features
to write parallel application;
 Message Passing for Interconnected Clusters
 Multiple Languages
 Interoperability
 Portability
 Heterogeneity for resource environment
 Network Topology Independence
 Multiple Network Protocol Support
 Communication Mode Autonomy (i.e. asynchronous
communication)
 Different Communication Libraries
CHEP 2010
4/4/2017
A-JUMP Component
Diagram
14
CHEP 2010
4/4/2017
Layout







15
Introduction
Related Work
Architecture for Java Universal Message
Passing (A-JUMP) Framework
Message Passing for Interconnected
Clusters
Performance Analysis
Future work & Conclusion
References
CHEP 2010
4/4/2017
Grid Computing and MPI
GRID
MPI
Dynamic
Resources
Static
can add/delete
dynamically
Static
Configuration Host
File
Voluminous data exchange
over Internet
Data exchange over cluster
Design for internet computing Design for cluster computing
Heterogeneous resource
environment
16
CHEP 2010
Homogenous resource
environment
4/4/2017
Grid Computing and MPI
GRID
MPI
Specific to certain scientific
applications
Generic and not specific to
certain scientific applications
Independent from base
Operating System
Code execution is Operating
System dependent
Support for different
communication protocols
paradigm
No support for different
communication protocols
paradigm
High network latencies due
to complex layout
Low network latencies for
message passing
17
CHEP 2010
4/4/2017
A-JUMP, Message Passing
for Interconnected Clusters





18
It supports the dynamic behaviour of interconnected
clusters with MPI functionality.
It is written in pure Java and does not incorporate
any native code/libraries.
It provides asynchronous mode of communication
for both message passing and code distribution
over the clusters.
It provides sub-clustering of resources for different
jobs to utilize free resources efficiently.
It is designed for both internet and cluster
computing.
CHEP 2010
4/4/2017
A-JUMP, Message Passing
for Interconnected Clusters
19
CHEP 2010
4/4/2017
HPC Bus (interconnected clusters)
C++ Client
20
CHEP 2010
4/4/2017
A-JUMP, Message Passing
for Interconnected Clusters




21
It segregates the communication layer from
the application code.
It is easy to adopt new communication
technologies without changing the
application code.
It offers support for both homogeneous and
heterogeneous clusters for parallel code
execution.
A set of easy to use APIs are part of the
framework.
CHEP 2010
4/4/2017
Layered Architecture
22
CHEP 2010
4/4/2017
A-JUMP framework
Components
Machine Registry
 Dynamic Resource Discovery
 Dynamic Resource Allocation
 Message Security
 Job Scheduler
 Monitoring
 Code Migrator/Execution
 API Classes

23
CHEP 2010
4/4/2017
Machine Registry
Machine Registry provides a list of machines that are
dynamically added or removed from the framework.
To become the part of the framework, every worker
machine has to run Code Migrator/Execution service.






24
Code Migrator/Execution service sends the request to the
Machine Registry for a unique rank.
Machine Registry assigns unique ranks to the newly
added machines.
Machine Registry is also responsible to inform Monitoring
module about the newly added resources.
In future It will records the static information about each
machine which has been registered i.e. number of CPUs
and number of cores per CPU.
CHEP 2010
4/4/2017
Dynamic Resource
Discovery (DRD)



25
It keeps track of the resources that are part of the AJUMP clusters (system size).
It gets the information from Machine Registry to
index the available machines that are registered with
the Machine Registry.
DRD fetches the information about the state of the
machines from Monitoring.

The states are defined as free and busy.
CHEP 2010
4/4/2017
Dynamic Resource
Allocation (DRA)




26
DRA finds the availability of free
resources on A-JUMP clusters.
Client sends the request for the required
numbers of free processors that are
needed to run the job.
DRA searches the free available
resources from the index maintained by
DRD and monitoring.
It creates a list of machine that could be
used to define a sub cluster from available
machines and return this list to client
APIs.
CHEP 2010
4/4/2017
Message Security



27
It provides the facility to encrypt the
outgoing messages.
Current implementation is based on
simple symmetric encryption
technique.
Requires more performance analysis
CHEP 2010
4/4/2017
Job Scheduler(JS)






28
JS distributes incoming jobs to the cluster.
The jobs are sent to the JS by client APIs through
ActiveMQ.
It works on queuing mechanism which keeps
incoming jobs queued until free resources are
available for execution.
If the number of available resources is less than the
required resources then job will not be submitted.
Otherwise, a sub-cluster is created by JS according
to the requirement.
Free machines will remain available for new job
request.
P2P-MPI is the only other implementation which
provides sub- clustering.
CHEP 2010
4/4/2017
Monitoring
Job Scheduler gets the information from the Monitoring
component.
Monitoring records the dynamic information about the
computational resources that are part of the
framework.
It records number of busy and free CPUs.
It helps the Job Scheduler to find the best match for
task in hand.
Monitoring maintains two distinct states of the
machines available in the cluster; these are free and
busy.







29
As soon as a machine starts job execution, it informs
Monitoring to change its state from free to busy.
Monitoring is also responsible to set the state of the
machine back to free as soon as the job gets finished.
CHEP 2010
4/4/2017
Code Migrator/Execution
Code Migrator/Execution distributes and executes a task over
the network.
A built-in communication API is a part of this component.
Enables the tasks to do collective and point to point
communication.
Both communications between the tasks as well as distribution of
code are done in an asynchronous way through ActiveMQ.
As soon as a resource gets free, the job is dispatched by Job
Scheduler to the designated machine.







The Code Migrator/Execution on the destination machine
accepts the codes and starts its execution.



30
The underlying code distribution takes place through Code
Migrator/Execution.
The distribution is done through ActiveMQ.
The streamed output is redirected to the user machine.
The basic send() and receive() are also incorporated with the
help of communication libraries written purely in Java.
CHEP 2010
4/4/2017
Code Migrator/Execution
Code Adaptor
31
CHEP 2010
4/4/2017
Communication APIs
The proposed framework offers easy to use set of
communication APIs for writing parallel code.
These APIs are close to MPI specifications for Java
(mpijava1.2).
API Classes















32
Sub cluster Grouping (MPI_COMM_WORLD )
MPI.Init
MPI.Finalize
MPI.COMM_WORLD.Rank
MPI.COMM_WORLD.Size
Send
Receive
Broadcast
Gather
Scatter
Allgather
Reduce
CHEP 2010
4/4/2017
Workflow of A-JUMP for
Interconnected Clusters
33
CHEP 2010
4/4/2017
Figure illustrates A-JUMP relationship and the information
shared between the different components.


It describes the step by step the flow of program running parallel
on different clusters.
There are three main entities;





A client is a process looking for computing resources.
A code adaptor is a process offering computing resources


Each machine in cluster must run Code Adaptor to receive the
incoming program files. The role of a Code Adaptor or a client is
not fixed. A machine may serve as a Code Adaptor as well as
client.
DRD/DRA is a process that allocates the free resources to the
client for job submission.

34
Clients
Code Adaptors
Dynamic Resource Discovery/Dynamic Resource Allocation
(DRD/DRA)
DRD/DRA maintains the index of resources and creates subclusters based on free
computing
resources.
CHEP
2010
4/4/2017
2. Resource
3.
4.
5.
6.
7.
8.
Request
Get
Create
Return
Submit
Set
Status
Free
Sub
Sub
afor
Resources
Job
Monitoring
free
Cluster
Cluster
Number
byor
the
busy
Client
of
processes
APIs
1.
Registration
of
New
aInformation
Machine
Workflow
of
A-JUMP
for
MultiEach
new
resource
the
cluster
isof
assigned
ainformation
unique
rank
by
the
Machine
Client
DRD/DRA
Monitoring
It
Once
When
Machine
is the
client
sends
Code
responsibility
Registry
seeks
creates
receives
Adaptors
a request
the
is
ainresponsible
the
information
sub
of
start
for
DRA/DRD
information
cluster
number
execution
of
of
to
to
free
available
send
about
processes
of
forward
available
the
information
the
submitted
free
free
required
processors
resources
required
tojob
Monitoring
to
of
it run
informs
the
resources,
from
and
the
free
Monitoring.
sends
incase
job.
Machine
subitBefore
itcluster
submits
to
a new
It
Registry.
Once
the
new
resource
is
registered,
this
information
is
propagated
searches
DRD/DRA.
to
the
Registry
resource
the
jobclient.
isClusters
to
about
is
submitted
the
added,
Inindexed
specific
the
case,
change
updated,
toifmachines
free
the
required
cluster/clusters,
processes
in
and
theresources
identified
deleted.
status
for
from
sub
Also
by
are
thethe
free
clustering.
not
user
when
DRD/DRA.
to
available
sends
busy.
and status
At
request
then
These
thegets
same
DRD/DRA
to
machines
change
the
time to
over
Internet
the Monitoring.
from reside
DRD/DRA
sends
may
Machine
free
a message
Registry
tofor
on
busy
the
different
or
given
also
about
vice
sends
clusters
number
the
versa
unavailability
the
Machine
or
ofupdates
on
freea processes.
single
Registry
in
of the
resources
cluster.
change
sendsto
the
ofthe
status
updates
client.
to the
to
Monitoring. As soon as the submitted job finishes, a request of change in status
from busy to free is sent back to machine Registry. Code Adaptors are also
responsible to sends results back to the client.
35
CHEP 2010
4/4/2017
Layout







36
Introduction
Related Work
Architecture for Java Universal Message
Passing (A-JUMP) Framework
Message Passing for Interconnected
Clusters
Performance Analysis
Future work & Conclusion
References
CHEP 2010
4/4/2017
Performance Analysis
Communication Performance
Measurement



Code Speed-up Performance
Analysis



37
Ping Pong Latency Test
Network Throughput Measurement
Embarrassingly Parallel (EP) Benchmark
JGF Crypt Benchmark
CHEP 2010
4/4/2017
P2P-MPI



38
S. Genaud and C. Rattanapoka, “P2P-MPI: A Peer-toPeer Framework for Robust Execution of Message
Passing Parallel Programs,” Journal of Grid
Computing, 5(1):27-42, 2007.
Strength
 Pure Java
 Supports distributed memory models
 Supports heterogeneous environment
 Supports high speed networks
Weakness
 No support for high speed networks
 Slow Performance
CHEP 2010
4/4/2017
Test Environment






39
Results are collected on an environment that simulates
globally interconnected clusters.
The hardware resources involved in conducting the
tests include; 9 single core machines divided into two
clusters.
The clusters are connected with 1 Gbps bandwidth.
The machines are running Windows Server 2003 (with
SP2) and Windows XP (with SP3).
The TCP window size is default on all the machines in
clusters.
No optimization is done at the level of OS as well as at
the level of hardware.
CHEP 2010
4/4/2017
Time (ms)
Ping Pong Latency Test
Data Size (bytes)
41
CHEP 2010
4/4/2017
Network Throughput
Measurement
Mbps
Chart Title
45
40
35
30
25
20
15
10
5
0
A-JUMP (inter connected Clusters)
P2P-MPI
1
100
1K
4K
8K
16K
32K
64K
128K
256K
512K
Data Size (bytes)
42
CHEP 2010
4/4/2017
Code Speed UP Performance
Analysis (NAS EP)
120
A-JUMP (Interconnected Clusters)
P2P-MPI
100
Tinme(sec)
80
60
40
20
0
2
4
6
8
Number of CPUs
43
CHEP 2010
4/4/2017
Code Speed UP Performance
Analysis (NAS EP)
Performance Gain of A-JUMP
30
25
25.24%
22.28%
Gain(%)
20
15
10
5
8.33%
6.44%
0
2
4
6
8
Number of CPUs
44
CHEP 2010
4/4/2017
Code Speed UP Performance
Analysis (JGF Crypt Class A)
2.5
A-JUMP interconnected clusters
P2P-MPI
Time(sec)
2
1.5
1
0.5
0
2
4
6
8
Number of CPUs
45
CHEP 2010
4/4/2017
Code Speed UP Performance
Analysis (JGF Crypt Class B)
16
A-JUMP interconnected clusters
14
P2P-MPI2
Time (sec)
12
10
8
6
4
2
0
2
4
6
8
Number of CPUs
46
CHEP 2010
4/4/2017
Layout







47
Introduction
Related Work
Architecture for Java Universal Message
Passing (A-JUMP) Framework
Message Passing for Interconnected
Clusters
Performance Analysis
Conclusion and Future work
References
CHEP 2010
4/4/2017
Conclusion & Future Work







A-JUMP for interconnected clusters is loosely coupled
with its communication layer
It supports sub clustering
It provides dynamic resource management
It carries multiple communication protocols
It supports asynchronous message passing
It supports heterogeneous environment
Performance analysis results of A-JUMP


48
For small message sizes are better than P2P-MPI
For large message sizes are promising and comparable
with P2P-MPI
CHEP 2010
4/4/2017
Conclusion & Future Work





49
The communication Library will be
improved for message sizes >= 256k
Support for different security techniques
The other area of focus will be to
improve network latencies and lessen
the communication overheads
Multilingual Support for Interconnected
Clusters
More NAS and JGF benchmarks
CHEP 2010
4/4/2017
Layout







50
Introduction
Related Work
Architecture for Java Universal Message
Passing (A-JUMP) Framework
Message Passing for Interconnected
Clusters
Performance Analysis
Conclusion and Future work
References
CHEP 2010
4/4/2017
References

















51
Ian Foster, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” in Proc. 7th International Euro-Par
Conference on Parallel Processing, Manchester, 2001, pp. 1–4.
LCG homepage [Online]. Available: http://lcg.web.cern.ch/lcg.
EGEE homepage [Online]. Available: http://www.eu-egee.org.
S. Haug, F. Ould-Saada, K. Pajchel, and A. L. Read, “Data management for the world’s largest machine,” in Applied
Parallel Computing. State of the Art in Scientific Computing, vol. 4699, Springer Berlin / Heidelberg, 2007, pp. 480-488.
P. Eerola et al., “The NorduGrid production grid infrastructure, status and plans,” in IEEE/ACM Grid, Nov. 2003, pp. 158–
165.
TeraGrid Project homepage [Online]. Available: http://www.teragrid.org
F. Cappello et al., “Grid’5000: a large scale and highly reconfigurable experimental grid testbed.” In Proc. 6th IEEE/ACM
International Workhop on Grid Computing - 2005, pp. 99-106.
J. Dongarra, D. Gannon, G. Fox, and K. Kennedy. The Impact of Multicore on Computational Science Software. CTWatch
Quarterly, 2007 [Online]. Available: http://www.ctwatch.org/quarterly/articles/2007/02/the-impact-of-multicore-oncomputational-science-software.
S. Genaud and C. Rattanapoka, “P2P-MPI: A Peer-to-Peer Framework for Robust Execution of Message Passing Parallel
Programs,” Journal of Grid Computing, 5(1):27-42, 2007.
S. Schulz, W. Blochinger, and M. Poths, “A Network Substrate for Peer-to-Peer Grid Computing beyond Embarrassingly
Parallel Applications,” in Proc. WRI International Conference on Communications and Mobile Computing – 2009, Volume
03, pp. 60-68.
M. Humphrey and M. R. Thompson, “Security Implications of Typical Grid Usage Scenarios,” Cluster Computing, vol. 5,
Issue-03, 2002, pp. 257-264.
P. Phinjaroenphan, S. Bevinakoppa, and P. Zeephongsekul, “A Method for Estimating the Execution Time of a Parallel
Task on a Grid Node,” in Proc. Eurepean Grid Conefernce, Amsterdam, 2005, pp. 226-236.
A. Nelisse, J. Maassen, T. Kielmann, and H. Bal. “CCJ: Object-Based Message Passing and Collective Communication in
Java,” Concurrency and Computation: Practice & Experience, 15(3-5):341-369, 2003.
J. Al-Jaroodi, N. Mohamed, H. Jiang, and D. Swanson, “JOPI: a Java object-passing interface: Research Articles,”
Concurrency and Computation: Practice & Experience, vol. 17, Issues 7-8, pp. 775-795, 2005.
P. Martin, L. M. Silva, and J. G. Silva, “A Java Interface to WMPI”. In 5th European PVM/MPI Users’ Group Meeting,
EuroPVM/MPI’98, Liverpool, UK, Lecture Notes in Computer Science, vol. 1497, pp. 121-128. Springer, 1998.
S. Mintchev and V. Getov, “Towards Portable Message Passing in Java: Binding MPI,” in 4th European PVM/MPI Users’
Group Meeting, EuroPVM/MPI’97, Crakow, Poland, Lecture Notes in Computer Science, vol. 1332, pp. 135–142. Springer,
1997.
G. Judd, M. Clement, and Q. Snell, “DOGMA: Distributed Object Group Metacomputing Architecture,” Concurrency and
Computation: Practice and Experience, 10(11-13):977-983, 1998.
CHEP 2010
4/4/2017
References















52
B. Pugh and J. Spacco, “MPJava: High-Performance Message Passing in Java using Java.nio,” in Proc. 16th Intl.
Workshop on Languages and Compilers for Parallel Computing (LCPC'03), Lecture Notes in Computer Science, vol. 2958,
pp. 323-339, College Station, TX, USA, 2003.
B. Y. Zhang, G. W. Yang, and W.-M. Zheng, “Jcluster: an Efficient Java Parallel Environment on a Large-scale
Heterogeneous Cluster,” Concurrency and Computation: Practice and Experience, 18(12):1541-1557, 2006.
A. Kaminsky, “Parallel Java: A Unified API for Shared Memory and Cluster Parallel Programming in 100% Java” in Proc. 9th
Intl. Workshop on Java and Components for Parallelism, Distribution and Concurrency (IWJacPDC'07), pp. 196a (8 pages),
Long Beach, CA, USA, 2007.
M. Baker, B. Carpenter, G. Fox, S. Ko, and S. Lim, “mpi-Java: an Object-Oriented Java Interface to MPI,” in 1st Int.
Workshop on Java for Parallel and Distributed Computing (IPPS/SPDP 1999 Workshop), San Juan, Puerto Rico, Lecture
Notes in Computer Science, volume 1586, pp. 748–762. Springer, 1999.
S. Genaud and C. Rattanapoka, “P2P-MPI: A Peer-to-Peer Framework for Robust Execution of Message Passing Parallel
Programs,” Journal of Grid Computing, 5(1):27-42, 2007.
A. Shafi, B. Carpenter, and M. Baker, “Nested Parallelism for Multi-core HPC Systems using Java,” Journal of Parallel and
Distributed Computing, 69(6):532-545, 2009.
M. Bornemann, R. V. v. Nieuwpoort, and T. Kielmann, “MPJ/Ibis: A Flexible and Efficient Message Passing Platform for
Java,” in Proc. 12th European PVM/MPI Users’ Group Meeting (EuroPVM/MPI’05), Lecture Notes in Computer Science,
vol. 3666, pp. 217-224, Sorrento, Italy, 2005.
S. Bang and J. Ahn, “Implementation and Performance Evaluation of Socket and RMI based Java Message Passing
Systems,” in Proc. 5th Intl. Conf. on Software Engineering Research, Management and Applications (SERA'07), pp. 153159, Busan, Korea, 2007.
G. L. Taboada, J. Tourino, and R. Doallo, “F-MPJ: scalable Java message-passing communications on parallel systems,”
Journal of Supercomputing, 2009.
V. Getov , G. v. Laszewski , M. Philippsen , and I. Foster, “Multiparadigm communications in Java for grid computing,”
Communications of the ACM, Volume. 44, Issue 10, 2001, pp. 118-125.
H. Schildt, “Java Beginner’s Guide”, 4th ed., Tata McGraw-Hill, 2007.
D. Balkanski , M. Trams , and W. Rehm , “Heterogeneous Computing With MPICH/Madeleine and PACX MPI: a Critical
Comparison” Technische Universit, Chemnitz, 2003.
S. Asghar, M. Hafeez, U. A. Malik, A. Rehman, and N. Riaz, “A-JUMP, A Universal Parallel Framework,” unpublished.
MPI: A Message Passing Interface Standard. Message Passing Interface Forum [Online]. Available: http://www.mpiforum.org/docs/mpi-11-html/mpi-report.html.
G. L. Taboada, J. Tourino, and R. Doallo, “Java for High Performance Computing: Assessment of current research &
practice,” in Proc. 7th International Conference on Principles and Practice or Programming in Java, Calgary, Alberta,
Canada, 2009. pp. 30-39.
CHEP 2010
4/4/2017
References


















53
G.L. Taboada, J. Tourino, and R. Doallo, “Performance Analysis of Java Message-Passing Libraries on Fast Ethernet,
Myrinet and SCI Clusters,” in Proc. 5th IEEE International Conference on Cluster Computing (Cluster'03), Hong Kong,
China, 2003, pp. 118-126.
S. Asghar, M. Hafeez, U. A. Malik, A. Rehman, and N. Riaz, “Survey of MPI Implementations delimited by Java for HPC,”
unpublished.
R. V. v. Nieuwpoort et al., “Ibis: an Efficient Java based Grid Programming Environment,” Concurrency and Computation:
Practice and Experience, 17(7-8):1079-1107, 2005.
W. Gropp, E. Lusk, N. Doss, and A. Skjellum, “High-performance, portable implementation of the MPI Message Passing
Interface Standard. Parallel Computing,” 22(6):789.828, Sept. 1996.
MPICH2 [Online]. Available: http://www.mcs.anl.gov/research/projects/mpich2.
MPICH-G2 [Online]. Available: http://www3.niu.edu/mpi.
N. Karonis, B. Toonen, and I. Foster, “MPICH-G2: A Grid-Enabled Implementation of the Message Passing
Interface,” Journal of Parallel and Distributed Computing, 2003.
T. Kielmann , R. F. H. Hofman , H. E. Bal , A. Plaat , and R. A. F. Bhoedjang, “MagPIe: MPI's collective communication
operations for clustered wide area systems,” ACM SIGPLAN Notices, vol. 34, Issue-08, 1999, pp. 131-140.
L. Chen, C. Wang, F. C. M. Lau, and R. K. K. Ma, “A Grid Middleware for Distributed Java Computing with MPI Binding and
Process Migration Supports”, Journal of Computer Science and Technology, vol. 18, Issue-04, 2003, pp. 505-514.
V. Welch, F. Siebenlist, I. Foster, J. Bresnahan, K. Czajkowski, J. Gawor, C. Kesselman, S. Meder, L. Pearlman, and S.
Tuecke, “Security for Grid Services,” Twelfth International Symposium on High Performance Distributed Computing (HPDC12), IEEE Press, 2003.
Globus Toolkit Homepage [Online]. Available: http://www.globus.org/toolkit.
PACX-MPI [Online]. Available: http://www.hlrs.de/organization/av/amt/research/pacx-mpi
Version 1.2 of MPI [Online]. Available: http://www.mpi-forum.org/docs/mpi-20-html/node28.htm.
O. Aumage, and G. Mercier, "MPICH/Madeleine: a True Multi-Protocol MPI for High Performance Networks," vol. 1,
pp.10051a, 15th International Parallel and Distributed Processing Symposium (IPDPS'01), 2001.
W. Gropp and E. Lusk, “MPICH working note: The second-generation adi for MPICH implementation of MPI,” 1996.
O. Aumage, “Heterogeneous multi-cluster networking with the madeleine III communication library,” in 16th IEEE
International Parallel and Distributed Processing Symposium (IPDPS '02 (IPPS & SPDP)), pp 85, Washington - Brussels Tokyo, Apr 2002.
O. Aumage, L. Boug´e, A. Denis, J.-F. M´ehaut, G. Mercier, R. Namyst, and L. Prylli, “A portable and efficient
communication library for high-performance cluster computing,” in IEEE Intl Conf. on Cluster Computing (Cluster 2000), pp
78-87, Technische Universitt Chemnitz, Saxony, Germany, Nov 2000.
ActiveMQ homepage [Online]. Available: http://activemq.apache.org.
CHEP 2010
4/4/2017
54
CHEP 2010
4/4/2017