HPC across Heterogeneous Resources
Download
Report
Transcript HPC across Heterogeneous Resources
HPC across Heterogeneous
Resources
Sathish Vadhiyar
Motivation
MPI assumes global communicator with the help
of which all processes can communicate with
each other.
Hence all nodes on which MPI application is
started must be accessible by all nodes.
This is not always possible
Due to the shortage of IP address space.
Security concerns – Large MPP sites and beowulf
clusters have only 1 master node in public IP address
space.
Grand challenging applications require the use
of many MPPs.
PACX-MPI
Parallel Computer eXtension
MPI on a cluster of MPPs
Initially developed to connect Cray-YMP with Intel
Paragon
PACX-MPI
CFDs or crash simulation of automobiles –
1 MPP not enough
Initial application – flow solver application
across Pittsburgh supercomputing centre,
Sandia national lab and High Performance
Computing Center, Stuttgart
PACX sits between applications and MPI;
PACX-MPI
On each MPP, 2 extra nodes with running
daemons to take care of communication
between MPPs, compression and
decompression of data, communication with
local nodes
Daemon nodes implemented as additional local
MPI processes
Communication among processes internal to
MPP through vendor MPI on a local MPP
network.
Communication between MPPs through
daemons via Internet or specialized network.
PACX MPI Architecture
PACX MPI – Point-point comm.
(Node 6 -> Node 2)
PACX MPI – Broadcast comm.
(Root: Node 6)
Data Conversion
If the sender and receiver are in two
separate heterogeneous MPPs, the
sender converts its data to XDR (external
data representation) format
Receiver converts data from XDR format
to its own data representation
PACX-MPI: Results
Between T3Es at PSC and SDSC T3E
URANUS application
Navier-Strokes application – an iterative application based on
convergence
Simulation of rentry vehicle
Between PSC and Univ. of Stuttgart
Closely coupled application due to frequent communication due
to convergence.
Had to be modified for metacomputing setting – application
made more asynchronous compromising on convergence
P3TDSMC – Monte Carlo for particle tracking
More amenable to metacomputing because of high computationcommunication ratio
The latency effects are hidden as larger number of particles are
considered.
Other Related Projects
PLUS
MPICH-G
PVMPI
MPI-Connect
References
Edgar Gabriel, Michael Resch, Thomas Beisel, Rainer Keller:
'Distributed computing in a heterogenous computing environment',
(gzipped postscript) to appear at EuroPVMMPI'98 Liverpool/UK,
1998.
Thomas Beisel, Edgar Gabriel, Michael Resch: 'An Extension to MPI
for Distributed Computing on MPPs' (gzipped postscript) in Marian
Bubak, Jack Dongarra, Jerzy Wasniewski (Eds.) 'Recent Advances
in Parallel Virtual Machine and Message Passing Interface', Lecture
Notes in Computer Science, Springer, 1997, 75-83.
Message-passing environments for metacomputing , Pages
699-712 , Matthias A. Brune, Graham E. Fagg and Michael M.
Resch , FGCS, Volume 15, 1999
PVMPI: An Integration of PVM and MPI systems.
http://www.netlib.org/utk/papers/pvmpi/paper.html
A Grid-Enabled MPI: Message Passing in Heterogeneous
Distributed Computing Systems. I. Foster, N. Karonis. Proc. 1998
SC Conference, November, 1998