HPC across Heterogeneous Resources

Download Report

Transcript HPC across Heterogeneous Resources

HPC across Heterogeneous
Resources
Sathish Vadhiyar
Motivation
 MPI assumes global communicator with the help
of which all processes can communicate with
each other.
 Hence all nodes on which MPI application is
started must be accessible by all nodes.
 This is not always possible
 Due to the shortage of IP address space.
 Security concerns – Large MPP sites and beowulf
clusters have only 1 master node in public IP address
space.
 Grand challenging applications require the use
of many MPPs.
PACX-MPI
 Parallel Computer eXtension
 MPI on a cluster of MPPs
 Initially developed to connect Cray-YMP with Intel
Paragon
PACX-MPI
 CFDs or crash simulation of automobiles –
1 MPP not enough
 Initial application – flow solver application
across Pittsburgh supercomputing centre,
Sandia national lab and High Performance
Computing Center, Stuttgart
 PACX sits between applications and MPI;
PACX-MPI
 On each MPP, 2 extra nodes with running
daemons to take care of communication
between MPPs, compression and
decompression of data, communication with
local nodes
 Daemon nodes implemented as additional local
MPI processes
 Communication among processes internal to
MPP through vendor MPI on a local MPP
network.
 Communication between MPPs through
daemons via Internet or specialized network.
PACX MPI Architecture
PACX MPI – Point-point comm.
(Node 6 -> Node 2)
PACX MPI – Broadcast comm.
(Root: Node 6)
Data Conversion
 If the sender and receiver are in two
separate heterogeneous MPPs, the
sender converts its data to XDR (external
data representation) format
 Receiver converts data from XDR format
to its own data representation
PACX-MPI: Results
 Between T3Es at PSC and SDSC T3E
 URANUS application
 Navier-Strokes application – an iterative application based on
convergence
 Simulation of rentry vehicle
 Between PSC and Univ. of Stuttgart
 Closely coupled application due to frequent communication due
to convergence.
 Had to be modified for metacomputing setting – application
made more asynchronous compromising on convergence
 P3TDSMC – Monte Carlo for particle tracking
 More amenable to metacomputing because of high computationcommunication ratio
 The latency effects are hidden as larger number of particles are
considered.
Other Related Projects
 PLUS
 MPICH-G
 PVMPI
 MPI-Connect
References
 Edgar Gabriel, Michael Resch, Thomas Beisel, Rainer Keller:




'Distributed computing in a heterogenous computing environment',
(gzipped postscript) to appear at EuroPVMMPI'98 Liverpool/UK,
1998.
Thomas Beisel, Edgar Gabriel, Michael Resch: 'An Extension to MPI
for Distributed Computing on MPPs' (gzipped postscript) in Marian
Bubak, Jack Dongarra, Jerzy Wasniewski (Eds.) 'Recent Advances
in Parallel Virtual Machine and Message Passing Interface', Lecture
Notes in Computer Science, Springer, 1997, 75-83.
Message-passing environments for metacomputing , Pages
699-712 , Matthias A. Brune, Graham E. Fagg and Michael M.
Resch , FGCS, Volume 15, 1999
PVMPI: An Integration of PVM and MPI systems.
http://www.netlib.org/utk/papers/pvmpi/paper.html
A Grid-Enabled MPI: Message Passing in Heterogeneous
Distributed Computing Systems. I. Foster, N. Karonis. Proc. 1998
SC Conference, November, 1998