parallel-computing

Download Report

Transcript parallel-computing

Bryan Carpenter
University of Portsmouth
 Give
a general idea of what my research
interests are, and where they may lead.
 Adopt
an “autobiographical” style, so
some material very old (but bears on
later developments!)
 After
a couple of years as a postdoc
physicist, joined Tony Hey’s new
Concurrent Computation Group in
Southampton.
 Initially
writing physics applications in
occam, to demonstrate transputer-based
parallel architectures (EU projects
Supernode, PUMA,…)
 occam
tooling for writing parallel
scientific codes was poor.
• Wanted MIMD capabilities of occam to be
merged with data-parallel concepts of (say) ICL
DAP Fortran.
• Compiler development seemed too hard. In
lieu, developed communication libraries to
support data-parallel applications.
• Eventually believed that – even with language
support for data-parallelism – libraries would be
key.
 Circa
1993, High Performance Fortran and
Message Passing Interface standards for
parallel computing were established.
• MPI – message passing library for MIMD
computers (“clusters”).
• HPF – data parallel Fortran extensions (for MIMD
or SIMD).
 John
Merlin had been working on a preprocessor for data-parallel extensions to
Fortran 90; I had been working on dataparallel communication libraries (for
occam).
 With Tony Hey and Mark Baker, we
obtained EPSRC + JISC funding to
develop a system called SHPF.
• http://www.vcpc.univie.ac.at/information/software/shpf/
• Southampton HPF or Subset HPF?
!hpf$ processors q(2, 3)
!hpf$ template s(4, 100)
!hpf$ distribute s(block, cyclic(12)) onto q
integer a(80)
real b(80)
!hpf$ align a(j) with s(*, 15 + j)
!hpf$ align b(:) with a(:)
HPF in
double precision res
do i = 1, 80
a(i) = i
b(i) = 1.0 / i
enddo
Fortran 90 ADLIB
Interface
res = dot_product(a, b)
print "(f12.6)", res
end
C++ Adlib kernel
ADAPT
preprocessor
MPI
[...]
INTEGER :: K_MA (0:13)
INTEGER, DIMENSION (0:1) :: K_PQ
INTEGER, DIMENSION (0:35) :: I_A
REAL, DIMENSION (0:35) :: R_B
INTEGER :: I_J
DOUBLE PRECISION :: D_RES
[...]
K_MA = (/65, 0, K_HQ, 0, 80, 1, 36, 15, 80, 1, 100, 12, 4,
1/)
DO 10 I_I = 1, 80
IF (K_PQ(1) == MOD((I_I+14)/12,3)) THEN
K_V0 = MOD(I_I+14,12)+(I_I+14)/36*12
I_A (K_V0) = I_I
ENDIF
IF (K_PQ(1) == MOD((I_I+14)/12,3)) THEN
K_V0 = MOD(I_I+14,12)+(I_I+14)/36*12
R_B (K_V0) = 1.0/I_I
ENDIF
10 CONTINUE
CALL AD_DOT_PRODUCT_IR (T_V0, I_A, K_MA, R_B,
K_MA)
D_RES = T_V0
IF (K_P0 == 0) THEN
PRINT "(f12.6)", D_RES
ENDIF
[...]
END
Fortran 90 with
ADLIB calls out
Fortran 90
Compiler
Parallel
Programme!
 In
1996 I moved to Syracuse University,
NY, to work with Geoffrey Fox on the
Parallel Compiler Runtime Consortium
project – DARPA project involving several
US sites.
 Syracuse + Peking University were
developing another HPF compiler.
• http://www.hpjava.org/pcrc/
SHPF Fortran 90
Interface
PCRC Fortran 77
Interface
ad++
Interface
PCRC Java
Interface
C++ Adlib kernel
Distributed Array
Descriptors,
“Distributed Control”
Process groups,
Distributed index
ranges
Collective communication
operations on distributed
arrays.
Remap, shifts, reductions,
gather-scatter, F90 intrinsics…
MPI
 By1997
it seemed clear HPF would fail –
programming model too inflexible,
optimising compilers too hard.
 We proposed HPspmd as a simpler and
more flexible data parallel language
model.
• Essentially box on LHS of Adlib kernel (previous
slide) goes in language syntax; all
communication functions (RHS box) are
explicitly called from user code.
 Java
was new and interesting, and looked
simpler than C++.
• Fox and others believed Java might replace
Fortran for scientific computing (Java Grande).
 NSF
proposal for HPspmd model
identified Java as the base language for
initial binding.
• It took about 5 years to produce the HPJava
development kit (compiler and libraries).
• http://www.hpjava.org/
Procs2 p = new Procs2(P, P) ;
on(p) {
Range x = new BlockRange(N, p.dim(0)) ;
Range y = new BlockRange(N, p.dim(1)) ;
float [[-,-]] a = new float [[x, y]],
b = new float [[x, y]],
c = new float [[x, y]] ;
...
overall(i = x for :)
overall(j = y for :)
c [i, j] = a [i, j] + b [i, j] ;
}
do {
Adlib.writeHalo(a) ;
overall(i = x for 1 : N - 2)
overall(j = y for 1 + (i` + parity) % 2 : N - 2 : 2) {
float newA = 0.25F * (a [i - 1, j] + a [i + 1, j] +
a [i, j - 1] + a [i, j + 1]) ;
r [i, j] = Math.abs(newA - a [i, j]) ;
a [i, j] = newA ;
}
parity = 1 - parity;
} while(parity = 1 || Adlib.maxval(r) > EPS) ;
 mpiJava
– a Java wrapper to native
implementations of MPI.
• http://www.hpjava.org/mpiJava.html
 MPJ
Express – a newer implementation of
MPI-like functions in Java (work with
Mark Baker and Aamir Shafi –
Portsmouth, Reading, NUST).
• http://www.mpj-express.org
 Proposal
by Mark Baker, Graham
Megson, and me.
 Basic run-time system provides common
interface to assorted multicore
architectures.
 HPspmd-like higher level programming
model. C/C++ rather than Java.
 Still unfunded…
 Gadget
2 is an MPI code for cosmological
N-body and hydrodynamics simulations.
• In MPJ Express work, we made a Java version of
this code, as a demo.
 Recently
revisited Gadget code – now
analysed as a potential demo of HPspmd.
• New communication patterns of general interest:
Collective Asynchronous Remote
Invocation/Collective Asynchronous Remote
Access.