NTropy-UW-July06l - Pittsburgh Supercomputing Center Staff
Download
Report
Transcript NTropy-UW-July06l - Pittsburgh Supercomputing Center Staff
N Tropy: A Framework for
Knowledge Discovery in a
Virtual Universe
Harnessing the Power of Parallel Grid
Resources for Astrophysical Data Analysis
Jeffrey P. Gardner
Pittsburgh Supercomputing Center
University of Pittsburgh
Carnegie Mellon University
Mining the Universe can be
(Computationally) Expensive
Computational Astrophysics:
Size of simulations are frequently limited by
inability to process simulation output (no
parallel group finders).
Current MPPs are 1000’s of processors…this
is already too large for serial processing.
Next generation of MPPs will be 100,000’s
of processors!
Mining the Universe can be
(Computationally) Expensive
Observational Astronomy:
Paradigm shift is astronomy: Sky Surveys
Observers now generates ~ 1TB data per night
With Virtual Observatories, one can pool data from
multiple catalogs.
Computational requirements are growing at a
faster rate than computational power.
There will be some problems that would be
impossible without parallel machines.
There will be many problems for which throughput
can be substantially enhanced by parallel
machines.
The Challenge of Parallel Data
Analysis
Parallel programs are hard to write!
Parallel world is dominated by simulations:
Code is often reused for many years by many people
Therefore, you can afford to spend lots of time writing the
code.
Data Analysis does not work this way:
Steep learning curve to learn parallel programming
Lengthy development time
Rapidly changing scientific inqueries
Less code reuse
Data Mining paradigm mandates rapid software
development!
Tightly-Coupled Parallelism
(what this talk is about)
Data and computational domains overlap
Computational elements must communicate
with one another
Examples:
Group finding
N-Point correlation functions
New object classification
Density estimation
Solution(?):
N Tropy
The Goal
GOAL: Minimize development time for parallel
applications.
GOAL: Enable scientists with no parallel
programming background (or time to learn) to
still implement their algorithms in parallel.
GOAL: Provide seamless scalability from single
processor machines to MPPs…potentially even
several MPPs in a computational Grid.
GOAL: Do not restrict inquiry space.
Methodology
Limited Data Structures:
Astronomy deals with point-like data in an N-dimension parameter
space
Most efficient methods on these kind of data use trees.
Limited Methods:
Analysis methods perform a limited number of fundamental
operations on these data structures.
N Tropy Design
PKDGRAV already provides a number of
advanced services
PKDGRAV benefits to keep:
Flexible client-server scheduling architecture
Portability
Threads respond to service requests issued by master.
To do a new task, simply add a new service.
Interprocessor communication occurs by high-level
requests to “Machine-Dependent Layer” (MDL) which is
rewritten to take advantage of each parallel architecture.
Advanced interprocessor data caching
< 1 in 100,000 off-PE requests actually result in
communication.
N Tropy New Features
Dynamic load balancing (available now)
Workload and processor domain boundaries can be
dynamically reallocated as computation
progresses.
Data pre-fetching?? (To be implemented)
Predict request off-PE data that will be needed for
upcoming tree nodes.
mdlSlurp(): Prefetch a big block of data to a special block
of local memory
Intelligent prediction: Investigate active learning
algorithms to prefetch off-PE data.
Performance
10 million particles
Spatial 3-Point
3->4 Mpc
(SDSS DR1 takes less
than 1 minute with
perfect load balancing)
PHASE 1 Performance
10 million particles
Projected 3-Point
0->3 Mpc
NTropy Conceptual Schematic
WSDL?
SOAP?
VO
Web Service Layer (at
least from Python)
Key:
Framework Components
Tree Services
Computational Steering Layer
User Supplied
C, C++, Python (Fortran?)
Framework (“Black Box”)
Domain Decomposition/
Tree Building
User tree and
particle data
Parallel I/O
Dynamic Workload
Management
Collectives
User serial I/O routines
Tree Traversal
User tree traversal routines
User serial collective staging
and processing routines
N Tropy Design
Use only Parallel Management Layer (pst) and
Interprocessor Communication Layer (mdl) of
PKDGRAV.
Rewrite everything else from scratch
PKDGRAV Functional Layout
Computational Steering Layer Executes on master processor
Parallel Client/Server Layer Coordinates execution and data
(pst)
distribution among processors
Serial Layer Executes on all processors
Gravity Calculator
Hydro Calculator
Interprocessor Communication Layer
(mdl)
Passes data between
processors
N Tropy Functional Layout
Application Steering script
ntropy_* functions
Key:
Layers completely rewritten
Layers retained from PKDGRAV
“User” Supplied Methods
Executed
on master
thread only
NTropy Computational Steering API
PKDGRAV Parallel Client/Server Layer
Thread services layer (executed on all threads)
Application “Tasks” (serial code)
ntp_* functions
Tree node and particle navigation methods:
(e.g. Acquire node/particle, next
node/particle, parent node, child node
Executed on
all threads
MDL Interprocessor
communication layer
Using Ntropy (example: N-body)
Computational Steering Layer (nbody.c)
#include “ntropy.h” /* Mandatory */
#include “nbody.h” /* My application-specific stuff */
NTropy stuff
App-specific stuff
int main(int argc,char **argv) {
NTROPY ntropy; /* Mandatory */
struct nbodyStruct nbody; /* My App-specific struct */
/* Start ntropy. Only the master thread will ever return from this function.
* The slave threads will return to "main_ch" in ntropy.c. */
ntropy = ntropy_Initialize(argc, argv);
/* Process command line arguments */
/* Add nbody-specific command line arguments */
nbodyAddParams(ntropy, &nbody);
/* Read in command line arguments */
ntropy_ParseParams(ntropy, argc, argv);
/* Now that the command-line parameters have been parsed, we can
* examine them. */
nbodyProcParams(ntropy, &nbody);
/* Open log file and write header */
fpLogFile = nbodyLogParams(ntropy, &nbody);
Using Ntropy (example: N-body)
Computational Steering Layer (nbody.c)
/* Start threads by calling ntropyThreadStart. This starts NTropy
computational services on all compute threads, pushes the data in Global
and Param structs to all threads, and runs the function tskLocalInit on all
threads. tskLocalInit is the constructor for your thread environment, and
is convenient if you want to set up any local structs. You can also
provide tskLocalFinish which is destructor for your thread environment. */
nThreads = ntropy_ThreadStart(ntropy, &(nbody.global), sizeof(nbody.global),
&(nbody.param), sizeof(nbody.param),
(*tskLocalInit), (*tskLocalFinish));
And in task_fof.c:
void *tskLocalInit(NTP ntp) {
struct nbodyLocalStruct *local; /* A struct that I invent that will store
* all “thread-global” variables that I
* will need for this computation */
local = (nbodyLocalStruct *)malloc(sizeof(struct nbodyLocalStruct));
return (void *)local;
}
void tskLocalFinish(NTP ntp) {
struct nbodyLocalStruct *local = ntp_GetLocal(ntp);
free(local);
}
Using Ntropy (example: N-body)
Computational Steering Layer (nbody.c)
N-body structs:
/* This struct has everything that I read in from the command line */
struct nbodyParamStruct {
double dOmega0;
int nSteps;
…
}
/* This struct has everything that I don’t read in from the command line,
but still want all threads to know */
struct nbodyGlobalStruct {
double dRedshift;
double dExpansionFactor;
…
}
/* This struct has everything that I was store store locally in each
compute thread. */
struct nbodyLocalStruct {
int nParticleInteractions;
int nCellInteractions;
…
}
Using Ntropy (example: N-body)
Computational Steering Layer (nbody.c)
Main N-body loop:
nbody->iStep = 0;
nbody->iOut = 0;
while(nbody->iStep==0 || nbody->iOut < nbodySteps(&nbody)) {
/* Build the tree */
ntropy_BuildTree(ntropy, (*tskCellAccum), (*tskBucketAccum));
/* Do the Gravity walk using dynamic load balancing */
ntropy_Dynamic(ntropy, (*tskGravityInit), (*tskGravityFinish), (*tskGravity));
/* Or, if you don’t want dynamic load balancing for this function, use
ntropy_Static() */
/* Write results if needed
(nbodyWriteOutput checks if it needs to output. If so, it calls
ntropy_WriteParticles(), then increments msr->iOut.) */
nbodyWriteOutput(ntropy, &nbody);
++(msr->iStep);
}
/* Finish */
ntropy_Finish(ntropy);
Using Ntropy (example: N-body)
Task Layer (nbody_task.c)
void tskGravity(NTP ntp, NS *pNodeSpecStart) {
/* Get all the structs that I will need */
struct nbodyParamStruct *param = ntp_GetParam(ntp);
struct nbodyGlobalStruct *global = ntp_GetGlobal(ntp);
struct nbodyLocalStruct *local = ntp_GetLocal(ntp);
NS nodeSpecBucket; /* Handle for the current bucket node */
NS nodePtrBucket; /* Pointer to the data of the current bucket */
NS nodeSpecDone; /* The handle of the node that is “next” for nodeSpecStart.
* When we walk to this node, we will be done. */
ntpNS_Next(ntp, pNodeSpecStart, &nodeSpecDone);
ntpNS_Copy(ntp, pNodeSpecStart, &nodeSpecBucket);
/*Find each bucket that is in pNodeSpecStart’s domain and do a treewalk for it*/
while(ntpNS_Compare(ntp, nodeSpecBucket, nodeSpecDone) {
AQCUIRE_NODEPTR(ntp, nodePtr, nodeSpec);
while(ntpNP_Type(ntp, nodePtr) != NTP_NODE_BUCKET) {
ntpNS_ReplaceLower(ntp, &nodeSpec, nodePtr);
ACQUIRE_NODEPTR(ntp, nodePtr, nodeSpec);
}
/* Now nodeSpecBucket and nodePtrBucket are pointing at the next bucket
to process. */
MyWalkBucket(ntp, nodeSpecBucket, nodePtrBucket);
}
}
Using Ntropy (example: N-body)
Task Layer (nbody_task.c)
void MyWalkBucket(NTP ntp, NS nodeSpecBucket, NP nodePtrBucket) {
NS nodeSpec; /* Handle for the node currently being looked at */
NP nodePtr; /* Pointer to the data of the node currently being looked at */
ntpNS_PstStart(ntp, &nodeSpec); /* Get the handle of the root of the PST */
while (iWalkResult != NTP_NULL_PST) {
ACQUIRE_NODEPTR(ntp, nodePtr, nodeSpec);
iInteractResult = MyCellCellInteract(ntp, nodeSpecBucket, nodePtrBucket,
nodePtr, nodeSpec);
if (iInteractResult == MY_CELL_OPEN)
iWalkResult = ntpNS_ReplaceLower(ntp, &nodeSpec, nodePtr);
else /* iInteractResult == MY_CELL_NEXT */
iWalkResult = ntpNS_ReplaceNext(ntp, &nodeSpec, nodePtr);
}
}
N Tropy Functional Layout
Application Steering script
ntropy_* functions
Key:
Layers completely rewritten
Layers retained from PKDGRAV
“User” Supplied Methods
Executed
on master
thread only
NTropy Computational Steering API
PKDGRAV Parallel Client/Server Layer
Thread services layer (executed on all threads)
Application “Tasks” (serial code)
ntp_* functions
Tree node and particle navigation methods:
(e.g. Acquire node/particle, next
node/particle, parent node, child node
Executed on
all threads
MDL Interprocessor
communication layer
Further NTropy Features
Custom cell and particle data
Selectable cache types
Read-only
Combiner
Explicit control over cache starting and
stopping
Cache statistics
10 user-selectable timers and 4
“automatic” timers
Further NTropy Features
Custom command-line and parameter
file options
Automatic reduction variables
Range of collectives (AllGather, AllToAll,
AllReduce)
I/O Primitives (TIPSY “array” and
“vector” files) as well as flexible userdefined I/O.
Conclusions
Most data analysis in astronomy is done
using trees as the fundamental data
structure.
Most operations on these tree structures
are functionally identical.
Based on our studies so far, it appears
feasible to construct a general purpose
parallel framework that users can
rapidly customize to their needs.