Challenges in Performance Evaluation and Improvement of Scientific

Download Report

Transcript Challenges in Performance Evaluation and Improvement of Scientific

Component Infrastructure of
CQoS and Its Application in
Scientific Computations
Li Li1, Boyana Norris1, Lois Curfman McInnes1,
Kevin Huck2 , Joseph Kenny3 , Meng-Shiou
Wu4
1Argonne
National Laboratory, Argonne, IL.
2University
3Sandia
4Ames
of Oregon
National Laboratories, California
Laboratory
CCA meeting Jan. 2009
Outline





Motivation
CQoS introduction
Database component design
Application examples
Ongoing and future work
2
Overall Goals
 Automate the configuration and runtime adaptation of highperformance component applications, through the so called
Computational Quality of Service (CQoS) infrastructure
– Instrumentation of component interfaces
– Performance data gathering
– Performance analysis
– Adaptive algorithm support
 Motivating application examples
– Quantum Chemistry challenges: How, during runtime, can we
make the best choices for reliability, accuracy, and performance
of interoperable QC components?
• When several QC components provide the same functionality, what criteria should be
employed to select one implementation for a particular application instance and
computational environment?
• How do we incorporate the most appropriate externally developed components? (e.g.,
which algorithms to employ from numerical optimization components?)
3
Motivating Application Examples (cont.)
 Overall simulation times for nonlinear (time-dependent)
PDE-based models often depend to a large extent on the
robustness and efficiency of sparse linear solvers
– Properties of linear system change during runtime
– No single method is best because of the complexity of long-running
applications
 Efficient parallel structured adaptive mesh refinement
(SAMR) applications depend on load-balancing algorithms
– Computational resources are dynamically concentrated to areas in
need of a high accuracy
– Application and computer state change at runtime
– Dynamic resource allocation requires the workload partitioning
algorithm be selected at runtime according to state change
4
Outline





Motivation
CQoS introduction
Database component design
Application examples
Ongoing and future work
5
CQoS Analysis
Infrastructure
Performance monitoring,
problem/solution characterization,
and performance model building
Instrumented
Component
Application Cases
Performance
Databases
(historical & runtime)
Interactive Analysis
and Model Building
Scientist can
analyze data
interactively
CQoS Control Infrastructure
Interpretation and execution of control laws to
modify an application’s behavior
Control System
(parameter changes and
component substitution)
Substitution
Assertion
Database
Scientist can
provide decisions
on substitution and
reparameterization
CQoS-Enabled
Component Application
Component
Substitution Set
Component A
Component B
Component C
6
Database Needs for the Scientific Application
Adaptation
 Performance analysis of candidate solver/algorithm
– Large number of performance runs
– Store, manage, and search performance data
 Store and manage hardware, compiler, and application
metadata
– Information essential to algorithm selection, e.g., system
configurations, problem properties, application states
 Optimal algorithm determination
– Input data (or problem features)
– Algorithmic parameters
– Performance models (or hints)
7
Database Needs for Scientific Application
Adaptation (cont.)
 Database use cases:
– Store historical performance data and application metadata
– Facilitate offline performance analysis
– Match the current application state against historical data
through DB queries during runtime
– Search for optimal algorithm w.r.t. current application
state
– Retrieve settings associated with the optimal algorithm
so we can apply it immediately to the application during
runtime
8
Outline





Motivation
CQoS introduction
Database component design
Application examples
Ongoing and future work
9
CQoS Database Component Design
 Designed C++ and SIDL interfaces for
CQoS database management
 Implemented prototype database
management components
– Description and software:
http://wiki.mcs.anl.gov/cqos/index.php/CQo
S_database_components_version_0.0.0
– Based on PerfDMF performance data
format and PERI metadata formats
– Comparator interface and
corresponding component for
searching and matching parameter
sets
10
……
CQoS Database Component Design
Perf. data:
query/store
Perf. Database
Metadata:
query/store
Meta-Database
Metadata:
compare/match
Meta-Comparator
Perf. data:
compare/match
Perf. Comparator
Adaptive
Heuristic
: component
……
: component connection
Fig.1. Connect database and comparator components to adaptive heuristics
component. There can be multiple database and comparator components
that deal with different data types.
11
Use DB interfaces in 2D driven-cavity
/* instantiate parameter 1 */
ierr = ComputeQuantity(matrix,"icmk","splits",&res,&flg); CHKERRQ(ierr);
MatrixProperty param1("splits", "matrix_meta", res.i);
/* instantiate parameter 2 */
ierr = ComputeQuantity(matrix,"structure","nnzeros",&res,&flg); CHKERRQ(ierr);
MatrixProperty param2("nnzeros", "matrix_meta", res.i);
/**** Store matrix property set into database. ***/
int myRank;
ierr = MPI_Comm_rank(PETSC_COMM_WORLD, &myRank); CHKERRQ(ierr);
if (myRank == 0){
int localID;
int trialID;
string conninfo("dbname = perfdb");
/* Generate a runtime database manager. It connects to a PostgreSQL
database through DB interfaces. */
RunTimeRecord *R = RunTimeRecord::instance();
R->Connect2DB(conninfo);
trialID = R->getTrialID();
localID = R->getCurEvtID(cflStr);
/* instantiate a parameter set */
PropertySet aSet;
/* add parameter 1 and 2 into the set */
aSet.addAParameter(&param1);
aSet.addAParameter(&param2);
/* store the parameter set into database, */
R->loadParameterSet(trialID, localID, aSet);
}
CQoS Performance and Metadata
 Performance (general)
– Historical performance data from different instances of the same application
or related applications:
• Obtained through source instrumentation, e.g., TAU (U. Oregon)
• Binary instrumentation, e.g., HPCToolkit (Rice U.)
 Ideally, for each application execution, the metadata should provide
enough information to be able to reproduce a particular application
instance. Examples:
– Input data (reduced representations)
• E.g., molecule characteristics,matrix properties
– Algorithmic parameters
• E.g., convergence level, maximum number of iterations
– System parameters
• Compilers, hardware
– Domain-specific
• Provided by scientist/algorithm developer
13
Outline





Motivation
CCA and CQoS introduction
Database component design
Application examples
Ongoing and future work
14
Example: CQoS in Quantum Chemistry
 Initial focus: parallel application configuration of QC
applications so that these can run effectively on various
high-performance machines
– Eliminate guesswork or trial-and-error configuration
 Future work: more sophisticated analysis to configure
algorithmic parameters for particular molecular targets,
calculation approaches, and hardware environments
1J.
Steensland and J. Ray, "A Partitioner-Centric Model for SAMR Partitioning Trade-Off Optimization
15
: Part I," International Journal of High Performance Computing Applications, 2005, 19(4):409-422.
Interactions of the Quantum Chemistry Components With the
Database and Comparator CQoS Components
CQoS/QC Component Wiring
CQoS Component Usage in Quantum Chemistry
 CQoS database usage
– Application metadata
• Molecule characteristics: atom types, topology, moments of inertia
• Algorithm parameters: tunable parameters, convergence level
– System parameters
• Compilers
• Machine info, e.g., number of nodes, threads per node, network
– Historical performance data
• Execution times, etc.
• Obtained through source instrumentation, e.g., TAU
• Can guide configuration of related new simulations
 CQoS comparator components
– Compare sets of parameters within the performance database
– Quantum chemistry applications can match the current application state
against historical data through database queries during runtime.
– Use metadata to guide parameter selection and application configuration
• Match molecule similarity, basis set similarity, electronic correlation approach, etc.
18
Ongoing and Future Work (Incomplete List)
 Integration of ongoing efforts in
– Performance tools: common interfaces and data representation
(leverage PerfExplorer, TAU performance interfaces, PERI tools, and
other efforts)
 Support training experiment design
– To perform an empirical search for selecting the optimal solver
components/parameters
 Incorporate more offline performance analysis capabilities
(machine learning, statistical analysis, etc.)
 Apply to more problem domains, implementing extensions as
necessary
19
Acknowledgements to Collaborators




TAU Performance Tools group, University of Oregon
Victor Eijkhout, the University of Texas at Austin
CCA Forum members
Funding:
– Department of Energy (DOE) Mathematical, Information, and
Computational Science (MICS) program
– DOE Scientific Discovery through Advanced Computing
(SciDAC) program
– National Science Foundation
20
Main Database Component Interfaces
interface DB extends gov.cca.Port{
bool connect();
bool disconnect();
bool isClosed();
void setConnectionInfo(in string info);
string getConnectioninfo();
int executeQuery(in string commd, out Outcome res);
interface Comparator extends gov.cca.Port {
/* Comparison operations between parameter sets
*/
void setLHS(in ParameterSet lefthand);
void setRHS(in ParameterSet righthand);
ParameterSet getLHS();
ParameterSet getRHS();
void storeParameter(in int trialID, in int iterNo, in
Parameter aParam); // store a parameter into DB
void storeParameterSet(in int trialID, in int iterNo, in
ParameterSet aParamSet); // store a set of parameter
into DB
int getDimension();
Parameter getLHSParameterAt(in string
paraName);
Parameter getRHSParameterAt(in string
paraName);
void getParameter(in int trialID, int iterNo, inout
Parameter aParam); // retrieve parameter value
void getParameterSet(in int trialID, int iterNo, inout
ParameterSet aParamSet);// retrieve parameter set value
void setToleranceAt(in string name, in double
epsilon);
double getToleranceAt(in string name);
int getMatchingTrialsBetween(in ParameterSet lower, in
ParameterSet upper, out Outcome trialIDs); // retrieve
trials from database, whose parameter set value is within [lower, upper]
int getMatchingTrials(in ParameterSet lower, in vector e
psilons, out Outcome trialIDs);// retrieve trials from database,
whose parameter set value is within [lower-epsilons, lower+epsilons]
}
void setRelationAt(in string name, in int
aRelation);
int getRelationAt(in string name);
bool doCompare();
}
Database Component Usage – Example 1:
2D Driven Cavity Flow1
 Linear solver: GMRES(30), vary only fill level of ILU preconditioner
 Adaptive heuristic based on:
– Matrix properties (which change during runtime) computed with Anamod
(Eijkhout, http://sourceforge.net/projects/salsa/)
T. S. Coffey, C.T. Kelley, and D.E. Keyes. Pseudo-transient continuation and differential algebraic
equations. SIAM J. Sci. Comp, 25:553–569, 2003.
1
22
How are the Database Components Used?
 During runtime, the driver (e.g., linear solver proxy component)
evaluates important matrix properties, and matches the properties
to historical data in MetaDB through PropertyComparator
interfaces.
 Linear solver performance data is retrieved and compared given
the current matrix properties. This is accomplished by the
PerfComparator component.
 The linear solver parameters resulting in the best performance, in
this case fill level of ILU preconditioner, is returned back to the
driver.
 The driver adapts accordingly to continue execution.
23