CQoS_CCAMeeting_Apr08 - The Common Component

Download Report

Transcript CQoS_CCAMeeting_Apr08 - The Common Component

CQoS Update
Li Li, Boyana Norris, Lois Curfman McInnes
Argonne National Laboratory
Kevin Huck
University of Oregon
Outline
 Interfaces and components for
– Performance database management
– CQoS parameters
 PerfExplorer and ongoing work on CQoS analysis components
2
Analysis Infrastructure
Control Infrastructure
Performance monitoring,
problem/solution
characterization, and
performance model building
Interpretation and execution of control laws to
modify an application’s behavior
Instrumented
Component
Application Cases
Performance
Databases
(historical & runtime)
Interactive Analysis
and Model Building
Scientist can
analyze data
interactively
Control System
(parameter changes and
component substitution)
Substitution
Assertion
Database
Scientist can
provide decisions
on substitution and
reparameterization
CQoS-Enabled
Component Application
Component
Substitution Set
Component A
Component B
Component C
3
Outline
 Motivation
 Introduction to components for high-performance computing and
computational quality of service (CQoS) architecture
 Database component design
 Application examples
 Ongoing and future work
4
CQoS Database Component Design
 Designed SIDL interfaces for CQoS
database management
 Implemented prototype database
management components
– Description and software:
http://wiki.mcs.anl.gov/cqos/index.php/CQo
S_database_components_version_0.0.0
– Based on PerfDMF performance data
format and PERI metadata formats
– Comparator interface and
corresponding component for
searching and matching parameter
sets
5
……
CQoS Database Component Design
Perf. data:
query/store
Perf. Database
Metadata:
query/store
Meta-Database
Metadata:
compare/match
Meta-Comparator
Perf. data:
compare/match
Perf. Comparator
Adaptive
Heuristic
: component
……
: component connection
Fig.1. Connect database and comparator components to adaptive heuristics
component. There can be multiple database and comparator components
that deal with different data types.
6
CQoS Performance and Metadata
 Performance (general)
– Historical performance data from different instances of the same application
or related applications:
• Obtained through source instrumentation, e.g., TAU (U. Oregon)
• Binary instrumentation, e.g., HPCToolkit (Rice U.)
 Ideally, for each application execution, the metadata should provide
enough information to be able to reproduce a particular application
instance. Examples:
– Input data (reduced representations)
• Matrix properties, condition number
– Algorithmic parameters
• Convergence tolerance, CFL number, maximum number of iterations
– System parameters
• Compilers, hardware
– Domain-specific
• Provided by scientist/algorithm developer
7
Outline
 Motivation
 Introduction to components for high-performance computing and
computational quality of service (CQoS) architecture
 Database component design
 Application examples
 Ongoing and future work
8
Database Component Application – Example 1:
2D Driven Cavity Flow1
 Linear solver: GMRES(30), vary only fill level of ILU preconditioner
 Adaptive heuristic based on:
– Matrix properties (which change during runtime) computed with Anamod
(Eijkhout, http://sourceforge.net/projects/salsa/)
T. S. Coffey, C.T. Kelley, and D.E. Keyes. Pseudo-transient continuation and differential algebraic
equations. SIAM J. Sci. Comp, 25:553–569, 2003.
1
9
How Database Components Work?
 During runtime, the driver (e.g., linear solver proxy component)
evaluates important matrix properties, and matches the properties
to historical data in MetaDB through PropertyComparator
interfaces.
 Linear solver performance data is retrieved and compared given
the current matrix properties. This is accomplished by the
PerfComparator component.
 The linear solver parameters resulting in the best performance, in
this case fill level of ILU preconditioner, is returned back to the
driver.
 The driver adapts accordingly to continue execution.
10
Example 2: Parallel Mesh Partitioning in
Combustion Simulations1
 J. Ray et al. (Sandia) have developed a CCA toolkit for flame
simulations using structured adaptive mesh partitioning (SAMR).
No single partitioner is optimal; thus, CQoS support for choosing
an efficient meta-partitioner and an appropriate configuration for a
given mesh is desirable.
 Meta-partitioner related information includes:
– Algorithm (i.e. partitioner) settings
• E.g., actual_levels, good_enough, smoothing,
maxNRLoadImbalance
– Problem (mesh) characterization
• E.g., number of levels, amount of refined area per level
– Performance metrics
• E.g., synchronization cost statistics, data migration cost statistics
1J.
Steensland and J. Ray, "A Partitioner-Centric Model for SAMR Partitioning Trade-Off Optimization
11
: Part I," International Journal of High Performance Computing Applications, 2005, 19(4):409-422.
Meta-Partitioner Example1
1Johan
Steensland and Jaideep Ray, "A Partitioner-Centric Model for SAMR Partitioning Trade-Off
Optimization: Part I", Proceedings of the 4th Annual Symposium of the Los Alamos Computer Science 12
Institute (LACSI04). 2004.
Database Components for SAMR Partitioner
13
How Database Components Work?
 The CharacterizationComparator component matches current
AMR grid characterization against historical data in MetaDB to find
and extract the most similar previously encountered state
 For the returned state, we choose an appropriate rule that
matches the state to an optimal partitioner
– How are the rules constructed?
• The performance of various partitioners are compared for a given
grid characterization
• The performance comparison takes into account the offset among
different metrics
• A rule is created to map the grid state to a best-performed
partitioner setting
• These are accomplished through the PerformanceComparator and
AlgorithmComparator components
 The rule maps the current grid state to an optimal partitioner
 The main driver adapts to the new partitioner to continue
simulation.
14
Ongoing and Future Work (Incomplete List)
 Validate current algorithm/solver selection strategies with
application experiments
 Incorporate more offline performance analysis capabilities
(machine learning, statistical analysis, etc.)
 Introduce a lightweight runtime database to avoid overhead of
accessing SQL databases (should only access database in
the beginning and after the end of the main computation)
 Apply to more problem domains, implementing extensions as
necessary
 Integration of ongoing efforts in
– Performance tools: common interfaces and data representaion (leverage
PERI tools, PerfExplorer, TAU performance interfaces, and other efforts)
– Numerical components: emerging common interfaces (e.g., TOPS solver
interfaces) increase choice of solution method  automated
composition and adaptation strategies
15
Acknowledgements to Collaborators
 Victor Eijkhout, the University of Texas at Austin
 Jaideep Ray, Sandia National Laboratory
 Henrik Johansson, Uppsala University, Department of Information
Technology, Sweden
16
Thank you!
17