Transcript slides

Monitoring Infrastructure for
Grid Scientific Workflows
Bartosz Baliś, Marian Bubak
Institute of Computer Science
and ACC CYFRONET AGH
Kraków, Poland
WORKS08, Austin, Texas, November 17th, 2008
Outline
•
•
•
•
•
Challenges in Monitoring of Grid Scientific Workflows
GEMINI infrastructure
Event model for workflow execution monitoring
On-line workflow monitoring support
Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008
Motivation
• Monitoring of Grid Scientific Workflows important
in particularly many scenarios
 On-line & off-line performance analysis, dynamic resource
reconfiguration, on-line steering, performance optimization,
provenance tracking, experiment mining, experiment repetition, …
• Consumers of monitoring data: humans
(provenance) and processes
• On-line & off-line scenarios
• Historic records: provenance, retrospective
analysis (enhancement of next executions)
WORKS08, Austin, Texas, November 17th, 2008
Grid Scientific Workflows
• Traditional scientific applications
 Parallel
 Homogeneous
 Tightly coupled
• Scientific worfklows
 Distributed
 Heterogeneous
 Loosely Coupled
 Legacy applications often in the backends
 Grid environment
•  Challenges for monitoring arise
WORKS08, Austin, Texas, November 17th, 2008
Challenges
• Monitoring infrastructure that conceals workflow
heterogeneity
Event subscription and instrumentation requests
• Standardized event model for Grid workflow
execution
Currently events tightly coupled to workflow
environments
• On-line monitoring support
Existing Grid information systems not suitable for
fast notification-based discovery
• Monitoring information model to record executions
WORKS08, Austin, Texas, November 17th, 2008
GEMINI: monitoring infrastructure
• Monitors: query & sub
engine, event caching,
services
• Sensors: lightweight
collectors of events
• Mutators: manipulation
of monitored entities (e.g.
dynamic instrumentation)
• Standardized, abstract interfaces for subscription and instrumentation
• Complex Event Processing: subscription management via continuous
querying
• Event representation
 XML: self describing, extensible but poor performance
 Google protocol buffers: under investigation
WORKS08, Austin, Texas, November 17th, 2008
Outline
• Event model for workflow execution monitoring
• On-line workflow monitoring support
• Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008
Workflow execution events
• Motivation: capture commonly used monitoring
measurements concerning workflow execution
• Attempts to standardize monitoring events exist,
but oriented to resource monitoring
GGF DAMED ‘Top N’
GGF NMWG Network Peformance Characteristics
• Typically monitoring systems introduce a single
event type for application events
WORKS08, Austin, Texas, November 17th, 2008
Workflow Execution Events – taxonomy
dataRead
availability
• Extension of GGF DAMED Top N events
• Extensible hierarchy; example extensions:




Loop entered – started.codeReigon.loop
MPI app invocation – invoking.application.MPI
MPI Calls – started.codeRegion.call.MPISend
Application-specific events
status
software
changed
execution
• Events for initiators and performers
 Invoking, invoked; started, finished
• Event for various execution levels
 Workflow, task, code region, data operations
• Events for various execution states
 Failed, suspended, resumed, …
• Events for execution metrics
started
dataWrite
application
workflow
process
wfTask
codeRegion
finished
failed
suspended
resumed
process
codeRegion
invoked
dataRead
rate
dataWrite
WORKS08, Austin, Texas, November 17th, 2008
wfTask
invoking
progress
 Progress, rate
call
call
Outline
• Event model for workflow execution monitoring
• On-line workflow monitoring support
• Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008
On-line Monitoring of Grid Workflows
• Motivation
 Reaction to time-varying resource availability and application
demands
 Up-to-date execution status
• Typical scenario: ‘subscribe to all execution events related to
workflow Wf_1234’
 Distributed producers, not known apriori
• Prerequisite: automatic resource discovery of workflow
components
 New producers are automatically discovered and transparently
receive appropriate active subscription requests
WORKS08, Austin, Texas, November 17th, 2008
Resource discovery in workflow monitoring
• Challenge: complex execution life cycle of a Grid workflow
 Abstract workflows: mapping of tasks to resources at runtime
 Many services involved: enactment engines, resource brokers, schedulers,
queue managers, execution managers, …
 No single place to subscribe for notifications about new workflow
components
 Discovery for monitoring must proceed bottom-up: (1) local discovery, (2)
global advertisement, (3) global discovery
• Problem: current Grid information services are not suitable
 Oriented towards query performance
 Slow propagation of resource status changes
 Example: average delay from event ocurrence to notification in EGEE
infrastructure ~ 200 seconds (Berger et al., Analysis of Overhead and
Waiting Times in the EGEE production Grid)
WORKS08, Austin, Texas, November 17th, 2008
Resource discovery: solution
• What kind of resource discovery is required?
 Identity-based, not attribute-based
 Full-blown information service functionality not needed
 Just simple, efficient key-value store
• Solution: a DHT infrastructure federated with the monitoring
infrastructure to store shared state of monitoring services
 Key = workflow identifier
 Value = producer record (Monitoring service URL, etc.)
 Multiple values (= producers) can be registered
• Efficient key-value stores
 OpenDHT
 Amazon Dynamo: efficiency, high availability, scalability. Lack of
strong data consistency (‘eventual consistency’)

Avg get/put delay ~ 15/30ms; 99th percentile ~ 200/300ms (Decandia et al. Dynamo: Amazon’s Highly
Available Key-value Store)
WORKS08, Austin, Texas, November 17th, 2008
Monitoring + DHT (simplified architecture)
Workflow plan
Workflow
Enactment Engine
DHT
Consumer
Site A
subscribe
(wf1)
Monitor
Site B
Monitor
Workflow Task
Monitoring
events
Site C
Monitor
Workflow Task
Workflow Task
WORKS08, Austin, Texas, November 17th, 2008
DHT-based scenario
C0 : Client
M0 : Monitor
B0 : DHT
M1 : Monitor
M2 : Monitor
put(wf1,m1)
subscribe(wf1)
get(wf1)
m1
subscribe(wf1,C0)
push(archiveEvents)
push(event)
put(wf1,m2)
get(wf1)
m1,m2
subscribe(wf1,C0)
push(archiveEvents)
push(event)
push(event)
WORKS08, Austin, Texas, November 17th, 2008
Evaluation
• Goal:
 Measure performance & scalability
 Comparison with centralized approach
• Main characteristic measured:
 Delay between ocurrence of a new workflow component to
beginning of data transfer, for different workloads
• Two methodologies:
 Queuing Network models with multiple classes, analytical
solution
 Simulation models (CSIM simulation package)
WORKS08, Austin, Texas, November 17th, 2008
1st methodology: Queuing Networks
Customers
arriving
• Solved analitycally
Customers
leaving
queue
server
Load-independent
Delay
subscribe
C1
M1
Load-dependent
C1
M1
begin
transfer
M2
register
C2
register
put
Mm
begin
transfer
M2
Cn
Mm
Poll
C2
get main
producer
Cn
DHT
get
CR
I’m producer
subscribe
(a) DHT solution QN model
(b) Centralized solution QN model
WORKS08, Austin, Texas, November 17th, 2008
2nd methodology: discrete-event simulation
• CSIM simulation
package
WORKS08, Austin, Texas, November 17th, 2008
Input parameters for models
• Workload intensity
 Measured in job arrivals per second
 Taken from EGEE: 3000 to 100.000 jobs per day
 Large scale production infrastructure
 Assumed range: from 0.3 to 10 job arrivals per second
• Service demands
 Monitors and Coordinator: prototypes built and measured
 DHT: from available reports on large-scale deployments
 OpenDHT, Amazon Dynamo
WORKS08, Austin, Texas, November 17th, 2008
Service demand matrices
WORKS08, Austin, Texas, November 17th, 2008
Results (centralized model)
WORKS08, Austin, Texas, November 17th, 2008
Results (DHT model)
WORKS08, Austin, Texas, November 17th, 2008
Scalability comparison: centralized vs. DHT
Conclusion: DHT solution scalable as expected, but centralized solution
can still handle relatively large workloads before saturation
WORKS08, Austin, Texas, November 17th, 2008
Outline
• Event model for workflow execution monitoring
• On-line workflow monitoring support
• Information model for recording workflow executions
WORKS08, Austin, Texas, November 17th, 2008
Information model for wf execution records
• Motivation: need for structured information about past experiments
executed as scientific workflows in e-Science environments




Provenance querying
Mining over past experiments
Experiment repetition
Execution optimization based on history
• State of the art
 Monitoring information models do exist but for resource monitoring
(GLUE), not execution monitoring
 Provenance models are not sufficient
 Repositories for performance data are oriented towards event traces or
simple performance-oriented information
• Experiment Information (ExpInfo) model
 Ontologies used to describe the model and represent the records
WORKS08, Austin, Texas, November 17th, 2008
ExpInfo model
A simplified example with particular domain ontologies
ViroLab computation ontology
Computation
+operationName
+cpuUsage
+memoryUsage
+usedInstance
GridObjectInstance
Experiment
Generic in silico experiment ontology
+name
+executedBy
+sourceFile
+id
+version
+time
+duration
ExperimentResult
+obtainedResult
*
+storageMechanism
+time
+format
+size
+creator
+endpoint
1
+inputData
*
DataAccess
+query
*
*
DataEntity
ExecutionStage
+outputData
+time
+duration
GridObjectImplementation
DataEntity
+technology
*
*
Computation
ontology
NucleotideSequenceSubtyping
-resultSubtype
NewDrugRanking
GridObject
+region
+usedRuleSet
+meaning
+name
NucleotideSequenceAlignment
-subtypedNs
-alignmentRegion
-resultMutation
+resultRanking
+alignedNs
DrugRanking
VirusNucleotideSequence
+testedMutation VirusNucleotideMutation
DRS Data ontology
WORKS08, Austin, Texas, November 17th, 2008
ExpInfo model: set of ontologies
• General experiment information
 Purpose, execution stages, input/output data sets
• Provenance information
 Who, where, why, data dependencies
• Performance information
 Duration of computation stages, scheduling, queueing,
performance metrics (possible)
• Resource information
 Physical resources (hosts, containers) used in the
computation
• Connection with domain ontologies
 Data sets with Data ontology
 Execution stages with Application ontology
WORKS08, Austin, Texas, November 17th, 2008
Aggregation of data to information
• From monitoring events to ExpInfo records
• Standardized process described by aggregation rules and
derivation rules
• Aggregation rules specify how to instantiate individuals
 Ontology classes associated with aggregation rules through object
properties
• Derivation rules specify how to compute attributes, including
object properties = associations betwen individuals
 Attributes are associated with derivation rules via annotations
• Semantic Aggregator uses collects wf execution events and
produces ExpInfo records according to aggregation and derivation
rules
WORKS08, Austin, Texas, November 17th, 2008
Aggregation rules
ExperimentAggregation:
eventTypes = started.workflow, finished.workflow
instantiatedClass = http://www.virolab.org/onto/expprotos/Experiment
ecidCoherency = 1
ComputationAggregation:
eventTypes = invoking.wfTask, invoked.wfTask
instantaitedClass = http://www.virolab.org/onto/expprotos/Computation
acidCoherency = 2
WORKS08, Austin, Texas, November 17th, 2008
Derivation rules
The simplest case – an XML element mapped directly to a
functional property:
<Derivation ID="OwnerLoginDeriv">
<element>MonitoringData/experimentStarted/ownerLogin</element>
</ext-ns:Derivation>
More complex case: which XML elements are needed and how to
compute an attribute:
<Derivation ID=”DurationDeriv”>
<delegate>cyfronet.gs.aggregator.delegates.ExpPlugin</delegate>
<delegateParam>software.execution.started.application/time</>
<delegateParam>software.execution.finished.application/time</>
</ext-ns:Derivation>
WORKS08, Austin, Texas, November 17th, 2008
Applications
• Coordinated Traffic
Management
 Executed within K-Wf Grid
infrastructure for workflows
 Workflows with legacy
backends
 Instrumentation & tracing
• Drug Resistance application
 Executed within ViroLab
virtual laboratory for infectious
diseases
virolab.cyfronet.pl
 Recording executions,
provenance querying, visual
ontology-based querying
based on ExpInfo model
WORKS08, Austin, Texas, November 17th, 2008
Conclusion
• Several monitoring challenges specific to Grid
scientific workflows
• Standardized taxonomy for workflow execution
events
• DHT infrastructure to improve performance of
resource discovery and enable on-line monitoring
• Information model for recording workflow
executions
WORKS08, Austin, Texas, November 17th, 2008
Future Work
• Enhancement of event & information models
 Work-in-progress, requires extensive review of existing
systems to enhance event taxonomy, event data structures
and information model
• Model enhancement & validation
 Performance of large-scale deployment
 Classification of workflows w.r.to generated workloads
 (Preliminary study: S. Ostermann, R. Prodan, and T. Fahringer. A Trace-Based
Investigation of the Characteristics of Grid Workflows)
• Information model for worfklow status
 Similar to resource status in information systems
WORKS08, Austin, Texas, November 17th, 2008