Transcript slides

Recording Actor Provenance in
Scientific Workflows
Ian Wootten, Shrija Rajbhandari, Omer Rana
[email protected]
Cardiff University, UK
What?
 Provenance is concerned with process
 This may or may not be documented
 Data Provenance – The process which leads to a
particular piece of data
 Actor Provenance - The process which leads to a
particular actor state
How an actor (client or service) arrived at a particular
state during an interaction (for stateless actors)
What? Actor Provenance
A2
A1
B2
B1
Service
Service
Enactment Engine
Actor State Assertions:
Asserting the state of
an actor at a particular
time during an
interaction.
Interaction Assertions:
Asserting the contents
of a message by an
actor sending or
receiving it.
Metrics for Actor State Assertion
 Static
 No variation in value over actor lifetime
 Per Node - Node identity, Operating system
 Per Actor - Actor identity, Name, Owner, Version
 Dynamic
 Variation in value over actor lifetime
 Per Node - Memory usage, Network traffic
 Per Actor - Execution Time, Availability
 Instrumented
 Actor is ‘Instrumented’ at Key Points in its Execution
 Description of internal data flow
Eg. German Aerospace Center (DLR)
Completion states for action events and file transfers
How? Actor Provenance
Instrumented
Output
Monitor
Output
M1
B2
B1
M2
Service
Service
Enactment Engine
Instrumented Actor:
Service information
obtained from
instrumented points
within an actor.
Monitoring Sources:
Service information
derived from hosting
platform via monitoring
sources (eg Ganglia)
Why? Standalone and Combined Value
 Standalone State Assertion Value
 Actor Selection
 Performance
• Evaluation of Past / Prediction of Future
 Resource Allocation
 Actor administrator allocates resources according to performance
metrics
 Combined Value - Putting Assertions into Context
 Interaction – Through Actor State Assertions
 Determining the likely cause of error / results
 Understanding what an actor is doing
 Actor – Through Interaction Assertions
 Understanding performance pattern observations
 Understanding instrumented metric observations
How? Actor Provenance Registry
Attempt to provide a mechanism to specify
and record actor state assertions for any
application
Generic Mechanism Problems
No Knowledge of Potential Resources
Monitoring sources, containers
No Direct Knowledge of Implementation
Instrumented Data Capture
How? Actor Provenance Registry
Resource and Rule Registration
Resource – Monitoring Tool
Rule - User defined instructions
Indirectly from Resources
Coordinator polls resources for information
Times of interest – Service Invocation, Request
Directly from actor
Collection of Instrumented data
Representation?
How? Actor Provenance Registry
 Integration with PReP [Groth et al.]
Local
Store
Record
Provenance
Provenance
Store
Record
Provenance
Local
Store
Registry
Monitoring
Sources
Client
Service
Registry
Invoke
Result
Record Actor
Provenance
Record Actor
Provenance
Monitoring
Sources
Data Mining Prototype
Record assertions using registry during
invocation of a data modelling service
Service takes incoming data sets and
generates a model based upon it
Uses Quantitative Structure-Activity
Relationship (QSAR) to attempt to correlate
biological activity to a chemical compound
Larger data set = longer run time
Performance Evaluation
5 rules
40000
35000
1 rule
Invocation Time (ms)
30000
No rules
25000
20000
15000
10000
5000
0
0
50
100
150
200
Size of Data Set (KB)
250
300
350
400
Conclusions / Future Work
Actor Provenance data is important
Without it, we don’t get the full picture
Prototype shows that it can be done
Room for improvement
Interface to Monitoring System
Caching of results
No inclusion of ‘instrumented’ actor capture
Requires service provider adoption to work
Prototype Configuration
Single machine holding both client, service
and registry
Rules executed on invocation of service
XQuery
Invocations performed 100 times on datasets
between 30KB – 340KB in size
Coordinator records rule results to a local
file store