Swinburne Marketing Strategy
Download
Report
Transcript Swinburne Marketing Strategy
Temporal Verification in Grid/
Scientific Workflows
Xiao Liu
CITR - Centre for Information Technology Research
Swinburne University of Technology, Australia
[email protected]
Content
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
2
Grid/Scientific Workflow
Grid Workflow Management System
A type of workflow management system aiming at supporting
large-scale sophisticated scientific and business processes
in complex e-science and e-business applications, by
facilitating the resource sharing and computing power of
underlying grid infrastructure.
Scientific Workflow Management System
A type of workflow management system aiming at supporting
complex scientific processes in many e-science applications
such as climate modelling, astronomy data processing. It
may or may not be built upon grid infrastructure. Can be
cluster or P2P.
3
How Are Grid Used
Utility computing
High-performance computing
Collaborative design
Financial modeling
E-Business
High-energy physics
Drug discovery
Data center automation
Life sciences
E-Science
Natural language
processing & Data Mining Collaborative data-sharing
From www.gridbus.org
4
An Example Grid Application
From www.gridbus.org
5
Grid Architecture
From www.gridbus.org
6
Grid Workflow Engine
From www.gridbus.org
7
Where Are We
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
8
Temporal Verification
In reality, complex scientific and business processes are normally
time constrained. Hence, time constraints are often set when
they are modelled as grid workflow specifications.
Temporal constraints mainly include: upper bound, lower bound
and fixed-time
Upper bound constraint
Lower bound constraint
Fixed-time constraint
Temporal verification is used to identified any temporal violations
so that we can handle them in time.
9
Temporal QOS Framework
Constraint Setting
Setting temporal constraints according to temporal QOS Specifications
Checkpoint Selection
Selecting necessary and sufficient checkpoints to conduct temporal
verification
Temporal Verification
Verifying the consistency states at selected checkpoints
Temporal Consistency: SC (Strong Consistency), WC (Weak
Consistency), WI (Weak Consistency), SI (Strong Consistency)
Temporal Adjustment
Handling temporal violations
10
Where Are We
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
11
Setting Temporal Constraints
Problem Statement
In scientific workflow systems, temporal consistency is critical to ensure
the timely completion of workflow instances. To monitor and guarantee
the correctness of temporal consistency, temporal constraints are often
set and then verified. However, most current work adopts user specified
temporal constraints without considering system performance, and hence
may result in frequent temporal violations that deteriorate the overall
workflow execution effectiveness.
Granularity of temporal constraints
Coarse-grained constraints refer to those assigned to the entire
workflow or workflow segments.
Fine-grained constraints refer to those assigned to individual activities.
12
A Motivating Example
This workflow segment contains 12 activities which are modeled
by SPN (Stochastic Petri Net) with additional graphic notations.
For simplicity, we denote these activities as X1 to X12. The
workflow process structures are composed with four SPN based
building blocks, i.e. a choice block for data collection from two
radars at different locations (activities X1 to X4), a compound
block of parallelism and iteration for data updating and preprocessing (activities X6 to X10), and two sequence blocks for
data transferring (activities X5 ,X11 to X12).
13
Two Basic Requirements
Temporal constraints should be well balanced between user
requirements and system performance.
It is common that clients often suggest coarse-grained temporal
constraints based on their own interest while with limited knowledge
about the actual performance of workflow systems. Therefore, user
specified constraints are normally prone to cause frequent temporal
violations.
Temporal constraints should facilitate both overall coarse-grained
control and local fine-grained control.
Both coarse-grained temporal constraints and fine-grained temporal
constraints should be supported. However, note that coarse-grained
temporal constraints and fine-grained temporal constraints are not in a
simple relationship of linear culmination and decomposition. Meanwhile, it
is impractical to set fine-grained temporal constraints manually for a large
amount of activities in scientific workflows.
14
A Probabilistic Strategy
Probability based temporal consistency
A novel probability based temporal consistency which utilise the weighted
joint distribution of workflow acitivity durations is proposed to facilitate
setting temporal constraints.
Two assumptions on activity durations
Assumption 1: The distribution of activity durations can be obtained from
workflow system logs. Without losing generality, we assume all the
activity durations follow the normal distribution model, which can be
denoted as N(µ,σ2) .
Assumption 2: The activity durations are independent to each other.
Exception handling of assumptions : Using normal transformation and
correlation analysis, or moreover, ignoring first when calculating joint
distribution and then added up afterwards.
15
Weighted Joint Normal Distribution
Joint normal distribution
If there are n independent variables of Xi~N (µi,σi2) and n real numbers θi,
where n is a limited natural number, then the joint distribution of these
variables can be obtained with the following formula:
Weighted joint normal distribution
For a scientific workflow process SW which consists of n activities, we
denote the activity duration distribution of activity ai as N (µi,σi2) with
(1≤i≤n). Then the weighted joint distribution is defined as:
where wi stands for the weight of activity ai that denotes the choice
probability or iteration times associated with the workflow path where ai
belongs to.
16
Probabilistic Specification of Activity Durations
Maximum Duration, Mean Duration, Minimum Duration
The 3σ rule depicts that for any sample comes from
normal distribution model, it has a probability of
99.73% to fall into the range [µ-3 σ, µ+3 σ] of which is
a systematic interval of 3 standard deviation around
the mean. According to this, in our strategy, we have
the following specification of activity durations:
Maximum Duration D(ai)= µ+3 σ
Mean Duration M(ai)= µ
Minimum Duration d(ai)= µ-3 σ
17
Probability based Temporal Consistency
18
Setting Strategy
19
Stpe1: Weighted Joint Normal Distribution
Here, to illustrate and facilitate the calculation of the weighted
joint distribution, we analyse basic SPN based building
blocks, i.e. sequence, iteration, parallelism and choice. These
four building blocks consist of basic control flow patterns and
are widely used in workflow modelling and structure analysis.
Most workflow process models can be easily built by their
compositions, and similarly for the weighted joint distribution
of most workflow processes.
20
Step2: Setting Coarse-grained Constraints
I Want the
process be
completed in
48 hours
Let me check
the probability
The negotiation process
21
Step2: Setting Coarse-grained Constraints
That’s not
good, how
about 52
hours
Sir, its 70%,
do you
agree?
Adjust the constraint
22
Step2: Setting Coarse-grained Constraints
Err… how long
will it take if I
want to have
90%
Then, it
increases to
85%
Adjust the probability
23
Step2: Setting Coarse-grained Constraints
Ok, that’s the
deal! Let’s do
it!
It will take
us 54 hours
Negotiation result
24
Step2: Setting Coarse-grained Constraints
Ok! But, sir, I need to remind you that
this is only a guarantee from statistic
sense. If we cannot make it, please
blame the stupid guy who invents the
strategy!
Sorry, statistically,
no predictions can
be 100% sure!
25
Step3: Setting Fine-grained Constrains
Setting fine-grained constraints for individual activities
Assume the probability gained from the last step is θ% that is
with a normal percentile of λ. Then the fine-grained
constraints for individual activities are (µi +λσi).
For example, if the coarse-grained temporal constraints are
of 90% consistency, that is a normal percentile of 1.28, then
the fine-grained constraint for activity ai with a distribution of
N(µI, σi) is (µi +1.28σi).
26
Evaluation--Specification
27
Setting Results: Coarse-grained Constraint
Negotiation for coarse-grained constraint
WS~N(6210,2182)
6300s
66%
6360s
75%
6390s
79%
6400s
81%
U(WS)=6400, λ=0.87
28
Setting Results: Fine-grained Constraint
29
Where Are We
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in CITR Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
30
SwinDeW-G Grid Workflow System
SwinDeW-G stands for Swinburne Decentralised
Workflow for Grid.
SwinDeW-G is a peer-to-peer based scientific grid workflow
system running on the SwinGrid (Swinburne service Grid)
platform. Swinburne CITR (Centre for Information
Technology Research) Node, Swinburne ESR (Enterprise
Systems Research laboratory) Node, Swinburne
Astrophysics Supercomputer Node, and Beihang
CROWN (China R&D environment Over Wide-area
Network) Node in China. They are running Linux, GT4
(Globus Toolkit) or CROWN grid toolkit 2.5 where CROWN
is an extension of GT4 with more middleware, hence
compatible with GT4.
31
Where Are We
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in CITR Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
32
Research Areas in WT
http://www.swin.edu.au/ict/research/citr/wt/research.php
Peer-to-peer based, service oriented and grid workflows
SwinDeW-A: SwinDeW with agent enhanced negotiation
SwinDeW-B: SwinDeW incorporating BPLE4WS (past)
SwinDeW-G: peer-to-peer based service grid workflow system
SwinDeW-S: SwinDeW incorporating Web services (past)
SwinDeW-V: temporal constraint verification in grid workflows
SwinDeW: peer-to-peer based decentralised workflow system (past)
Service-oriented computing
SwinGrid - a Swinburne Service Grid Platform which connects Swinburne
CITR nodes and Swinburne Supercomputer with external nodes
nationally and internationally, forming a Grid computing environment.
33
Recent Publications in WT
http://www.ict.swin.edu.au/personal/yyang/Publications.html
X. Liu, J. Chen and Y. Yang, A Probabilistic Strategy for Setting Temporal Constraints in Scientific
Workflows, Proc. 6th International Conference on Business Process Management (BPM2008),
Sept. 2008 Milan, Italy.
K. Ren, X. Liu, J. Chen, N. Xiao, J. Song, W. Zhang, A QSQL-based efficient Planning Algorithm
for fully-automated Service Composition in Dynamic Service Environments, Proc. of IEEE
International Conference on Services Computing (SCC2008), Honolulu, Hawaii, USA, July 2008.
J. Chen and Y. Yang, A Taxonomy of Grid Workflow Verification and Validation. Concurrency and
Computation: Practice and Experience, Wiley, 20(4):347-360, 2008.
J. Chen and Y. Yang, Adaptive Selection of Necessary and Sufficient Checkpoints for Dynamic
Verification of Temporal Constraints in Grid Workflow Systems. ACM Transactions on Autonomous
and Adaptive Systems, 2(2):Article6, June 2007.
Q. He, J. Yan, R. Kowalczyk, H. Jin, Y. Yang, Lifetime Service Level Agreement Management with
Autonomous Agents for Services Provision. Information Sciences, Elsevier, to appear.
K. Liu, J. Chen, Y. Yang and H. Jin, A Throughput Maximisation Strategy for Scheduling
Transaction Intensive Workflows on SwinDeW-G. Concurrency and Computation: Practice and
Experience, Wiley, to appear.
J. Yan, Y. Yang and G. K. Raikundalia. SwinDeW - A Peer-to-peer based Decentralized Workflow
Management System. IEEE Transactions on Systems, Man and Cybernetics, Part A, 36(5):922935, 2006.
34
Where Are We
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in CITR Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
35
Data Mining Techniques in Workflow area
2) process model
Process Mining Overview
3) organizational model
4) social network
Start
Re gis te r orde r
Pre pare
s hipm e nt
(Re )s e nd bill
Ship goods
Cont act
cus tom e r
Re ce ive pa ym e nt
Archive orde r
End
5) performance
characteristics
1) basic
performance
metrics
6) auditing/security
If …then …
From www.processmining.org
36
Process Mining
Prepare
shipment
Register
order
Ship
goods
(Re)send
bill
Archive
order
Requirement
for material
has arisen
Purchase
Requisition
Receive
payment
Purchase
requisition
released
for purchase
order
Requisition
released
for scheduling
agreement
schedule/SA release
Contact
customer
Purchasing
Decide To Buy Computer
Goods
received
Inbound
delivery
entered
Purchase
order
created
Material
is released
Invoice
received
Order Machine
Choose Model
[desktop]
[bad reviews]
Save Money
Read Test Reviews
1. Process Discovery
2. Conformance testing
3. Log based verification
Goods
Receipt
Order Screen
[laptop]
[enough]
Receive Machine
Choose Operating System
Goods
receipt
posted
Receive Screen
[reviews ok]
[laptop]
[desktop]
[windows]
Order Windows
[linux]
Download Linux
Work Hard
Receive Windows
Check Bank Account
Set Up And Connect
Warehouse/
Stores
Open Lid
[not enough]
Plug In And Power On
Install Operating System
From www.processmining.org
TO item
confirmed
without
differences
Transfer
order
item
is confirmed
Invoice
Verification
Payment
must
be effected
37
ProM Framework
From www.processmining.org
38
Other Workflow Mining Topics
Successful Termination Prediction.
To choose an activity from a given set of potential activities which is the choice
performed in the past that had more frequently led to a desired final
configuration.
Identification of Critical Activities.
To discover those activities that can be considered critical in the sense that they
are scheduled by the system in every successful execution.
Failure/Success Characterization.
By analysing the past experience, a workflow administrator may be interested in
knowing which discriminate factors characterize the failure or the success in the
executions.
Workflow Optimization.
The information collected into the logs of the system can be profitably used to
reason on the “optimality” of workflow executions.
Workflow Performance Related Analysis and Prediction
Time series mining used in the prediction of activity durations, setting temporal
constraints and dynamic temporal verification
39
References on Workflow Mining
G. Greco, A. Guzzo, G. Manco and D. Sacca, Mining and Reasoning on
Workflows, IEEE Trans. on Knowledge and Data Engineering, Vol. 17, No.
4, pp.519-534, APRIL 2005.
W.M.P. van der Aalst, B.F. van Dongen, J. Herbst, L. Maruster, G. Schimm,
and A.J.M.M. Weijters, Workflow Mining: A Survey of Issues and
Approaches. Data and Knowledge Engineering, Vol. 47, No. 2, pp.237-267,
2003.
A.K.A. de Medeiros, W.M.P. van der Aalst, and A.J.M.M. Weijters, Workflow
Mining: Current Status and Future Directions, CoopIS 2003, volume 2888 of
Lecture Notes in Computer Science, pages 389-406. Springer-Verlag,
Berlin, 2003.
W.M.P. van der Aalst, H.T. de Beer, and B.F. van Dongen, Process Mining
and Verification of Properties: An Approach based on Temporal Logic,
CoopIS 2005, volume 3760 of Lecture Notes in Computer Science, pages
130-147. Springer-Verlag, Berlin, 2005.
40
Where Are We
Grid/Scientific Workflows
Temporal QOS Framework
Setting Temporal Constraints in Scientific Workflows
SwinDeW-G Grid Workflow Management System
Additional Information
Research areas in CITR Workflow Technology Program
Data Mining Techniques in Workflow area
Optimization Algorithms in Workflow area
41
Grid Resource Management System
Higher-Level
Services
User/
Application
Core Grid
Infrastructure Services
Information
Services
Grid Middleware
Resource
Broker
Monitoring
Services
Security
Services
Local Resource
Management
Grid Resource
Manager
Grid Resource
Manager
Grid Resource
Manager
PBS
LSF
…
Resource
Resource
Resource
From http://www.coregrid.net
42
Grid Workflow Scheduling
Grid User
Grid-Scheduler
Scheduler
Scheduler
time
time
time
Scheduler
Schedule
Schedule
Schedule
Job-Queue
Job-Queue
Job-Queue
Machine 1
Machine 2
Machine 3
From http://www.coregrid.net
43
A taxonomy of Grid workflow scheduling algorithms
44
GA based Scheduling
Fundamentals for GA based Scheduling
1. Encoding/Decoding
2. Genetic Operators: Crossover, Mutation
and Selection.
3. Fitness Evaluation Function
45
Others
Simulated Annealing
Ant Colony
Workflow Rescheduling
When any QOS constraints are violated, how to handle those
violations by rescheduling current task list to compensate, e.g. time
or budget deficits.
46
Summary
Grid/Scientific Workflows
Temporal Verification and Temporal Adjustment to
Support Temporal QOS Framework
Workflow Mining (More than process mining )
Optimization Algorithms for Workflow Scheduling and
Rescheduling
47
Useful Links
www.swinflow.org
Our work on temporal verification in scientific/grid workflows
http://is.tm.tue.nl/staff/wvdaalst/
Home page of Pro. Wil van der Aalst, Workflow Research
http://www.buyya.com/
Home page of Dr. Rajkumar Buyya, Grid Research
http://www.cs.ucr.edu/~eamonn/
Home page of Eamonn Keogh, Time Series Mining
http://www.cs.ucr.edu/~eamonn/time_series_data/ , UCR
Time Series Database
48
The End
Any questions or comments?
49