Transcript Slide 1

A Probabilistic QoS Model and
Computation Framework for Web
Services-Based Workflows
San-Yih Hwang, H. Wang, J. Srivastava
National Sun Yat-sen U., Taiwan
Univ. of Minnesota, USA
1
Overview





Introduction
QoS metrics and modeling
WS-Workflow QoS Computation
Performance evaluation
Conclusions
2
Web Service based workflows



Web Service: a modular and self-described application that
uses Web technologies to interact with other services.
Workflow: a process by which a series of activities are
executed in a specific sequence.
WS-Workflow (composite web service): a workflow of
activities, each of which is wrapped as a web service.


e.g., a “Travel Planner” may aggregate multiple Web services
for flight booking, travel insurance, accommodation booking,
car rental, etc.
Quality of Service (QoS): non-functional measures of a
service, which often decides the satisfaction of a user
toward the service .
3
Web service QoS
Category
Instance-level QoS Metrics
1. Performance
Response time: time elapsed
from the submission of a request
to the time the response is
received
Throughput: the number of
instances completed per time
unit
2. Resources
Cost: the amount of money paid
for executing an instance
Memory/CPU/ bandwidth
3. Dependability
Reliability: The probability that
the service can be successfully
completed.
Availability: The probability that
a service can be successfully
invoked.
TTR: time to repair
4. Fidelity
Reputation rating:, usually
measured as a scalar value (e.g.,
1, 2, 3, 4, 5, the higher the better)
5. Transactional
properties
Service Classes
Metrics
System-level QoS Metrics
ACID properties
Commit protocol
(e.g., 2PC)
6. Security
Confidentiality
Nonrepudiation
Encryption
4
The problem


The goal is to compute the QoS measures of a WSworkflow from those of its constituent web services.
Four instance_level QoS metrics are considered:





Response time: the time elapsed from the submission of a request to
the time the response is received .
Cost: amount of money paid.
Reliability: the probability that the service can be successfully
delivered.
Fidelity: reputation rating.
How do we represent a QoS measure of a web service?


A single value, used by most of the previous researches
A probability distribution, adopted by us
5
The problem (Cont.)

W (parallel)
Why not using a single
value for a QoS measure of
a web service?


Instance-level QoS measures
are inherently probabilistic.
Choosing a single value for a
QoS measure (e.g., average
case) may yield incorrect
result.
A1
A2



Response time of
A1: N(10, 10)
Response time of
A2: N(10, 10)
Average response
time of W is NOT 10.
6
The problem (Cont.)

W (conditional)
Why not using a normal
distribution for modeling each
QoS measure of a Web
service?


Some QoS measure may not
follow normal distribution.
Even if the QoS measures of all
activities follow normal
distribution, the QoS of a WSworkflow may not follow normal
distribution, e.g., parallel,
conditional selection
0.5 A1
0.5


A2
Response time of
A1: N(10, 5)
Response time of
A2: N(20, 5)
7
Probabilistic Modeling of WS QoS


A QoS measure of a web service is modeled as a
discrete random variable.
Probability Mass Function (PMF)

Let the sample space of X be Dom(X), then
 f X ( x)  1
f ( x)  P( X  x)
X

xDom ( X )
e.g. Suppose the probabilities of a web service w being
completed in one, two, and three days, are 0.2, 0.6, and
0.2, respectively. The PMF of its response time is:



fresponse_time(w)(1)=0.2
fresponse_time(w) (2)=0.6
fresponse_time(w) (3)=0.2
8
WS-workflow QoS framework
Web services
WS invocation
log
WS-workflow
QoS Framework
WS selection
WS-workflow
enactment
invokes
user
WS-workflow
QoS Model
WS-workflow QoS
Computation
WS SLA
spec
WS-workflow QoS
Monitoring
WS-workflow QoS
Objective Spec
owner
9
Computing QoS Values of WS Compositions

A structured workflow can be constructed
recursively by the following 5 basic
constructs.





Sequential
Parallel
Conditional
fault-tolerant
Loop
10
Sequential
a1
Cost:
Time:
Reliability:
…
a2
an
n
C ( w)   C (ai )
i 1
n
T ( w)   T (ai )
i 1
n
R( w)   R(ai )
i 1
Fidelity
n
F ( w)   wi F (ai )
i 1
11
Parallel
a1, a2, …, an are executed concurrently.
a1
a2
…
P
Pe
an
n
C ( w)   C (ai )
i 1
n
R( w)   R(ai )
i 1
T ( w)  Max T ( a i )
i
n
F ( w)   wi F (ai )
i 1
12
Conditional
Only one of the activities is executed.
a1
p1
,
a2
pn
…
p2
C
Ce
an
C ( w)  CS (C (ai ), pi )
T ( w)  CS (T (ai ), pi )
R( w)  CS ( R(ai ), pi )
F ( w)  CS ( F (ai ), pi )
1i  n
1i  n
1i  n
1i  n
13
fault-tolerant
All the activities are executed concurrently, but only
one of them need to be finished.
a1
a2
Fe
…
F
an
C (w) 
 C (a )
1i  n
i
T ( w)  Min T ( ai )
i
n
R( w)  1   (1  R(ai ))
i 1
F ( w)  CS ( F (ai ), p(ai ))
1i  n
p(ai )
is the probability that ai is finished first.
14
Loop
a
a
LC
0.3
C
Ce
0.7


a
a
We model the number of iterations as a discrete
random variable, e.g. 1 time with probability 0.3,
and 2 times with probability 0.7.
It can be converted to a equivalent conditional
structure.
15
Operations of discrete random variables

Basic operations:






Sum
Multiplication
MAX
MIN
Conditional selection
Let X, Y be independent random variables:


Dom(X)={x1, x2, …, xm} , PMF: fX(x)
Dom(Y)={y1, y2, …, yn} , PMF: fY(y)
16
Z=X+Y
Dom( Z )  {z | z  x  y, x  Dom( X ), y  Dom(Y )}
PMF: f Z ( z) 
f
X ( x) f Y ( y )
xDom ( X ), yDom (Y ), z  x  y
Dom(X)={1, 2}, fX(1)=0.3,
fX(2)=0.7
Dom(Y)={10, 20}, fY(10)=0.4,
fY(20)=0.6
Dom(Z)={11, 12, 21, 22}
fz(11)=0.3*0.4=0.12, fz(12)=0.7*0.4=0.28
fz(21)=0.3*0.6=0.18, fz(12)=0.7*0.6=0.42
Note: Multiplication is similar, except that z is the
product of some x and y.
17
Z=Max(X, Y)
Dom(Z) = Dom(X)Dom(Y)


f X ( z )   fY ( y )
if z  Dom (X) andz  Dom (Y)

y z

yDom (Y )

f X ( z)  
fY ( z )   f X ( x)
if z  Dom (Y) andz  Dom (X)
x z

xDom ( X )

 f X ( z )   fY ( y )  fY ( z )   f X ( x) if z  Dom (X) andz  Dom (Y)
y z
x z

yDom (Y )
xDom ( X )
Note: MIN operation is very similar to MAX.
18
Conditional selection
Z  CS ( X i , pi ): exclusive selection of a random
1 i  n
variable according to the associated probabilities.
pi : the probability that Xi is selected.
Dom ( Z )   Dom ( X i )
1i  k
f Z (Z  z) 
p  f
i
zDom ( X i )
Xi
( z),
z  Dom(Z )
19
An example WS-workflow
HDD Proc. 1
Ce
C
HDD Proc. 2
S
HDD procurement - conditional
P
Pe
Assembly
Test
CPU Proc.
Parts procurement
Parallel
CD-ROM Proc
Email Notification
Y
OK?
Shipping
N
Fix&Test
Adjustment-loop
Fe
F
E
Phone Notification
Customer notification – fault tolerant
20
Tree representation of a workflow



A structured workflow can be represented by a tree.
A composite activity can be substituted by an equivalent atomic activity.
Use bottom-up method to compute the Workflow QoS.
Workflow
Sequential
Parts Proc.
Parallel
HDD Proc.
Conditional
HDD
Proc. 1
Assembly
CPU Proc.
Test
CD-ROM
Proc.
Adjustment
Loop
Fix&Test
Notification
Fault tolerant
Shipping
Email
Phone
HDD
Proc. 2
21
Sample space reduction



When combining the random variables, the sample
space size of the resultant variable may increase
dramatically.
To keep the sample space at a reasonable size after
some operation, we have to group the elements in
the sample space. Each group is represented by
one value.
Find an optimal grouping scheme which
minimizes the error.
22
The optimal solution

Dynamic programming:

Let e(i, j, k) be the optimal aggregate error of partitioning
xi, xi+1, …, xj into k subsequences.

Recursive function:
e(i, j, k ) 
min (e(i, a, b)  e(a  1, j, k  b))
i  a  j ,1b  k
if j-i+1>k and k>1
e(i, j, k) = 0 if j-i+1=k
e(i, j, 1) = error(i, j).

Time complexity: O(s3m2), where s is the number of
elements in the original domain and m is the number of
elements in the domain after reduction.
23
Heuristic method

Algorithm:





Find the pair of adjacent elements in the
domain which has least error when merged.
Replace the two elements by a new element.
Repeat until the reasonable size is reached.
Priority queue can be used to find the pair
of samples with least error in O(logs) time.
Time complexity: O(slogs)
24
Performance evaluations
Sample space size reduction



Performance metric: cumulative distribution function (CDF). The
CDF of Z, denoted as FZ(z), is defined as Pr(Zz)
The following figure shows the difference of CDFs of each method
(DP and greedy methods ) and the theoretical value when the size
of PMF is reduced from 400 to 30.
Mean error:

DP: 0.001494, Greedy: 0.002136
0.007
dynamic programming
0.006
difference on F(z)

greedy
0.005
0.004
0.003
0.002
0.001
0
100
120
140
160
180
200
Z
220
240
260
25
Response time accuracy

Computation of response time of the experimental WSworkflow “PC order fulfillment”.
activity
Mean
Standard
deviation
minimum
maximum
Probability
HDD Proc. 1
2
1
1
4
0.8
HDD Proc. 2
3
1.5
1
5
0.2
CPU Proc.
2
0.8
1
4
N/A
CD-ROM Proc.
2
0.7
1
4
N/A
Assembly
2
0.5
1
3
N/A
Test
1
0.2
0.5
2
N/A
Fix&Test
0.5
0.2
0.1
1
N/A
Shipping
1
0.2
0.5
2
N/A
Email notification
0.6
0.2
0.2
1
N/A
Phone notification
0.5
0.2
0.1
1
N/A
26
Response time accuracy


The following figure show the difference between the
CDF of the greedy method and simulation result.
Maximum error: 0.008
The greedy method is a thousand times faster than the
simulation.
0.009
cumulative probability error

0.008
0.007
0.006
0.005
0.004
0.003
0.002
0.001
0
5
6
7
8
Time
9
10
11
27
Cost accuracy
0.006
Standard
dev
iati
on
minimum
maximum
0.005
HDD Proc. 1
100
6
80
120
HDD Proc. 2
110
10
80
130
CPU Proc.
150
5
130
160
CD-ROM Proc.
80
10
60
100
Assembly
20
3
10
30
Test
15
5
10
20
Fix&Test
10
3
5
20
Shipping
25
4
15
40
Email notification
1
0.5
3
5
Phone notification
2
1
1
3
CDF error
Mean
Activity
0.004
0.003
0.002
0.001
0
360
380
400
Cost
420
440
28
Fidelity accuracy
2
3
4
5
1
0.01
HDD Proc. 1
0.006644
0.064938
0.259001
0.410416
0.259001
HDD Proc. 2
0.122595
0.233026
0.288758
0.233026
0.122595
CPU Proc.
0.027104
0.111310
0.259135
0.343315
0.259135
0.009
0.008
0.153398
0.221476
0.250254
0.221476
0.153398
Assembly
0
0.001302
0.157605
0.683489
0.157605
CDF Error
CD-ROM
Proc.
0.007
0.006
0.005
0.004
0.003
0.002
Test
0
0.001302
0.157605
0.683489
0.157605
Fix&Test
0.006200
0.196167
0.595267
0.196167
0.006200
Shipping
0.022031
0.139256
0.349728
0.349728
0.139256
Email
notification
0
0
0.369433
0.629267
0.001300
Phone
notification
0
0
0.047833
0.904333
0.047833
0.001
0
2.8
3
3.2
3.4
3.6 3.8
Fidelity
4
4.2
4.4
29
Related work



Estimating the QoS measures of a WSworkflow by assuming a single QoS value for
each web service. [Menace02] [Cardoso02]
[Zeng03].
Estimating the QoS measures of a WSworkflow using simulation
[Gillmann02][Cardoso02]
Project management (CPM/PERT) for
estimating response time
30
Summary


We propose a probability-based WSworkflow QoS model and its
computation framework.
The computation framework is
efficient and accurate.
31
Ongoing work

Online QoS monitoring

Online QoS estimation for a WS-workflow
instance


Pre-computation is needed for efficiency.
Defining and ensuring QoS objective


A QoS objective is a 5-tuple. E.g., (“PC order
fulfillment”, “response time”, “no larger”, “7 days”,
90%)
Define a checkpoiont for each atomic web service.
32
Ongoing work

Web service selection


Each activity has a set of candidate web services
that provide the same function but different QoS
measures.
Choose a web service for each activity such that
some constraints are satisfied and the goal is
optimized. Both constraints and the goal are
specified in a probabilistic manner

E.g., The probability that the entire WS-workflow can be
completed in 10 days is at least 90%.
33