Efficient Autoscaling in the Cloud using Predictive Models for

Transcript Efficient Autoscaling in the Cloud using Predictive Models for

1
EFFICIENT AUTOSCALING IN
THE CLOUD USING
PREDICTIVE MODELS FOR
WORKLOAD FORECASTING
Roy, N., A. Dubey, and A. Gokhale
4th IEEE International Conference on Cloud Computing (Cloud 2011)
Agenda
2







Introduction
Related Work
Challenge
Solution
Evaluation
Conclusion
Comment
Introduction (1/4)
3

Typically customers maintain SLAs with service
providers for the QoS properties.
 Failure
to comply with satisfying these QoS metrics
leads to a major loss of revenue in the form of
decreased user base

Catering to the SLA while still keeping costs
low is challenging for such enterprise systems
Introduction (2/4) – naïve method
4
Introduction (3/4)
5

A problem with such a resource allocation scheme
 Is
the chance of thrashing where due to frequent
variation of workload, machines can be added and
released on every sample

A desirable solution would require an ability to
predict the incoming workload on the system and
allocate resources a priori
Introduction (4/4)
6

autoscaling the resources in a cloud environment is
not an easy and straightforward task.
 (i)
overheads related to state transition when number
of resources are changed
 (ii) ability to accurately predict future workload
 (iii) compute the right number of resources required
for the expected increase or decrease in workload.
Related Work (1/2)
7

(1) Heuristics-based virtual machine allocation and
migration

Urgaonkar et. al. [2], 2008



VM, dynamic provisioning, Queueing model
Only a single VM can be run in a host
Wood et. al. [3], 2007


VM, dynamic migration
define a unique metric based on the consumption data of the three
resources to make the migration decision


CPU, network and memory
Cunha et. al. [4], 2007


Queueing model
Pricing Model that gives rewards for throughput to be within SLA
limits and penalty for throughput going above
Related Work (2/2)
8

(2) Autonomic management of virtual computing
environment using control-theoretic approaches:

Wang et. al. [6]

A load balancing controller


Moreno et. al. [7]


An architecture for elastic management of cluster-based services
Waheed et. al. [8]


VMs are all load-balanced and the response time of the
applications in all the VMs are the same
Reactive algorithm to allocate resources to a cluster farm
Yang et. al [9]

Profiled based approach
Challenge (1/4)
9

Challenge 1: Workload Forecasting
 Correctness
of prediction
 Releasing resources is easy, but..
 Acquiring resources
 Make
a call on the cloud API which starts the acquisition
process
 The machine will be needed to boot up with the specified
image
 The application need to be started
Challenge (2/4)
10

Challenge 2: Identify Resource Requirement for
Incoming Load
 The
required number of resources is a function of
 the
number of customers
 the nature of the application
 the type of calls that each customer makes on the
application
Challenge (3/4)
11

Challenge 3: Resource Allocation while Optimizing
Multiple Cost Factors
 To
optimize resource usage and/or minimize idle
resources
 define
a time interval and change resources as many times
as possible as workload changes.
 In
the limit, this interval could be made infinitesimally
small and resources are changed continuously in
accordance with the change in load
Challenge (4/4)
12
 Obviously,
 the

such as scheme is not possible
overhead in allocating a resource
scaling up or down resources also involves cost and needs to be
optimized
Solution (1/9)
13

Control theory offers a promising methodology to
address the challenges
Solution (2/9)
14



1. For every future time step, it computes the cost of
selecting each possible resource allocation
2. To compute the cost of a particular allocation, it
uses Algorithm 1 to compute the estimated response
time for that particular machine configuration
3. Once the response time is calculated, it is used to
calculate the cost of the allocation which is a
combination of
how far the estimated response time is from the SLA
bounds (SLA violation)
 cost of leasing additional machines
 and also a cost of re-configuration

Solution (3/9)
15

A. Workload Prediction
 Authors
used a second order autoregressive moving
average method (ARMA) filter for the workload
 ( t  1)   ( t )   ( t  1)  (1  (    ))  ( t  2 )
 The
value for the variables β and γ are given by the
values 0.8 and 0.15
ARMA
16

Autoregressive Model (AR)
a
model depends on the level of the lagged
observations
 For example, if we observe a high realisation of GDP
we would expect that the GDP in the next few periods
are high as well
p
x
t
c
 x
i 1
i
t i

t
ARMA
17

Moving Average Model (MA)
 model
that the observations of a random variable at
time t are not only affected by the shock at time t, but
also the shocks of prior periods
 Ex. if we observe a negative shock to the economy, say,
9/11, then we would expect that the negative effect
affects the economy also for the near future.
q
x
t
  
 
i 1
i
t i

t
ARMA
18

Autoregressive Moving Average Model
 combine
both models we get a ARMA(p,q) model
x
 ARMA
t
 c t
p
 x
i 1
i
q
t i

i 1
i

t i
models are widely used for prediction of
economic and industrial time series
Solution (4/9)
19
Solution (5/9)
20

B. Performance Model
 The
next challenge we resolve is identifying resource
requirements for the predicted workload
 The workload used in this work is the number of users
currently in the system.
 It also depends
 In
upon what each user does.
prior work [20] we have used Customer Behavior
Modeling Graphs (CBMG) (?) to model the overall
behavior of customers
Solution (6/9)
21

A CBMG is built from a log of previous customer
behavior and computes the probability of a typical
user to visit each page
 Using
this information, we can calculate the number
of visits to a single page from the total number of
customers in the system.
 The number of visits to each page helps in calculating
the average load on each page.
Solution (7/9)
22
Solution (8/9)
23


C. Optimizing Resource Provisioning
The intuition is to identify the right number of time
intervals
 Our
solution works on look-ahead optimization
 iteratively solves
from t0
an optimization problem, Costopt, starting
Solution (9/9)
24

The next challenge is the choice of the look-ahead
period.
A
small look-ahead period will neglect trends
 A very large period will increase computational
complexity

The actual algorithm is not described here
 because
the implementation requires recursive data
structures
 is difficult to describe in the limited space available.
Evaluation (1/7)
25

Cost Function
26
Evaluation (2/7)
Just in time Resource Allocation

the weights on each component of the cost
function is the same
27
Evaluation (3/7) -- Resource Usage
under Different Cost Priorities

1) SLA violation against Resource Cost
 The
ratio of SLA penalty to machine cost is varied from
4 : 1 to 1 : 13
28
Evaluation (4/7) -- Resource Usage
under Different Cost Priorities
29
Evaluation (5/7) -- Resource Usage
under Different Cost Priorities
30
Evaluation (6/7) -- Resource Usage
under Different Cost Priorities

2) Including the Cost of Reconfiguration
31
Evaluation (7/7) -- Resource Usage
under Different Cost Priorities
Conclusion
32

this paper describes a look-ahead resource
allocation algorithm based on model predictive
control
 predicts
future workload
 adjusts resources allocated to users ahead-of-time
Comments
33




The detail of the model in the paper is too simple
I cannot understand why the authors did these
evaluations
The paper use control theory and it seems to have
a good prediction of workload
Something in 3 challenges
 the
overhead of allocating resources
 the prediction interval
 the costs of SLA violation, reconfiguring machine….

Efficient Autoscaling in the Cloud using Predictive Models for

Transcript Efficient Autoscaling in the Cloud using Predictive Models for

Directory