Web Server Request Scheduling
Download
Report
Transcript Web Server Request Scheduling
Web Server Request
Scheduling
Mingwei Gong
Department of Computer Science
University of Calgary
November 16, 2004
Outline
Introduction and Background
Quantifying the Properties of SRPT Policy
Evaluating the Sensitivity to the Arrival
Process and Job Size Distribution
Evaluating Hybrid SRPT
Introduction
User-perceived Web response time is
composed of several components:
Transmission delay, propagation delay in
network
Delays caused by TCP protocol effects (e.g.,
handshaking, slow start, packet loss, retxmits)
Queueing delays at the Web server itself,
which may be servicing 100’s or 1000’s of
concurrent requests
Queueing delays at busy routers
Our focus in this work: Web request
scheduling
Scheduling Policies
FCFS: First Come First Serve
PS: Processor Sharing
SRPT: Shortest Remaining Processing Time
LAS: Least Attained Service
FSP: Fair Sojourn Protocol
FCFS
FCFS: First Come First Serve
typical policy for single shared resource
(“unfair”)
e.g., drive-thru restaurant
e.g., routers
jobs
FSFS
PS
PS: Processor Sharing
time-sharing a resource amongst M jobs
each job gets 1/M of the resources (equal, “fair”)
e.g., CPU; VM; multi-tasking; Apache Web server
jobs
1/M
PS
SRPT
SRPT: Shortest Remaining Processing Time
pre-emptive version of Shortest Job First (SJF)
give resources to job that will complete quickest
e.g., ??? (express lanes in grocery store)(almost)
jobs
SRPT
jobs
SRPT
SRPT (cont’d)
SRPT is well known to be optimal in terms of
mean response time among all work-conserving
scheduling policies.
But, it is rarely deployed in the applications,
mainly due to two concerns:
Unfairness problem, that is, large jobs may be
penalized.
Unknown job sizes beforehand
FSP
FSP: Fair Sojourn Protocol
order the jobs according to the PS policy
give full resources to job that with the earliest
PS completion time
Eric J. Friedman, Shane G. Henderson: “Fairness and efficiency in web
server protocols”. SIGMETRICS 2003: 229-237
LAS
LAS: Least Attained Service scheduling
At any time, LAS gives service to a job that has
received the least amount of service.
A size based policy that favors short jobs without
knowing their sizes in advance
Rai, I. A., Urvoy-Keller G., and Biersack, E. W., “Analysis of LAS
scheduling for job size distributions with high variance. “
Proceedings of ACM Sigmetrics 2003, San Diego, June 2003.
Rai, I. A., Urvoy-Keller, G., Vernon, M., and Biersack, E. W.,
“Performance models for LAS-based scheduling disciplines in a packet
switched network”, Proceedings of ACM Sigmetrics 2004
Quantifying the Properties of
SRPT Policy
Related Work
Theoretical work:
SRPT is provably optimal in terms of mean
response time (“classical” results)
Practical work:
CMU: prototype implementation in Apache Web
server. The results are consistent with
theoretical work.
Related Work (Cont’d)
Harchol-Balter et al. show theoretical results:
For the largest jobs, the slowdown asymptotically
converges to the same value for any preemptive
work-conserving scheduling policies (i.e., for these
jobs, SRPT, or even LRPT, is no worse than PS)
For sufficiently large jobs, the slowdown under
SRPT is only marginally worse than under PS, by at
most a factor of 1 + ε, for small ε > 0.
[M.Harchol-Balter, K.Sigman, and A.Wierman 2002],
“Asymptotic Convergence of Scheduling Policies w.r.t. Slowdown”,
Proceedings of IFIP Performance 2002, Rome, Italy, September
2002
“crossover region”
(mystery hump)
“asymptotic
convergence”
PS
1
1-p
SRPT
1
0
x Job Size
y
8
Slowdown
8
A Pictorial View
Research Questions
Do these properties hold in practice for
empirical Web server workloads? (e.g.,
general arrival processes, service time
distributions)
What does “sufficiently large” mean?
Is the crossover effect observable?
If so, for what range of job sizes?
Is PS (the “gold standard”) really “fair”?
Overview of Research Methodology
Trace-driven simulation of simple Web server
Empirical Web server workload trace
(WorldCup’98) for main experiments
Probe-based sampling methodology
Estimate job response time distributions for
different job size, load level, scheduling policy
Performance Metrics
Number of jobs in the system
Slowdown:
The slowdown of a job is its observed response
time divided by the ideal response time if it were
the only job in the system
Ranges between 1 and
Lower is better
Empirical Web Server Workload
1998 WorldCup: Internet Traffic Archive:
Item
Trace Duration
Total Requests
Unique Documents
Value
861 sec
1,000,000
5,549
Total Transferred Bytes
Smallest Transfer Size (bytes)
Largest Transfer Size (bytes)
3.3 GB
4
2,891,887
Median Transfer Size (bytes)
Mean Transfer Size (bytes)
Standard Deviation (bytes)
889
3,498
18,815
Probe-Based Sampling Algorithm
PS
Slowdown (1 sample)
PS
Repeat
N
times
PS
Example Results for 3 KB Probe Job
Load 50%
Load 80%
Load 95%
Example Results for 100 KB Probe Job
Size 100K
Load 50%
Load 80%
Load 95%
Example Results for 10 MB Probe Job
Load 50%
Load 80%
Load 95%
Statistical Summary of Results
Two Aspects of Unfairness
Endogenous unfairness: (SRPT)
Caused by an intrinsic property of a job, such as
its size. This aspect of unfairness is invariant
Exogenous unfairness:
(PS)
Caused by external conditions, such as the number
of other jobs in the system, their sizes, and their
arrival times.
PS is “fair”
Sort of!
Observations for PS
Exogenous
unfairness
dominant
Observations for SRPT
Endogenous
unfairness
dominant
Asymptotic Convergence?
Yes!
Linear Scale
Log Scale
3M
Illustrating
the
crossover
effect
(load=95%)
3.5M
4M
Crossover Effect?
Yes!
Summary and Conclusions
Trace-driven simulation of Web server scheduling
strategies, using a probe-based sampling
methodology (probe jobs) to estimate response
time (slowdown) distributions
Confirms asymptotic convergence of the slowdown
metric for the largest jobs
Confirms the existence of the “cross-over effect”
for some job sizes under SRPT
Provides new insights into SRPT and PS
Two types of unfairness: endogenous vs. exogenous
PS is not really a “gold standard” for fairness!
Evaluating the Sensitivity to
Arrival Process and Job Size
Distribution
Research Questions
What is the impact from different arrival
process and job size distribution?
How the crossover region will be affected?
Does it depend on the arrival process and the
service time distribution? If so, how?
Effect of Request Arrival Process
Using fixed size transfers
3 KB in this experiment
Changing the Hurst parameter
from 0.50 to 0.90
Marginal Distr. Of Number of Jobs in the
System (p=0.8)
Hurst 0.5
Hurst 0.7
Hurst 0.9
Marginal Distr. Of Number of Jobs in the
System (p=0.95)
Hurst 0.5
Hurst 0.7
Hurst 0.9
Mean Performance under Load 0.80
Mean Performance under Load 0.95
Sensitivity to Arrival Process
A bursty arrival process (e.g., self-similar
traffic, with Hurst parameter H > 0.5) makes
things worse for both PS and SRPT policies
A bursty arrival process has greater impact
on the performance of PS than on SRPT
Effect of Heavy-tailed Job Size
Distribution
Using Deterministic Arrival Process
Adjusting the heavy-tailed parameter
From 1.0 to 2.0
Sensitivity to Job Size Distribution
SRPT loves heavy-tailed distributions:
the heavier the tail the better!
For all Pareto parameter values and all system
loads considered, SRPT provides better
performance than PS with respect to mean
slowdown and standard deviation of slowdown
The Crossover Effect Revisited
The crossover region tends to get smaller as
the burstiness of the arrival process
increases
PS performs much worse under bursty arrival
process
SRPT can still manage a relatively good
performance
Evaluating Hybrid SRPT
Research Questions
Efficiency and Fairness, how to achieve both?
Can we do better? If so, how?
PS, FSP and SRPT
K-SRPT
Multi-threaded version of SRPT, that allows up to K
jobs (the K smallest RPT ones) to be in service
concurrently (like PS), though with the same fixed
aggregate service rate. Additional jobs (if any) in the
system wait in the queue. And of course it is
preemptive, like SRPT.
share = Min (J, K)
If J > K, then first K jobs each receives 1/share
Else, those J jobs, each receives 1/share
Simulation Results for K-SRPT
Slowdown Profile Plot for K-SRPT
Jobs in System for K-SRPT
T-SRPT
Determining whether the system is "busy" or
not depends on a threshold T for the number
of jobs (J) in the system. share = Min (J, K)
If J > T, then use SRPT
Else, use PS.
Simulation Results for T-SRPT
Slowdown Profile Plot for T-SRPT
Jobs in System for T-SRPT
DT-SRPT
Double-Threshold SRPT uses two threshold:
A high threshold T_high at which the policy
switches from PS to SRPT.
A low threshold T_low at which it switches back
from SRPT to PS.
Simulation Results for DT-SRPT
Slowdown Profile Plot for DT-SRPT
Jobs in System for DT-SRPT
Summary of Simulation Results for
Hybrid SRPT Scheduling Policies
Scheduling
Policy
Mean
Slowdown
SRPT
State
State
Changes
K-SRPT-2
K-SRPT-10
1.338
4.774
N/A
N/A
N/A
N/A
T-SRPT-2
T-SRPT-10
1.389
3.295
74.5%
38.5%
310,236
480,084
DT-SRPT-2-10
DT-SRPT-10-30
2.505
9.572
60.6%
22.8%
6,554
2,542
Summary and Conclusions
Proposes two novel Web server scheduling
policies, each of which is a parameterizable
variant of SRPT
Hybrid SRPT provides similar performance as
FSP with simpler implementation.
DT-SRPT policy looks optimizing.
For Details..
Mingwei Gong, Carey Williamson, “Quantifying
the Properties of SRPT Scheduling”.
MASCOTS 2003, pp:126-135
Mingwei Gong, “Quantifying Unfairness in
Web Server Scheduling” . M.Sc. Thesis,
University of Calgary, July 2003.
Mingwei Gong, Carey Williamson, “Simulation
Evaluation of Hybrid SRPT Scheduling Policy”.
MASCOTS 2004, pp: 355-363
Thank you for your
attention!!