Application Performance in the QLinux Multimedia Operating System
Download
Report
Transcript Application Performance in the QLinux Multimedia Operating System
Application Performance
in the QLinux
Multimedia Operating
System
Jun Wang
Overview
H-SFQ CPU/Packet Scheduler
Cello Disk Scheduler
Lazy Receiver Processing (LRP)
Related Works
Open Discussions
H-SFQ CPU/Packet Scheduler
Requirements for New CPU
Scheduler
Example 1 Consider a MPEG decoder and some
conventional processes running together. The CPU
scheduler is the standard time sharing scheduler.
The MPEG decoder has a soft real-time demand
which is 25 frames per seconds. When the number
of conventional processes increases, what is the
problem?
Complaint:
The video
refreshes too
slow!
Requirements for New CPU
Scheduler
Example 2 Consider some conventional processes
and some hard real-time processes such as control
system of aircraft running together in a real-time OS.
The CPU scheduler is Earliest Deadline First (EDF).
When the number of hard real-time processes
increases, what is the problem?
Complaint:
I can’t input
anything into my
editor!
Requirements for New CPU
Scheduler
Problem Analysis
1. A conventional CPU scheduler can not adapt
multiple classes of applications simultaneously.
2. A conventional CPU scheduler can not isolate
different classes of applications so that increasing
the load of one class of applications does not
influence other classes of applications.
H-SFQ CPU Scheduler
—— A Sample Hierarchy
H-SFQ CPU Scheduler
—— Features
Multiple level scheduling structure
♠class-independent scheduler for non-leaf node & classspecific scheduler for leaf node
Support for multiple service classes
♠solving the simultaneous occurrence problem
Predictable resource allocation
♠solving the isolation problem
Proper accounting of resource usage
♠accounting of CPU usage by virtual time
SFQ CPU Scheduler
—— Algorithm Description(1)
Example: Consider two threads A and B with
weights 1 and 2,respectively,which become
runnable at time t=0. Thread A will run 80ms
to finish, and Thread B will run 110ms to
finish. But both of them will block at some
time. Let the scheduling quantum for each
thread be 10ms.
SFQ CPU Scheduler
—— Algorithm Description(2)
SFQ CPU Scheduler
—— Properties
SFQ achieves fair allocation of CPU regardless
of variation in available processing bandwidth
SFQ does not require the length of the quantum
to be known a priori
SFQ provides bounds on maximum delay
incurred and minimum throughput achieved by
the threads
H-SFQ CPU Scheduler
—— implementation
Weight
Start-tag
Finish-tag
H-SFQ CPU Scheduler
—— Evaluation(1)
H-SFQ CPU Scheduler
—— Evaluation(2)
Open Discussions
— Section One
How to convert different
user requirements into
weights in H-SFQ?
HLS: another hierarchical
framework for soft Real-time
schedulers
♣Multiple kinds of
schedulers in non-leaf
nodes
♣Support both relative
allocations and absolute
allocations
Cello Disk Scheduler
Requirements for New Disk
Scheduler
Conventional Disk
schedulers experience the
same simultaneous
occurrence problems and
isolation problem as the
conventional CPU
schedulers
Cello Disk Scheduler
—— Architectural Principles
Cello Disk Scheduler
—— Features
Multiple level scheduling structure
Class-independent scheduler combines weight
and disk service time into its decision of
scheduling.
Class-specific scheduler steal the slack time of
soft real-time requests for interactive requests as
much as possible
Cello Disk Scheduler
—— Implementation Ideas(1)
Example: There are 3 classes disk requests with equal weights in the system,
one is soft real-time class with 20ms deadline for every request and 6ms service
time, one is interactive class with 3 ms service time, the last is throughputintensive class. The following figure shows the contents of pending queues and
scheduled queue at the beginning of a certain time interval which is 45ms.
soft real-time
r3 r2 r1
interactive
i3 i2 i1
throughput-intensive
t1
Scheduled Queue
Cello Disk Scheduler
—— Implementation Ideas(2)
soft real-time
r3 r2
interactive
i3 i2 i1
throughput-intensive
t1
soft real-time
r3
interactive
i3 i2 i1
throughput-intensive
t1
Switching from one class to another
class when one of three conditions
is met:
Scheduled Queue
ᆞthe pending queue for a class is
r1
empty
ᆞthe sum of requests outweigh the
class’s proportion
ᆞthe sum of requests outweigh the
current time interval,
Scheduled Queue
r2 r1
Different class-specific
Cello Disk Scheduler
schedulers have different insert
Slack time of a
strategies. Ideas(3)
——
Implementation
request i: equal to
ᆞReal-time: EDF+SCAN
Max(0,li-ei).
Li: the latest time by
softthe
real-time
which
disk must
r3 this
begin servicing
interactive
request
i2 i1time
Ei: thei3earliest
throughput-intensive
at which
the disk
t1
may begin servicing
a request
soft real-time
r3
interactive
throughput-intensive
t1
ᆞInteractive: Steal slack time
ᆞThroughput-intensive: insert
towards the tail
Scheduled Queue
i3 r2 i2 i1 r1
Unused disk bandwidth for
interactive class can be
assigned to other classes
according to their weight.
Scheduled Queue
t1 r3 i3 r2 i2 i1 r1
Cello Disk Scheduler
—— Evaluation(1)
Cello Disk Scheduler
—— Evaluation(2)
Open Discussions
— Section Two
Overload of recomputing slack time after
insertion of each request.
Possible reductions?
Lazy Receiver Processing (LRP)
Conventional Linux Network
Architecture for Receiving Packets
Problem Analysis
Eager receiver processing
♠starving applications problem
Lack of effective load shedding
♠livestock problem
Lack of traffic separation
♠applications isolation problem
Inappropriate resource accounting
♠Account CPU time unfair
LRP Architecture
Features of LRP
Lazy receiver processing
Dropping packets as soon as possible
Different NI channels for different sockets
accurate resource accounting
Comparision of Two Architectures(1)
Demultiplexing Income Packets
Conventional Linux
LRP
Demultiplexing Position
Sockets Queue
NI Channel
The initial Position of
Dropping Packets
Sockets Queue
NI Channel
What is the advantage of early demultiplexing packets in
LRP?
Traffic separation and effective load shedding
Comparision of Two
Architectures(2)
Packets Processing
Conventional Linux
LRP
Time
when the packet is passed
from NI
when the application
process call receive()
TCP/IP Protocol
Processing Postion
Hard/Soft interrupt handler
driven as kernel level
invoked in receive()
as user level
Who charge for Protocol
Processing Time
The process when a
interrupt comes
The process who will
consume the packet
What is the advantage of delayed packet processing in
LRP?
Support stability and fairness under overload and correct CPU time accounting
Open Discussions
—Section Three
Accounting CPU time
♣Virtual time
How to make up the stolen time for a process?
♣Lazy receiver processing
Related Works
QLinux
♠ Resource Reservation for different classes of applications by multiple
level schedulers
♠ Reallocating dynamically unused resources
ControlWare & QLinux
A Problem with QLinux : how to map QoS directly to
the appropriate weight? QLinux can only tell that you can
get some performance guarantee with a certain weight. But
if you want to get some certain QoS, what is the
appropriate weight assigned to you?
A possible solution: using some feedback mechanism like
ControlWare to adjust the weights until finding the appropriate
weight for a certain QoS.
QuO & QLinux
QuO solves how to transmit the
QoS information between a client
and an object based on the QoS
parameters of underlying OS and
network. So QLinux can serve
very well for QuO.
References
[1] P. Goyal and X. Guo and H.M. Vin, A Hierarchical CPU Scheduler for
Multimedia Operating Systems, Proceedings of 2nd Symposium on Operating System Design
and Implementation (OSDI'96), Seattle, WA, pages 107-122, October 1996.
[2] P. Goyal and H. M. Vin and H. Cheng, Start-time Fair Queuing: A Scheduling
Algorithm for Integrated Services Packet Switching Networks, Proceedings of ACM
SIGCOMM'96, pages 157-168, August 1996.
[3] P Shenoy and H M. Vin, Cello: A Disk Scheduling Framework for Next
Generation Operating Systems, Proceedings of ACM SIGMETRICS Conference, Madison,
WI, pages 44-55, June 1998.
[4] P. Druschel and G. Banga, Lazy Receiver Processing (LRP): A Network
Subsystem Architecture for Server Systems, Proceedings of the 2nd Symposium on
Operating System Design and Implementation (OSDI'96), Seattle, WA, Pages 261-275,
October 1996
[5] Vijay Sundaram, Abhishek Chandra, Pawan Goyal and Prashant Shenoy,
Application Performance in the QLinux Multimedia Operating System , Proceedings of
the Eighth ACM Conference on Multimedia, Los Angeles, CA, pages 127-136, November 2000.
November 2000.
[6] John Regehr and John A. Stankovic, HLS:A Framework for Composing Soft RealTime Schedulers, Proceedings of the 22nd IEEE Real-Time Systems Symposium (RTSS
2001), pages 3-14, London, UK, December 3-6 2001.
[7] John Regehr and John A. Stankovic, Augmented CPU Reservations: Towards
Predictable Execution on General-Purpose Operating Systems,
Proceedings of the Seventh IEEE Real-Time Technology and Applications Symposium (RTAS
2001), pages 141-148, Taipei, Taiwan, May 30-June 1 2001
[8] John Regehr and John A. Stankovic, Operating System Support for Multimedia: