INF5070 – Media Storage and Distribution Systems

Download Report

Transcript INF5070 – Media Storage and Distribution Systems

INF5070 – Media Storage and Distribution Systems:
Server Resources
6/9 - 2004
Overview
 Resources, real-time, …
 “Continuous” media streams
 (CPU) Scheduling
 Memory management for streaming
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Resources and Real–Time
Resources
 Resource:
“A resource is a system entity required by a task for manipulating data”
[Steimetz & Narhstedt 95]
 Characteristics:
 active: provides a service, e.g., CPU, disk or network adapter
 passive: system capabilities required by active resources, e.g., memory
or bandwidth




exclusive: only one process at a time can use it, e.g., CPU
shared: can be used by several concurrent processed, e.g., memory
single: exists only once in the system, e.g., loudspeaker
multiple: several within a system, e.g., CPUs in a multi-processor system
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Real–Time
 Real-time process:
“A process which delivers the results of the processing in a given time-span”
 Real-time system:
“A system in which the correctness of a computation depends not only on
obtaining the result, but also upon providing the result on time”
 Many real-time applications, e.g.:

temperature control in a nuclear/chemical plant



defense system on a navy boat



driven by interrupts from an external device
these interrupts occur irregularly
control of a flight simulator



driven by interrupts from an external device
these interrupts occur irregularly
execution at periodic intervals
scheduled by timer-services which the application requests from the OS
...
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Real–Time
 Deadline:
“A deadline represents the latest acceptable time for the
presentation of the processing result”
 Hard deadlines:
 must never be violated  system failure
 too late results

have no value,

means severe (catastrophic) system failure,
e.g., processing weather forecasts
e.g., processing of an incoming torpedo signal in a navy boat scenario
 Soft deadlines:
 in some cases, the deadline might be missed



not too frequently
not by much time
result still may have some (but decreasing) value,
e.g., a late I-frame in MPEG
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Real–Time and Multimedia
 Multimedia systems
 typically have soft deadlines (may miss a frame)
 are non-critical (user may be annoyed, but …)
 have periodic processing requirements
(e.g., each 33 ms in a 30 fps video)
 require large bandwidths
(e.g., average of 3.5 Mbps for DVD video only)




need predictability (guarantees)
adapt real-time mechanisms to continuous media
exploit resource-specific properties
(like real-time resource allocation schemes, preemption, ...)
priority-based schemes are of special importance
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Admission and Reservation
 To prevent overload, admission may be performed:

schedulability test:





“are there enough resources available for a new stream?”
“can we find a schedule for the new task without disturbing the existing workload?”
a task is allowed if the utilization remains < 1
yes – allow new task, allocate/reserve resources
no – reject
 Resource reservation is analogous to booking
(asking for resources)

pessimistic




optimistic




avoid resource conflicts making worst-case reservations
potentially under-utilized resources
guaranteed QoS
reserve according to average load
high utilization
overload may occur
perfect


must have detailed knowledge about resource requirements of all processes
too expensive to make/takes much time
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Real–Time and Operating Systems
 The operating system manages local resources
(CPU, memory, disk, network card, busses, ...)
 In a real-time, multimedia scenario, support is needed for:


real-time processing
efficient memory management
 This also means support for proper …
 scheduling –
high priorities for time-restrictive multimedia tasks
 timer support –
clock with fine granularity and event scheduling with high accuracy
 kernel preemption –
avoid long periods where low priority processes cannot be interrupted
 memory replacement –
prevent code for real-time programs from being paged out
 fast switching –
both interrupts and context switching should be fast
 ...
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Continuous Media Streams
Streaming Data
 Start playback at t1
 Consumed bytes (offset)
 variable rate
 constant rate
 Must start retrieving
arrive function
send function
read function
data earlier
 Data must arrive before
consumption time
 Data must be sent
before arrival time
 Data must be read from
disk before sending time
INF5070 – media storage and distribution systems
consume function
time
t1
2004 Carsten Griwodz & Pål Halvorsen
Streaming Data
 Need buffers to hold data between the functions,
e.g., client B(t) = A(t) – C(t), i.e.,  t : A(t) ≥ C(t)
arrive function
 Latest start of data arrival
is given by
min[B(t,t0,t1) ;  t B(t,t0,t1) ≥ 0],
i.e., the buffer must at all
times t have more data to
consume
consume function
time
t 0 t1
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Streaming Data
 “Continuous Media” and “Streaming” are ILLUSIONS

retrieve data in blocks from disk

transfer blocks from file
system to application
application

send packets to communication system

split packets into appropriate MTUs

... (intermediate nodes)
... (client)


different optimal sizes

pseudo-parallel processes
(run in time slices)
file system
communication
system
need for scheduling
(to have timing and
appropriate resource allocation)
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
(CPU) Scheduling
Scheduling
 A task is a schedulable entity
(a process/thread executing a job, e.g.,
an packet through the communication
system or a disk request through the file system)
requests
 In a multi-tasking system, several
tasks may wish to use a resource
simultaneously
scheduler
 A scheduler decides which task
that may use the resource,
i.e., determines order
by which requests are serviced,
using a scheduling algorithm
resource
 Each active (CPU, disk, NIC) resources needs a scheduler
(passive resources are also “scheduled”, but in a slightly different way)
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Scheduling
 Scheduling algorithm classification:

dynamic





static





make scheduling decisions at off-line (also called pre-run-time)
generates a dispatching table for run-time dispatcher at compile time
needs complete knowledge of task before compiling
small run-time overhead
preemptive





make scheduling decisions at run-time
flexible to adapt
considers only actual task requests and execution time parameters
large run-time overhead finding a schedule
currently executing task may be interrupted (preempted) by higher priority processes
preempted process continues later at the same state
potential frequent contexts switching
(almost!?) useless for disk and network cards
non-preemptive



running tasks will be allowed to finish its time-slot (higher priority processes must wait)
reasonable for short tasks like sending a packet (used by disk and network cards)
less frequent switches
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Scheduling
 Preemption:






tasks waits for processing
scheduler assigns priorities
task with highest priority will be
scheduled first
preempt current execution if a higher priority
(more urgent) task arrives
requests
scheduler
preemption
real-time and best effort priorities
(real-time processes have higher priority
- if exists, they will run)
to kinds of preemption:

preemption points
o
o

resource
predictable overhead
simplified scheduler accounting
immediate preemption
o
o
needed for hard real-time systems
needs special timers and fast interrupt and
context switch handling
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Scheduling
 Scheduling is difficult and takes time
(both to find a schedule and to switch between threads/processes – not shown):
RT process
delay
request
round-robin
process 1 process 2 process 3 process 4 …
process N RT process
RT process
request
priority,
non-preemtive
delay
process 1 process
RT process
2 process 3 process 4 …
process N
RT process
priority,
preemtive
request
only delay switching and interrupts
process
p 1 RT
p 1process
process
processp221 process
process
process
33 2process
process
process
44 3…
process
process
4 NN…
…process
process N
NOTE: preemption may also be limited to preemption points (fixed
points where the scheduler is allowed to interrupt a running process)
 giving larger delays
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Priorities and Multimedia
 Multimedia streams need predictable access to
resources – high priorities:
1. multimedia traffic with guaranteed QoS
may not exist
2. multimedia traffic with predictive QoS
3. other requests
must not starve
 Within each class one could have a second-level
scheduler


1 and 2: real-time scheduling and fine grained priorities
3: may use traditional approaches as round-robin
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Scheduling in Windows 2000
 Preemptive kernel
 Schedules threads individually
 Time slices given in quantums

3 quantums = 1 clock interval

different values used for different clock frequencies, e.g.,


x86 uniCPU:
10 ms

x86 multiCPU:
15 ms
defaults:

Win2000 server:
36 quantums

Win2000 workstation:
6 quantums (professional)

may manually be increased between threads (1x, 2x, 4x, 6x)

foreground quantum boost (add 0x, 1x, 2x):
active window can get longer time slices (assumed needs fast response)
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Scheduling in Windows 2000
 32 priority levels:
Round Robin (RR) within each level
 Interactive and throughput-oriented:
 “Real time” – 16 system levels



Variable – 15 user levels




fixed priority
may run forever
priority may change:
thread priority = process priority ± 2
uses much  drops
user interactions, I/O completions  increase
Idle/zero-page thread – 1 system level


runs whenever there are no other processes to run
clears memory pages for memory manager
INF5070 – media storage and distribution systems
Real Time (system thread)
31
30
...
17
16
Variable (user thread)
15
14
...
2
1
Idle (system thread)
0
2004 Carsten Griwodz & Pål Halvorsen
Scheduling in Linux



Preemptive kernel
Threads and processes used to be equal,
but Linux uses (in 2.6) thread scheduling




126
127
each priority in RR
timeslices of 10 ms (quantums)
ordinary user processes
uses “nice”-values: 1≤ priority≤40
timeslices of 10 ms (quantums)
SHED_RR
1
2

realtime (FIFO and RR):
goodness = 1000 + priority
timesharing (OTHER):
goodness = (quantum > 0 ? quantum + priority : 0)
Quantums are reset when no ready
process has quantums left:
quantum = (quantum/2) + priority
INF5070 – media storage and distribution systems
nice
...
Threads with highest goodness are selected first:


...
may run forever, no timeslices
may use it’s own scheduling algorithm
SHED_OTHER


2
SHED_RR


1
SHED_FIFO


SHED_FIFO
-20
126
-19
127
...
18
SHED_OTHER
default (20)
19
2004 Carsten Griwodz & Pål Halvorsen
Scheduling in AIX
 Similar to Linux, but has
SHED_FIFO
1
always only used thread
scheduling



2
...
SHED_FIFO
SHED_RR
SHED_OTHER
 BUT, SHED_OTHER may
126
127
SHED_RR
1
change “nice” values


running long (whole timeslices)
 penalty – nice increase
interrupted (e.g., I/O) gives
initial “nice” value back
2
...
126
127
SHED_OTHER
default
INF5070 – media storage and distribution systems
nice
-20
-19
...
18
19
2004 Carsten Griwodz & Pål Halvorsen
Real–Time Scheduling
 Multimedia streams are usually periodic
(fixed frame rates and audio sample frequencies)
 Time constraints for a periodic task:
 s – starting point
(first time the task require processing)
 e – processing time
 d – deadline
 p – period (r – rate (r = 1/p))


d
e
time
s
0≤e≤d
(often d ≤ p: we’ll use d = p – end of period, but Σd ≤ Σp is enough)
the kth processing of the task



p
is ready at time s + (k – 1) p
must be finished at time s + (k – 1) p + d
the scheduling algorithm must account for these properties
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Real–Time Scheduling
 Resource reservation
 QoS can be guaranteed
 relies on knowledge of tasks
 no fairness
 origin: time sharing operating systems
 e.g., earliest deadline first (EDF) and rate monotonic (RM)
(AQUA, HeiTS, RT Upcalls, ...)
 Proportional share resource allocation
 no guarantees
 requirements are specified by a relative share
 allocation in proportion to competing shares
 size of a share depends on system state and time
 origin: packet switched networks
 e.g., Scheduler for Multimedia And Real-Time (SMART)
(Lottery, Stride, Move-to-Rear List, ...)
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Earliest Deadline First (EDF)
 Preemptive scheduling based on dynamic task priorities
 Task with closest deadline has highest priority
 stream priorities vary with time
 Dispatcher selects the highest priority task
 Assumptions:
 requests for all tasks with deadlines are periodic
 the deadline of a task is equal to the end on its period (starting of next)
 independent tasks (no precedence)
 run-time for each task is known and constant
 context switches can be ignored
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Earliest Deadline First (EDF)
 Example:
priority A < priority B
priority A > priority B
deadlines
Task A
Task B
time
Dispatching
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Rate Monotonic (RM) Scheduling
 Classic algorithm for hard real-time systems with one CPU
[Liu & Layland ‘73]
 Pre-emptive scheduling based on static task priorities
 Optimal: no other algorithms with static task priorities can
schedule tasks that cannot be scheduled by RM
 Assumptions:
 requests for all tasks with deadlines are periodic
 the deadline of a task is equal to the end on its period (starting of next)
 independent tasks (no precedence)
 run-time for each task is known and constant
 context switches can be ignored
 any non-periodic task has no deadline
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Rate Monotonic (RM) Scheduling
shortest period,
highest priority
priority
 Process priority based on task periods
 task with shortest period gets
highest static priority
 task with longest period gets
lowest static priority
period length
 dispatcher always selects task requests with highest priority
 Example:
Task 1
Task 2
longest period,
lowest priority
p1
p2
P 1 < P2
 P1 highest priority
preemption
Dispatching
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
EDF Versus RM
 It might be impossible to prevent deadline misses in a strict, fixed priority system:
deadlines
Task A
time
Task B
deadline miss
Fixed priorities,
A has priority, no dropping
waste of time
deadline
miss
Fixed priorities,
A has priority, dropping
Fixed priorities,
B has priority, no dropping
Fixed priorities,
B has priority, dropping
deadline
miss
waste of time
waste of time
deadline
miss
Earliest deadline first
deadline miss
Rate monotonic (as the first)
INF5070 – media storage and distribution systems
RM may give some
deadline violations
which is avoided by EDF
2004 Carsten Griwodz & Pål Halvorsen
EDF Versus RM
NOTE: this means that EDF is usually
more efficient than RM, i.e., if switches
are free and EDF uses resources ≤ 1,
time then RM may need ≤ ln(2) resources
to schedule the same workload
 EDF
 dynamic priorities changing in
 overhead in priority switching
 QoS calculation – maximal throughput:

all streams i
Ri x ei ≤ 1,
R – rate, e – processing time
 RM
 static priorities based on periods
 may map priority onto fixed OS priorities (like Linux)
 QoS calculation:

all streams i
Ri x ei ≤ ln(2), R – rate, e – processing time
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
SMART
(Scheduler for Multimedia And Real–Time applications)
 Designed for multimedia and real-time applications
 Principles

priority – high priority tasks should not suffer degradation due to
presence of low priority tasks

proportional sharing – allocate resources proportionally and distribute
unused resources (work conserving)

tradeoff immediate fairness – real-time and less competitive processes
(short-lived, interactive, I/O-bound, ...) get instantaneous higher shares

graceful transitions – adapt smoothly to resource demand changes

notification – notify applications of resource changes
 No admission control
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
SMART
(Scheduler for Multimedia And Real–Time applications)
 Tasks have importance and urgency
 urgency – an immediate real-time constraint, short deadline
(determine when a task will get resources)
 importance – determine the overall resource allocation




expressed by a tuple:
[ priority p , biased virtual finishing time bvft ]
static priority: supplied by user or assigned a default value
virtual finishing time: degree to which the share was consumed
bias: bonus for interactive tasks
 Best effort schedule based on urgency and importance
 find most important tasks – compare tuple:
T1 > T2  (p1 > p2)  (p1 = p2  bvft1 > bvft2)
 sort after urgency (EDF based sorting)
 iteratively select task from candidate set as long as schedule is feasible
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling
 Tests performed
 by IBM (1993)
 executing tasks with and without EDF
 on an 57 MHz, 32 MB RAM, AIX Power 1
 Video playback program:
 one real-time process
 read compressed data
 decompress data
 present video frames via X server to user
 process requires 15 timeslots of 28 ms each per second
 42 % of the CPU time
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling
3 Load Processes
3 load processes
(competing with the video playback) without real-time s cheduling
laxity (remaining time to deadline)
laxity [s]
with real-time s cheduling
0.05
the real-time
scheduler reaches
all its deadlines
0.04
0.03
0.02
0.01
0
-0.01
several deadline
violations by the
non-real-time
scheduler
-0.02
-0.03
-0.04
-0.05
0
20
40
60
INF5070 – media storage and distribution systems
80
100 120 140 160 180 200
task number
ev ent number
2004 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling
Varied the number of load processes
(competing with the video playback)
Only video process
0. 042
laxity (remaining time to deadline)
0.04
0. 038
4 other
processes
0. 036
0. 034
16 other
processes
0. 032
0.03
NB! The EDF
scheduler kept
its deadlines
0. 028
0. 026
0
20
40
60
80
INF5070 – media storage and distribution systems
100 120 140 160 180 200
task number
2004 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling
 Tests again performed
 by IBM (1993)
 on an 57 MHz, 32 MB RAM, AIX Power 1
 “Stupid” end system program:
 3 real-time processes only requesting CPU cycles
 each process requires 15 timeslots of 21 ms each per second
 31.5 % of the CPU time each
 94.5 % of the CPU time required for real-time tasks
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Evaluation of a Real–Time Scheduling
1 load process
1 Load Process
0.06
without real-time scheduling
with real-time scheduling
laxity [s]
laxity (remaining time to deadline)
laxity [s]
(competing with the
real-time processes)
0.04
0.05
0.045
the real-time
0.04
scheduler reaches
all its0.035
deadlines
0.02
0.03
0
0.025
-0.02
0.02
0.015
-0.04
0.01
-0.06
0.005
-0.08
0
0
20
40
60
80 100 120 140 160 180 200
number
event task
number
INF5070 – media storage and distribution systems
0
20
2004 Carsten Griwodz & Pål Halvorsen
Evaluation
of a Processes
Real–Time Scheduling
16 Load
16
load
process
with
real-time
scheduling – process 1
0.050.06
with real-time
scheduling
– process
2
without
real-time
scheduling
with real-time with
scheduling
– process
3
real-time
scheduling
0.045
0.04
0.04
process 1
0.0350.02
0.03
0.025
0
-0.02
0.02
0.015
-0.04
0.01
process 2
laxity [s]
laxity [s]
laxity (remaining time to deadline)
laxity [s]
1
Load
Process
(competing with the real-time processes)
0.05
0.045
0.04
Regardless of
0.035
other load,
the
EDF-scheduler
0.03 reach
its deadlines
0.025 equal
(laxity almost
as in 1 load
process
0.02
scenario)
0.015
NOTE: Processes
are
0.01
process 3
-0.06
0.005
scheduled in same
0.005
order
0
-0.08
0
0
20 40 60 80 100 120 140 160 180 200
task
number
0 20 40 60 80 100 120 140
160
180 200
0
20
event
number
event number
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Memory Management
Copying on the Intel Hub Architecture
application
Pentium 4
Processor
registers
file system
communication
system
disk
network card
cache(s)
memory
controller
hub
RDRAM
file system
RDRAM
communication system
RDRAM
application
RDRAM
I/O
controller
hub
INF5070 – media storage and distribution systems
PCI slots
network card
PCI slots
PCI slots
disk
2004 Carsten Griwodz & Pål Halvorsen
Streaming Modes Using Copying
 Traditional applications:
application-specific
data modifications
read
user
kernel
write
independent abstraction layer(s)
OS
device driver
device driver
HW device
HW device
 Streaming applications:
read
user
kernel
write
independent abstraction layer(s)
OS
device driver
device driver
HW device
HW device
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Cost of Data Transfers – Example I
 First generation router built with 133 MHz Intel Pentium
 mean packet size 500 B
 interrupt time of 10 µs, word access 50 ns
 per packet processing of 200 instructions (1.504 µs)

copy loop:




4 instructions
2 memory accesses
130.08 ns (per 4 byte)
register
 memory[read_ptr]
memory[write_ptr]  register
read_ptr
 read_prt + 4
write_ptr
 write_prt + 4
counter
 counter
– 1
if (counter not 0) goto top of loop
per packet:
processing + copy + interrupt = 1.504 µs + [(500/4) * 130 ns] + 10 µs
= 27.754 µs  144 Mbps
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Cost of Data Transfers – Example II
 Copying in NetBSDv1.5
 by UniK/IFI (2000)
 copyin(),
copyout(), and
memcpy()
 933 MHz P3 CPU
 theoretical max.:
25.6 Gbps


INTEL:
larger is better
BUT:



max at 2 – 8 KB
decrease at larger
sizes
caching effects
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Cost of Data Transfers – Example II
(cont.)
 Assume sending 1 GB data
 whole operation, reading from disk and sending to network,
takes about 10 s
 reading 64 KB blocks from disk
 137.10 µs per copyout()
 sending 4 KB packets
 1.65 µs per copyin()
 in total: read + send =
(16384 * 137.10 µs) + (262144 * 1.65 µs) =
2.679 s for copying only
 THUS; data movement costs should be kept small
 careful management of contiguous media data
 avoid unnecessary physical copy operations
 apply appropriate buffer management schemes

reduce overhead by removing physical in-memory copy
operation, i.e., ZERO-COPY data paths
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Basic Idea of Zero–Copy Data Paths
application
user space
kernel space
file system
buf
b_data
communication
system
mbuf
m_data
bus(es)
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Streaming Modes NOT Using Copying
 Application streaming using zero-copy:



read data into kernel
buffer and send
user
from there
kernel
application
responsible for timing
send:


explicit send
automatic send
read
read& send
write
independent abstraction layer(s)
OS
device driver
device driver
HW device
HW device
 Kernel streaming using zero-copy:




thread per stream
perform read and
write operations
user
application specifies kernel
timing, but it is
ensured by the tread
stream is only created –
controlled by kernel
create stream
independent abstraction layer(s)
OS
device driver
HW device
INF5070 – media storage and distribution systems
thread
device driver
HW device
2004 Carsten Griwodz & Pål Halvorsen
Existing Zero–Copy Streaming Mechanisms
 Linux: sendfile()
 between two descriptors (file and TCP-socket)
 bi-directional: disk-network and network-disk
 need TCP_CORK
 AIX: send_file()
 only TCP
 uni-directional: disk-network
Kernel streaming
using zero-copy
 INSTANCE (MMBUF-based, in NetBSDv1.5):
 by UniK/IFI (2000)
 uni-directional: disk-network
(network-disk ongoing work)
 stream_read() and stream_send()
Application streaming
(zero-copy 1)
using zero-copy
 stream_rdsnd()
(zero-copy 2)
 splice(), stream(), IO-Lite, MMBUF, …
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
INSTANCE CPU Time
Traditional
12
 Used disk blocks of 64 KB
10
 Used UDP packets of 1–8 KB
 Results in seconds:
time in seconds
 Transfer 1 GB
Zero-Copy 1
8
6
4
2
0
1
 Gain larger than expected:


removed other operations as
well like buffer cache look-up
1
(simplified the chain of functions)
2
some packet drop at server saved 4
about 0.2 s
8
INF5070 – media storage and distribution systems
2
4
8
packet size in KB
Removing copy
Measured
KB
2.80
3.39
KB
2.75
4.09
KB
2.68
3.98
KB
2.98
3.31
2004 Carsten Griwodz & Pål Halvorsen
INSTANCE Zero–Copy Transfer Rate
 Zero-copy transfer
rate limited by
network card
and storage system

saturated a
1 Gbps NIC and
32-bit, 33 MHz PCI

reduced
processing time by
approximately
50 %

read, write, with copy
read, write, no copy
read, automatic write, no copy
 Throughput increase of ~2.7 times per stream
(can at least double the number of streams)
huge improvement
in number of
concurrent streams
INF5070 – media storage and distribution systems
approx. 12 Mbps
approx. 6 Mbps
2004 Carsten Griwodz & Pål Halvorsen
The End:
Summary
Summary
 All (active) resources needs to be scheduled
 Scheduling algorithms for multimedia tasks have to…
 … consider real-time requirements
 … provide good resource utilization
 (… be implementable)
 Memory management is an important issue
 pinning frequently used data – or at least keep as long as possible
(replacement algorithms later)
 reservation of memory buffers
 copying is expensive
 Rule of thumb: watch out for bottlenecks
 copying
 data touching operations
 frequent context switches
 scheduling of slow devices (disk)
 ...
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen
Some References
1.
2.
3.
4.
5.
6.
7.
8.
Halvorsen, P.: “Improving I/O Performance of Multimedia Servers”, Thesis for the Dr. Scient. degree at
University of Oslo, Unipub forlag, ISSN 1501-7710, No. 161, Oslo, Norway, August 2001
Liu, C.L., Layland, J.W.: "Scheduling Algorithms for Multi-Programming in a Hard Real-Time Environment“,
Journal of the Association for Computing Machinery 20, 1 (January 1973): 40-61
Nieh, J., Lam, M.S.: “The Design, Implementation and Evaluation of SMART: A Scheduler for Multimedia
Applications”, Proc. of 16th ACM Symp. on Operating System Principles (SOSP’97), St. Malo, France,
October 1997, pp. 184-197
Plagemann, T., Goebel, V., Halvorsen, P., Anshus, O.: "Operating System Support for Multimedia Systems",
The Computer Communications Journal, Elsevier, Vol. 23, No. 3, February 2000, pp. 267-289
Solomon, D.A., Russinovich, M.E.: “Inside Microsoft Windows2000”, 3rd edition, Microsoft Press, 2000
Steinmetz, R., Nahrstedt, C.: “Multimedia: Computing, Communications & Applications”, Prentice Hall, 1995
Tanenbaum, A.S.: “Modern Operating Systems” (2nd ed.), Prentice Hall, 2001
Wolf, L.C., Burke, W., Vogt, C.: “Evaluation of a CPU Scheduling Mechanism for Multimedia Systems”,
Software - Practice and Experience, Vol. 26, No. 4, 1996, pp. 375 - 398
INF5070 – media storage and distribution systems
2004 Carsten Griwodz & Pål Halvorsen