presentation source

Download Report

Transcript presentation source

Chapter 12 - Virtual Memory
Systems
– Page Replacement Algorithms
• Main metric is number of page faults / algorithm,
given a constant memory stream.
• Will look at global replacement first (page frames
allocated to all processes are candidates for
replacement).
• All global page algorithms attempt to predict the
future page requirements of the process.
• Page Replacement Algorithms
– Optimal (aka MIN or OPT): Replace the page that is
to be used the farthest in the future.
•
•
•
•
Not an actual algorithm, since it requires future knowledge.
Useful benchmark, though, for testing other algorithms.
Provides a lower bound on page faults.
Example of it’s use later.
– Random: Replace a random page.
• Does not attempt to predict future behavior based on the
past.
• Easy to implement :)
• Not the worst, but not the best. Consider it a lower bound
on page replacement algorithms.
• Example of it’s use later (Figure 12.1).
• Page Replacement Algorithms
– FIFO: Replace the page that has been in memory the
longest.
• Theory is that oldest page won’t be used in near future.
• Bad theory :) (Figure 12.2).
– Least-Recently Used (LRU): Replace the page that
has not been accessed in the longest time.
• Predict the recent future from the recent past.
• Good theory & best global paging algorithm (30 - 40% of
Optimal).
• Expensive to record timestamps, though (Figure 12.4).
• Easier to approximate LRU via software & hardware.
• LRU Approximations
– A useful hardware device is a per-page reference bit
that is set to 1 by the paging hardware whenever the
page is accessed. It can be read and set to zero by
software.
– Crude LRU: Use the reference bit to provide you
with one bit of resolution - can use it to split page
references into those that have been accessed at least
once from those that have not been accessed at all
(Figure 12.5).
– LRU page counter: Keep a, say, 8-bit counter that is
incremented by an LRU daemon every second or so.
If the ref bit == 1, increment & clear the ref bit. This
will separate the pages into 256 sets (Figure 12.5).
• LRU Approximations
– True LRU: Each page has a timestamp value that will
sort the page access times.
• Clock Page Replacement Algorithms
– Visualize as a circular list of page frames, with a
single clock hand sweeping across.
– FIFO Clock algorithm: Clock hand always points to
the page to replace; after replacement the hand moves
on to the next page frame. This is the FIFO
algorithm (and thus not interesting).
• Clock Page Replacement Algorithms
– Basic Clock algorithm:
• Hand looks at reference bit of current page frame. If
ref bit == 0, this page hasn’t been referenced and it is
selected to be replaced.
• If ref bit == 1, this page has been referenced. Give it a
another chance by setting the ref bit to 0 and continue
looking for a page frame where the ref bit == 0. Notice
that this insures eventually a free page will be found when
the clock hand rolls back around.
• A page has to get referenced at least once every revolution
of the clock hand to keep from getting replaced.
• This is a modified FIFO algorithm (called First In, Not
Used, First Out -- FINUFO or Not Recently Used -- NRU).
• One clock alg had to look an average of 13 page frames in
order to find one to replace.
• Clock Page Replacement Algorithms
– Hopefully the hardware not only supports the ref bit,
but also a modified or dirty bit - a bit that is set
indicating a value has been written somewhere inside
of the page frame (a memory write occurred).
– This gives us two bits we can use to “sort” the page
references. A page with a dirty bit set is, in general,
more expensive to handle (the page has to be written
out to the swap device before a new page can be read
into the page frame) -- it will take twice as long to
process.
– We can modify the basic clock algorithm to take the
ref bit and dirty bit into account (Figure 12.7).
• Clock Page Replacement Algorithms
– Clock algorithm with ref/dirty bits (aka Second
Chance).
• Test the page frame pointed to by the clock hand
– If ref == 0 && dirty == 0, select this page for replacement;
– If ref == 1, set ref = 0 and continue moving clock hand;
– If ref == 0 && dirty == 1, start writing the page to disk and set the
dirty bit == 0 and continue moving clock hand
• If page frame fails test (it finds a 0,0 page frame), replace
this page.
• If it passes the test,
– Modify the page frame information according to rules above.
– Move the clock hand to the next page frame.
– Go back to the test step above.
– Idea is to replace the “cheapest, oldest” page frame.
• Clock Page Replacement Algorithms
– In summary,
Variant
FIFO
Basic Clock
Second Chance
Test
always fail
ref == 1?
(ref == 1 ||
mod == 1)?
Modification
none
ref=0
If ref==1 then ref=0
else mod=0
– Book quote: Clock algorithms are versatile and
effective.
• Examples of Page Replacement Algorithms
– Typical program may contain millions of memory
references.
– We will use small memory references with higher
than normal page fault rates (to demo the algs).
• Examples of Page Replacement Algorithms
– Reference stream is (column headers on page 493494): 1,2,3,2,5,6,3,4,6,3,7,3,1,5,3,6,3,4,2,4,3,4,5,1
– Assume four page frames available (row numbered 0
through 3 on pages 493-494).
– An “*” following a number indicates a page fault.
– A “+” following an number indicates a reference for a
page that is already in the page table.
– Results (not counting the 1st four page faults):
•
•
•
•
•
Optimal: 7 page faults
LRU: 10
Clock: 10
FIFO: 12
Random: 12
• Local Page Replacement Algorithms
– Global algorithms make no attempt to model the
behavior of an individual process.
– Better page replacement can be found using more
accurate models of individual process execution.
– Currently accepted model: Phase model of Working
Sets.
– Working set: set of pages that a process is currently
using. If a process has only the working set of pages
mapped to page frames in it’s page table then it won’t
page fault.
– The working set is defined in terms of past page
references.
• Working Set Model
– Virtual time: each page reference is one unit of virtual
time.
– Time units are numbered from 1 (the oldest page
reference) to T (the present).
– Reference string: sequence of pages r1, r2, r3, .., rT,
where rt is the page reference at virtual time t.
– Let Θ = working set window: number of page
references in which to look for a working set.
– Working set is defined as:
• W(T, Θ) = {p| p = rt, where T - Θ < t  T}
– In other words, the working set is the set of pages
referenced within the working set window.
• Working Set Model
– Idea is that if you can keep the number of allocated
page frames for a process to be at least equal to the
working set of that process it will never page fault
once the working set is “filled”.
– The operating system should thus only run the
process when all the pages in it’s working set are
loaded into memory.
– The working set will change in time as the process
proceeds.
– Studies show that the working set transitions or
phases tend to happen abruptly.
– Measurements show: 95-98% of virtual time of a
process is spent in a stable working set phase.
• Working Set Model
– Thus, only 2 to 5% of virtual time is spent between
working sets. These transitions account for 40-50%
of the total page faults of the process.
– Figure 12.8 shows how the working set sizes
fluctuate between the program phases.
– Resident set: set of pages that a process currently has
in memory. If the resident set == working set, then
page faults should be minimal. If the resident set >
working set, excessive page faults will occur. If the
resident set < working set the result is inefficient use
of allocate page frames.
– Resident sets vary as program runs. It makes sense to
make sure the working set tracks the resident set.
• Working Set Paging Algorithm
– Monitor the working set of for each process by
computing W (the working set size).
– Change the resident set as W changes.
– If W will not fit in memory, swap out the entire
process until it’s working set will fit.
– The WS paging algorithm is a good one, but
expensive to implement. Easier to approximate
(same as with LRU and LRU approximations).
– Working set algorithm requires a virtual time
reference, which is easy to keep track of in the
dispatcher at context switch time. We can use the
accumulated execution time as the virtual time of the
process.
• Approximating the Working Set
– The working set approximation algorithm:
• When a page fault occurs, scan through all the pages that
process has in memory.
• If the ref bit is set, set the “time of last use” of that page to
the current virtual time (accum. CPU time) of the process.
• If the ref bit is NOT set, check the “time of last use” for the
page. If the page has not been used within Θ (working set
window) time units then remove the page.
• Bring in the new page and add it to the resident set of the
program.
– Another variation, which we won’t cover, is the
WSClock paging algorithm. It is more efficient than
the above approximation algorithm.
– Skip section 12.5.
• Thrashing & Load Control
– Thrashing: Overhead is higher than throughput per
time unit. For paging performance, this means that
page fault processing is taking more time than
processes are getting over a period of time.
– Classic thrashing behavior seen by old page fault
algorithm that allowed in more processes when the
user CPU consumption rate decreased. This would
lead to more paging, more overhead, less user CPU
consumption. A positive feedback loop resulted and
the machine spent more time paging than allowing
user processes to run.
– This leads to the need for better load control (Figure
12.11).
• Thrashing & Load Control
– A better metric for VM performance is the page fault
rate. If the page fault rate increases, decrease the
level of multiprogramming by swapping out
processes. If the page fault rate decreases, increase
the level of multiprogramming by swapping in
processes.
– Swapping: the entire process address space is moved
to the paging disk. A swapped-out process has no
page frames assigned to it and cannot be scheduled to
the run state.
– Swapping involves higher degrees of process control
and is usually scheduled separately.
• Load Control
– The short-term scheduler schedules processes that are
in memory (or at least partially in memory).
– The medium-term scheduler schedules processes that
are out of memory (swapped out).
– Some operating systems have a long-term scheduler
that schedule processes/jobs to run over a period of
hours or days (batch queuing system).
– Figure 12.12 shows the first two levels.
– The short-term and medium-term schedulers must
coordinate their activities to keep processes flowing
freely between memory and the swap device.
• Load Control
– The page replacement algorithms can provide useful
load metrics:
• Clock algorithm - if the clock hand rate is too slow, bring
in more processes; if the clock hand rate is too fast, swap
out processes.
• Page fault frequency load control - track the page
faults/second. Interesting choice: the “L==S” criterion.
Keep the mean time between page faults (L) the same as
the time it takes to handle a page fault (S). If L > S, swap
processes out; if L < S, swap processes in (methinks the
book has this backwards!).
– Most UNIX systems revert to swapping when the
page fault rates get high. Separate daemons (pageout
on xi; kswapd on Linux) perform the swapping.
• Load Control
– In some cases it’s useful to be proactive with load
control (predictive) rather than reactive.
– One mechanism separates out the paging activity of a
loading process (LT: Load Time) from a running
process (RT: Run Time). The algorithm can then
treat the initial process loading differently than a
running process, avoiding misleading spikes in the
page fault rate that may inadvertently decreased
multiprogramming.
• Load Control
– Another idea is to prepage in parts of a process that
are expected to be used right away.
– Rather than using a demand paging scheme, where
pages are brought in as they are used, prepaging of
pages from a swapped-out process can eliminate the
initial flurry of page faults as the resident set reestablishes itself.
• Large Page Tables
– Assume a 32-bit address space.
– Assume 4 KB pages. This partitions the 32 bits into
20 bits for a page number and 12 bits for the offset.
– Assuming a PTE takes 4 bytes, this would require a 4
MB page table per process (220 * 4)!
• Large Page Tables
– One solution: two-level paging.
• Chop the page address into two pages: a master page and a
secondary page (example: 10 bits for each; 12 bits for
offset).
• Let master page table contain not PTEs, but memory
pointers to secondary page tables, which do contain PTEs.
• Assuming the process address requirements are sparse, not
all the secondary page tables will be needed, resulting in
significant memory savings for page tables.
• Figures 12.13 and 12.14 show two views of two-level
paging.
• While the master page table should be in memory, why not
page out the secondary tables?
• Notice, though, the cost of an extra memory access; assume
that the TLB will alleviate the 2-level costs.
• Large Page Tables
– Another solution: three-level paging.
• Can further subdivide the page address into three levels!
• Figure 12.15
– If the level of TLB misses is low enough we can
avoid having hardware support for two and three
level paging and use software page table lookups
instead (Figure 12.16). Table on page 513 shows
relative costs of TLB hit/miss/page fault for
1/2/3/software paging.
– Yet another solution: inverted page tables.
• Rather than have a page table entry for all possible virtual
address pages why not have a page table entry for all
physical page frames? After all, all processes cannot have
more than the actual amount of real memory resident.
• Large Page Tables
– Inverted page tables
• An inverted page table has one entry for each page frame in
physical memory and is used to look up which virtual page
is in the page frame.
• Need some way to map from a virtual page number into the
inverted page table, since the IPT is sorted by physical
page frame number.
• One mechanism is to use a hash table (Figure 12.18).
• Another is to include the process ID in the inverted page
table so we know which page frame belongs to which
process.
• Skip sections 12.8, 12.9 & 12.10.
• Segmentation
– Segment: division of the logical (virtual) address
space that is visible to the programmer.
– An alternative to fixed-sized pages, segmentation
allows for memory allocation to occur at “natural”
boundaries, since the segment sizes are variable. You
can allocate segments based on the sizes of the
objects in the load module.
– Segmented addressing works similar to paged
addressing, with the obvious name change (“segment
table” rather than “page table”, etc.) and the need for
a base/limit pair within each STE (Segment Table
Entry).
– Figure 12.22 shows segmentation addressing.
• Segmentation
– Segment tables can have the same fields as a page
table (ref bit, dirty bit, access bits, etc.).
– Notice that finding a free segment of the proper size
required by a process brings up all the horrible
external fragmentation problems seen in dynamic
memory allocation.
– Some operating systems combine segmentation with
paging: the virtual address is first sent to a segment
table. The segment base address then points to a page
table rather than to a memory address. So, segments
are composed of same-sized pages. This avoids the
dynamic memory allocation problem.
• Segmentation
– Note that in the “segmentation” vs “paging” war,
paging won :)
– Paging is transparent to the user, while segmentation
requires that the compiler writers know how
segments are assigned.
– Segmentation, as previously mentioned, re-introduces
the dynamic memory allocation hassles.
– Interesting book comment: only commercial systems
using segmentation use paged segments.
– Book also says, rightfully so, that you have to be
aware of the potential misuse of the word “segment”;
it is used to mean different things. Find out the
context.
• Sharing Memory
– Page tables and segment tables provide an extra
bonus -- the ability to share memory between
processes.
– We saw how use of shared libraries cut down the
memory requirements of processes by the use of
reentrant library code.
– A simple adjustment to a page table: have the frame
number duplicated in more than one process’s page
table (Figure 12.23). Now each process can execute
code or access data that is executable/accessible from
another process.
– Shared memory management requires new syscalls.
• Sharing Memory
– New SOS calls:
void *AttachSharedMemory(int smid, void *smaddr, int
flags)
void *DetachSharedMemory(void *smaddr)
– UNIX calls: shmget(), shmctl(), shmop().
• Virtual Memory System Examples
– Swap area: section of a disk reserved for pages.
• Usually a separate partition (“swap -l” on xi or “cat
/proc/meminfo” on Linux)
• Sometimes part of the filesystem (Windows
NT:Settings/Control Panel/System/Performance/Virtual
Memory)
• Rule of thumb: swap area = 2 * physical memory
• Virtual Memory System Examples
– Swap area: section of a disk reserved for pages.
• In emergencies, most UNIXes can swap to a file in the
filesystem using the “mkfile” and “swapon” commands.
– Various optimizations can be done at process creation
to minimize initial swap disk traffic (Figure 12.24).
– Copy-on-write: useful trick for implementing UNIX
fork(). A child initially shares all the pages with it’s
parent in a read-only fashion and the parent’s page
table entries are all set to read-only When child or
parent attempts to change a page a new copy of the
page is created in a new page frame and the page
table adjusted accordingly (Figure 12.25). This is
more efficient than simply duplicating the parent’
memory.
• Virtual Memory System Examples
– More tricks: two-handed clock algorithm, allowing
pages to be reclaimed sooner rather than waiting for a
full hand revolution (Figure 12.26).
– Standby page lists: keep a list of free page frames that
the page replacement algorithm can dip into if there
are no clean (don’t have to be written back to swap
device before re-use) free pages (Figure 12.27).
– Clustering pages: at swap-in, load in a clump of
pages to prevent excessive page faults while the
process is re-establishing it’s resident set.
– File mapping: the virtual memory system acts as a
disk cache.
• Virtual Memory System Examples
– Portable VM systems: as with the rest of the OS, it’s
good to have as much of the VM system as possible
be written in a machine-independent fashion so
porting to a new architecture is quick.
– Sparse address space: a VM system makes it easier to
support the regioning of the VM address space of a
process, something typically done by a compiler to
separate code, data, heap and stack space.
• VM Systems used in Actual Oses
– OS/2 Version 2.0
• 4KB page size; 32-bit address space
• 2-level page tables with paged page tables
• VM Systems used in Actual OSes
– OS/2 Version 2.0
• Single-handed clock alg w/ a standby list.
• Copy-on-write
– Windows NT
•
•
•
•
•
4KB page size; 32-bit address space.
Swap space allocated only when needed.
Local FIFO page replacement alg. w/ a standby list.
Demand paging with clustering on initial swap in.
VM system built on top of HAL (Hardware Abstraction
Layer)
– Mach & OSF/1
• Goals: flexible sharing; large, sparse address space;
memory-mapped files; user control of page replacement.
• VM Systems used in Actual OSes
– Mach & OSF/1
•
•
•
•
Object-oriented approach, even in memory system.
Global FIFO replacement with a standby list.
Copy-on-write.
Clustered paging.
– UNIX System V Release 4
• 2-handed clock page replacment
• Address space divided into segments (but it is not a
segmented VM system; it is paged).
• Mostly hardware independent implementation.
• Clustered paging.
• VM Systems used in Actual OSes
– UNIX 4.3 BSD
• 2-handed clock replacment
–
–
–
–
UNIX 4.4 BSD: based on Mach VM architecture.
DEC VMS: local FIFO with standby list.
Macintosh: second-chance clock replacement.
IBM MVS: 4 KB page size; 3-level paging.
• Useful UNIX commands to view paging
behavior.
– top
– vmstat
– monitor (AIX only)