Memory Management - Faruk Hadziomerovic

Download Report

Transcript Memory Management - Faruk Hadziomerovic

Chapter 4
Memory Management
4.1 Basic memory management
4.2 Swapping
4.3 Virtual memory
4.4 Page replacement algorithms
4.5 Modeling page replacement algorithms
4.6 Design issues for paging systems
4.7 Implementation issues
4.8 Segmentation
1
Memory Management
• Ideally programmers want memory that is
– large
– fast
– non volatile
• Memory hierarchy
– small amount of fast, expensive memory – cache
– some medium-speed, medium price main memory
– gigabytes of slow, cheap disk storage
• Memory manager handles the memory hierarchy
2
Basic Memory Management
Monoprogramming without Swapping or Paging
Three simple ways of organizing memory
- an operating system with one user process
3
Fixed memory partitions
Sort jobs in separate queues by size for each partition:
• small jobs wait while their partition is free.
Single input queue:
• take from the head wherever it fits.
• choose the ones that fit: discriminated jobs are rejected k times and taken in.
4
Multiprogramming makes CPU better utilized
Degree of multiprogramming
p – probability of I/O wait (CPU idle) (e.g. 80%).
With n processes P[CPU idle] = pn . P[CPU busy] = 1 - pn .
5
How much memory
Assume total 32 M, with OS taking 16 M and each user program 4 M.
With 80% IO CPU utilization is 1 - .84 = 60%. Additional 16 M allows
N to grow from 4 to 8 or CPU utilization = 83 % or 38 % better.
Adding another 16 M will increase n = 12 or CPU utilization = 93 %
which is 12 % increase.
Multiprogramming problems:
•
Relocation: solved by base registers.
•
Protection: solved by limit registers.
Timesharing systems:
•
Processes are kept on the disk and brought into memory dynamically:
Swapping
Virtual memory
•
Problems: memory compaction.
6
Swapping: keep programs on disk
Memory allocation changes as processes come into memory and
leave memory. Shaded regions are unused memory => fragmentation.
Memory compaction: moving processes down to fill in holes.
7
New art: the size of process is known
• Allocating space for growing data segment (old way)
• Allocating space for growing stack & data segment (new way):
the memory requirements per process are known in advance.
8
Two ways to track memory usage
•
•
•
•
•
Processes: A (5), B(6), C(4), D(6), E(3)
Part of memory with 5 processes, 3 holes
– tick marks show allocation units
– shaded regions are free
Corresponding bit map.
Corresponding information as a linked list: P – process, H – hole,
start block, size, and pointer to the next unit.
Improvement is to maintain the list of processes and list of holes
possibly ordered by size for quick search.
9
Double linked list enables easy compation
Memory allocation algorithms:
1.
First fit: the first hole large enough is chosen.
2.
Next fit: the same as first fit that always starts from where it ended last (worst
than 1.). It is not for ordered list.
3.
Best fit: searches the list for smallest partition. It leaves small holes that are
useless. In ordered list of holes the first hole that fits is the one.
4.
Worst fit lives largest partitions, however (simulation shows) not good.
10
Virtual Memory: The function of the MMU
Job may be larger than the physical address. Program is split into overlays.
To start running program only first overlay needs to be in memory.
11
Paging
MMU maps virtual pages
into physical pages - (frames)
Page size: 4 to 64 kB.
Therefore:
mov reg, 0 ; maps to
mov reg, 8192
and
mov reg, 8192 ; maps to
mov reg, 24576
12
Page Tables
Internal operation of MMU with 16 4 KB pages
13
32 bit address with two level page table
Second-level page tables
Top-level
page table
Both levels page table will have
1020 entries. However, two level
doesn’t need to keep all page table
In a memory all the time. For instance
only 4 kB is needed here.
14
Page tables entry
Page frame number depends on size of physical memory.
Present/absent: entry is valid
Protection: three bits RWX
Referenced: whether is page read or written into
Modified: whether is page written into
Cashing disabled: for memory mapped I/O
15
Translation Lookaside Buffers (TLBs) speed up the mapping
Page maps are too large and kept in memory. TLB contains entries only
for page frames. They are associative memories. If the page is not in TLB
it is loaded from the page table. Initially it was done by hardware. The old
entry modified bit is copied into page table. How is the TLB searched? Then
it was done by software.
16
Inverted Page Tables
Trouble of page table combined with TLB is that it is too large.Why not to
keep track of only pages that are active (frames)? The table of active pages
has much less entries. From inverted table we load TLB. Index into inverted
table is hash value of virtual page. If virtual page is not found needs to
search a chain in hash table. This approach is used in 64 bit address
machines, IBM and HP workstations.
17
Page Replacement Algorithms
• Similar problem exist in cashe and web servers. Active
pages are kept while inactive are in the background.
• Page fault forces choice
– which page must be removed to make room for incoming page
• Modified page must first be saved
– unmodified just overwrite entry in page table
• Better not to choose an often used page
– will probably need to be brought back in soon
18
Optimal Page Replacement Algorithm
• Replace page needed at the farthest point in future
– Say we use instruction counter to mark the future page
references. Replace the page referenced furthest into the future.
– Optimal but future is not known.
• Estimate by …
– Simulate a run and make prediction. However, this run is
specific for particular program and particular data.
– page use in the past of process: each page is labeled by the
number of instructions before it has been referenced. The more
instructions the less chance to be referenced and likely
candidate for replacement.
– it is impractical to count instruction – count a time instead.
19
Not Recently Used (NRU) Replacement Algorithm
(adequate performance)
• Each page has Reference bit, Modified bit
–bits are set when page is referenced and/or modified
–reference bit is cleared every clock interrupt (20 msec) to differ
pages not referenced recently.
• Pages are classified
1.not referenced, not modified
2.not referenced, modifieda
3.referenced, not modified
4.referenced, modified
• NRU removes page at random
from lowest numbered non empty class e.g. among 1. (if exists)
then among 2. etc. Rationale: it is better to replace not referenced
but modified pages than heavily referenced
a. Page might be modified in the past and cleared by the clock interrupt.
20
FIFO Page Replacement Algorithm
Based on age when the page was brought in.
• Maintain a linked list of all pages
– in order they came into memory: oldest pages at the
head, the most recent at the tail of the list.
• Page at beginning of list, the oldest one, is
replaced.
• Disadvantage
– page in memory the longest may be often used.
21
Second Chance Page Replacement Algorithm
Inspect the R bit before throwing the oldest page
Operation of a second chance
– pages sorted in FIFO order (number above is time of arrival)
– if last page has R=0 it is replaced immediately.
– if not its R is set to 0 and put at the tail with current time stamp ( =
20) and next head page (the oldest) is looked at.
– if all pages have been referenced A will first come to the head of the
list with R=0 and be replaced like FIFO.
22
The Clock Page Replacement Algorithm
All page frames are placed around the clock sorted by age (same as linked
list). When page fault occurs the hand points to the most recent page (tail).
Hand moves one place clockwise to inspect the oldest page. If R=0 then replace,
else set R=0 and time = current, and inspect the next.
23
Least Recently Used (LRU)
• Assume pages used recently will used again soon
– throw out page that has been unused for longest time
• Must keep a linked list of pages
– most recently used at front, least at rear
– update this list every memory reference !!
• Alternatively keep reference counter in each
frame table entry
– choose page with lowest value counter
– periodically zero the counter
24
Simulating LRU in Software (1)
LRU using a matrix n*n with n frames
Every reference of frame k set k-th row to 1 and reset k-th column to 0.
At page fault the frame with the lowest binary number is LRU candidate.
Example shows pages referenced in sequence: 0,1,2,3,2,1,0,3,2,3
25
Simulating LRU in Software (2)
Counting page references can be simulated by counting R bits every tick.
However, the window in the past in infinite – noting is forgotten. The approach
below uses window of 8 ticks.
R bit counters for 6 pages for 5 clock ticks, (a) – (e).
The LRU page has the smallest binary number.
26
The Working Set Page Replacement Algorithm (1)
k
• The working set is the set of pages used by the k
most recent memory references
• w(k,t) is the size of the working set at time, t
27
The working set algorithm
Every clock tick:
if (R == 1) {
R = 0;
counter = VT;
}
At each page fault entire frame table has to
be scanned to find the proper page to evict.
At page fault:
loop {
age = current VT – counter;
if (age > t) {
evict;
break;
else
noted = smallest age page;
}
}
if (not evicted && (R[noted] == 0))
evict(noted)
else
evict next younger with R==0;
if (not evicted) evict random frame
28
WSClock Algorithm
29
Review of Page Replacement Algorithms
30
Modeling Page Replacement Algorithms
Belady's Anomaly:
FIFO with 3 frames has less page faults than with 4 frames.
31
Stack Algorithms for LRU
frames
VM
State of memory array, M, after each item in reference string is processed
32
The Distance String
Probability density functions for two hypothetical distance string.
33
Computation of page fault rate from distance string
20
3
3
14
3
11
9
1
0
C vector
F vector
34
Design Issues for Paging Systems
Local versus Global Allocation Policies (1)
• Original configuration
• Local page replacement
• Global page replacement
35
Local versus Global Allocation Policies (2)
Page fault rate as a function of the number of
page frames assigned
36
Load Control
• Despite good designs, system may still thrash
• When PFF algorithm indicates
– some processes need more memory
– but no processes need less
• Solution :
Reduce number of processes competing for memory
– swap one or more to disk, divide up pages they held
– reconsider degree of multiprogramming
37
Page Size (1)
Small page size
• Advantages
– less internal fragmentation
– better fit for various data structures, code sections
– less unused program in memory
• Disadvantages
– programs need many pages, larger page tables
38
Page Size (2)
• Overhead due to page table and internal
fragmentation
page table space
s e p
overhead 

p 2
internal
fragmentation
• Where
– s = average process size in bytes
– p = page size in bytes
– e = page entry
Optimized when
p  2se
39
Separate Instruction and Data Spaces
• One address space
• Separate I and D spaces
40
Two processes sharing same program
41
Cleaning Policy
• Need for a background process, paging daemon
– periodically inspects state of memory
• When too few frames are free
– selects pages to evict using a replacement algorithm
• It can use same circular list (clock)
– as regular page replacement algorithm but with diff
ptr
42
Implementation Issues
Operating System Involvement with Paging
Four times when OS involved with paging
1.
Process creation


determine program size
create page table
Process execution
2.


MMU reset for new process
TLB flushed
Page fault time
3.


determine virtual address causing fault
swap target page out, needed page in
Process termination time
4.

release page table, pages
43
Page Fault Handling (1)
1.
2.
3.
4.
5.
Hardware traps to kernel
General registers saved
OS determines which virtual page needed
OS checks validity of address, seeks page frame
If selected frame is dirty, write it to disk
44
Page Fault Handling (2)
6. OS brings schedules new page in from disk
7. Page tables updated
8. Faulting instruction backed up to when it began
9. Faulting process scheduled
10. Registers restored
11. Program continues
45
Instruction Backup
An instruction causing a page fault
46
Locking Pages in Memory
• Virtual memory and I/O occasionally interact
• Proc issues call for read from device into buffer
– while waiting for I/O, another processes starts up
– has a page fault
– buffer for the first proc may be chosen to be paged out
• Need to specify some pages locked
– exempted from being target pages
47
Backing Store
(a) Paging to static swap area
(b) Backing up pages dynamically
48
Separation of Policy and Mechanism
Page fault handling with an external pager
49
Segmentation (1)
• One-dimensional address space with growing tables
• One table may bump into another
50
Segmentation (2)
Allows each table to grow or shrink, independently
51
Segmentation (3)
Comparison of paging and segmentation
52
Implementation of Pure Segmentation
(a)-(d) Development of fragmentation
(e) Removal of the fragmentation by compaction
53
MULTICS: Segmentation with Paging:
Segment descriptor
Segment descriptor points to page tables
54
A 34 bit Multics virtual address
55
Conversion of a MULTICS address into a main
memory address
56
Simplified version of the MULTICS TLB
• Existence of 2 page sizes makes actual TLB more
complicated
57
Pentium: selector chooses a segment descriptor
58
Pentium code segment descriptor
Data segments differ slightly
59
Conversion of the (selector, offset) into physical
address
60
Segmentation with Paging: Pentium
Mapping of a linear address onto a physical address
61
Segmentation with Paging: Pentium (5)
Level
Protection on the Pentium
62