Memory Management and Virtual Memory

Download Report

Transcript Memory Management and Virtual Memory

Memory Management and Virtual Memory
• Background
• Overlays versus Swapping
• Contiguous Allocation
Operating System Concepts
9.1
Memory-Management Unit (MMU)
• The run-time mapping from virtual to physical addresses is
done by the memory-management unit (MMU), which is a
hardware device.
• The user program deals with logical addresses; it never sees
the real physical addresses.
• Two different types of addresses:
– Logical: range 0 to max.
– Physical: range R+0 to R+max; where R is a base value.
• The user generates only logical addresses and thinks that
the process runs in locations 0 to max.
Operating System Concepts
9.2
Memory-Management Unit (MMU) (Cont.)
• In MMU scheme, the value in the relocation register is added
to every address generated by a user process at the time it
is sent to memory.
• Example: Dynamic relocation using a relocation register.
Logical
address
345
Relocation
register
1400
CPU
Memory
+
MMU
Operating System Concepts
Physical
address
1745
9.3
Overlays
• The entire program and data of a process must be in the
physical memory for the process to execute.
• The size of a process is limited to the size of physical memory.
• If a process is larger than the amount of memory, a technique
called overlays can be used.
• Overlays is to keep in memory only those instructions and data
that are needed at any given time.
• When other instructions are needed, they are loaded into space
that was occupied previously by instructions that are no longer
needed.
• Overlays are implemented by user, no special support needed
from operating system, programming design of overlay
structure is complex.
Operating System Concepts
9.4
Overlays Example
• Example: Consider a two-pass assembler.
– Pass1 constructs a symbol table.
– Pass2 generates machine-language code.
– Assume the following:
Size (k = 1024 bytes)
Pass1
70k
Pass2
80k
Symbol table
20k
Common routines
30k
Total size
200k
– To load everything at once, we need 200k of memory.
Operating System Concepts
9.5
Overlays Example (Cont.)
• If only 150K is available, we cannot run our process.
• Notice that Pass1 and Pass2 do not need to be in memory
at same time.
• So, we define two overlays:
– Overlay A: symbol table, common routines, and Pass1.
– Overlay B: symbol table, common routines, and Pass2.
• We add overlay driver 10k and start with overlay A in
memory.
• When finish Pass1, we jump to overlay driver, which reads
overlay B into memory overwriting overlay A and transfer
control to Pass2.
Operating System Concepts
9.6
Overlays Example (Cont.)
• Overlay A needs 130k and Overlay B needs 140k.
Symbol table
Common routines
Overlay driver
Pass2 (80k)
Pass1 (70k)
Operating System Concepts
9.7
Swapping
• A process can be swapped temporarily out of memory to a backing
store, and then brought back into memory for continued execution.
• Backing store – fast disk large enough to accommodate copies of
all memory images for all users; must provide direct access to
these memory images.
• Roll out, roll in – swapping variant used for priority-based
scheduling algorithms; lower-priority process is swapped out so
higher-priority process can be loaded and executed.
• Normally, a process that is swapped out will be swapped back into
the same memory space that it occupied previously.
• Example: In a multiprogramming environment with Round Robin
CPU scheduling, when time slice expires, the memory manager
swap out the process that just finished, and swap in another
process to the memory space that has been freed.
Operating System Concepts
9.8
Schematic View of Swapping
• Major part of swap
time is transfer time;
total transfer time is
directly proportional to
the amount of memory
swapped.
• The context-switch
time in swapping
system is high.
• Modified versions of
swapping are found on
many systems, i.e.,
UNIX and Microsoft
Windows.
Operating System Concepts
9.9
Example 1: Swapping
• Let Process P1 size is 100KB and the transfer rate of a hard
disk is 1MB/second.
– To transfer P1 (100KB) to or from memory takes
100K/1000K per second, which is 1/10 second = 100
milliseconds.
– Assume the average latency is 8 milliseconds, then to
swap in or out takes 108 milliseconds. The total swap (in
and out) is 216 milliseconds for 100KB process.
• For efficient CPU utilization, we want our execution time for
each process to be long relative to the swap time.
• A Round Robin scheduling algorithm, the time slice should
be larger than 216 milliseconds (from the above example).
Operating System Concepts
9.10
Contiguous Allocation
• Main memory usually is
divided into two partitions:
– For the resident
operating system
– For the user processes.
0
O.S.
100K
User
1MB
Operating System Concepts
9.11
Example 2: Swapping
• A computer system has 1MB of
main memory. The O.S. takes
100KB.
– Maximum size of user
process is 900KB.
– User process may be smaller
than 900KB.
– From previous example if
process P1 is 100K the swap
time is 108 milliseconds. But,
if P1 is 900K then the swap
time is 908 milliseconds.
• As the size of a process increases
the swap time increases too.
Operating System Concepts
9.12
0
O.S.
100K
100K
User
900K
1MB
Main Memory
Single – Partition Allocation
• Single-partition allocation
– Needs Protection.
– Protect O.S. code and data from changes by the user
processes.
– Protect user processes from one another.
– We can provide protection by using a relocationregister with a limit register.
– Relocation register contains value of smallest
physical address.
– Limit register contains range of logical addresses.
– Each logical address must be less than the limit
register.
Operating System Concepts
9.13
Hardware Support for Relocation & Limit Registers
Limit
Register
Relocation
Register
Memory
Yes
CPU
+
<
Logical
Address
Physical
Address
No
Trap, Address
Error
Operating System Concepts
9.14
Multiple - Partition Allocation
• Several user processes residing in memory at the same
time.
• Memory can be divided into a number of fixed-sized
partitions, where each partition may contain exactly one
process. Therefore, the degree of multiprogramming is
bound by the number of partitions.
• Or all memory is available for user processes as one
large block (hole).
• When a process arrives and needs memory, we search
for a hole large enough for this process.
• If we find a space, we allocate only as much memory as
is needed, keeping the rest available to satisfy future
requests.
Operating System Concepts
9.15
Multiple - Partition Allocation (Cont.)
• Hole – block of available memory; holes of various size
are scattered throughout memory.
• Operating system maintains information about:
a) allocated partitions
b) free partitions (hole)
OS
OS
OS
OS
process 5
process 5
process 5
process 5
process 9
process 9
process 8
process 2
Operating System Concepts
process 10
process 2
process 2
9.16
process 2
Multiple - Partition Allocation (Cont.)
• When no available block of memory (hole) is large enough to
hold process, the O.S. waits until a large block is available.
• In general, there is at any time a set of holes, of various
sizes, scattered throughout memory.
• When a process arrives and needs memory, we search this
set for a hole that is large enough for this process.
• If the hole is too large, it is split into two:
– One part is allocated to the arriving process.
– The other is returned to the set of holes.
• When a process terminates, it releases its block of memory,
which is then placed back in the set of holes.
• If the new hole is adjacent to other holes, we merge these
adjacent holes to form one larger hole.
Operating System Concepts
9.17
Dynamic Storage-Allocation Problem
• First-fit, best-fit, and worst fit are the most common strategies
•
used to select a free hole from the set of available holes.
– First-fit: Allocate the first hole that is big enough.
Searching starts at the beginning of the set of holes. We
can stop searching as soon as we find a free hole that is
large enough.
– Best-fit: Allocate the smallest hole that is big enough;
must search entire list, unless ordered by size. Produces
the smallest leftover hole.
– Worst-fit: Allocate the largest hole; must also search
entire list, unless ordered by size. Produces the largest
leftover hole, which may be more useful than the smaller
leftover hole from best-fit approach.
First-fit and best-fit better than worst-fit in terms of speed and
storage utilization.
Operating System Concepts
9.18
Fragmentation
• External fragmentation – total memory
space exists to satisfy a request, but it
is not contiguous; storage is
fragmented into a large number of small
holes.
0
400K
P1
1000K
• Example: We have a total external
fragmentation of (300+260)=560KB.
– If P5 is 500KB, then this space
would be large enough to run P5.
But the space is not contiguous.
• The selection of first-fit versus best-fit
can affect the amount of fragmentation.
P4
1700K
Free
2000K
P3
2300K
Free
2560K
Operating System Concepts
9.19
O.S.
Fragmentation (Cont.)
• Internal fragmentation – Memory
that is internal to a partition, but is
not being used, because it is too
small.
O.S.
• Example: Assume next request is
for 18462 bytes.
– If we allocate exactly the
requested block, we are left
with a hole of 2 bytes.
• The overhead to keep track of this
hole will be larger than the hole
itself. So, we ignore this small hole
(internal fragmentation).
Operating System Concepts
9.20
P7
Hole of
18464
bytes
Free
P4
Virtual Memory
• Background
• Demand Paging
• Performance of Demand Paging
• Page Replacement
• Page-Replacement Algorithms
Operating System Concepts
9.21
Background (Cont.)
• Virtual memory is the separation of user logical memory
from physical memory. This separation allows an
extremely large virtual memory to be provided for
programmers when only a smaller physical memory is
available.
– Only part of the program needs to be in memory for
execution.
– Logical address space can therefore be much larger
than physical address space.
– Need to allow pages to be swapped in and out.
• Virtual memory can be implemented via:
– Demand paging
– Demand segmentation
Operating System Concepts
9.22
Diagram: virtual memory larger than physical memory
Page 0
Page 1
Page 2
Page n
Virtual
Memory
Memory
Map
Physical
Memory
Disk
Space
Operating System Concepts
9.23
Transfer of a Pages Memory to Contiguous Disk Space
Program A
Swap Out
Program B
Main
Memory
0
1
2
3
4
5
6
7
8
Swap In
Disk Space
Operating System Concepts
9.24
Valid-Invalid bit
• We need hardware support to distinguish between those
pages that are in memory and those pages are on the disk.
• The valid-invalid bit scheme can be used:
– Valid: indicates that the associated pages is both legal
and in memory.
– Invalid: indicates that the page either is not valid (not in
logical address space) or is valid but is currently on the
disk.
• What happens if the process tries to use a page that was
not brought into memory?
• Access to a page marked invalid causes a page-fault trap.
Operating System Concepts
9.25
Valid-Invalid Bit (Cont.)
• With each page table entry a valid–invalid bit is associated
(1  in-memory, 0  not-in-memory)
• Initially valid–invalid but is set to 0 on all entries.
• Example of a page table snapshot.
Frame # valid-invalid bit
1
1
1
1
0

0
0
page table
• During address translation, if valid–invalid bit in page table
entry is 0  page fault.
Operating System Concepts
9.26
Page Table: when some pages are not in main memory
0
A
1
B
2
C
Valid-Invalid
Framebit
#
0
3
4
4
5
6
6
4
v
i
G
4
7
i
F
3
Logical
Memory
Operating System Concepts
9
6
v
9
i
7
i
C
7
8
6
A
5
i
H
5
2
3
E
2
1
v
D
1
0
C
F
Physical
Memory
9.27
A
B
D
E
F
Disk Space
Steps in Handling a Page Fault (Cont.)
1. We check an internal table for this process, to determine
whether the reference was a valid or invalid memory access.
2. If the reference was invalid, we terminate process. If it was
valid, but we have not yet brought in that page, we now page
in the latter.
3. We find a free frame.
4. We schedule a disk operation to read the desired page into
the newly allocated frame.
5. When the disk read is complete, we modify the internal table
kept with the process and the page table to indicate that the
page is now in memory.
6. We restart the instruction that was interrupted by the illegal
address trap. The process can now access the page as
though it had always been in memory.
Operating System Concepts
9.28
What happens if there is no free frame?
• Page replacement – find some page in
memory, but not really in use, swap it out.
– Algorithm.
– Performance – want an algorithm which will
result in minimum number of page faults.
• Same page may be brought into memory
several times.
Operating System Concepts
9.29
Performance of Demand Paging
• Effective Access Time (EAT) for a demand-paged memory.
• Memory Access Time (ma) for most computers now ranges
from 10 to 200 nanoseconds.
• If there is no page fault, then EAT = ma.
• If there is page fault, then
EAT = (1 – p) x (ma) + p x (page-fault time).
p: the probability of a page fault (0  p  1),
we expect p to be close to zero ( a few page faults).
If p=0 then no page faults, but if p=1 then every reference
is a fault
• If a page fault occurs, we must first read the relevant page
from disk, and then access the desired word.
Operating System Concepts
9.30
Performance of Demand Paging (Cont.)
•
•
We are faced with three major components of the pagefault service time:
1. Service the page-fault interrupt.
2. Read in the page.
3. Restart the process.
A typical hard disk has:
– An average latency of 8 milliseconds.
– A seek of 15 milliseconds.
– A transfer time of 1 milliseconds.
– Total paging time = (8+15+1)= 24 milliseconds,
including hardware and software time, but no
queuing (wait) time.
Operating System Concepts
9.31
Demand Paging Example 1
• Assume an average page-fault service time of 25
milliseconds (10-3), and a Memory Access Time of 100
nanoseconds (10-9). Find the Effective Access Time?
• Solution: Effective Access Time (EAT)
= (1 – p) x (ma) + p x (page fault time)
= (1 – p) x 100 + p x 25,000,000
= 100 – 100 x p + 25,000,000 x p
= 100 + 24,999,900 x p.
• Note: The Effective Access Time is directly proportional to
the page-fault rate.
Operating System Concepts
9.32
Page Replacement
• Example: Assume each process contains 10 pages and
uses only 5 pages.
– If we had 40 frames in physical memory then we can
run 8 processes instead of 4 processes. Increasing
the degree of multiprogramming.
– If we run 6 processes (each of which is 10 pages in
size), but uses only 5 pages. We have higher CPU
utilization and throughput, also 10 frames to spare
(i.e., 6x5=30 frames needed out of 40 frames).
– It is possible each process tries to use all 10 of its
pages, resulting in a need for 60 frames when only
40 are available.
Operating System Concepts
9.33
Need for Page Replacement
0
H
1
Load M
2
J
3
M
Logical
Memory for
user 1
Valid-invalid
Frame bit
0 #3
v
1
4
v
2
5
v
3
i
A
1
B
0
2
D
1
3
E
2
2
v
3
7
v
Logical
Memory for
user 2
Operating System Concepts
6
v
i
2
D
3
H
5
J
6
A
7
E
Physical
Memory
Page table
for user 2
9.34
Disk
1
4 Load M
Page table
for user 1
0
0 monitor
B
M
The Operating System has Several Options
1. Terminate user process.
2. Swap out a process, freeing all its frames, and
reducing the level of multiprogramming.
3. Page replacement takes the following approach:
• If no frames is free, find one that is not currently
•
being used and free it.
We can free a frame by writing its contents to swap
space, and changing the page table to indicate that
the page is no longer in memory.
Operating System Concepts
9.35
Page Replacement
Swap out victim
page
Valid-invalid bit
Frame
#
O
i
(2) Change to
F
v
invalid
(4) Reset page
table
Page table
for new
page
Operating System Concepts
F
victim
Disk
(1
)
(3)
Physical
Memory
9.36
Swap
desired
page in
Page Replacement (Cont.)
•
The page-fault service time is now modified to include
page replacement:
1. Find the location of the desired page on the disk.
2. Find a free frame:
– If there is a free frame use it.
– Otherwise, use a page-replacement algorithm to
select a victim frame.
– Write the victim page to the disk; change the page
and frame tables accordingly.
3. Read the desired page into the newly free frame;
change the page and frame tables.
4. Restart the user process.
Operating System Concepts
9.37
Page Replacement (Cont.)
• Note: If no frames are free, two page transfers (one out
and one in) are required. This doubles the page-fault
service time and will increase the effective access time
accordingly.
• This overhead can be reduced by the use of a modify
(dirty) bit.
• Each page or frame may have a modify bit associated
with it in the hardware.
• To implement demand paging, we must develop:
– Frame-allocation algorithm, to decide how many
frames to allocate to each process.
– Page-replacement algorithm, to select the frames
that are to be replaced.
Operating System Concepts
9.38
Page-Replacement Algorithms
• We want a page replacement algorithm with the lowest
page-fault rate.
• We evaluate an algorithm by running it on a particular
string of memory references (reference string) and
computing the number of page faults on that string.
• The string of memory references is called a reference
string.
• We can generate reference strings by tracing a given
system and recording the address of each memory
reference.
• In our examples, the reference string is
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.
Operating System Concepts
9.39
Page-Replacement Algorithms (Cont.)
• Example: If we trace a particular process, we might record
the following address sequence:
0100, 0432, 0101, 0102, 0609, 0601, 0612
1
4
6
1
This is reduced to the following reference string: 1, 4, 1, 6
• As the number of frames available increases, the number of
page faults will decrease.
• From the above example: If we had 3 or more frames, we
would have only 3 faults, one fault for the first reference to
each page. If we had only one frame, we would have a
replacement with every reference resulting in 4 faults.
Operating System Concepts
9.40
First-In-First-Out (FIFO) Algorithm
• A FIFO algorithm associates with each page the time when
that page was brought into memory.
• When a page must be replaced, the oldest page is chosen.
• Or we can use a FIFO queue to hold all pages in memory.
• We replace the page at the head of the queue.
• When a page is brought into memory, we insert it at the tail
of the queue.
Out
In
Front
Tail
Operating System Concepts
9.41
Example: FIFO Algorithm
•
•
•
Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
Let 3 frames are initially empty (3 pages can be in memory at a time per
process).
The first 3 references (1, 2, 3) cause page faults, and are brought into
these empty frames.
1 1 4 5
2 2 1 3 9 page faults
3 3 2 4
•
4 frames
1 1 5 4
2 2 1 510 page faults
3 3 2
4 4 3
Operating System Concepts
9.42
Optimal (OPT) Algorithm
• An optimal algorithm has the lowest page-fault rate of
all algorithms.
• An optimal algorithm will never suffer from Belady’s
anomaly.
• Replace the page that will not be used for the longest
period of time.
• This algorithms guarantees the lowest possible pagefault rate for a fixed number of frames.
• The optimal algorithm is difficult to implement, because
it requires future knowledge of the reference string.
• Similar situation with Shortest-Job-First in CPU
scheduling.
Operating System Concepts
9.43
Example: OPT Algorithm
• Initially 4 frames empty.
• Reference String: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
1 4
2
6 page faults
3
4 5
• How do you know this?
• Used for measuring how well your algorithm performs.
Operating System Concepts
9.44
Least Recently Used (LRU) Algorithm
• The key distinction between FIFO and OPT algorithms is
that FIFO uses the time when a page was brought into
memory; the OPT uses the time when a page is to be
used (future).
• LRU algorithm uses the time when a page has not been
used for the longest period of time (Past).
• LRU replacement associates with each page the time of
that page’s last use.
Operating System Concepts
9.45
Example: LRU Algorithm
• Looking backward in time.
• Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
1
5
2
3 5 4
4 3
8 page
faults
• Note: Number of page faults (on same reference string) using:
– LRU is 8.
– FIFO is 10.
– OPT is 6.
Operating System Concepts
9.46