CS 519 -- Operating Systems -
Download
Report
Transcript CS 519 -- Operating Systems -
CS 519: Lecture 3
Memory Management
Memory Management
Requirements for memory management strategy:
Consistency: all address spaces look “basically the same”
Relocation: processes can be loaded at any physical address
Protection: a process cannot maliciously access memory
belonging to another process
Sharing: may allow sharing of physical memory (must
implement control)
CS 519
2
Operating System Theory
Basic Concepts: Memory Partitioning
Static: a process must be loaded into a partition of equal or
greater size => Internal Fragmentation
Dynamic: each process load into partition of exact size =>
External fragmentation
New Job
Memory
CS 519
Memory
3
Operating System Theory
Basic Concepts: Pure Paging and
Segmentation
Paging: memory divided into equal-sized frames. All
process pages loaded into non-necessarily contiguous
frames
Segmentation: each process divided into variablesized segments. All process segments loaded into
dynamic partitions that are not necessarily contiguous
More details in the context of Virtual Memory
CS 519
4
Operating System Theory
Memory Hierarchy
Memory
Cache
Registers
Question: What if we want to support programs that
require more memory than what’s available in the
system?
CS 519
5
Operating System Theory
Memory Hierarchy
Virtual Memory
Memory
Cache
Registers
Answer: Pretend we had something bigger
=> Virtual Memory
CS 519
6
Operating System Theory
Virtual Memory
Virtual memory is the OS abstraction that gives the
programmer the illusion of an address space that may
be larger than the physical address space
Virtual memory can be implemented using either
paging or segmentation but paging is most common
Virtual memory is motivated by both
Convenience: the programmer does not have to deal with the
fact that individual machines may have very different
amounts of physical memory
Higher degree of multiprogramming: processes are not
loaded as a whole. Rather they are loaded on demand
CS 519
7
Operating System Theory
Virtual Memory: Paging
A page is a cacheable unit of virtual memory
The OS controls the mapping between pages of VM
and physical memory
More flexible (at a cost)
page
frame
Cache
Memory
Memory
CS 519
8
VM
Operating System Theory
Virtual Memory: Segmentation
Job 0
Job 1
Memory
CS 519
9
Operating System Theory
Hardware Translation
Processor
translation
box (MMU)
Physical
memory
Translation from virtual to physical can be done in software
However, hardware support is needed to ensure protection and
perform translation faster
Simplest solution with two registers: base and size
CS 519
10
Operating System Theory
Segmentation Hardware
offset
+
virtual address
physical address
segment
segment table
Segments are of variable size
Translation done through a set of (base, size, state) registers segment table
State: valid/invalid, access permission, reference, and modified
bits
Segments may be visible to the programmer and can be used as
a convenience for organizing the programs and data (i.e code
segment or data segments)
CS 519
11
Operating System Theory
Paging hardware
virtual address
+
page #
physical address
offset
page table
Pages are of fixed size
The physical memory corresponding to a page is called a page frame
Translation done through a page table indexed by page number
Each entry in a page table contains the physical frame number that the
virtual page is mapped to and the state of the page in memory
State: valid/invalid, access permission, reference, modified, and
caching bits
Paging is transparent to the programmer
CS 519
12
Operating System Theory
Combined Paging and Segmentation
Some MMUs combine paging with segmentation
Virtual address: segment number + page number + offset
Segmentation translation is performed first
The segment entry points to a page table for that segment
The page number is used to index the page table and look up the
corresponding page frame number
Segmentation not used much anymore so we’ll focus on paging
UNIX has simple form of segmentation but does not require any
hardware support
CS 519
13
Operating System Theory
Paging: Address Translation
virtual address
CPU
p
d
f
physical address
f
d
d
p
f
memory
page table
CS 519
14
Operating System Theory
Translation Lookaside Buffers
Translation on every memory access must be fast
What to do? Caching, of course …
Why does caching work? Temporal locality!
Same as normal memory cache – cache is smaller so can
spend more $$ to make it faster
Cache for page table entries is called the Translation
Lookaside Buffer (TLB)
Typically fully associative
No more than 64 entries
On every memory access, we look for the page
frame mapping in the TLB
CS 519
15
Operating System Theory
Paging: Address Translation
virtual address
CPU
p
d
f
physical address
f
d
d
TLB
p/f
memory
f
CS 519
16
Operating System Theory
TLB Miss
What if the TLB does not contain the right PT entry?
TLB miss
Evict an existing entry if does not have any free ones
Replacement policy?
Bring in the missing entry from the PT
TLB misses can be handled in hardware or software
Software allows application to assist in replacement
decisions
CS 519
17
Operating System Theory
Where to Store Address Space?
Virtual address space may be larger than physical
memory
Where do we keep it?
Where do we keep the page table?
CS 519
18
Operating System Theory
Where to Store Address Space?
On the next device down our storage hierarchy, of course …
Memory
Disk
VM
CS 519
19
Operating System Theory
Where to Store Page Table?
In memory, of course …
OS
Code
Globals
P0 Page Table
Stack
P1 Page Table
Heap
CS 519
20
Interestingly, use
memory to “enlarge”
view of memory, leaving
LESS physical memory
This kind of overhead is
common
Gotta know what the
right trade-off is
Have to understand
common application
characteristics
Have to be common
enough!
Operating System Theory
Page table structure
Non-page-able
Page
Table
Kernel PT
Master
PT
P0 PT
Page-able
2nd-Level
PTs
P1 PT
Page table can become huge
What to do?
OS Segment
Two-Level PT: saves memory but requires two lookups per access
Page the page tables
Inverted page tables (one entry per page frame in physical memory):
translation through hash tables
CS 519
21
Operating System Theory
Demand Paging
To start a process (program), just load the code page
where the process will start executing
As process references memory (instruction or data)
outside of loaded page, bring in as necessary
How to represent fact that a page of VM is not yet in
memory?
Paging Table
0
A
1
B
2
C
0
1
2
3
1
v
i
i
Memory
0
1
Disk
B
A
C
2
VM
CS 519
22
Operating System Theory
Page Fault
What happens when process references a page
marked as invalid in the page table?
Page fault trap
Check that reference is valid
Find a free memory frame
Read desired page from disk
Change valid bit of page to v
Restart instruction that was interrupted by the trap
Is it easy to restart an instruction?
What happens if there is no free frame?
CS 519
23
Operating System Theory
Page Fault (Cont’d)
So, what can happen on a memory access?
1.
2.
3.
4.
TLB miss read page table entry
TLB miss read kernel page table entry
Page fault for necessary page of process page table
All frames are used need to evict a page modify a
process page table entry
1. TLB miss read kernel page table entry
2. Page fault for necessary page of process page table
3. Go back
5. Read in needed page, modify page table entry, fill TLB
CS 519
24
Operating System Theory
Cost of Handling a Page Fault
Trap, check page table, find free memory frame (or find victim)
… about 200 - 600 s
Disk seek and read … about 10 ms
Memory access … about 100 ns
Page fault degrades performance by ~100000!!!!!
And this doesn’t even count all the additional things that can
happen along the way
Better not have too many page faults!
If want no more than 10% degradation, can only have 1 page
fault for every 1,000,000 memory accesses
OS must do a great job of managing the movement of data
between secondary storage and main memory
CS 519
25
Operating System Theory
Page Replacement
What if there’s no free frame left on a page fault?
Free a frame that’s currently being used
1.
2.
3.
4.
5.
6.
Select the frame to be replaced (victim)
Write victim back to disk
Change page table to reflect that victim is now invalid
Read the desired page into the newly freed frame
Change page table to reflect that new page is now valid
Restart faulting instruction
Optimization: do not need to write victim back if it
has not been modified (need dirty bit per page).
CS 519
26
Operating System Theory
Page Replacement
Highly motivated to find a good replacement policy
That is, when evicting a page, how do we choose the best
victim in order to minimize the page fault rate?
Is there an optimal replacement algorithm? If yes,
what is it?
Let’s look at an example:
Suppose we have 3 memory frames and are running a
program that has the following reference pattern
7, 0, 1, 2, 0, 3, 0, 4, 2, 3
CS 519
27
Operating System Theory
Page Replacement
Suppose we know the access pattern in advance
7, 0, 1, 2, 0, 3, 0, 4, 2, 3
Optimal algorithm is to replace the page that will not
be used for the longest period of time
What’s the problem with this algorithm?
Realistic policies try to predict future behavior on
the basis of past behavior
Works because of locality
CS 519
28
Operating System Theory
FIFO
First-in, First-out
Be fair, let every page live in memory for the about the same
amount of time, then toss it.
What’s the problem?
Is this compatible with what we know about behavior of
programs?
How does it do on our example?
7, 0, 1, 2, 0, 3, 0, 4, 2, 3
CS 519
29
Operating System Theory
LRU
Least Recently Used
On access to a page, timestamp it
When need to evict a page, choose the one with the oldest
timestamp
What’s the motivation here?
Is LRU optimal?
In practice, LRU is quite good for most programs
Is it easy to implement?
CS 519
30
Operating System Theory
Not Frequently Used Replacement
Have a reference bit and software counter for each page frame
At each clock interrupt, the OS adds the reference bit of each
frame to its counter and then clears the reference bit
When need to evict a page, choose frame with lowest counter
What’s the problem?
Doesn’t forget anything, no sense of time – hard to evict a page
that was referenced long in the past but is no longer relevant
Updating counters is expensive, especially since memory is getting
rather large these days
Can be improved with an aging scheme: counters are shifted
right before adding the reference bit and the reference bit is
added to the leftmost bit (rather than to the rightmost one)
CS 519
31
Operating System Theory
Clock (Second-Chance)
Arrange physical pages in a circle, with a clock hand
Hardware keeps 1 use bit per frame. Sets use bit
on memory reference to a frame.
If bit is not set, hasn’t been used for a while
On page fault:
1. Advance clock hand
2. Check use bit
If 1, has been used recently, clear and go on
If 0, this is our victim
Can we always find a victim?
CS 519
32
Operating System Theory
Nth-Chance
Similar to Clock except: maintain a counter as well as
a use bit
On page fault:
1. Advance clock hand
2. Check use bit
If 1, clear and set counter to 0
If 0, increment counter, if counter < N, go on, otherwise, this is
our victim
What’s the problem if N is too large?
CS 519
33
Operating System Theory
A Different Implementation of
2nd-Chance
Always keep a free list of some size n > 0
On page fault, if free list has more than n frames, get a
frame from the free list
If free list has only n frames, get a frame from the list,
then choose a victim from the frames currently being used
and put on the free list
On page fault, if page is on a frame on the free list,
don’t have to read page back in.
Implemented on VAX … works well, gets performance
close to true LRU
CS 519
34
Operating System Theory
Virtual Memory and Cache Conflicts
Assume an architecture with direct-mapped caches
First-level caches are often direct-mapped
The VM page size partitions a direct-mapped cache
into a set of cache-pages
Page frames are colored (partitioned into equivalence
classes) where pages with the same color map to the
same cache-page
Cache conflicts can occur only between pages with
the same color, and no conflicts can occur within a
single page
CS 519
35
Operating System Theory
VM Mapping to Avoid Cache Misses
Goal: to assign active virtual pages to different
cache-pages
A mapping is optimal if it avoids conflict misses
A mapping that assigns two or more active pages to
the same cache-page can induce cache conflict misses
Example:
a program with 4 active virtual pages
16 KB direct-mapped cache
a 4 KB page size partitions the cache into four cache-pages
there are 256 mappings of virtual pages into cache-pages
but only 4!= 24 are optimal
CS 519
36
Operating System Theory
Page Re-coloring
With a bit of hardware, can detect conflict at
runtime
Count cache misses on a per-page basis
Can solve conflicts by re-mapping one or more of the
conflicting virtual pages into new page frames of
different color
Re-coloring
For the limited applications that have been studied,
only small performance gain (~10-15%)
CS 519
37
Operating System Theory
Multi-Programming Environment
Why?
Better utilization of resources (CPU, disks, memory, etc.)
Problems?
Mechanism – TLB, caches?
How to guarantee fairness?
Over commitment of memory
What’s the potential problem?
Each process needs its working set in memory in order to
perform well
If too many processes running, can thrash
CS 519
38
Operating System Theory
Support for Multiple Processes
More than one address space should be loaded in
memory
A register points to the current page table
OS updates the register when context switching
between threads from different processes
Most TLBs can cache more than one PT
Store the process id to distinguish between virtual
addresses belonging to different processes
If no pids, then TLB must be flushed at the process
switch time
CS 519
39
Operating System Theory
Sharing
virtual address spaces
p1
p2
processes:
v-to-p memory mappings
physical memory:
CS 519
40
Operating System Theory
Copy-on-Write
p1
CS 519
p1
p2
41
p2
Operating System Theory
Resident Set Management
How many pages of a process should be brought in ?
Resident set size can be fixed or variable
Replacement scope can be local or global
Most common schemes implemented in the OS:
Variable allocation with global scope: simple - resident set
size of some process is modified at replacement time
Variable allocation with local scope: more complicated resident set size is modified periodically to approximate the
working set size
CS 519
42
Operating System Theory
Working Set
The set of pages that have been referenced in the last window
of time
The size of the working set varies during the execution of the
process depending on the locality of accesses
If the number of pages allocated to a process covers its
working set then the number of page faults is small
Schedule a process only if enough free memory to load its
working set
How can we determine/approximate the working set size?
CS 519
43
Operating System Theory
Page-Fault Frequency
A counter per page stores the virtual time between
page faults
An upper threshold for the virtual time is defined
If the amount of time since the last page fault is less
than the threshold (frequent faults), then the page is
added to the resident set
A lower threshold can be used to discard pages from
the resident set
If time between faults higher than the lower
threshold (infrequent faults), then discard the LRU
page of this process
CS 519
44
Operating System Theory
Application-Controlled Paging
OS kernel provides the mechanism and implements
the global policy
Chooses the process which has to evict a page when need a
free frame
Application decides the local replacement
Chooses the particular page that should be evicted
Basic protocol for an external memory manager:
At page fault, kernel upcalls the manager asking it to pick a
page to be evicted
Manager provides the info and kernel re-maps it as
appropriate
CS 519
45
Operating System Theory
Summary
Virtual memory is a way of introducing another level in our
memory hierarchy in order to abstract away the amount of
memory actually available on a particular system
This is incredibly important for “ease-of-programming”
Imagine having to explicitly check for size of physical memory and
manage it in each and every one of your programs
Can be implemented using paging (sometimes segmentation)
Page fault is expensive so can’t have too many of them
Important to implement good page replacement policy
Have to watch out for thrashing!!
CS 519
46
Operating System Theory
Single Address Space
What’s the point?
Virtual address space is currently used for three purposes
Provide the illusion of a (possibly) larger address space than
physical memory
Provide the illusion of a contiguous address space while allowing
for non-contiguous storage in physical memory
Protection
Protection, provided through private address spaces, makes
sharing difficult
There is no inherent reason why protection should be
provided by private address spaces
CS 519
47
Operating System Theory
Private Address Spaces vs. Sharing
virtual address spaces
p1
p2
processes:
v-to-p memory mappings
physical memory:
Shared physical page may be mapped to different virtual pages
when shared
BTW, what happens if we want to page red page out?
This variable mapping makes sharing of pointer-based data
structures difficult
Storing these data structures on disk is also difficult
CS 519
48
Operating System Theory
Private Address Space vs. Sharing
(Cont’d)
Most complex data structures are pointer-based
Various techniques have been developed to deal with
this
Linearization
Pointer swizzling: translation of pointers; small/large AS
OS support for multiple processes mapping the same
physical page to the same virtual page
All above techniques are either expensive
(linearization and swizzling) or have shortcomings
(mapping to same virtual page requires previous
agreement)
CS 519
49
Operating System Theory
Opal: The Basic Idea
Provide a single virtual address space to all processes
in the system …
Can be used on a single machine or set of machines on a LAN
… but … but … won’t we run out of address space?
Enough for 500 years if allocated at 1 Gigabyte per second
A virtual address means the same thing to all
processes
Share and save to secondary storage data structures as is
CS 519
50
Operating System Theory
Opal: The Basic Idea
Code
P0
OS
Data
Code
Globals
Data
Stack
Code
Heap
CS 519
P1
Data
51
Operating System Theory
Opal: Basic Mechanisms
Protection domain process
Container for all resources allocated to a running instantiation of a
program
Contains identity of “user”, which can be used for access control
(protection)
Virtual address space is allocated in chunks – segments
Segments can be persistent, meaning they might be stored on
secondary storage, and so cannot be garbage collected
Segments (and other resources) are named by capabilities
Capability is an “un-forgeable” set of rights to access a resource
(we’ll learn more about this later)
Access control + identity capabilities
Can attach to a segment once have a capability for it
Portals: protection domain entry points (RPC)
CS 519
52
Operating System Theory
Opal: Issues
Naming of segments: capabilities
Recycling of addresses: reference counting
Non-contiguity of address space: segments cannot
grow, so must request segments that are large
enough for data structures that assume contiguity
Private static data: must use register-relative
addressing
CS 519
53
Operating System Theory