CS 519 -- Operating Systems -

Download Report

Transcript CS 519 -- Operating Systems -

CS 519: Lecture 3
Memory Management
Memory Management
 Requirements for memory management strategy:
Consistency: all address spaces look “basically the same”
Relocation: processes can be loaded at any physical address
Protection: a process cannot maliciously access memory
belonging to another process
Sharing: may allow sharing of physical memory (must
implement control)
CS 519
2
Operating System Theory
Basic Concepts: Memory Partitioning
 Static: a process must be loaded into a partition of equal or
greater size => Internal Fragmentation
 Dynamic: each process load into partition of exact size =>
External fragmentation
New Job
Memory
CS 519
Memory
3
Operating System Theory
Basic Concepts: Pure Paging and
Segmentation
 Paging: memory divided into equal-sized frames. All
process pages loaded into non-necessarily contiguous
frames
 Segmentation: each process divided into variablesized segments. All process segments loaded into
dynamic partitions that are not necessarily contiguous
 More details in the context of Virtual Memory
CS 519
4
Operating System Theory
Memory Hierarchy
Memory
Cache
Registers
Question: What if we want to support programs that
require more memory than what’s available in the
system?
CS 519
5
Operating System Theory
Memory Hierarchy
Virtual Memory
Memory
Cache
Registers
Answer: Pretend we had something bigger
=> Virtual Memory
CS 519
6
Operating System Theory
Virtual Memory
 Virtual memory is the OS abstraction that gives the
programmer the illusion of an address space that may
be larger than the physical address space
 Virtual memory can be implemented using either
paging or segmentation but paging is most common
 Virtual memory is motivated by both
Convenience: the programmer does not have to deal with the
fact that individual machines may have very different
amounts of physical memory
Higher degree of multiprogramming: processes are not
loaded as a whole. Rather they are loaded on demand
CS 519
7
Operating System Theory
Virtual Memory: Paging
 A page is a cacheable unit of virtual memory
 The OS controls the mapping between pages of VM
and physical memory
More flexible (at a cost)
page
frame
Cache
Memory
Memory
CS 519
8
VM
Operating System Theory
Virtual Memory: Segmentation
Job 0
Job 1
Memory
CS 519
9
Operating System Theory
Hardware Translation
Processor
translation
box (MMU)
Physical
memory
 Translation from virtual to physical can be done in software
 However, hardware support is needed to ensure protection and
perform translation faster
 Simplest solution with two registers: base and size
CS 519
10
Operating System Theory
Segmentation Hardware
offset
+
virtual address
physical address
segment
segment table
 Segments are of variable size
 Translation done through a set of (base, size, state) registers segment table
 State: valid/invalid, access permission, reference, and modified
bits
 Segments may be visible to the programmer and can be used as
a convenience for organizing the programs and data (i.e code
segment or data segments)
CS 519
11
Operating System Theory
Paging hardware
virtual address
+
page #
physical address
offset
page table
Pages are of fixed size
The physical memory corresponding to a page is called a page frame
Translation done through a page table indexed by page number
Each entry in a page table contains the physical frame number that the
virtual page is mapped to and the state of the page in memory
 State: valid/invalid, access permission, reference, modified, and
caching bits
 Paging is transparent to the programmer




CS 519
12
Operating System Theory
Combined Paging and Segmentation
Some MMUs combine paging with segmentation
Virtual address: segment number + page number + offset
Segmentation translation is performed first
The segment entry points to a page table for that segment
The page number is used to index the page table and look up the
corresponding page frame number
 Segmentation not used much anymore so we’ll focus on paging





 UNIX has simple form of segmentation but does not require any
hardware support
CS 519
13
Operating System Theory
Paging: Address Translation
virtual address
CPU
p
d
f
physical address
f
d
d
p
f
memory
page table
CS 519
14
Operating System Theory
Translation Lookaside Buffers
 Translation on every memory access  must be fast
 What to do? Caching, of course …
Why does caching work? Temporal locality!
Same as normal memory cache – cache is smaller so can
spend more $$ to make it faster
 Cache for page table entries is called the Translation
Lookaside Buffer (TLB)
Typically fully associative
No more than 64 entries
 On every memory access, we look for the page 
frame mapping in the TLB
CS 519
15
Operating System Theory
Paging: Address Translation
virtual address
CPU
p
d
f
physical address
f
d
d
TLB
p/f
memory
f
CS 519
16
Operating System Theory
TLB Miss
 What if the TLB does not contain the right PT entry?
TLB miss
Evict an existing entry if does not have any free ones
Replacement policy?
Bring in the missing entry from the PT
 TLB misses can be handled in hardware or software
Software allows application to assist in replacement
decisions
CS 519
17
Operating System Theory
Where to Store Address Space?
 Virtual address space may be larger than physical
memory
 Where do we keep it?
 Where do we keep the page table?
CS 519
18
Operating System Theory
Where to Store Address Space?
On the next device down our storage hierarchy, of course …
Memory
Disk
VM
CS 519
19
Operating System Theory
Where to Store Page Table?
In memory, of course …
OS
Code
Globals
P0 Page Table
Stack
P1 Page Table
Heap
CS 519
20
 Interestingly, use
memory to “enlarge”
view of memory, leaving
LESS physical memory
 This kind of overhead is
common
 Gotta know what the
right trade-off is
 Have to understand
common application
characteristics
 Have to be common
enough!
Operating System Theory
Page table structure
Non-page-able
Page
Table
Kernel PT
Master
PT
P0 PT
Page-able
2nd-Level
PTs
P1 PT
 Page table can become huge
 What to do?
OS Segment
 Two-Level PT: saves memory but requires two lookups per access
 Page the page tables
 Inverted page tables (one entry per page frame in physical memory):
translation through hash tables
CS 519
21
Operating System Theory
Demand Paging
 To start a process (program), just load the code page
where the process will start executing
 As process references memory (instruction or data)
outside of loaded page, bring in as necessary
 How to represent fact that a page of VM is not yet in
memory?
Paging Table
0
A
1
B
2
C
0
1
2
3
1
v
i
i
Memory
0
1
Disk
B
A
C
2
VM
CS 519
22
Operating System Theory
Page Fault
 What happens when process references a page
marked as invalid in the page table?
 Page fault trap
Check that reference is valid
Find a free memory frame
Read desired page from disk
Change valid bit of page to v
Restart instruction that was interrupted by the trap
 Is it easy to restart an instruction?
 What happens if there is no free frame?
CS 519
23
Operating System Theory
Page Fault (Cont’d)
 So, what can happen on a memory access?
1.
2.
3.
4.
TLB miss  read page table entry
TLB miss  read kernel page table entry
Page fault for necessary page of process page table
All frames are used  need to evict a page  modify a
process page table entry
1. TLB miss  read kernel page table entry
2. Page fault for necessary page of process page table
3. Go back
5. Read in needed page, modify page table entry, fill TLB
CS 519
24
Operating System Theory
Cost of Handling a Page Fault
 Trap, check page table, find free memory frame (or find victim)
… about 200 - 600 s
 Disk seek and read … about 10 ms
 Memory access … about 100 ns
 Page fault degrades performance by ~100000!!!!!
 And this doesn’t even count all the additional things that can
happen along the way
 Better not have too many page faults!
 If want no more than 10% degradation, can only have 1 page
fault for every 1,000,000 memory accesses
 OS must do a great job of managing the movement of data
between secondary storage and main memory
CS 519
25
Operating System Theory
Page Replacement
 What if there’s no free frame left on a page fault?
 Free a frame that’s currently being used
1.
2.
3.
4.
5.
6.
Select the frame to be replaced (victim)
Write victim back to disk
Change page table to reflect that victim is now invalid
Read the desired page into the newly freed frame
Change page table to reflect that new page is now valid
Restart faulting instruction
 Optimization: do not need to write victim back if it
has not been modified (need dirty bit per page).
CS 519
26
Operating System Theory
Page Replacement
 Highly motivated to find a good replacement policy
That is, when evicting a page, how do we choose the best
victim in order to minimize the page fault rate?
 Is there an optimal replacement algorithm? If yes,
what is it?
 Let’s look at an example:
Suppose we have 3 memory frames and are running a
program that has the following reference pattern
7, 0, 1, 2, 0, 3, 0, 4, 2, 3
CS 519
27
Operating System Theory
Page Replacement
 Suppose we know the access pattern in advance
7, 0, 1, 2, 0, 3, 0, 4, 2, 3
 Optimal algorithm is to replace the page that will not
be used for the longest period of time
 What’s the problem with this algorithm?
 Realistic policies try to predict future behavior on
the basis of past behavior
Works because of locality
CS 519
28
Operating System Theory
FIFO
 First-in, First-out
Be fair, let every page live in memory for the about the same
amount of time, then toss it.
 What’s the problem?
Is this compatible with what we know about behavior of
programs?
 How does it do on our example?
7, 0, 1, 2, 0, 3, 0, 4, 2, 3
CS 519
29
Operating System Theory
LRU
 Least Recently Used
On access to a page, timestamp it
When need to evict a page, choose the one with the oldest
timestamp
What’s the motivation here?
 Is LRU optimal?
In practice, LRU is quite good for most programs
 Is it easy to implement?
CS 519
30
Operating System Theory
Not Frequently Used Replacement
 Have a reference bit and software counter for each page frame
 At each clock interrupt, the OS adds the reference bit of each
frame to its counter and then clears the reference bit
 When need to evict a page, choose frame with lowest counter
 What’s the problem?
 Doesn’t forget anything, no sense of time – hard to evict a page
that was referenced long in the past but is no longer relevant
 Updating counters is expensive, especially since memory is getting
rather large these days
 Can be improved with an aging scheme: counters are shifted
right before adding the reference bit and the reference bit is
added to the leftmost bit (rather than to the rightmost one)
CS 519
31
Operating System Theory
Clock (Second-Chance)
 Arrange physical pages in a circle, with a clock hand
 Hardware keeps 1 use bit per frame. Sets use bit
on memory reference to a frame.
 If bit is not set, hasn’t been used for a while
 On page fault:
1. Advance clock hand
2. Check use bit
 If 1, has been used recently, clear and go on
 If 0, this is our victim
 Can we always find a victim?
CS 519
32
Operating System Theory
Nth-Chance
 Similar to Clock except: maintain a counter as well as
a use bit
 On page fault:
1. Advance clock hand
2. Check use bit
 If 1, clear and set counter to 0
 If 0, increment counter, if counter < N, go on, otherwise, this is
our victim
 What’s the problem if N is too large?
CS 519
33
Operating System Theory
A Different Implementation of
2nd-Chance
 Always keep a free list of some size n > 0
On page fault, if free list has more than n frames, get a
frame from the free list
If free list has only n frames, get a frame from the list,
then choose a victim from the frames currently being used
and put on the free list
 On page fault, if page is on a frame on the free list,
don’t have to read page back in.
 Implemented on VAX … works well, gets performance
close to true LRU
CS 519
34
Operating System Theory
Virtual Memory and Cache Conflicts
 Assume an architecture with direct-mapped caches
First-level caches are often direct-mapped
 The VM page size partitions a direct-mapped cache
into a set of cache-pages
 Page frames are colored (partitioned into equivalence
classes) where pages with the same color map to the
same cache-page
 Cache conflicts can occur only between pages with
the same color, and no conflicts can occur within a
single page
CS 519
35
Operating System Theory
VM Mapping to Avoid Cache Misses
 Goal: to assign active virtual pages to different
cache-pages
 A mapping is optimal if it avoids conflict misses
 A mapping that assigns two or more active pages to
the same cache-page can induce cache conflict misses
 Example:
a program with 4 active virtual pages
16 KB direct-mapped cache
a 4 KB page size partitions the cache into four cache-pages
there are 256 mappings of virtual pages into cache-pages
but only 4!= 24 are optimal
CS 519
36
Operating System Theory
Page Re-coloring
 With a bit of hardware, can detect conflict at
runtime
Count cache misses on a per-page basis
 Can solve conflicts by re-mapping one or more of the
conflicting virtual pages into new page frames of
different color
Re-coloring
 For the limited applications that have been studied,
only small performance gain (~10-15%)
CS 519
37
Operating System Theory
Multi-Programming Environment
 Why?
Better utilization of resources (CPU, disks, memory, etc.)
 Problems?
Mechanism – TLB, caches?
How to guarantee fairness?
Over commitment of memory
 What’s the potential problem?
Each process needs its working set in memory in order to
perform well
If too many processes running, can thrash
CS 519
38
Operating System Theory
Support for Multiple Processes
 More than one address space should be loaded in
memory
 A register points to the current page table
 OS updates the register when context switching
between threads from different processes
 Most TLBs can cache more than one PT
Store the process id to distinguish between virtual
addresses belonging to different processes
 If no pids, then TLB must be flushed at the process
switch time
CS 519
39
Operating System Theory
Sharing
virtual address spaces
p1
p2
processes:
v-to-p memory mappings
physical memory:
CS 519
40
Operating System Theory
Copy-on-Write
p1
CS 519
p1
p2
41
p2
Operating System Theory
Resident Set Management
 How many pages of a process should be brought in ?
 Resident set size can be fixed or variable
 Replacement scope can be local or global
 Most common schemes implemented in the OS:
Variable allocation with global scope: simple - resident set
size of some process is modified at replacement time
Variable allocation with local scope: more complicated resident set size is modified periodically to approximate the
working set size
CS 519
42
Operating System Theory
Working Set
 The set of pages that have been referenced in the last window
of time
 The size of the working set varies during the execution of the
process depending on the locality of accesses
 If the number of pages allocated to a process covers its
working set then the number of page faults is small
 Schedule a process only if enough free memory to load its
working set
 How can we determine/approximate the working set size?
CS 519
43
Operating System Theory
Page-Fault Frequency
 A counter per page stores the virtual time between
page faults
 An upper threshold for the virtual time is defined
 If the amount of time since the last page fault is less
than the threshold (frequent faults), then the page is
added to the resident set
 A lower threshold can be used to discard pages from
the resident set
 If time between faults higher than the lower
threshold (infrequent faults), then discard the LRU
page of this process
CS 519
44
Operating System Theory
Application-Controlled Paging
 OS kernel provides the mechanism and implements
the global policy
Chooses the process which has to evict a page when need a
free frame
 Application decides the local replacement
Chooses the particular page that should be evicted
 Basic protocol for an external memory manager:
At page fault, kernel upcalls the manager asking it to pick a
page to be evicted
Manager provides the info and kernel re-maps it as
appropriate
CS 519
45
Operating System Theory
Summary
 Virtual memory is a way of introducing another level in our
memory hierarchy in order to abstract away the amount of
memory actually available on a particular system
 This is incredibly important for “ease-of-programming”
 Imagine having to explicitly check for size of physical memory and
manage it in each and every one of your programs
 Can be implemented using paging (sometimes segmentation)
 Page fault is expensive so can’t have too many of them
 Important to implement good page replacement policy
 Have to watch out for thrashing!!
CS 519
46
Operating System Theory
Single Address Space
 What’s the point?
Virtual address space is currently used for three purposes
Provide the illusion of a (possibly) larger address space than
physical memory
Provide the illusion of a contiguous address space while allowing
for non-contiguous storage in physical memory
Protection
Protection, provided through private address spaces, makes
sharing difficult
There is no inherent reason why protection should be
provided by private address spaces
CS 519
47
Operating System Theory
Private Address Spaces vs. Sharing
virtual address spaces
p1
p2
processes:
v-to-p memory mappings
physical memory:
 Shared physical page may be mapped to different virtual pages
when shared
 BTW, what happens if we want to page red page out?
 This variable mapping makes sharing of pointer-based data
structures difficult
 Storing these data structures on disk is also difficult
CS 519
48
Operating System Theory
Private Address Space vs. Sharing
(Cont’d)
 Most complex data structures are pointer-based
 Various techniques have been developed to deal with
this
Linearization
Pointer swizzling: translation of pointers; small/large AS
OS support for multiple processes mapping the same
physical page to the same virtual page
 All above techniques are either expensive
(linearization and swizzling) or have shortcomings
(mapping to same virtual page requires previous
agreement)
CS 519
49
Operating System Theory
Opal: The Basic Idea
 Provide a single virtual address space to all processes
in the system …
Can be used on a single machine or set of machines on a LAN
 … but … but … won’t we run out of address space?
Enough for 500 years if allocated at 1 Gigabyte per second
 A virtual address means the same thing to all
processes
Share and save to secondary storage data structures as is
CS 519
50
Operating System Theory
Opal: The Basic Idea
Code
P0
OS
Data
Code
Globals
Data
Stack
Code
Heap
CS 519
P1
Data
51
Operating System Theory
Opal: Basic Mechanisms
 Protection domain  process
 Container for all resources allocated to a running instantiation of a
program
 Contains identity of “user”, which can be used for access control
(protection)
 Virtual address space is allocated in chunks – segments
 Segments can be persistent, meaning they might be stored on
secondary storage, and so cannot be garbage collected
 Segments (and other resources) are named by capabilities
 Capability is an “un-forgeable” set of rights to access a resource
(we’ll learn more about this later)
 Access control + identity  capabilities
 Can attach to a segment once have a capability for it
 Portals: protection domain entry points (RPC)
CS 519
52
Operating System Theory
Opal: Issues
 Naming of segments: capabilities
 Recycling of addresses: reference counting
 Non-contiguity of address space: segments cannot
grow, so must request segments that are large
enough for data structures that assume contiguity
 Private static data: must use register-relative
addressing
CS 519
53
Operating System Theory