L11-12_VM_2007

Download Report

Transcript L11-12_VM_2007

MAMAS – Computer Architecture
Virtual Memory
Dr. Lihu Rappoport
1
Computer Design 2007 – VM
Virtual Memory
 Provides the illusion of a large memory
 Different machines have different amount of physical memory
– Allows programs to run regardless of actual physical memory size
 The amount of memory consumed by each process is dynamic
– Allow adding memory as needed
 Many processes can run on a single machine
– Provide each process its own memory space
– Prevents a process from accessing the memory of other processes
running on the same machine
– Allows the sum of memory spaces of all process to be larger than
physical memory
 Basic terminology
– Virtual Address Space: address space used by the programmer
– Physical Address: actual physical memory address space
2
Computer Design 2007 – VM
Virtual Memory: Basic Idea
 Divide memory (virtual and physical) into fixed size blocks
– Pages in Virtual space, Frames in Physical space
– Page size = Frame size
– Page size is a power of 2: page size = 2k
 All pages in the virtual address space are contiguous
 Pages can be mapped into physical Frames in any order
 Some of the pages are in main memory (DRAM),
some of the pages are on disk
 All programs are written using Virtual Memory Address Space
 The hardware does on-the-fly translation between virtual and
physical address spaces
– Use a Page Table to translate between Virtual and Physical
addresses
3
Computer Design 2007 – VM
Virtual Memory
 Main memory can act as a cache for the secondary storage (disk)
Virtual Addresses
Physical Addresses
Address
Translation
Disk Addresses
 Advantages:
– illusion of having more physical memory
– program relocation
– protection
4
Computer Design 2007 – VM
Virtual to Physical Address translation
Virtual Address
31
0
12 11
Page offset
Virtual Page Number
Page
table
base
reg
Dirty bit
V D AC
Frame number
Access Control
1 0
Valid bit
29
0
12 11
Physical Frame Number
Page offset
Physical Address
Page size: 212 byte =4K byte
5
Computer Design 2007 – VM
Page Tables
Page Table
Virtual page number
Physical Page
Or Disk Address
Physical Memory
Valid
1
1
1
1
0
1
1
0
1
1
Disk
0
1
6
Computer Design 2007 – VM
Address Mapping Algorithm
If V = 1 then
page is in main memory at frame address stored in table
 Fetch data
else (page fault)
need to fetch page from disk
 causes a trap, usually accompanied by a context switch:
current process suspended while page is fetched from disk
Access Control (R = Read-only, R/W = read/write, X = execute only)
If kind of access not compatible with specified access rights then
protection_violation_fault
 causes trap to hardware, or software fault handler
 Missing item fetched from secondary memory only on the occurrence of
a fault  demand load policy
7
Computer Design 2007 – VM
Page Replacement Algorithm
 Not Recently Used (NRU)
– Associated with each page is a reference flag such that
ref flag = 1 if the page has been referenced in recent past
 If replacement is needed, choose any page frame such that its
reference bit is 0.
– This is a page that has not been referenced in the recent past
 Clock implementation of NRU:
10
10
10
0
0
page table entry
While (PT[LRP].NRU) {
PT[LRP].NRU
LRP++ (mod table size)
}
Ref bit
 Possible optimization: search for a page that is both not recently
referenced AND not dirty
8
Computer Design 2007 – VM
Page Faults
 Page faults: the data is not in memory  retrieve it from disk
–
–
–
–
–
–
–
The CPU must detect situation
The CPU cannot remedy the situation (has no knowledge of the disk)
CPU must trap to the operating system so that it can remedy the situation
Pick a page to discard (possibly writing it to disk)
Load the page in from disk
Update the page table
Resume to program so HW will retry and succeed!
 Page fault incurs a huge miss penalty
–
–
–
–
9
Pages should be fairly large (e.g., 4KB)
Can handle the faults in software instead of hardware
Page fault causes a context switch
Using write-through is too expensive so we use write-back
Computer Design 2007 – VM
Optimal Page Size
 Minimize wasted storage
– Small page minimizes internal fragmentation
– Small page increase size of page table
 Minimize transfer time
–
–
–
–
Large pages (multiple disk sectors) amortize access cost
Sometimes transfer unnecessary info
Sometimes prefetch useful data
Sometimes discards useless data early
 General trend toward larger pages because
– Big cheap RAM
– Increasing memory / disk performance gap
– Larger address spaces
10
Computer Design 2007 – VM
Translation Lookaside Buffer (TLB)
 Page table resides in memory
 each translation requires a
Virtual Address
memory access
TLB Access
 TLB
– Cache recently used PTEs
– speed up translation
– typically 128 to 256 entries
– usually 4 to 8 way associative
– TLB access time is comparable to L1
cache access time
11
Access
Page Table
No
TLB Hit ?
Yes
Physical
Addresses
Computer Design 2007 – VM
Making Address Translation Fast
TLB is a cache for recent address translations:
TLB
Virtual page number
Valid
Tag
Physical Page
1
1
Physical Memory
1
1
0
1
Page Table
Valid
1
1
1
Disk
1
0
1
1
0
Physical Page
Or
Disk Address
12
1
1
0
1
Computer Design 2007 – VM
TLB Access
Virtual page number
Tag
Offset
Set
Way 0 Way 1 Way 2 Way 3
Set#
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
=
1
1
1
1
1
1
1
1
1
=
Way 0 Way 1 Way 2 Way 3
1
1
1
1
1
1
1
1
1
=
=
Way MUX
Hit/Miss
13
PTE
Computer Design 2007 – VM
Virtual Memory And Cache
Virtual Address
Access
Page Table
No
TLB Access
Access
Cache
TLB Hit ?
Cache
Hit ?
Yes
Physical
Addresses
No
Access
Memory
Yes
Data
TLB access is serial with cache access
14
Computer Design 2007 – VM
More On Page Swap-out
 DMA copies the page to the disk controller
– Reads each byte:
 Executes snoop-invalidate for each byte in the cache (both L1 and L2)
 If the byte resides in the cache:
 if it is modified reads its line from the cache into memory
 invalidates the line
– Writes the byte to the disk controller
– This means that when a page is swapped-out of memory
 All data in the caches which belongs to that page is invalidated
 The page in the disk is up-to-date
 The TLB is snooped
– If the TLB hits for the swapped-out page, TLB entry is invalidated
 In the page table
– The valid bit in the PTE entry of the swapped-out pages set to 0
– All the rest of the bits in the PTE entry may be used by the operating
system for keeping the location of the page in the disk
15
Computer Design 2007 – VM
Overlapped TLB & Cache Access
Virtual Memory view of a Physical Address
29
12 11
0
Page Number
Page offset
Cache view of a Physical Address
29
14 13
tag
6 5
set
0
disp
#Set is not contained within the Page Offset
 The #Set is not known until the physical page number is known
 Cache can be accessed only after address translation done
16
Computer Design 2007 – VM
Overlapped TLB & Cache Access (cont)
Virtual Memory view of a Physical Address
29
12 11
Page Number
0
Page offset
Cache view of a Physical Address
29
6 5
12 11
tag
set
0
disp
In the above example #Set is contained within the Page Offset
 The #Set is known immediately
 Cache can be accessed in parallel with address translation
 Once translation is done, match upper bits with tags
Limitation: Cache ≤ (page size × associativity)
17
Computer Design 2007 – VM
Overlapped TLB & Cache Access (cont)
 Assume 4K byte per page  bits [11:0] are not translated
 Assume cache is 32K Byte, 2 way set-associative, 64 byte/line
– (215/ 2 ways) / (26 bytes/line) = 215-1-6 = 28 = 256 sets
29
12 11
Page Number
20
0
Page offset
14 13
tag
6 5
set
0
disp
 Physical_addr[13:12] may be different than virtual_addr[13:12]
– Tag is comprised of bits [31:12] of the physical address
 The tag may mis-match bits [13:12] of the physical address
– Cache miss  allocate missing line according to its virtual set address
and physical tag
18
Computer Design 2007 – VM
Context Switch
 Each process has its own address space
– Each process has its own page table
– When the OS allocates to each process frames in physical memory,
and updates the page table of each process
– A process cannot access physical memory allocated to another process
 Unless the OS deliberately allocates the same physical frame to 2
processes (for memory sharing)
 On a context switch
– Save the current architectural state to memory
 Architectural registers
 Register that holds the page table base address in memory
– Flush the TLB
– Load the new architectural state from memory
 Architectural registers
 Register that holds the page table base address in memory
19
Computer Design 2007 – VM
VM in VAX: Address Format
31 30 29
Virtual Address
9 8
Page offset
Virtual Page Number
0
0
1
1
0
0 - P0 process space (code and data)
1 - P1 process space (stack)
0 - S0 system space
1 - S1
Physical Address
29
9 8
Physical Frame Number
0
Page offset
Page size: 29 byte = 512 bytes
20
Computer Design 2007 – VM
VM in VAX: Virtual Address Spaces
Process0
0
7FFFFFFF
80000000
Process1 Process2
Process3
P0 process code &
global vars
grow upward
P1 process stack &
local vars
grow downward
S0 system space
grows upward,
generally static
21
Computer Design 2007 – VM
Page Table Entry (PTE)
31
0
20
V PROT M Z OWN S S
Physical Frame Number
3 ownership bits
Indicate if the line was cleaned (zero)
Modified bit
4 Protection bits
Valid bit =1 if page mapped to main memory, otherwise page fault:
• Page on the disk swap area
• Address indicates the page location on the disk
22
Computer Design 2007 – VM
System Space Address Translation
29
9 8
10
Page offset
VPN
0
VPN
0
00
+
SBR (System page table base physical address)
00
=
PTE physical address
00
Get PTE
PFN (from PTE)
29
9 8
PFN
23
0
Page offset
Computer Design 2007 – VM
System Space Address Translation
31 29
98
10
0
offset
VPN
SBR
VPN*4
PFN
98
29
PFN
24
0
offset
Computer Design 2007 – VM
P0 Space Address Translation
29
9 8
00
Page offset
VPN
0
VPN
0
00
+
P0BR (P0 page table base virtual address)
00
=
PTE S0 space virtual address
00
Get PTE using system space
translation algorithm
PFN (from PTE)
29
9 8
PFN
25
0
Page offset
Computer Design 2007 – VM
P0 Space Address Translation (cont)
31 29
98
00
P0BR+VPN*4
10
offset
VPN
31 29
0
98
0
Offset’
VPN’
SBR
VPN’*4
PFN’
Physical addr
of PTE
98
29
0
Offset’
PFN’
PFN
98
29
PFN
26
0
Offset
Computer Design 2007 – VM
P0 space Address translation Using TLB
00
VPN
offset
Memory Access
Process
TLB Access
Process
TLB hit?
Yes
Get PTE of req
page from the
proc. TLB
PFN
No
Calculate PTE virtual addr
(in S0): P0BR+4*VPN
System TLB Access
System
TLB hit?
No
Access Sys Page
Table in
SBR+4*VPN(PTE)
Yes
Get PTE from system TLB
Calculate physical PFN Get PTE of req page from
address
the process Page table
Access Memory
27
Computer Design 2007 – VM
Paging in x86
 2-level hierarchical mapping
– Page directory and page tables
– All pages and page tables are 4K
 Linear address divided to:
Linear Address Space (4K Page)
–
–
–
–
31
Dir
10 bits
Table
10 bits
Offset
12 bits
Dir/Table serves as index
into a page table
– Offset serves ptr into a
data page
 Page entry points to a
page table or page
21
DIR
TABLE
11
0
OFFSET
4K Page Frame
Operand
Page Table
Page Directory
PG Tbl Entry
CR3
4K Dir Entry
 Performance issues: TLB
28
Computer Design 2007 – VM
x86 Page Translation Mechanism
 CR3 points to current page directory (may be changed per process)
 Usually, a page directory entry (covers 4MB) points to a page
table that covers data of the same type/usage
 Can allocate different physical for same Linear (e.g. 2 copies of same code)
 Sharing can alias pages from diff. processes to same physical (e.g., OS)
DIR
TABLE
OFFSET
CR3
Page Dir
Code
Phys Mem
Data
4K page
Page Tables
29
OS
Stack
Computer Design 2007 – VM
x86 Page Entry Format
 20 bit pointer to a 4K
Aligned address
 12 bits flags
 Virtual memory
– Present
– Accessed, Dirt
Present
Writable
User
Write-Through
Cache Disable
Accessed
Page Size (0: 4 Kbyte)
Available for OS Use
Page Dir
Entry
Page Frame Address 31:12
 Protection
– Writable (R#/W)
– User (U/S#)
– 2 levels/type only
 Caching
– Page WT
– Page Cache Disabled
 3 bit for OS usage
PP
AVAIL 0 0 0 A C W U W P
DT
12 11 - 9 8 7 6 5 4 3 2 1 0
31
Present
Writable
User
Write-Through
Cache Disable
Accessed
Dirty
Available for OS Use
Page Table
Entry
Page Frame Address 31:12
31
PP
AVAIL 0 0 D A C W U W P
DT
12 11 - 9 8 7 6 5 4 3 2 1 0
Reserved by Intel for future use (should be zero)
Figure 11-14. Format of Page Directory and Page Table Entries for 4K Pages
30
Computer Design 2007 – VM
x86 Paging – Virtual memory
 A page can be
– Not yet loaded
– Loaded
– On disk
 A loaded page can be
– Dirty
– Clean
 When a page is not loaded (P bit clear) => Page fault occurs
– It may require throwing a loaded page to insert the new one


OS prioritize throwing by LRU and dirty/clean/avail bits
Dirty page should be written to Disk. Clean need not.
– New page is either loaded from disk or “initialized”
– CPU will set page “access” flag when accessed, “dirty” when written
31
Computer Design 2007 – VM
Virtually-Addressed Cache
 Cache uses virtual addresses (tags are virtual)
VA
CPU
Translation
PA
Main
Memory
Cache
hit
data
 Only require address translation on cache miss
– TLB not in path to cache hit
 Aliasing: 2 different virtual addr. mapped to same physical addr
– Two different cache entries holding data for the same physical address
– Must update all cache entries with same physical address
32
Computer Design 2007 – VM
Virtually-Addressed Cache (cont).
 Cache must be flushed at task switch
– Solution: include process ID (PID) in tag
 How to share memory among processes
– Permit multiple virtual pages to refer to same physical frame
 Problem: incoherence if they point to different
physical pages
– Solution: require sufficiently many common virtual LSB
– With direct mapped cache, guarantied that they all point to same
physical page
33
Computer Design 2007 – VM
Backup
34
Computer Design 2007 – VM
Inverted Page Tables
IBM System 38 (AS400) implements 64-bit addresses.
48 bits translated
start of object contains a 12-bit tag
Virtual
Page
hash
V.Page P. Frame
=
=> TLBs or virtually addressed caches are
critical
35
Computer Design 2007 – VM
Hardware / Software Boundary
 What aspects of the Virtual → Physical Translation is determined
in hardware?
 TLB Format
 Type of Page Table
 Page Table Entry Format
 Disk Placement
 Paging Policy
36
Computer Design 2007 – VM
Why virtual memory?
 Generality
– ability to run programs larger than size of physical memory
 Storage management
– allocation/deallocation of variable sized blocks is costly and leads to
(external) fragmentation
 Protection
– regions of the address space can be R/O, Ex, . . .
 Flexibility
– portions of a program can be placed anywhere, without relocation
 Storage efficiency
– retain only most important portions of the program in memory
 Concurrent I/O
– execute other processes while loading/dumping page
 Expandability
– can leave room in virtual address space for objects to grow.
 Performance
37
Computer Design 2007 – VM
Address Translation with a TLB
n–1
p p–1
0
virtual page number page offset
valid
.
virtual address
tag physical page number
.
TLB
.
=
TLB hit
physical address
tag
index
valid tag
byte offset
data
Cache
=
cache hit
38
data
Computer Design 2007 – VM