Transcript 22 - Pages

Virtual Memory
1
Outline
• Virtual Space
• Address translation
• Accelerating translation
– with a TLB
– Multilevel page tables
• Different points of view
• Suggested reading: 10.1~10.6
TLB: Translation lookaside buffers
2
10.1 Physical and Virtual Addressing
3
Physical Addressing
• Attributes of the main memory
– Organized as an array of M contiguous byte-sized
cells
– Each byte has a unique physical address (PA)
started from 0
• physical addressing
– A CPU use physical addresses to access memory
• Examples
– Early PCs, DSP, embedded microcontrollers, and
Cray supercomputers
Contiguous: 临近的
4
Physical Addressing
Figure 10.1 P693
5
Virtual Addressing
• Virtual addressing
– the CPU accesses main memory by a virtual address
(VA)
• The virtual address is converted to the appropriate
physical address
6
Virtual Addressing
• Address translation
– Converting a virtual address to a physical one
– requires close cooperation between the CPU
hardware and the operating system
• the memory management unit (MMU)
– Dedicated hardware on the CPU chip to translate
virtual addresses on the fly
• A look-up table
– Stored in main memory
– Contents are managed by the operating system
7
Figure 10.2 P694
8
10.2 Address Space
9
Address Space
• Address Space
– An ordered set of nonnegative integer addresses
• Linear Space
– The integers in the address space are consecutive
• N-bit address space
10
Address Space
• K=210(Kilo), M=220(Mega), G=230(Giga),
T=240(Tera), P=250(Peta), E=260(Exa)
#virtual address
bits (n)
8
16
32
48
64
#virtual address
(N)
256
64K
4G
256T
16E
Practice Problem 10.1 P695
Largest possible
virtual address
255
64K-1
4G-1
256T-1
16E-1
11
Address Space
• Data objects and their attributes
– Bytes vs. addresses
• Each data object can have multiple
independent addresses
12
10.3 VM as a Tool for Caching
13
Using Main Memory as a Cache P695
SRAM
DRAM
Disk
14
10.3.1 DRAM Cache Organization
15
Using Main Memory as a Cache
• DRAM vs. disk is more extreme than SRAM vs.
DRAM
– Access latencies:
• DRAM ~10X slower than SRAM
• Disk ~100,000X slower than DRAM
– Bottom line:
• Design decisions made for DRAM caches driven by
enormous cost of misses
16
Design Considerations
• Line size?
– Large, since disk better at transferring large
blocks
• Associativity?
– High, to minimize miss rate
• Write through or write back?
– Write back, since can’t afford to perform small
writes to disk
17
10.3.2 Page Tables
18
Page
• Virtual memory
– Organized as an array of contiguous byte-sized
cells stored on disk conceptually.
– Each byte has a unique virtual address that serves
as an index into the array
– The contents of the array on disk are cached in
main memory
19
Page P695
• The data on disk is partitioned into blocks
– Serve as the transfer units between the disk and
the main memory
– virtual pages (VPs)
– physical pages (PPs)
• Also referred to as page frames
20
Page Attributes P695
• 1) Unallocated:
– Pages that have not yet been allocated (or created)
by the VM system
– Do not have any data associated with them
– Do not occupy any space on disk.
21
Page Attributes
• 2) Cached:
– Allocated pages that are currently cached in
physical memory.
• 3) Uncached:
– Allocated pages that are not cached in physical
memory.
22
Page
Figure 10.3 P696
23
Page Table
• Each allocate page of virtual memory has
entry in page table
• Mapping from virtual pages to physical pages
– From uncached form to cached form
• Page table entry even if page not in memory
– Specifies disk address
• OS retrieves information
24
Page Table
“Cache”
Page Table
Location
Data
Object Name
D:
0
0:
243
X
J:
On Disk
1:
17
•
•
•
N-1:
105
•
•
•
X:
1
25
Page Table
Virtual Page
Number
Memory resident
page table
(physical page
Valid or disk address)
1
1
0
1
1
1
0
1
0
1
Physical Memory
Disk Storage
(swap file or
regular file system file)
26
10.3.3 Page Hits
27
Page Hits
Figure 10.5 P698
Memory
0:
1:
Page Table
Virtual
Addresses
0:
1:
Physical
Addresses
CPU
P-1:
N-1:
Disk
Address Translation: Hardware converts virtual addresses to
physical addresses via an OS-managed lookup table (page table)
28
10.3.4 Page Faults
29
Page Faults
• Page table entry indicates virtual address not
in memory
• OS exception handler invoked to move data
from disk into memory
– current process suspends, others can resume
– OS has full control over placement, etc.
Suspend: 悬挂
30
Page Faults
• Swapping or paging
• Swapped out or paged out (from DRAM to Disk)
• Demand paging (Waiting until the last moment to swap
in a page, when a miss occurs)
Before fault
After fault
Memory
Memory
Page Table
Virtual
Addresses
Physical
Addresses
CPU
Page Table
Virtual
Addresses
Physical
Addresses
CPU
Disk
Figure 10.6 P699
Disk
31
Figure 10.7 P699
Servicing a Page Fault
• Processor
Signals
Controller
– Read block of
length P
starting at disk
address X and
store starting
at memory
address Y
(1) Initiate Block Read
Processor
Reg
Cache
Memory-I/O bus
Memory
I/O
controller
disk
Disk
disk
Disk
32
Servicing a Page Fault
• Read Occurs
– Direct Memory
Access (DMA)
– Under control
of I/O
controller
Processor
Reg
Cache
Memory-I/O bus
(2) DMA Transfer
Memory
I/O
controller
disk
Disk
disk
Disk
33
Servicing a Page Fault
• I / O Controller
Signals
Completion
– Interrupt
processor
– OS resumes
suspended
process
Processor
Reg
Cache
Memory-I/O bus
Memory
Resumes: 再继续,重新开始
(3) Read
Done
I/O
controller
disk
Disk
disk
Disk
34
10.3.5 Allocating Pages
35
Allocating Pages P700
• The operating system allocates a new page of
virtual memory, for example, as a result of
calling malloc.
Figure 10.8 P700
36
10.3.6 Locality to the Rescue Again
Rescue: 解救
37
Locality P700
• The principle of locality promises that at any
point in time they will tend to work on a
smaller set of active pages, known as working
set or resident set.
• Initial overhead where the working set is
paged into memory, subsequent references to
the working set result in hits, with no
additional disk traffic.
38
Locality-2 P700
• If the working set size exceeds the size of
physical memory, then the program can
produce an unfortunate situation known as
thrashing, where the pages are swapped in and
out continuously.
Thrash: 鞭打
39
10.4 VM as a Tool for Memory Management
40
A Tool for Memory Management
• Separate virtual address space
– Each process has its own virtual address space
• Simplify linking, sharing, loading, and memory
allocation
41
10.4.1 Simplifying Linking
42
A Tool for Memory Management
0
Virtual
Address
Space for
Process 1:
PP 2
Physical
Address
Space
(DRAM)
PP 7
(e.g., read/only
library code)
Address Translation
0
VP 1
VP 2
...
N-1
Virtual
Address
Space for
Process 2:
0
VP 1
PP 10
VP 2
...
M-1
N-1
Figure 10.9 P701
43
A Tool for Memory Management
0xc0000000
0xbfffffff
%esp
Linux/x86
process
memory
image
0x40000000
kernel virtual memory
memory invisible to
user code
stack
Memory mapped region
for shared libraries
the “brk” ptr
runtime heap (via malloc)
uninitialized data (.bss)
initialized data (.data)
0x08048000
program text (.text)
forbidden
Figure 10.10 P702
44
10.4.2 Simplifying Sharing
45
Simplifying Sharing
• In some instances, it is desirable for processes to
share code and data.
– The same operating system kernel code
– Make calls to routines in the standard C library
• The operating system can arrange for multiple process
to share a single copy of this code by mapping the
appropriate virtual pages in different processes to
the same physical pages
46
10.4.3 Simplifying Memory Allocation
47
Simplifying Memory Allocation
• A simple mechanism for allocating additional
memory to user processes.
• Page table work.
48
10.4.4 Simplifying Loading
49
Simplifying Loading
• Load executable and shared object files into
memory.
• Memory mapping……mmap
50
10.5 VM as a Tool for Memory Protection
51
A Tool for Memory Protection
• Page table entry contains access rights
information
– hardware enforces this protection (trap into OS if
violation occurs)
52
A Tool for Memory Protection
Page Tables
Read? Write?
Process i:
Physical Addr
VP 0: Yes
No
PP 9
VP 1: Yes
Yes
PP 4
No
XXXXXXX
VP 2:
No
•
•
•
•
•
•
Read? Write?
Process j:
Memory
•
•
•
Physical Addr
VP 0: Yes
Yes
PP 6
VP 1: Yes
No
PP 9
VP 2:
No
XXXXXXX
No
•
•
•
•
•
•
0:
1:
N-1:
•
•
•
53
A Tool for Memory Protection
Figure 10.11 P704
54
10.6 Address Translation
55
Address Translation
page fault
fault
handler
Processor
a
Hardware
Addr Trans
Mechanism

Main
Memory
Secondary
memory
a'
virtual address
part of the
physical address
on-chip
memory mgmt unit (MMU)
OS performs
this transfer
(only if miss)
56
Address Translation Figure 10.12 P705
• Parameters
– P = 2p = page size (bytes).
– N = 2n = Virtual address limit
– M = 2m = Physical address limit
57
Address Translation P705
n–1
p p–1
virtual page number
0
virtual address
page offset
address translation
m–1
physical page number
p p–1
0
page offset
physical address
Notice that the page offset bits don't change as a result of translation
58
Address Translation via Page Table
page table base register
(PTBR)
VPN acts as
table index
virtual address
n–1
p p–1
virtual page number (VPN)
page offset
0
valid access physical page number (PPN)
if valid=0
then page
not in memory
m–1
p p–1
physical page number (PPN)
page offset
Figure 10.13 P705
physical address
59
0
10.6.4 Putting it Together: End-to-End
Address Translation
60
Simple Memory System Example
• Addressing
– 14-bit virtual addresses
– 12-bit physical address
– Page size = 64 bits
13
12
11
10
9
8
7
6
VPN
(Virtual Page Number)
11
10
9
8
4
3
2
1
0
VPO
(Virtual Page Offset)
7
6
PPN
(Physical Page Number)
Figure 10.20 P712
5
5
4
3
2
1
0
PPO
(Physical Page Offset)
61
Simple Memory System Page Table
• Only show first 16 entries
VPN
PPN
Valid
VPN
PPN
Valid
00
28
1
08
13
1
01
–
0
09
17
1
02
33
1
0A
09
1
03
02
1
0B
–
0
04
–
0
0C
–
0
05
16
1
0D
2D
1
06
–
0
0E
11
1
07
–
0
0F
0D
1
Figure 10.21 P712 (b)
62
Address Translation Example P714
Virtual Address 0x03D4
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0
0
0
0
1
1
1
1
0
1
0
1
0
0
VPN
VPN: 0x0f
VPO
Page Fault? No
11
10
9
8
7
6
5
4
3
2
1
0
0
0
1
1
0
1
0
1
0
1
0
0
PPN
PPN: 0x0D
VPO: 0x14
PPO
PA: 0x354
63
Page Hits
Figure 10.14 (a) P706
VA: virtual address. PTEA: page table entry address.
PTE: page table entry. PA: physical address.
64
Page Faults
Figure 10.14 (b) P706
65
10.6.1 Integrating Caches and VM
66
Integrating Caches and VM
VA
CPU
miss
PA
Translation
Cache
Main
Memory
hit
data
67
Integrating Caches and VM
• Most Caches “Physically Addressed”
– Accessed by physical addresses
– Allows multiple processes to have blocks in cache at
same time
– Allows multiple processes to share pages
– Cache doesn’t need to be concerned with protection
issues
• Access rights checked as part of address
translation
68
Integrating Caches and VM
• Perform Address Translation Before Cache
Lookup
– But this could involve a memory access itself (of
the PTE)
– Of course, page table entries can also become
cached
69
Figure 10.15 P708
70
10.6.2 Speeding up Address Translation
with a TLB
71
Speeding up Translation with a TLB
• “Translation Lookaside Buffer” (TLB)
– Small hardware cache in MMU
– Maps virtual page numbers to physical page
numbers
72
Figure 10.17 (a) P709
73
Speeding up Translation with a TLB
Figure 10.16 P708
74
Speeding up Translation with a TLB
n–1
p p–1
0
virtual page number page offset
valid
.
virtual address
tag physical page number
.
.
TLB
=
TLB hit
physical address
tag
index
valid tag
byte offset
data
Cache
=
cache hit
data
75
Simple Memory System TLB
• TLB
– 16 entries
– 4-way associative
TLBT
13
12
11
TLBI
10
9
8
7
6
5
4
VPN
3
2
1
0
VPO
Set
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
Tag
PPN
Valid
0
03
–
0
09
0D
1
00
–
0
07
02
1
1
03
2D
1
02
–
0
04
–
0
0A
–
0
2
02
–
0
08
–
0
06
–
0
03
–
0
3
07
–
0
03
0D
1
0A
34
1
02
–
0
Figure 10.21 (a) P712
76
Address Translation Example P714
Virtual Address 0x03D4
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0
0
0
0
1
1
1
1
0
1
0
1
0
0
VPN
VPO
VPN: 0x0f TLBI: 0x03 TLBT: 0x03 TLB Hit? yes Page Fault? No
11
10
9
8
7
6
5
4
3
2
1
0
0
0
1
1
0
1
0
1
0
1
0
0
PPN
PPN: 0x0D
VPO: 0x14
PPO
PA: 0x354
77
Simple Memory System Cache
• Cache
– 16 lines
– 4-byte line size
– Direct mapped
78
Simple Memory System Cache
CI
CT
11
10
9
8
7
6
5
4
PPN
CO
3
2
1
0
PPO
Idx
Tag
Valid
B0
B1
B2
B3
Idx
Tag
Valid
B0
B1
B2
B3
0
19
1
99
11
23
11
8
24
1
3A
00
51
89
1
15
0
–
–
–
–
9
2D
0
–
–
–
–
2
1B
1
00
02
04
08
A
2D
1
93
15
DA
3B
3
36
0
–
–
–
–
B
0B
0
–
–
–
–
4
32
1
43
6D
8F
09
C
12
0
–
–
–
–
5
0D
1
36
72
F0
1D
D
16
1
04
96
34
15
6
31
0
–
–
–
–
E
13
1
83
77
1B
D3
7
16
1
11
C2
DF
03
F
14
0
–
–
–
–
Figure 10.21 (c) P713
79
Address Translation Example P714
PA: 0x354
CI
CT
CO
11
10
9
8
7
6
5
4
3
2
1
0
0
0
1
1
0
1
0
1
0
1
0
0
PPN
Offset: 0x0
PPO
CI: 0x05
CT: 0x0D
Hit? Yes
Byte: 0x36
80
10.6.3 Multi Level Page Tables
81
Multi-Level Page Tables
• Given:
– 4KB (212) page size
– 32-bit address space
– 4-byte PTE
• Problem:
– Would need a 4 MB page table!
• 220 *4 bytes
82
Multi-Level Page Tables
Level 2
Tables
• Common solution
– multi-level page tables
– e.g., 2-level table (P6)
• Level 1 table: 1024 entries,
Level 1
Table
each of which points to a
Level 2 page table.
• Level 2 table: 1024 entries,
...
each of which points to a
page
83
Multi-Level Page Tables
Figure 10.18 P710
84
Multi-Level Page Tables
Figure 10.19 P711
85