Transcript page table
CHAPTER 9:
MEMORY MANAGEMENT
(内存管理)
Background
Swapping
Contiguous Allocation
Paging
Segmentation
Segmentation with Paging
BACKGROUND
Sharing Management
CPU sharing
Memory sharing
Memory management
Actual memory management
Contiguous partition allocation
Paging
segmentation
Virtual memory management
Demand paging
demand segmentation
Background :
Logical vs. Physical Address Space
Program must be brought into memory and placed within
a process for it to be run.
CPU Memory
Logical address (逻辑地址) and Physical address (物理
地址)
Logical address (逻辑地址)– generated by the CPU;
also referred to as virtual address.
Physical address (物理地址)– address seen by the
memory unit.
Logical and physical addresses are the same in compiletime and load-time address-binding schemes; logical
(virtual) and physical addresses differ in execution-time
address-binding scheme.
Background:
Memory-Management Unit (MMU)
What is seen is not always true.
Hardware device that maps virtual to physical
address.
In MMU scheme, the value in the relocation
register is added to every address generated by
a user process at the time it is sent to memory.
The user program deals with logical addresses;
it never sees the real physical addresses.
Background:
Dynamic relocation using a relocation register
Background :
Multistep Processing of a User Program
Background :
Binding of Instructions and Data to Memory
Address binding of instructions and data to
memory addresses can happen
at three different stages.
Compile time: If you know at compile time where the
process will reside in memory, then absolute code can
be generated. Recompiling may be required.
Load time: If it is not known at compile time where
the process will reside in memory, then the compiler
must generate relocatable code.
Execution time: If the process can be moved during
its execution from one memory segment to another,
then binding must be delayed until run time. This
requires hardware support for address maps (e.g., base
Background:
Dynamic Loading (v.s. static loading)
Should the entire program and data of a process be in
physical memory for the process to execute?
Routine is not loaded until it is called.
Discussion
Better memory-space utilization; unused routine is
never loaded.
Useful when large amounts of code are needed to
handle infrequently occurring cases.
No special support from the operating system is
required.
The OS can help by providing library routines to
implement dynamic loading.
Background:
Dynamic Linking (v.s. static linking)
Static linking(静态连接) dynamic linking(动态连接)
Linking postponed until execution time.
Small piece of code, stub(存根), used to locate the
appropriate memory-resident library routine.
Stub replaces itself with the address of the routine, and
executes the routine.
Operating system needed to check if routine is in
processes’ memory address.
Dynamic linking is particularly useful for libraries.
Background:
Overlays(覆盖 )
Keep in memory only those instructions and
data that are needed at any given time.
Needed when process is larger than amount of
memory allocated to it.
Implemented by user, no special support
needed from operating system, programming
design of overlay structure is complex.
Background:
Overlays for a Two-Pass Assembler
SWAPPING(交换)
A process can be swapped temporarily out of memory
to a backing store(备份仓库), and then brought
back into memory for continued execution.
Backing store – fast disk large enough to
accommodate the copies of all memory images for all
users; must provide direct access to these memory
images.
Roll out, roll in – swapping variant used for prioritybased scheduling algorithms; lower-priority process is
swapped out so higher-priority process can be loaded
and executed.
Swapping: Schematic View
Swapping: Backing store
Swap file
Swap device
Swap device and swap file
Swapping: Performance
Major part of swap time is transfer time; total transfer
time is directly proportional to the amount of memory
swapped.
1000KB/5000KBS = 1/5 seconds = 200ms
Modified versions of swapping are found on many
systems, i.e., UNIX, Linux, and Windows.
UNIX
Windows 3.1
CONTIGUOUS MEMORY ALLOCATION
The main memory must accommodate
both the operating system
and the various user processes.
The main memory is usually divided into two
partitions:
One for the resident operating system, usually held
in low memory with interrupt vector,
The other for user processes, then held in high
memory.
For contiguous memory allocation, each process is
contained in a single contiguous section of memory.
Contiguous Memory Allocation:
Memory protection
Why to protect memory?
To protect the OS from user processes
And to protect user process from each other.
How to protect memory?
To use relocation registers and limit registers to
protect user processes from each other, and from
changing operating-system code and data.
The relocation register contains the value of
smallest physical address; the limit register
contains the range of logical addresses – each
logical address must be less than the limit register.
Contiguous Memory Allocation:
Hardware Support for Relocation and Limit Registers
Contiguous Memory Allocation:
Types
Fixed-sized contiguous partition
Dynamic contiguous partition
Buddy system
Contiguous Memory Allocation:
Fixed-Sized Contiguous Partitions
Main memory is divided into a number of fixed
partitions at system generation time. A process may
be loaded into a partition of equal or greater size.
Equal-size partitions
Unequal-size partions
Equal-size partitions:
A program may be too big to fit into a partition.
Main-memory utilization is extremely inefficient.
Contiguous Memory Allocation:
Fixed-Sized Contiguous Partitions
Unequal-size partitions:
Placement algorithms for unequal-size partitions
best-fit
first-fit
worst-fit
Conclusion for Fixed-Sized Contiguous Partition
Strengths: Simple to implement; little overhead
Weaknesses: internal fragmentation; fixed number
of processes
Contiguous Memory Allocation:
Dynamic Contiguous Partitions
Partitions are created dynamically, so that each
process is loaded into a partition of exactly the same
size as that process.
The OS maintains information about:
a) allocated partitions b) free partitions (hole)
When a process arrives, find a hole for it
When a process terminates, return the memory and
merge the holes
Holes of various size are scattered throughout
memory.
Contiguous Memory Allocation:
Dynamic Contiguous Partitions
OS
OS
OS
OS
process 5
process 5
process 5
process 5
process 9
process 9
process 8
process 2
process 10
process 2
process 2
process 2
Contiguous Memory Allocation:
Dynamic Contiguous Partitions
Placement algorithms: first-fit, best-fit, worst-fit.
First-fit: Allocate the first hole that is big enough.
Best-fit: Allocate the smallest hole that is big
enough; must search entire list, unless ordered by
size. Produces the smallest leftover hole.
Worst-fit: Allocate the largest hole; must also
search entire list. Produces the largest leftover hole.
Replacement algorithms:
LRU,
FIFO,
and etc.
Contiguous Memory Allocation:
Dynamic Contiguous Partitions
Partitions are created dynamically, so that each process is
loaded into a partition of exactly the same size as that
process.
Dynamic Partitioning
Strengths:
no internal fragmentation;
more efficient use of main memory.
Weaknesses:
external fragmentation; (internal fragmentation);
Compaction.
Contiguous Memory Allocation:
Fragmentation
External Fragmentation – total memory space exists to
satisfy a request, but it is not contiguous.
Internal Fragmentation – allocated memory may be
slightly larger than requested memory; this size difference is
memory internal to a partition, but not being used.
Reduce external fragmentation by compaction
Shuffle memory contents to place all free memory
together in one large block.
Compaction is possible only if relocation is dynamic, and
is done at execution time.
I/O problem: batch job in memory while it is involved in
I/O. Do I/O only into OS buffers.
Contiguous Memory Allocation:
Buddy System
Contiguous Memory Allocation:
Buddy System
Contiguous Memory Allocation:
Buddy System
Easy to partition
Easy to combine
The buddy system is used for Linux kernel
memory allocation.
PAGING (分页)
Contiguous memory allocation External fragmentation
Noncontiguous memory allocation no external
fragmentation.
Paging:
to divide the process into equal sized pages.
(physical division)
Segmentation:
to divide the process into non-equal sized.
(logical division)
Paging + segmentation
Paging
Paging
Divide physical memory into fixed-sized blocks called
frames(帧).
Divide logical memory into fixed-sized blocks called
pages(页). And page size is equal to frame size.
Divide the backing store into blocks of same size
called clusters(块、簇)
To run a program of size n pages,
to find n free frames
to load program.
Paging
How to translate logical address to physical address
Logical address (page number页码, page offset页偏移)
Page number (p) – used as an index into a page table
which contains base address of each page in physical
memory.
Page offset (d) – combined with base address to
define the physical memory address that is sent to the
memory unit.
Page table (页表)(The page table contains the base
address of each page in physical memory)
Physical address (frame number帧码, page offset页偏移)
Paging
Paging
Paging: Page size
How to partition logical addresses?
The page size is selected as a power of 2.
m bit logical address
= m-n bit page size
+ n bit page offset
Possible page sizes:
512B 16MB
Paging: Page size
Paging: Page size
0 (5x4)+0 physical address
1 (5x4)+1 physical address
2 (5x4)+2 physical address
3 (5x4)+3 physical address
4 (6x4)+0 physical address
5 (6x4)+1 physical address
Logical address
Logical address
Logical address
Logical address
Logical address
Logical address
Logical address 13 (2x4)+2 physical address
Paging: Page size
How to select page sizes?
Page table size
Internal fragmentation
Disk I/O
Generally page sizes have grown over time as
processes, data, and main memory have become
larger.
Paging:The OS Concern
What should the OS do?
Which frames are allocated?
Which frames are available?
How to allocate frames for a newly arrived process?
Placement algorithm (放置算法)
Replacement algorithms (替换算法)
Paging:The OS Concern
Paging: Implementation of Page Table
How to implement the page table
Small, fast registers
Main memory
Main memory + Translation look-aside
buffer (联想存储器)
Paging: Implementation of Page Table
Page table is kept in main memory.
Page-table base register (PTBR) points to the page
table.
In this scheme every data/instruction access requires
two memory accesses.
One for the page table and
one for the data/instruction.
Paging: Implementation of Page Table
Page-table length register (PRLR) indicates size of
the page table.
The two memory access problem can be solved by the
use of a special fast-lookup hardware cache called
associative memory or translation look-aside buffers
(TLBs)
Associative memory – parallel search
Paging: Implementation of Page Table
: Paging Hardware With TLB
Paging: Implementation of Page Table
: Paging Hardware With TLB
Effective Access Time (EAT)
hit ratio = 0.8
EAT = 0.80*120 + 0.20*220 = 140 ns
hit ratio = 0.98
EAT = 0.98*120 + 0.02*220 = 122 ns
Paging: Memory Protection
Memory protection implemented by associating
protection bit with each frame.
Read-write or read-only
Read-write or read-only or executed only or what
ever
Valid-invalid bit attached to each entry in the page
table:
“valid” indicates that the associated page is in the
process’ logical address space, and is thus a legal
page.
“invalid” indicates that the page is not in the
process’ logical address space.
Paging: Memory Protection
:Valid (v) or Invalid (i) Bit In A Page Table
A system: 14bit address space(0-16383),
A process: uses only address of (0-10468)
Paging: Page Table Structure
The page table can be larger:
For a 32 bit CPU:
4k page => 2^32/2^12=1M Words = 4M B
The solution:
Hierarchical Paging
Hashed Page Tables
Inverted Page Tables
Paging: Page Table Structure
: Hierarchical Page Tables
Break up the logical address space into
multiple page tables.
A simple technique is a two-level page table.
Paging: Page Table Structure
: Hierarchical Page Tables
A logical address (on 32-bit machine with 4K
page size) is divided into:
a page number consisting of 20 bits.
a page offset
consisting of 12 bits.
Since the page table is paged, the page number
is further divided into:
a 10-bit page number.
a 10-bit page offset.
Linux uses three-level paging.
Paging: Page Table Structure
: Hierarchical Page Tables
Address-translation scheme for a two-level 32-bit
paging architecture (Intel Pentium2)
Paging: Page Table Structure
: Hierarchical Page Tables
Paging: Page Table Structure
: Hashed Page Tables
Common in address spaces > 32 bits.
For 64 bits,
paged paged … paged tables => intolerable and
prohibitive
How to quickly find the pair given a key
binary searching or hashing
The virtual page number is hashed into a page table. This
page table contains a chain of elements hashing to the
same location.
Virtual page numbers are compared in this chain
searching for a match. If a match is found, the
corresponding physical frame is extracted.
Paging: Page Table Structure
: Hashed Page Tables
Paging: Page Table Structure
: Inverted Page Tables
For an OS, many processes many page tables
much overhead why not keep one big page table?
The inverted page table
One entry for each real page of memory.
Entry consists of the virtual address of the page stored
in that real memory location, with information about
the process that owns that page.
Use hash table to limit the search to one — or at most
a few — page-table entries. Decreases memory
needed to store each page table, but increases time
needed to search the table when a page reference
occurs.
Paging: Page Table Structure
: Inverted Page Tables
Paging: Shared Pages
In a time-sharing environment,
40 users execute a text editor. One text editor
process consists 150KB code and 50 KB data.
Without sharing
(150KB + 50KB)*40 = 200KB * 40 = 8000KB
With sharing
50KB*40 + 150KB = 2000KB + 150KB =
2150KB
Some other heavily used programs can also be shared
– compilers, window systems, run-time libraries,
database systems, and so on.
Paging: Shared Pages
Paging: Shared Pages
What can be shared
Reentrant code (pure code)
If the code is reentrant, then it never changes
during execution. Two or more processes can
execute the same code at the same time.
Read-only data
What can not be shared
Each process has its own copy of registers and data
storage to hold the data for the process’s execution.
The OS should provides facility to enforce some
necessary property for sharing.
Paging has numerous other benefits as well.
SEGMENTATION
Paging
Mapping to allow differentiation between logical
memory and physical memory.
Separation of the user’s view of memory and the
actual physical memory.
Chopping a process into equally-sized pieces.
Any scheme for dividing a process into a collection
of semantic units?
(syntactic, semantic)
Segmentation
What is the user’s view of program?
A program is a collection of segments.
Procedural style, module style, object-based style,
object-oriented style, logical style, functional style.
OO Style:
Classes (states and behavior), derived classes,
objects,
May have a main program
Procedural style: A segment is a logical unit such as:
main program, procedure, function, method,
object,local variables, global variables,
common block, stack, symbol table, arrays
Segmentation: User’s view of a program
A program has many segments.
Every segment has a name.
Every segment is of variable length, its length is
intrinsically defined by the purpose of the segment
in the program.
Elements within a segment are identified by their
offset from the beginning of the segment.
The order among the segments is insignificant
(different from pages).
Segmentation: User’s View of a Program
Segmentation: Logical View of Segmentation
A logical-address space is a collection of segments
Each segment has a name and a length.
The address specifies both the segment name and the
offset within the segment.
In implementation, the address specifies both the
segment number and the offset within the segment.
X86: CS, DS, etc.
Segmentation: Logical View of Segmentation
1
1
4
2
3
4
2
3
user space
physical memory space
Segmentation: Logical View of Segmentation
How to segment
By programmer
By compiler
A Pascal compiler might create separate segments for the
following
The global variables;
The procedure call stack, to store parameters and
return addresses;
The code portion of each procedure or function;
The local variables of each procedure or function.
A Fortran compiler might create a separate segment for
each common block. Arrays might be assigned separate
segments.
Segmentation: Hardware
How to map logical addresses <segment-number,
offset> to physical address?
Each entry of the segment table has a segment base
and a segment limit; each table entry has:
base – contains the starting physical address where
the segments reside in memory.
limit – specifies the length of the segment.
Two special registers
Segment-table base register (STBR) points to the
segment table’s location in memory.
Segment-table length register (STLR) indicates
number of segments used by a program.
Segmentation: Hardware
A local address consists of two parts: a segment
number s, and an offset d.
The segment number s is used as an index into the
segment table.
To check segment number s is legal if s < STLR.
To get both the base and limit values for the
segment.
The offset d must be between 0 and the segment limit
If invalid, illegal memory access.
If valid,
base + d physical address.
Segmentation: Hardware
Segmentation: Hardware
Segmentation: Hardware
A process has 5 segments numbered from 0 through 4
For the logical space, segment table, physical
memory layout, see the previous slide.
Address mapping
Segment 1 Byte 1222 => invalid
Segment 2 Byte 400 => 4300+53
Segment 3 Byte 852 => 3200+852
Segmentation: Protection of Segments
Easy to protect the whole segments
To protect the executable code
To protect the read-only data
To auto-check array indexes if the array is put in
its own segment
Segmentation: Sharing of Segments
Segmentation: Sharing of Segments
The problem with sharing segments
Logical address: (segment number + offset)
The shared segment maybe use (segment
number +offset) to reference to its own address.
The shared segment should have one single
segment number.
• Could be a problem
SEGMENTATION + PAGING
Paging or segmentation?
In the old days,
Motorola 68000 used paging.
Intel 80x86
used segmentation.
Now
both combines paging and segmentation
The OS for I386
OS/2 from IBM
NT
from MS
Segmentation + Paging: Intel 386
The logical address space of a process (16KB segment
descriptor table) is divided into two partitions.
Information about the first partition (8KB segment
descriptor table) is in LDT.
Information about the second partition (8KB segment
descriptor table) is in GDT.
Every entry in LDT or GDT consists of 8 bytes, with
detailed information about a particular segment including
the base location and length of that segment.
The machine has 6 segment registers, allowing 6 segments
to be addressed at any one time by a process.
The machine has 8-byte microprogram registers to hold the
corresponding descriptors from either the LDT or GDT.
Segmentation + Paging: Intel 386
I386
Logical addresses Linear addresses
Physical addresses
Logical addresses Linear addresses
Segment + offset
Segment (16bits)
• How to segment registers?
–Shifting left
–Shifting right
Segment base + offset linear addresses
Segmentation + Paging: Intel 386
I386
Linear addresses Physical addresses
One level paging?
• 2^32/2^12 = 2^20 entries
• 2^20 * 4 = 4MB
Two level paging
• 10 bit for page directory
• 10 bit for page page entry
• 12 bit for page offset.
Segmentation + Paging: Intel 386
Address-translation scheme for a two-level 32-bit
paging architecture is similar to the following
diagram.
Segmentation + Paging: Intel 386
Linux MM
task_struct (include/linux/sched.h)
mm_struct (include/linux/sched.h)
vm_area_struct (include/linux/mm.h)
Homework
9.9
9.10
9.11
9.12
9.16