Transcript page table

CHAPTER 9:
MEMORY MANAGEMENT
(内存管理)
Background
 Swapping
 Contiguous Allocation
 Paging
 Segmentation
 Segmentation with Paging

BACKGROUND


Sharing  Management
 CPU sharing
 Memory sharing
Memory management
 Actual memory management
 Contiguous partition allocation
 Paging
 segmentation
 Virtual memory management
 Demand paging
 demand segmentation
Background :
Logical vs. Physical Address Space




Program must be brought into memory and placed within
a process for it to be run.
CPU  Memory
Logical address (逻辑地址) and Physical address (物理
地址)
 Logical address (逻辑地址)– generated by the CPU;
also referred to as virtual address.
 Physical address (物理地址)– address seen by the
memory unit.
Logical and physical addresses are the same in compiletime and load-time address-binding schemes; logical
(virtual) and physical addresses differ in execution-time
address-binding scheme.
Background:
Memory-Management Unit (MMU)
What is seen is not always true.
 Hardware device that maps virtual to physical
address.
 In MMU scheme, the value in the relocation
register is added to every address generated by
a user process at the time it is sent to memory.
 The user program deals with logical addresses;
it never sees the real physical addresses.

Background:
Dynamic relocation using a relocation register
Background :
Multistep Processing of a User Program
Background :
Binding of Instructions and Data to Memory
Address binding of instructions and data to
memory addresses can happen
at three different stages.
 Compile time: If you know at compile time where the
process will reside in memory, then absolute code can
be generated. Recompiling may be required.
 Load time: If it is not known at compile time where
the process will reside in memory, then the compiler
must generate relocatable code.
 Execution time: If the process can be moved during
its execution from one memory segment to another,
then binding must be delayed until run time. This
requires hardware support for address maps (e.g., base
Background:
Dynamic Loading (v.s. static loading)



Should the entire program and data of a process be in
physical memory for the process to execute?
Routine is not loaded until it is called.
Discussion
 Better memory-space utilization; unused routine is
never loaded.
 Useful when large amounts of code are needed to
handle infrequently occurring cases.
 No special support from the operating system is
required.
 The OS can help by providing library routines to
implement dynamic loading.
Background:
Dynamic Linking (v.s. static linking)






Static linking(静态连接)  dynamic linking(动态连接)
Linking postponed until execution time.
Small piece of code, stub(存根), used to locate the
appropriate memory-resident library routine.
Stub replaces itself with the address of the routine, and
executes the routine.
Operating system needed to check if routine is in
processes’ memory address.
Dynamic linking is particularly useful for libraries.
Background:
Overlays(覆盖 )

Keep in memory only those instructions and
data that are needed at any given time.

Needed when process is larger than amount of
memory allocated to it.

Implemented by user, no special support
needed from operating system, programming
design of overlay structure is complex.
Background:
Overlays for a Two-Pass Assembler
SWAPPING(交换)



A process can be swapped temporarily out of memory
to a backing store(备份仓库), and then brought
back into memory for continued execution.
Backing store – fast disk large enough to
accommodate the copies of all memory images for all
users; must provide direct access to these memory
images.
Roll out, roll in – swapping variant used for prioritybased scheduling algorithms; lower-priority process is
swapped out so higher-priority process can be loaded
and executed.
Swapping: Schematic View
Swapping: Backing store

Swap file

Swap device

Swap device and swap file
Swapping: Performance


Major part of swap time is transfer time; total transfer
time is directly proportional to the amount of memory
swapped.
 1000KB/5000KBS = 1/5 seconds = 200ms
Modified versions of swapping are found on many
systems, i.e., UNIX, Linux, and Windows.
 UNIX
 Windows 3.1
CONTIGUOUS MEMORY ALLOCATION



The main memory must accommodate
 both the operating system
 and the various user processes.
The main memory is usually divided into two
partitions:
 One for the resident operating system, usually held
in low memory with interrupt vector,
 The other for user processes, then held in high
memory.
For contiguous memory allocation, each process is
contained in a single contiguous section of memory.
Contiguous Memory Allocation:
Memory protection


Why to protect memory?
 To protect the OS from user processes
 And to protect user process from each other.
How to protect memory?
 To use relocation registers and limit registers to
protect user processes from each other, and from
changing operating-system code and data.
 The relocation register contains the value of
smallest physical address; the limit register
contains the range of logical addresses – each
logical address must be less than the limit register.
Contiguous Memory Allocation:
Hardware Support for Relocation and Limit Registers
Contiguous Memory Allocation:
Types

Fixed-sized contiguous partition

Dynamic contiguous partition

Buddy system
Contiguous Memory Allocation:
Fixed-Sized Contiguous Partitions


Main memory is divided into a number of fixed
partitions at system generation time. A process may
be loaded into a partition of equal or greater size.
 Equal-size partitions
 Unequal-size partions
Equal-size partitions:
 A program may be too big to fit into a partition.
 Main-memory utilization is extremely inefficient.
Contiguous Memory Allocation:
Fixed-Sized Contiguous Partitions


Unequal-size partitions:
 Placement algorithms for unequal-size partitions
 best-fit
 first-fit
 worst-fit
Conclusion for Fixed-Sized Contiguous Partition
 Strengths: Simple to implement; little overhead
 Weaknesses: internal fragmentation; fixed number
of processes
Contiguous Memory Allocation:
Dynamic Contiguous Partitions


Partitions are created dynamically, so that each
process is loaded into a partition of exactly the same
size as that process.
The OS maintains information about:
a) allocated partitions b) free partitions (hole)
 When a process arrives, find a hole for it
 When a process terminates, return the memory and
merge the holes
 Holes of various size are scattered throughout
memory.
Contiguous Memory Allocation:
Dynamic Contiguous Partitions
OS
OS
OS
OS
process 5
process 5
process 5
process 5
process 9
process 9
process 8
process 2
process 10
process 2
process 2
process 2
Contiguous Memory Allocation:
Dynamic Contiguous Partitions


Placement algorithms: first-fit, best-fit, worst-fit.
 First-fit: Allocate the first hole that is big enough.
 Best-fit: Allocate the smallest hole that is big
enough; must search entire list, unless ordered by
size. Produces the smallest leftover hole.
 Worst-fit: Allocate the largest hole; must also
search entire list. Produces the largest leftover hole.
Replacement algorithms:
 LRU,
 FIFO,
 and etc.
Contiguous Memory Allocation:
Dynamic Contiguous Partitions


Partitions are created dynamically, so that each process is
loaded into a partition of exactly the same size as that
process.
Dynamic Partitioning
 Strengths:
 no internal fragmentation;
 more efficient use of main memory.
 Weaknesses:
 external fragmentation; (internal fragmentation);

Compaction.
Contiguous Memory Allocation:
Fragmentation



External Fragmentation – total memory space exists to
satisfy a request, but it is not contiguous.
Internal Fragmentation – allocated memory may be
slightly larger than requested memory; this size difference is
memory internal to a partition, but not being used.
Reduce external fragmentation by compaction
 Shuffle memory contents to place all free memory
together in one large block.
 Compaction is possible only if relocation is dynamic, and
is done at execution time.
 I/O problem: batch job in memory while it is involved in
I/O. Do I/O only into OS buffers.
Contiguous Memory Allocation:
Buddy System
Contiguous Memory Allocation:
Buddy System
Contiguous Memory Allocation:
Buddy System
Easy to partition
 Easy to combine
 The buddy system is used for Linux kernel
memory allocation.

PAGING (分页)


Contiguous memory allocation External fragmentation
Noncontiguous memory allocation  no external
fragmentation.
 Paging:
 to divide the process into equal sized pages.
(physical division)
 Segmentation:
 to divide the process into non-equal sized.
(logical division)
 Paging + segmentation
Paging

Paging
Divide physical memory into fixed-sized blocks called
frames(帧).
 Divide logical memory into fixed-sized blocks called
pages(页). And page size is equal to frame size.
 Divide the backing store into blocks of same size
called clusters(块、簇)
 To run a program of size n pages,
 to find n free frames
 to load program.

Paging
How to translate logical address to physical address
 Logical address (page number页码, page offset页偏移)
 Page number (p) – used as an index into a page table
which contains base address of each page in physical
memory.
 Page offset (d) – combined with base address to
define the physical memory address that is sent to the
memory unit.
 Page table (页表)(The page table contains the base
address of each page in physical memory)
 Physical address (frame number帧码, page offset页偏移)
Paging
Paging
Paging: Page size


How to partition logical addresses?
 The page size is selected as a power of 2.
 m bit logical address
= m-n bit page size
+ n bit page offset
Possible page sizes:
 512B  16MB
Paging: Page size
Paging: Page size
0  (5x4)+0 physical address
1  (5x4)+1 physical address
2  (5x4)+2 physical address
3  (5x4)+3 physical address
4  (6x4)+0 physical address
5  (6x4)+1 physical address

Logical address
Logical address
Logical address
Logical address
Logical address
Logical address

Logical address 13  (2x4)+2 physical address





Paging: Page size

How to select page sizes?
 Page table size
 Internal fragmentation
 Disk I/O
 Generally page sizes have grown over time as
processes, data, and main memory have become
larger.
Paging:The OS Concern

What should the OS do?
 Which frames are allocated?
 Which frames are available?
 How to allocate frames for a newly arrived process?
 Placement algorithm (放置算法)
 Replacement algorithms (替换算法)
Paging:The OS Concern
Paging: Implementation of Page Table

How to implement the page table
 Small, fast registers
 Main memory
 Main memory + Translation look-aside
buffer (联想存储器)
Paging: Implementation of Page Table



Page table is kept in main memory.
Page-table base register (PTBR) points to the page
table.
In this scheme every data/instruction access requires
two memory accesses.
 One for the page table and
 one for the data/instruction.
Paging: Implementation of Page Table


Page-table length register (PRLR) indicates size of
the page table.
The two memory access problem can be solved by the
use of a special fast-lookup hardware cache called
associative memory or translation look-aside buffers
(TLBs)
 Associative memory – parallel search
Paging: Implementation of Page Table
: Paging Hardware With TLB
Paging: Implementation of Page Table
: Paging Hardware With TLB

Effective Access Time (EAT)
hit ratio = 0.8
EAT = 0.80*120 + 0.20*220 = 140 ns
hit ratio = 0.98
EAT = 0.98*120 + 0.02*220 = 122 ns
Paging: Memory Protection


Memory protection implemented by associating
protection bit with each frame.
 Read-write or read-only
 Read-write or read-only or executed only or what
ever
Valid-invalid bit attached to each entry in the page
table:
 “valid” indicates that the associated page is in the
process’ logical address space, and is thus a legal
page.
 “invalid” indicates that the page is not in the
process’ logical address space.
Paging: Memory Protection
:Valid (v) or Invalid (i) Bit In A Page Table
A system: 14bit address space(0-16383),
A process: uses only address of (0-10468)
Paging: Page Table Structure

The page table can be larger:
For a 32 bit CPU:
4k page => 2^32/2^12=1M Words = 4M B
The solution:
 Hierarchical Paging
 Hashed Page Tables
 Inverted Page Tables
Paging: Page Table Structure
: Hierarchical Page Tables

Break up the logical address space into
multiple page tables.

A simple technique is a two-level page table.
Paging: Page Table Structure
: Hierarchical Page Tables
A logical address (on 32-bit machine with 4K
page size) is divided into:
 a page number consisting of 20 bits.
 a page offset
consisting of 12 bits.
 Since the page table is paged, the page number
is further divided into:
 a 10-bit page number.
 a 10-bit page offset.
 Linux uses three-level paging.

Paging: Page Table Structure
: Hierarchical Page Tables
Address-translation scheme for a two-level 32-bit
paging architecture (Intel Pentium2)
Paging: Page Table Structure
: Hierarchical Page Tables
Paging: Page Table Structure
: Hashed Page Tables



Common in address spaces > 32 bits.
 For 64 bits,
 paged paged … paged tables => intolerable and
prohibitive
 How to quickly find the pair given a key
 binary searching or hashing
The virtual page number is hashed into a page table. This
page table contains a chain of elements hashing to the
same location.
Virtual page numbers are compared in this chain
searching for a match. If a match is found, the
corresponding physical frame is extracted.
Paging: Page Table Structure
: Hashed Page Tables
Paging: Page Table Structure
: Inverted Page Tables


For an OS, many processes  many page tables 
much overhead  why not keep one big page table?
The inverted page table
 One entry for each real page of memory.
 Entry consists of the virtual address of the page stored
in that real memory location, with information about
the process that owns that page.
 Use hash table to limit the search to one — or at most
a few — page-table entries. Decreases memory
needed to store each page table, but increases time
needed to search the table when a page reference
occurs.
Paging: Page Table Structure
: Inverted Page Tables
Paging: Shared Pages


In a time-sharing environment,
 40 users execute a text editor. One text editor
process consists 150KB code and 50 KB data.
 Without sharing
(150KB + 50KB)*40 = 200KB * 40 = 8000KB
 With sharing
50KB*40 + 150KB = 2000KB + 150KB =
2150KB
Some other heavily used programs can also be shared
– compilers, window systems, run-time libraries,
database systems, and so on.
Paging: Shared Pages
Paging: Shared Pages




What can be shared
 Reentrant code (pure code)
 If the code is reentrant, then it never changes
during execution. Two or more processes can
execute the same code at the same time.
 Read-only data
What can not be shared
 Each process has its own copy of registers and data
storage to hold the data for the process’s execution.
The OS should provides facility to enforce some
necessary property for sharing.
Paging has numerous other benefits as well.
SEGMENTATION

Paging
 Mapping to allow differentiation between logical
memory and physical memory.
 Separation of the user’s view of memory and the
actual physical memory.
 Chopping a process into equally-sized pieces.

Any scheme for dividing a process into a collection
of semantic units?
(syntactic, semantic)
Segmentation



What is the user’s view of program?
 A program is a collection of segments.
 Procedural style, module style, object-based style,
object-oriented style, logical style, functional style.
OO Style:
Classes (states and behavior), derived classes,
objects,
May have a main program
Procedural style: A segment is a logical unit such as:
main program, procedure, function, method,
object,local variables, global variables,
common block, stack, symbol table, arrays
Segmentation: User’s view of a program

A program has many segments.
Every segment has a name.
 Every segment is of variable length, its length is
intrinsically defined by the purpose of the segment
in the program.
 Elements within a segment are identified by their
offset from the beginning of the segment.
 The order among the segments is insignificant
(different from pages).

Segmentation: User’s View of a Program
Segmentation: Logical View of Segmentation





A logical-address space is a collection of segments
Each segment has a name and a length.
The address specifies both the segment name and the
offset within the segment.
In implementation, the address specifies both the
segment number and the offset within the segment.
X86: CS, DS, etc.
Segmentation: Logical View of Segmentation
1
1
4
2
3
4
2
3
user space
physical memory space
Segmentation: Logical View of Segmentation



How to segment
 By programmer
 By compiler
A Pascal compiler might create separate segments for the
following
 The global variables;
 The procedure call stack, to store parameters and
return addresses;
 The code portion of each procedure or function;
 The local variables of each procedure or function.
A Fortran compiler might create a separate segment for
each common block. Arrays might be assigned separate
segments.
Segmentation: Hardware



How to map logical addresses <segment-number,
offset> to physical address?
Each entry of the segment table has a segment base
and a segment limit; each table entry has:
 base – contains the starting physical address where
the segments reside in memory.
 limit – specifies the length of the segment.
Two special registers
 Segment-table base register (STBR) points to the
segment table’s location in memory.
 Segment-table length register (STLR) indicates
number of segments used by a program.
Segmentation: Hardware



A local address consists of two parts: a segment
number s, and an offset d.
The segment number s is used as an index into the
segment table.
 To check segment number s is legal if s < STLR.
 To get both the base and limit values for the
segment.
The offset d must be between 0 and the segment limit
 If invalid, illegal memory access.
 If valid,
base + d  physical address.
Segmentation: Hardware
Segmentation: Hardware
Segmentation: Hardware



A process has 5 segments numbered from 0 through 4
For the logical space, segment table, physical
memory layout, see the previous slide.
Address mapping
 Segment 1 Byte 1222 => invalid
 Segment 2 Byte 400 => 4300+53
 Segment 3 Byte 852 => 3200+852
Segmentation: Protection of Segments

Easy to protect the whole segments
 To protect the executable code
 To protect the read-only data
 To auto-check array indexes if the array is put in
its own segment
Segmentation: Sharing of Segments
Segmentation: Sharing of Segments

The problem with sharing segments
 Logical address: (segment number + offset)
 The shared segment maybe use (segment
number +offset) to reference to its own address.
 The shared segment should have one single
segment number.
• Could be a problem
SEGMENTATION + PAGING




Paging or segmentation?
In the old days,
 Motorola 68000 used paging.
 Intel 80x86
used segmentation.
Now
 both combines paging and segmentation
The OS for I386
 OS/2 from IBM
 NT
from MS
Segmentation + Paging: Intel 386






The logical address space of a process (16KB segment
descriptor table) is divided into two partitions.
Information about the first partition (8KB segment
descriptor table) is in LDT.
Information about the second partition (8KB segment
descriptor table) is in GDT.
Every entry in LDT or GDT consists of 8 bytes, with
detailed information about a particular segment including
the base location and length of that segment.
The machine has 6 segment registers, allowing 6 segments
to be addressed at any one time by a process.
The machine has 8-byte microprogram registers to hold the
corresponding descriptors from either the LDT or GDT.
Segmentation + Paging: Intel 386

I386
 Logical addresses Linear addresses
Physical addresses
 Logical addresses Linear addresses
 Segment + offset
 Segment (16bits)
• How to segment registers?
–Shifting left
–Shifting right
 Segment base + offset  linear addresses
Segmentation + Paging: Intel 386

I386
 Linear addresses Physical addresses
 One level paging?
• 2^32/2^12 = 2^20 entries
• 2^20 * 4 = 4MB
 Two level paging
• 10 bit for page directory
• 10 bit for page page entry
• 12 bit for page offset.
Segmentation + Paging: Intel 386
Address-translation scheme for a two-level 32-bit
paging architecture is similar to the following
diagram.
Segmentation + Paging: Intel 386
Linux MM
task_struct (include/linux/sched.h)
mm_struct (include/linux/sched.h)
vm_area_struct (include/linux/mm.h)
Homework
9.9
 9.10
 9.11
 9.12
 9.16

