Slide set 22

Download Report

Transcript Slide set 22

Memory Management
Managing Memory … The
Simplest Case
0xFFF …
The O/S
* Early PCs and Mainframes
* Embedded Systems
One user program at a time.
The logical address space is
the same as the physical
address space.
User
Program
0
But in most modern systems …
Modern memory managers subdivide memory to
accommodate multiple processes.
Memory needs to be allocated efficiently to pack as many
processes into memory as possible
When required a process should be able to have exclusive
use of a block of memory, or permit sharing of the
memory by multiple processes.
Primary memory is abstracted so that a program
perceives that the memory allocated to it is a large array
of contiguously addressed bytes (but it usually isn’t).
Memory Managers: Four Concerns
*
*
*
*
Relocation
Protection
Sharing
Physical Organization
Relocation
The Programmer does not know where the program will
be placed in memory when it is executed. While the
program is executing, it may be swapped to disk and
returned to main memory at a different location
(relocated). Memory references in the code must be
translated to actual physical memory address
Physical address
space
Process’s logical
address space
Protection
With multiple processes in memory and running
simultaneously” the system must protect one
process from referencing memory locations in
another process. This is more of a hardware
responsibility than it is an operating system
responsibility.
Loading and Linking a Program
Source
All references to data or
functions are resolved…
Compiler
Program Segment
Data Segment
Stack Segment
Relocatable
Object Module
Relocatable
Object Modules
Load Module
Linker
Library Modules
Load Module
Loader
Process Image
In Main Memory
Absolute Loader
no changes in addresses
0
0
program
program
data
data
Load Module
Process Image
In Main Memory
Static Binding
Add the offset to every
address as the code is
loaded.
0
Offset = 1000
program
Jump 400
data
Load Module
All addresses in the
code segment are
relative to 0
1000
program
Jump 1400
data
Process Image
In Main Memory
Dynamic Run-time Binding
0
Offset = 1000
program
Jump 400
data
Load Module
All addresses in the
code segment are
relative to 0
program
Jump 400
data
Process Image
In Main Memory
Addresses are
maintained in
relative format
Address translation
takes place on the
fly at run time.
Base Register
1000
jump, 400
Adder
1400
Comparator
relative address
program
data
absolute address
Limit Register
interrupt
segment error!
A Code Example
...
static int gVar; static variables are stored in the
data segment.
...
int function_a(int arg) generated code will be stored
in the code segment.
{
...
}
gVar = 7;
libFunction( ) is defined in an external
libFunction(gVar); module. At compile time we don’t know
...
the address of its entry point
...
static int gVar;
...
int function_a(int arg)
{
...
gVar = 7;
libFunction(gVar);
...
Data Seg
}
Code Segment
A Code Example
relative addresses
0000
...
0008
...
0220
0224
0228
0228
...
...
0400
...
0404
...
0500
...
0540
...
0600
...
0799
External Reference Table
...
0036
...
0049
entry function_a
load r1, 7
store r1, 0036
push 0036
call libFunction
“libFunction”
????
External Definition Table
“function_a”
0008
Symbol Table
End of Code Segment
[space for gVar]
End of Data segment
0000
...
0008
...
0220
0224
0228
0228
...
...
0400
...
0404
...
0500
...
0540
...
0600
...
0799
External Reference Table
Contains an external definition table
Indicating the relative entry point
load r1, 7
store r1, 0036
push 0036
call libFunction
????
External Definition Table
“function_a”
Symbol Table
0008
Code Segment
“libFunction”
relative addresses
(end of code segment)
[space for gVar]
(end of data segment)
Object File
Data seg
...
0036
...
0049
libFunction
entry function_a
0000
...
1008
...
1220
1224
1228
1232
...
1399
...
2334
...
2999
...
0136
...
1000
(other modules)
entry function_a
load r1, 7
store r1, 0136
push 1036
call 2334
(end of function_a)
(other modules)
entry libFunction
(end of code segment)
[space for gVar]
(end of data segment)
0000
...
1008
...
1220
1224
1228
1232
...
1399
...
2334
...
2999
...
0136
...
1000
(other modules)
entry function_a
load r1, 7
store r1, 0136
push 0136
call 2334
(end of function_a)
(other modules)
entry libFunction
(end of code segment)
[space for gVar]
(end of data segment)
Load Module
static
bind
real addresses
0000
(other modules)
...
5008
entry function_a
...
5220
load r1, 7
5224
store r1, 7136
5228
push 7136
5232
call 6334
...
5399
(end of function_a)
...
(other modules)
6334
entry libFunction
...
6999
(end of code segment)
...
7136
...
8000
[space for gVar]
(end of data segment)
Sharing
Multiple processes (fork) running the same
executable
Shared memory
Physical Organization
The flow of information between the various
“levels” of memory.
fast but expensive
Registers
1 machine cycle
Cache
RAM
Disk
Optical, Tape, etc
cheap but slow
Memory Allocation
Before an address space can be bound to physical
addresses, the memory manager must allocate
the space in real memory where the address space will
be mapped to. There are a number of schemes to
do memory allocation.
Fixed Partitioning
Equal-size fixed partitions
any process whose size is less than or
equal to the partition size can be loaded
into an available partition
if all partitions are full, the operating
system can swap a process out of a
partition
a program may not fit in a partition. Then
the programmer must design the
program with overlays
Fixed Partitioning
Main memory use is inefficient. Any
program, no matter how small, occupies an
entire partition. This is called internal
fragmentation.
But . . . It’s easy to implement.
Placement Algorithm with Fixed
Size Partitions
Equal-size partitions
because all partitions are of equal size, it
does not matter which partition is used
Placement is trivial.
Fixed Partition with
Different Sizes
Example is OS/360 MFT. The operator fixed the
partition sizes at system start up.
Two options:
* Separate Input Queues
* Single Input Queue
Multiple Input Queues
Partition 4
800K
700K
Jobs are put into the queue
for the smallest partition big
enough to hold them.
Partition 3
400K
Disadvantage?
Partition 2
200K
Partition 1
100K
O/S
0
Memory can go unused,
even though there are
jobs waiting to run that
would fit.
Single Input Queue
Partition 4
800K
700K
When a partition becomes free
pick the first job on the queue
that fits.
Partition 3
Disadvantage?
400K
Partition 2
200K
Partition 1
100K
O/S
0
Small jobs can be put into
much larger partitions than
they need, wasting memory
space.
Single Input Queue
Partition 4
800K
700K
Alternative Solution – scan the
whole queue and find the job
that best fits.
Partition 3
Disadvantage?
400K
Discriminates against small jobs.
Starvation.
Partition 2
200K
Partition 1
100K
O/S
0
CPU Utilization
From a probabalistic point of view ….
Suppose that a process spends a fraction p of its time
waiting for I/O to complete. With n processes in memory
at once, the probability that all n processes are waiting
for I/O (in which case the CPU is idle) is p n.
CPU utilization is therefore given by the formula
CPU Utilization = 1 – p
n
Consider the case where processes spend 80% of their time
waiting for I/O (not unusual in an interactive end-user system
where most time is spent waiting for keystrokes). Notice that
it requires at least 10 processes to be in memory to achieve
a 90% CPU utilization.
Predicting Performance
Suppose you have a computer that has 32MB of memory and that
the operating system uses 16MB. If user programs average 4MB
we can then hold 4 jobs in memory at once. With an 80% average
I/O wait
CPU utilization = 1 – 0.84 = approx 60%
Adding 16MB of memory allows us to have 8 jobs in memory at once
So
CPU utilization = 1 - .88 = approx 83%
Adding a second 16MB would only increase CPU utilization to 93%
Dynamic Partitioning
Partitions are of variable length and number.
A process is allocated exactly as much
memory as it requires.
Eventually you get holes in the memory. This
is called external fragmentation.
You must use compaction to shift processes
so they are contiguous and all free memory is
in one block.
O/S
8M
56M
For
Example …
O/S
8M
Process 1
20M
36M
O/S
8M
Process 1
20M
Process 2
Process 3
14M
18M
8M
O/S
8M
Process 1
20M
Process 4
10M
4M
Process 3
18M
8M
O/S
8M
Process 5
16M
4M
Process 4
10M
4M
Process 3
18M
8M
Fragmentation!
Periodically the O/S could do memory compaction – like
disk compaction. Copy all of the blocks of code for loaded
processes into contiguous memory locations, thus opening
larger un-used blocks of free memory.
The problem is that this is expensive!
A related question: How much memory do you
allocate to a process when it is created or
swapped in?
In most modern computer languages data can be
created dynamically.
The Heap
This may come as a surprise….
Dynamic memory allocation with malloc, or new, does not really
cause system memory to be dynamically allocated to the process.
In most implementations, the linker anticipates the use of dynamic
memory and reserves space to honor such requests. The linker
reserves space for both the process’s run-time stack and it’s heap.
Thus a malloc( ) call returns an address within the existing address
space reserved for the process.
Only when this space is used up does a system call to the kernel
take place to get more memory. The address space may have to be
rebound.
Managing Dynamically
Allocated Memory
When managing memory dynamically, the operating
system must keep track of the free and used blocks
of memory.
Common methods used are bitmaps and linked lists.
Linked List Allocation
Keep List in order sorted by address
Hole
P 0 5
H 5 3
P 8 6
P 14 4
H 18 2
P 20 6
P 26 3
H 29 3 X
Length
Memory is divided up into some
number of fixed size allocation
units.
Starts at
Process
Linked List Allocation
Hole
P 0 5
H 5 3
P 8 6
P 14 4
H 18 2
P 20 6
P 26 3
H 29 3 X
Length
Starts at
When this process ends, just merge this node
with the hole next to it (if one exists). We want
contiguous blocks!
Linked List Allocation
P 0 5
H 5 3
H 18 8
Hole
P 8 6
P 14 4
P 26 3
H 29 3 X
Length
Starts at
When blocks are managed this way, there are several algorithms that
the O/S can use to find blocks for a new process, or one being
swapped in from disk.
Dynamic Partitioning
Placement Algorithms
Best-fit algorithm
Search the entire list and choose the block that is
the smallest that will hold the request. This algorithm
is the worst performer overall, since the smallest possible
block is found for a process, this algorithm tends to leave
lots of tiny holes that are not useful.
tiny hole
smallest block that process will fit in
Dynamic Partitioning
Placement Algorithms
Worst-fit – a variation of best fit
This scheme is like best fit, but when looking for a new
block it picks the largest block of unallocated memory.
The idea is that external fragmentation will result in
bigger holes, so it is more likely that another block will fit.
big hole
Largest block of unallocated memory
Dynamic Partitioning
Placement Algorithms
First-fit algorithm
Finds the first block in the list that will fit.
May end up with many process loaded in the front end of
memory that must be searched over when trying to
find a free block
Dynamic Partitioning
Placement Algorithms
Next-fit – a variation of first fit
This scheme is like first fit, but when looking for a
new block, it begins its search where it left off the
last time. This algorithm actually performs slightly
worse than first fit
Swapping
Used primarily in timeshared systems with single thread
processes.
Optimizes system performance by removing a process from
memory when its thread is blocked.
When a process is moved to the ready state, the process
manager notifies the memory manager so that the address
space can be swapped in again when space is available.
Requires relocation hardware
Swapping can also be used when the memory requirements of
the processes running on the system exceed available memory.
System Costs to do Swapping
If a process requires S units of primary storage, and a disk block
holds D units of primary storage, then
ceiling(S/D)
disk writes are required to swap the address space to disk. The
same number of reads are required to swap the address space
back into primary storage.
For example, if a process is using 1000 bytes of memory, and
disk blocks are 256 bytes, then 4 disk writes are required.
Suppose that a process requiring S units of primary storage is
blocked for T units of time. The resource wasted because the
process stays in memory is
S x T.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
Suppose that a process requiring S units of primary storage is
blocked for T units of time. The resource wasted because the
process stays in memory is
S x T.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
How big is S ? If it is small, then the amount of storage made
available for other processes to use is minimized, and another
process may not fit. Swapping would be wasteful if there is not
a process that would fit in the storage made available.
Suppose that a process requiring S units of primary storage is
blocked for T units of time. The resource wasted because the
process stays in memory is
S x T.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
If T is small, then the process will begin competing for primary
storage too quickly to make the swap effective.
If T < R, the process will begin requesting memory before it is
even completely swapped out (R is the time required to swap).
Suppose that a process requiring S units of primary storage is
blocked for T units of time. The resource wasted because the
process stays in memory is
S is known. Can T be predicted?
S x T.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
For swapping to be effective, T must be considerably larger
than 2R for every process that the memory manager chooses
to swap out, and S must be large enough for other processes
to execute.
S is known. Can T be predicted?
When a process is blocked on a slow I/O device, the memory
manager can estimate a lower bound.
What about when a process is blocked by a semaphore
operation?
Example Test Questions
A memory manager for a variable sized region strategy has a
free list of memory blocks of the following sizes:
600, 400, 1000, 2200, 1600, 2500, 1050
Which block will be selected to honor a request for 1603 bytes
Using a best-fit policy? 2200
Which block will be selected to honor a request for 949 bytes
Using a best-fit policy? 1000
Which block will be selected to honor a request for 1603 bytes
Using a worst-fit policy? 2500
If you were designing an operating system, and had to
determine the best way to sort the free block list, how
would you sort it for each of the following policies, and why?
Best-fit
Smallest to largest free
block
Worst-fit
Largest to smallest free
block