Storage Management

Download Report

Transcript Storage Management

Storage Management
The stack and the heap
• Dynamic storage allocation refers to
allocating space for variables at run time
• Most modern languages support dynamic
storage allocation on both a stack and a heap
• It is possible to have many stacks and heaps
• Allocated storage must eventually be
deallocated (freed)
The stack
• When you enter a subprogram (function,
procedure, method), space is allocated on
the stack for
– local variables
– formal parameters
• This space persists until the subprogram
returns
• Storage allocation/deallocation is simple
Using the stack
• Stack allocation/deallocation is automatic
• You have names for variables on the stack
– because they're local variables or parameters
• Stack allocation is perfect for recursion
– Recursive calls use the same names but
different storage
• The storage is released when function exits
– This is not always what you want
Purpose of the heap
• Suppose you write a function whose job is
to create a data structure, and suppose that
data structure is placed on the stack
• The data structure disappears when you exit
the function
• You need dynamic storage that you, the
programmer, can control
– The heap provides that dynamic storage
Pointers
• Allocating storage from the heap is easy
– Person p = new Person ( );
• You get a pointer to the new storage
• C and C++ provide operations on pointers
– For example, p++;
• You can have many pointers to the same storage
• Pointers are pervasive in C and C++; you
can't avoid them
References
• The Java name for pointers is references
– The only real difference is that Java does not
provide (many) operations on pointers
• Pointer arithmetic is inherently unsafe
– You can accidentally point to the wrong thing
– You cannot in general be sure of the type of the
thing you are pointing to
– Java avoids these problems
Advantages/disadvantages
• Pointers give you:
– Greater flexibility and (maybe) convenience
– A much more complicated syntax
– More ways to create hard-to-find errors
• References give you:
– Less flexibility (no pointer arithmetic)
– Simpler syntax, like that of any other variable
– Much safer programs with fewer mysterious bugs
Deallocation
• There are two potential errors when
deallocating (freeing) storage yourself:
– Deallocating too soon, so that you have
dangling references (pointers to storage that has
been freed and possibly reused for something
else)
– Forgetting to deallocate, so that unused storage
accumulates and you have a memory leak
Ownership
• If you have to deallocate storage yourself, a
good strategy is that of ownership
• The function that owns the storage is
responsible for deallocating it
• Ownership can be transferred
– You just need a clearly defined policy for
determining ownership
– In practice, this is easier said than done
Discipline
• Most C/C++ advocates say:
– It's just a matter of being disciplined
– I'm disciplined, even if other people aren't
– Besides, there are good tools for finding
memory problems
• However:
– Virtually all large C/C++ programs have
memory problems
Garbage collection
• Garbage is storage that has been allocated
but is not longer available to the program
• It's easy to create garbage; just assign a new
value to the pointer to that storage
• A garbage collector automatically searches
out garbage and deallocates it
• Practically every modern language, except
C++, uses a garbage collector
Garbage collection algorithms
• There are two well-known algorithms (and
several not so well known ones) for doing
garbage collection:
– Reference counting
– Mark and sweep
Reference counting
• When storage is allocated, make it a bit larger
so it can hold an integer reference count
• The reference count keeps track of how many
pointers the program has to the storage
• Any assignment to a pointer variable modifies
reference counts
• When a reference count reaches zero, the
storage can be garbage collected
Problems with reference counting
• If A points to B, and B points to A, then
each is referenced, even if nothing else in
the program points to either one
• This fools the garbage collector, which
doesn't collect either A or B
• Thus, reference counting is imperfect and
unreliable; memory leaks still happen
Mark and sweep
• When memory runs low, stop processing and
run the garbage collector
• The collector marks every bit of heap storage
• It then looks at every program variable,
follows every pointer, and unmarks all the
storage it can reach
• When done, the marked storage must not be
accessible; it is garbage that can be freed
Problems with mark and sweep
• Mark-and-sweep is a complex algorithm
that takes substantial time
• Unlike reference counting, it must be done
all at once--nothing else can be going on
• The program stops responding during
garbage collection
• This is unsuitable for real-time applications
Garbage collection in Java
• Java uses mark-and-sweep
• Highly reliable, but may cause unexpected
slowdowns
• You can ask Java to do garbage collection
when the program has some time to spare
– But not all implementations respect your request
• This problem is known and being worked on
The End