Automatic Storage Management

Download Report

Transcript Automatic Storage Management

Automatic Storage
Management
Patrick Earl
Simon Leonard
Jack Newton
Overview








Terminology
Why use Automatic Storage Management?
Comparing garbage collection algorithms
The “Classic” algorithms
Copying garbage collection
Incremental Tracing garbage collection
Generational garbage collection
Conclusions
CMPUT 425/525: Automatic Storage Management
2
Terminology

Stack: a memory area where activation
records or frames are pushed onto when a
procedure is called and popped off when it
returns
 Heap: a memory area where data structures
can be allocated and deallocated in any
order.
CMPUT 425/525: Automatic Storage Management
3
Terminology
(Continued)

Roots: values that a program can manipulate
directly (i.e. values held in registers, on the
program stack, and global variables.)
 Node/Cell/Object: an individually allocated piece
of data in the heap.
 Children Nodes: the list of pointers that a given
node contains.
 Live Node: a node whose address is held in a root
or is the child of a live node.
CMPUT 425/525: Automatic Storage Management
4
Terminology
(Continued)

Garbage: nodes that are not live, but are not
free either.
 Garbage collection: the task of recovering
(freeing) garbage nodes.
 Mutator: The program running alongside
the garbage collection system.
CMPUT 425/525: Automatic Storage Management
5
Why Garbage Collect?

Language requirements
– In some situations it may be impossible to
know when a shared data structure is no longer
in use.
CMPUT 425/525: Automatic Storage Management
6
Why Garbage Collect?
(Continued)

Software Engineering
– Garbage collection increases abstraction level
of software development.
– Simplified interfaces and decreases coupling of
modules.
– Studies have shown a significant amount of
development time is spent on memory
management bugs [Rovner, 1985].
CMPUT 425/525: Automatic Storage Management
7
Comparing Garbage
Collection Algorithms

Directly comparing garbage collection algorithms
is difficult – there are many factors to consider.
 Some factors to consider:
–
–
–
–
–
–
–
Cost of reclaiming cells
Cost of allocating cells
Storage overhead
How does the algorithm scale with residency?
Will user program be suspended during garbage collection?
Does an upper bound exist on the pause time?
Is locality of data structures maintained (or maybe even
improved?)
CMPUT 425/525: Automatic Storage Management
8
Classes of Garbage Collection
Algorithms

Direct Garbage Collectors: a record is associated
with each node in the heap. The record for node N
indicates how many other nodes or roots point to
N.
 Indirect/Tracing Garbage Collectors: usually
invoked when a user’s request for memory fails
because the free list is exhausted. The garbage
collector visits all live nodes, and returns all other
memory to the free list. If sufficient memory has
been recovered from this process, the user’s
request for memory is satisfied.
CMPUT 425/525: Automatic Storage Management
9
Quick Review:
Reference Counting

Every cell has an additional field: the
reference count. This field represents the
number of pointers to that cell from roots or
heap cells.
 Initially, all cells in the heap are placed in a
pool of free cells, the free list.
CMPUT 425/525: Automatic Storage Management
10
Reference Counting
(Continued)

When a cell is allocated from the free list, its
reference count is set to one.
 When a pointer is set to reference a cell, the cell’s
reference count is incremented by 1; if a pointer is
to the cell is deleted, its reference count is
decremented by 1.
 When a cell’s reference count reaches 0, its
pointers to its children are deleted and it is
returned to the free list.
CMPUT 425/525: Automatic Storage Management
11
Reference Counting Example
1
0
2
1
0
0
1
0
1
CMPUT 425/525: Automatic Storage Management
12
Reference Counting Example
(Continued)
1
2
1
0
1
1
CMPUT 425/525: Automatic Storage Management
13
Reference Counting Example
(Continued)
1
2
1
1
0
1
0
1
CMPUT 425/525: Automatic Storage Management
14
Reference Counting Example
(Continued)
1
2
1
1
0
Returned to
free list
1
0
1
0
CMPUT 425/525: Automatic Storage Management
15
Reference Counting:
Advantages and Disadvantages

Advantages:
– Garbage collection overhead is distributed.
– Locality of reference is no worse than mutator.
– Free memory is returned to free list quickly.
CMPUT 425/525: Automatic Storage Management
16
Reference Counting:
Advantages and Disadvantages
(Continued)

Disadvantages:
– High time cost (every time a pointer is changed,
reference counts must be updated).
– Storage overhead for reference counter can be
high.
– Unable to reclaim cyclic data structures.
– If the reference counter overflows, the object
becomes permanent.
CMPUT 425/525: Automatic Storage Management
17
Reference Counting:
Cyclic Data Structure - Before
1
2
0
0
2
0
1
CMPUT 425/525: Automatic Storage Management
18
Reference Counting:
Cyclic Data Structure – After
1
1
0
0
2
0
1
CMPUT 425/525: Automatic Storage Management
19
Deferred Reference Counting
 Optimisation
– Cost can be improved by special treatment of
local variables.
– Only update reference counters of objects on the
stack at fixed intervals.
– Reference counts are still affected from pointers
from one heap object to another.
CMPUT 425/525: Automatic Storage Management
20
Quick Review: Mark-Sweep

The first tracing garbage collection algorithm
 Garbage cells are allowed to build up until heap
space is exhausted (i.e. a user program requests a
memory allocation, but there is insufficient free
space on the heap to satisfy the request.)
 At this point, the mark-sweep algorithm is
invoked, and garbage cells are returned to the free
list.
CMPUT 425/525: Automatic Storage Management
21
Mark-Sweep
(Continued)

Performed in two phases:
– Mark phase: identifies all live cells by setting
a mark bit. Live cells are cells reachable from a
root.
– Sweep phase: returns garbage cells to the free
list.
CMPUT 425/525: Automatic Storage Management
22
Mark-Sweep Example
Returned to
free list
CMPUT 425/525: Automatic Storage Management
23
Mark-Sweep:
Advantages and Disadvantages

Advantages:
– Cyclic data structures can be recovered.
– Tends to be faster than reference counting.
CMPUT 425/525: Automatic Storage Management
24
Mark-Sweep:
Advantages and Disadvantages
(Continued)

Disadvantages:
– Computation must be halted while garbage
collection is being performed
– Every live cell must be visited in the mark
phase, and every cell in the heap must be
visited in the sweep phase.
– Garbage collection becomes more frequent as
residency of a program increases.
– May fragment memory.
CMPUT 425/525: Automatic Storage Management
25
Mark-Sweep:
Advantages and Disadvantages
(Continued)
 Disadvantages:
– Has negative implications for locality of
reference. Old objects get surrounded by new
ones (not suited for virtual memory
applications).

However, if objects tend to survive in clusters in
memory, as they apparently often do, this can greatly
reduce the cost of the sweep phase.
CMPUT 425/525: Automatic Storage Management
26
Mark-Compact Collection
 Remedy
the fragmentation and allocation
problems of mark-sweep collectors.
 Two phases:
– Mark phase: identical to mark sweep.
– Compaction phase: marked objects are
compacted, moving most of the live objects until
all the live objects are contiguous.
CMPUT 425/525: Automatic Storage Management
27
Mark-Compact:
Advantages and Disadvantages
(Continued)
 Advantages:
– The contiguous free area eliminates
fragmentation problem. Allocating objects of
various sizes is simple.
– The garbage space is "squeezed out", without
disturbing the original ordering of objects. This
ameliorate locality.
CMPUT 425/525: Automatic Storage Management
28
Mark-Compact:
Advantages and Disadvantages
(Continued)
 Disadvantages:
– Requires several passes over the data are
required. "Sliding compactors" takes two, three
or more passes over the live objects.


One pass computes the new location
Subsequent passes update the pointers to refer to new
locations, and actually move the objects
CMPUT 425/525: Automatic Storage Management
29
Copying Garbage Collection




Like mark-compact, copying garbage collection
does not really "collect" garbage.
Rather it moves all the live objects into one area
and the rest of the heap is know to be available.
Copying collectors integrate the traversal and the
copying process, so that objects need only be
traversed once.
The work needed is proportional to the amount of
live date (all of which must be copied).
CMPUT 425/525: Automatic Storage Management
30
Semispace Collector Using the
Cheney Algorithm
 The
heap is subdivided into two contiguous
subspaces (FromSpace and ToSpace).
 During normal program execution, only one
of these semispaces is in use.
 When the garbage collector is called, all the
live data are copied from the current
semispace (FromSpace) to the other
semispace (ToSpace).
CMPUT 425/525: Automatic Storage Management
31
Semispace Collector Using
the Cheney Algorithm
A
B
C
D
FromSpace
CMPUT 425/525: Automatic Storage Management
ToSpace
32
Semispace Collector Using
the Cheney Algorithm
A
B
C
D
A
B
C
D
FromSpace
CMPUT 425/525: Automatic Storage Management
ToSpace
33
Semispace Collector Using the
Cheney Algorithm
(Continued)
 Once
the copying is completed, the ToSpace
is made the "current" semispace.
 A simple form of copying traversal is the
Cheney algorithm.



The immediately reachable objects from the initial
queue of objects for a breadth-first traversal.
A scan pointer is advanced through the first object
location by location.
Each time a pointer into FromSpace is encountered,
the referred-to-object is transported to the end of the
queue and the pointer to the object is updated.
CMPUT 425/525: Automatic Storage Management
34
Cheney Algorithm: Example
A
B
A
scan
Root Nodes
B
F
free
B
C
B
scan
C
A
scan
free
E
C
D
A
A
A
CMPUT 425/525: Automatic Storage Management
free
B
scan
B
D
C
C
scan
D
E
free
D
E
F
free
35
Semispace Collector Using the
Cheney Algorithm
(Continued)
 Multiple
paths must not be copied to tospace
multiple times.
 When an object is transported to tospace, a
forwarding pointer is installed in the old
version of the object.
 The forwarding pointer signifies that the old
object is obsolete and indicates where to find
the new copy.
CMPUT 425/525: Automatic Storage Management
36
Copying Garbage Collection:
Advantages and Disadvantages
 Advantages:
– Allocation is extremely cheap.
– Excellent asymptotic complexity.
– Fragmentation is eliminated.
– Only one pass through the data is required.
CMPUT 425/525: Automatic Storage Management
37
Copying Garbage Collection:
Advantages and Disadvantages
(Continued)
 Disadvantages:
– The use of two semi-spaces doubles memory
requirement needs
– Poor locality. Using virtual memory will cause
excessive paging.
CMPUT 425/525: Automatic Storage Management
38
Problems with Simple Tracing
Collectors
 Difficult
to achieve high efficiency in a
simple garbage collector, because large
amounts of memory are expensive.
 If virtual memory is used, the poor locality of
the allocation/reclamation cycle will cause
excessive paging.
 Even as main memory becomes steadily
cheaper, locality within cache memory
becomes increasingly important.
CMPUT 425/525: Automatic Storage Management
39
Problems with Simple Tracing
Collectors
(Continued)
 With
a simple semispace copy collector,
locality is likely to be worse than marksweep.
 The memory issue is not unique to copying
collectors.
 Any efficient garbage collection involves a
trade-off between space and time.
 The problem of locality is an indirect result
of the use of garbage collection.
CMPUT 425/525: Automatic Storage Management
40
Incremental Tracing Collectors
Overview

Introduction to Incremental Collectors
 Coherence and Conservatism
 Tricolor Marking
 Write Barrier Algorithms
 Baker’s Read Barrier Algorithm
CMPUT 425/525: Automatic Storage Management
41
Incremental Tracing Collectors

Program (Mutator) and Garbage Collector
run concurrently.
– Can think of system as similar to two threads.
One performs collection, and the other
represents the regular program in execution.

Can be used in systems with real-time
requirements. For example, process control
systems.
CMPUT 425/525: Automatic Storage Management
42
Coherence & Conservatism

Coherence: A proper state must be
maintained between the mutator and the
collector.
 Conservatism: How aggressive the garbage
collector is at finding objects to be
deallocated.
CMPUT 425/525: Automatic Storage Management
43
Tricoloring
White – Not yet traversed. A candidate for
collection.
 Black – Already traversed and found to be
live. Will not be reclaimed.
 Grey – In traversal process. Defining
characteristic is that it’s children have not
necessarily been explored.

CMPUT 425/525: Automatic Storage Management
44
The Tricolor Abstraction
CMPUT 425/525: Automatic Storage Management
45
Tricoloring Invariant

There must not be a pointer from a black
object to a white object.
CMPUT 425/525: Automatic Storage Management
46
Violation of Coloring Invariant
A
B
A
C
D
Before
CMPUT 425/525: Automatic Storage Management
C
B
D
After
47
Steps in Violation

Read a pointer to a white object
 Assign that pointer to a black object
 Original pointer must be destroyed without
collection system noticing.
CMPUT 425/525: Automatic Storage Management
48
Read Barrier

Barriers are essentially memory access
detection systems.
 We detect when any pointers to any white
objects are read.
 If a read to the pointer occurs, we
conceptually color that object grey.
CMPUT 425/525: Automatic Storage Management
49
Write Barrier

When a pointer is written to an object, we
record the write somehow.
 The recorded write is dealt with at a later
point.
 Read vs. Write efficiency considerations.
CMPUT 425/525: Automatic Storage Management
50
Write Barrier Algorithms

Snapshot-at-beginning
 Incremental update
CMPUT 425/525: Automatic Storage Management
51
Snapshot-at-beginning

Conceptually makes a copy-on-write
duplication of the pointer graph.
 Can be implemented with a simple write
barrier that records pointer writes and adds
the old addresses to a stack to be traversed
later.
CMPUT 425/525: Automatic Storage Management
52
Snapshot-at-beginning
Example
A
B
A
C
D
Before
CMPUT 425/525: Automatic Storage Management
C
B
Pointer to
D is now
On stack
D
After
Stack
53
Comments on Snapshot-atbeginning

Very conservative.
 All overwritten pointer values are saved and
traversed.
 No objects can be freed while collection
process is occurring.
CMPUT 425/525: Automatic Storage Management
54
Incremental Update WriteBarrier Algorithm

No copy of tree is made.
 Catches overwrites of pointers that have
been copied.
– If a pointer is not copied before being written, it
will be freed.

The object with the overwritten pointer is
colored grey and the algorithm must search
that node again at the end.
CMPUT 425/525: Automatic Storage Management
55
Incremental Update Example
A
B
A
C
D
Before
CMPUT 425/525: Automatic Storage Management
C
B
D
After
56
Comments on Incremental
Update

Things that are freed during collection are
far more likely to be collected than with the
snapshot algorithm. (Less conservative)
 Although the collector restarts the traversal
in some places, it is guaranteed to do a full
search and will eventually terminate.
CMPUT 425/525: Automatic Storage Management
57
Baker’s Read Barrier
Algorithms

Incremental Copying
 Non-copying Algorithm (The Treadmill)
CMPUT 425/525: Automatic Storage Management
58
Incremental Copying

Variation of Copying Collector
 “Garbage collection cycle begins with an
atomic flip.”
 All objects directly pointed to by the root
are copied into tospace.
CMPUT 425/525: Automatic Storage Management
59
Read Barrier in Incremental
Copying

Whenever an object is read that is not
already in ToSpace, the read barrier catches
that and copies the object over to ToSpace at
that point.
 Normal “background scavenging” occurs
simultaneously to ensure that all objects are
traversed and reclamation can occur.
CMPUT 425/525: Automatic Storage Management
60
Incremental Copying Example
A
B
C
A
B
C
D
D
E
FromSpace
ToSpace
Atomic Flip, then a read to D occurs…
CMPUT 425/525: Automatic Storage Management
61
Comments on Read Barrier

If implemented in software can be quite
slow due to numerous reads to heap.
 Specialized hardware is available on some
unique machines that allow this type of
tracing to be done quickly.
CMPUT 425/525: Automatic Storage Management
62
Baker’s Incremental NonCopying Algorithm

Allocation
Free
New
To
From
Scanning
CMPUT 425/525: Automatic Storage Management
Doubly Linked
Lists
 New area for
allocations since
started collection
 To/From spaces
 Free list
63
Example - Allocation

Allocation
Free
Take an object from
the free list and move
it to the new list.
New
To
From
Scanning
CMPUT 425/525: Automatic Storage Management
64
Example - Scanning

Allocation
Free
New
To
From
Scanning
CMPUT 425/525: Automatic Storage Management
Searching nodes in
ToSpace for references
to objects in
FromSpace.
 When found, object is
unlinked in
FromSpace and is
linked in ToSpace.
65
Treadmill Workings

When starting collection cycle:
– New list is empty
– From list contains all New and To objects from
last cycle.

Collection proceeds and scanning and
allocation are performed.
 When finished:
– From list is merged with Free list.
CMPUT 425/525: Automatic Storage Management
66
Comments on Treadmill

As in Incremental Copying, the garbage
found in the FromSpace is reclaimed in
constant time.
 Conservative with new objects
 Conservative also in that reached objects
will not be removed even if they become
garbage before scan ends.
CMPUT 425/525: Automatic Storage Management
67
Incremental Collectors
Summary

Incremental Tracing Collectors
 Tricolor Marking and Invariant
 Read and Write Barriers
 Snapshot-at-beginning
 Incremental Update
 Baker’s Incremental Copying
 Baker’s Non-copying (Treadmill)
CMPUT 425/525: Automatic Storage Management
68
Generational Garbage
Collection

Attempts to address weaknesses of simple tracing
collectors such as mark-sweep and copying
collectors:
– All active data must be marked or copied.
– For copying collectors, each page of the heap is
touched every two collection cycles, even though the
user program is only using half the heap, leading to
poor cache behavior and page faults.
– Long-lived objects are handled inefficiently.
CMPUT 425/525: Automatic Storage Management
69
Generational Garbage
Collection
(Continued)

Generational garbage collection is based on
the generational hypothesis:
Most objects die young.
 As such, concentrate garbage collection
efforts on objects likely to be garbage:
young objects.
CMPUT 425/525: Automatic Storage Management
70
Generational Garbage
Collection: Object Lifetimes

When we discuss object lifetimes, the
amount of heap allocation that occurs
between the object’s birth and death is used
rather than the wall time.
 For example, an object created when 1Kb of
heap was allocated and was no longer
referenced when 4 Kb of heap data was
allocated would have lived for 3Kb.
CMPUT 425/525: Automatic Storage Management
71
Generational Garbage
Collection: Object Lifetimes
(Continued)

Typically, between 80 and 98 percent of all
newly-allocated heap objects die before
another megabyte has been allocated.
CMPUT 425/525: Automatic Storage Management
72
Generational Garbage
Collection
(Continued)

Objects are segregated into different areas
of memory based on their age.
 Areas containing newer objects are garbage
collected more frequently.
 After an object has survived a given number
of collections, it is promoted to a less
frequently collected area.
CMPUT 425/525: Automatic Storage Management
73
Generational Garbage
Collection: Example
C
B
A
S
Root Set
Memory Usage
Memory Usage
Old Generation
New Generation
CMPUT 425/525: Automatic Storage Management
74
Generational Garbage
Collection: Example
(Continued)
R
C
B
A
S
Root Set
Memory Usage
Memory Usage
Old Generation
New Generation
CMPUT 425/525: Automatic Storage Management
75
Generational Garbage
Collection: Example
(Continued)
R
D
C
B
A
S
Root Set
Memory Usage
Memory Usage
Old Generation
New Generation
CMPUT 425/525: Automatic Storage Management
76
Generational Garbage
Collection: Example
(Continued)

This example demonstrates several interesting
characteristics of generational garbage collection:
– The young generation can be collected independently of
the older generations (resulting in shorter pause times).
– An intergenerational pointer was created from R to D.
These pointers must be treated as part of the root set of
the New Generation.
– Garbage collection in the new generation result in S
becoming unreachable, and thus garbage. Garbage in
older generations (sometimes called tenured garbage)
can not be reclaimed via garbage collections in younger
generations.
CMPUT 425/525: Automatic Storage Management
77
Generational Garbage
Collection: Implementation

Usually implemented as a copying collector,
where each generation has its own semispace:
FromSpace
FromSpace
ToSpace
ToSpace
Old Generation
CMPUT 425/525: Automatic Storage Management
New Generation
78
Generational Garbage
Collection: Issues

Choosing an appropriate number of
generations:
– If we benefit from dividing the heap into two
generations, can we further benefit by using
more than two generations?

Choosing a promotion policy:
– How many garbage collections should an object
survive before being moved to an older
generation?
CMPUT 425/525: Automatic Storage Management
79
Generational Garbage
Collection: Issues
(Continued)

Tracking intergenerational pointers:
– Inter-generational pointers need to be tracked,
since they form part of the root set for younger
generations.

Collection Scheduling
– Can we attempt to schedule garbage collection
in such a way that we minimize disruptive
pauses?
CMPUT 425/525: Automatic Storage Management
80
Generational Garbage Collection:
Multiple Generations
Generation 4
Generation 3
Generation 2
CMPUT 425/525: Automatic Storage Management
Generation 1
81
Generational Garbage Collection:
Multiple Generations
(Continued)

Advantages:
– Keeps youngest generation’s size small.
– Helps address mistakes made by the promotion policy by creating
more intermediate generations that still get garbage collected fairly
frequently.

Disadvantages:
– Collections for intermediate generations may be disruptive.
– Tends to increase number of inter-generational pointers, increasing
the size of the root set for younger generations.

Most generational collectors are limited to just two or three
generations.
CMPUT 425/525: Automatic Storage Management
82
Generational Garbage Collection:
Promotion Policies

A promotion policy determines how many
garbage collections cycles (the cycle count)
an object must survive before being
advanced to the next generation.
 If the cycle count is too low, objects may be
advanced too fast; if too high, the benefits
of generational garbage collection are not
realized.
CMPUT 425/525: Automatic Storage Management
83
Generational Garbage Collection:
Promotion Policies
(Continued)

With a cycle count of just one, objects created just
before the garbage collection will be advanced,
even though the generational hypothesis states
they are likely to die soon.
 Increasing the cycle count to two denies
advancement to recently created objects.
 Under most conditions, it increasing the cycle
count beyond two does not significantly reduce
the amount of data advanced.
CMPUT 425/525: Automatic Storage Management
84
Generational Garbage Collection:
Inter-generational Pointers

Inter-generational pointers can be created in two
ways:
– When an object containing pointers is promoted to an
older generation.
– When a pointer to an object in a newer generation is
stored in an object.

The garbage collector can easily detect promotioncaused inter-generational pointers, but handling
pointer stores is a more complicated task.
CMPUT 425/525: Automatic Storage Management
85
Generational Garbage Collection:
Inter-generational Pointers

Pointer stores can be tracked via the use of a
write barrier:
– Pointer stores must be accompanied by extra
bookkeeping instructions that let the garbage
collector know of pointers that have been
updated.

Often implemented at the compiler level.
CMPUT 425/525: Automatic Storage Management
86
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

However, write barriers only provide a
conservative estimation of live intergenerational
pointers:
Root Set
Old Generation
New Generation
CMPUT 425/525: Automatic Storage Management
87
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

Tracking inter-generational pointers are
often the largest cost of generational
garbage collection.
 1 percent of a typical Lisp program’s total
instruction count are pointer stores. If a
write barrier adds 10 instructions to a
pointer store, overall performance will drop
by 10 percent.
CMPUT 425/525: Automatic Storage Management
88
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

Entry Tables
– Pointers from older generations point indirectly
to younger generations via an entry table:
Generation 3
Generation 2
Generation 1
Entry Table
Entry Table
CMPUT 425/525: Automatic Storage Management
89
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

Entry Table: Advantages
– When a younger generation is collected, only the entry
table for that generation needs to be scanned.

Entry Table: Disadvantages
– Entry table may contain several entries to the same
object, making scans of the object table proportional to
the number of pointer stores rather than to the number
of inter-generational pointers.
– High overhead because of extra level of indirection.
CMPUT 425/525: Automatic Storage Management
90
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

Remembered Sets
– The write barrier checks to see if a pointer
being stored in an old objects points to an
object in a newer generation. If so, the address
of the old object is added to the remembered set
(if that object is not already in the set).
CMPUT 425/525: Automatic Storage Management
91
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

Remembered Sets (Continued)
Old Generation
New Generation
Remembered Set
CMPUT 425/525: Automatic Storage Management
92
Generational Garbage Collection:
Inter-generational Pointers
(Continued)

Remembered Sets: Advantages
– Scanning is proportional to the number of
stored-into objects, not the number of store
operations.

Remembered Sets: Disadvantages
– Pointer store checking can be expensive.
CMPUT 425/525: Automatic Storage Management
93
Generational Garbage Collection:
Collection Scheduling

Generational garbage collection aims to
reduce pause times. When should these
(hopefully short) pause times occur?
 Two strategies exist:
– Hide collections when the user is least likely to
notice a pause, or
– Trigger efficient collections when there is likely
to be lots of garbage to collect.
CMPUT 425/525: Automatic Storage Management
94
Generational Garbage Collection:
Advantages

In practice it has proven to be an effective
garbage collection technique.
 Minor garbage collections are performed
quickly.
 Good cache and virtual memory behavior.
CMPUT 425/525: Automatic Storage Management
95
Generational Garbage Collection:
Disadvantages

Performs poorly if any of the main assumptions
are false:
– That objects tend die young.
– That there are relatively few pointers from old objects
to young ones.

Frequent pointer writes to older generations will
increase the cost of the write barrier, and possibly
increase the size of the root set for younger
generations.
CMPUT 425/525: Automatic Storage Management
96
Garbage Collection: Summary
Tracing
Incremental
Method
Conservatism
Space
Time
Fragmentation
Locality
Mark Sweep
Major
Basic
1 traversal +
heap scan
Yes
Fair
Mark Compact
Major
Basic
Many passes of
heap
No
Good
Copying
Major
Two
Semispaces
1 traversal
No
Poor
Reference
Counting
No
Reference count
field
Constant per
Assignment
Yes
Very Good
Deferred
Reference
Counting
Only for stack
variables
Reference
Count Field
Constant per
Assignment
Yes
Very Good
Incremental
Varies
depending on
algorithm
Varies
Can be
Guaranteed
Real-Time
Varies
Varies
Generational
Variable
Segregated
Areas
Varies with
number of live
objects in new
generation
Yes (NonCopying)
No (Copying)
Good
CMPUT 425/525: Automatic Storage Management
97
Garbage Collection:
Conclusions

Relieves the burden of explicit memory
allocation and deallocation.
 Software module coupling related to
memory management issues is eliminated.
 An extremely dangerous class of bugs is
eliminated.
CMPUT 425/525: Automatic Storage Management
98
Garbage Collection: Conclusions
(Continued)

Zorn’s study in 1989/93 compared garbage collection to
explicit deallocation:
– Non-generational
 Between 0% and 36% more CPU time.
 Between 40% and 280% more memory.
– Generational garbage collection
 Between 5% to 20% more CPU time.
 Between 30 and 150% more memory.


Wilson feels these numbers can be improved, and they are
also out of date.
A well implemented garbage collector will slow a program
down by approximately 10 percent relative to explicit heap
deallocation.
CMPUT 425/525: Automatic Storage Management
99
Garbage Collection: Conclusions
(Continued)

Despite this cost, garbage collection a feature
in many widely used languages:
– Lisp (1959)
– Perl (1987)
– Java (1995)
– C# (2001)
– Microsoft’s Common Language Runtime (2002)
CMPUT 425/525: Automatic Storage Management
100
Garbage Collection: Pointers

Heap of fish applet (Mark and Sweep garbage collection example)
http://www.artima.com/insidejvm/applets/HeapOfFish.html

Java HotSpot Garbage Collection Strategies
http://developer.java.sun.com/developer/technicalArticles/Networking/HotSpot/

The Memory Management Reference
http://www.memorymanagement.org/

Uniprocessor Garbage Collection Techniques (Wilson)
http://www.cs.ualberta.ca/~duane/courses/425-525/WilsonACMDraft.pdf

Garbage Collection: Algorithms for Automatic Dynamic Memory Management
(Richard Jones and Rafael Lins)
CMPUT 425/525: Automatic Storage Management
101
Questions?
If you have any questions, please feel free to
e-mail one of us:
Patrick Earl
[email protected]
Simon Leonard [email protected]
Jack Newton
[email protected]
CMPUT 425/525: Automatic Storage Management
102