Lecture 2- Algorithm analysis and DS

Download Report

Transcript Lecture 2- Algorithm analysis and DS

Five Representative Problems: Interval Scheduling
› Input. Set of jobs with start times and finish times.
› Goal. Find maximum cardinality subset of mutually compatible jobs.
jobs don't overlap
a
b
c
d
e
f
g
h
0
1
2
3
4
5
6
7
8
9
Time
10
11
1
Weighted Interval Scheduling
› Input. Set of jobs with start times, finish times, and weights.
› Goal. Find maximum weight subset of mutually compatible jobs.
23
12
20
26
13
20
11
16
0
1
2
3
4
5
6
7
8
9
Time
10
11
2
Bipartite Matching
› Input. Bipartite graph.
› Goal. Find maximum cardinality matching.
A
1
B
2
C
3
D
4
E
5
6
3
Independent Set
› Input. Graph.
› Goal. Find maximum cardinality independent set.
subset of nodes such that no two
joined by an edge
2
1
4
5
3
6
7
4
Competitive Facility Location
› Input. Graph with weight on each each node.
› Game. Two competing players alternate in selecting nodes. Not allowed
to select a node if any of its neighbors have been selected.
› Goal. Select a maximum weight subset of nodes.
10
1
5
15
5
1
5
1
15
10
Second player can guarantee 20, but not 25.
5
Five Representative Problems
› Variations on a theme: independent set.
› Interval scheduling: n log n greedy algorithm.
› Weighted interval scheduling: n log n dynamic programming algorithm.
› Bipartite matching: nk max-flow based algorithm.
› Independent set: NP-complete.
› Competitive facility location: PSPACE-complete.
6
Chapter 2
Algorithm Analysis
&
Data Structures
7
2.1 Computational Tractability
"For me, great algorithms are the poetry of
computation. Just like verse, they can be terse, allusive,
dense, and even mysterious. But once unlocked, they
cast a brilliant new light on some aspect of computing."
- Francis Sullivan
Polynomial-Time
› Brute force. For many non-trivial problems, there is a natural brute force
search algorithm that checks every possible solution.
- Typically takes 2N time or worse for inputs of size N.
- Unacceptable in practice.
› Desirable scaling property. When the input size doubles, the algorithm
should only slow down by some constant factor C.
There exists constants c > 0 and d > 0 such that on
every input of size N, its running time is bounded by c
Nd steps.
› Definition: An algorithm is poly-time if the above scaling property holds.
9
Worst-Case Analysis
› Worst case running time. Obtain bound on largest possible running
time of algorithm on input of a given size N.
- Generally captures efficiency in practice.
- Draconian view, but hard to find effective alternative.
› Average case running time. Obtain bound on running time of
algorithm on random input as a function of input size N.
- Hard (or impossible) to accurately model real instances by random
distributions.
- Algorithm tuned for a certain distribution may perform poorly on other inputs.
10
Worst-Case Polynomial-Time
› Definition: An algorithm is efficient if its running time is polynomial.
› Justification: It really works in practice!
- Although 6.02  1023  N20 is poly-time, it is useless in practice.
- In practice, the poly-time algorithms that people develop almost always have low
constants and low exponents.
- Breaking through the exponential barrier of brute force typically exposes some
crucial structure of the problem.
› Exceptions.
simplex method
Unix grep
- Some poly-time algorithms do have high constants and/or exponents, and are
useless in practice.
- Some exponential-time (or worse) algorithms are widely used because the worstcase instances seem to be rare.
11
Why It Matters
12
Asymptotic Order of Growth
› Upper bounds. T(n) is O(f(n)) if there exist constants c > 0 and n0  0 such
that for all n  n0 we have T(n)  c · f(n).
› Lower bounds. T(n) is (f(n)) if there exist constants c > 0 and n0  0 such
that for all n  n0 we have T(n)  c · f(n).
› Tight bounds. T(n) is (f(n)) if T(n) is both O(f(n)) and (f(n)).
› Ex: T(n) = 32n2 + 17n + 32.
- T(n) is O(n2), O(n3), (n2), (n), and (n2) .
- T(n) is not O(n), (n3), (n), or (n3).
13
Notation
› Slight abuse of notation. T(n) = O(f(n)).
- Asymmetric:
- f(n) = 5n3; g(n) = 3n2
- f(n) = O(n3) = g(n)
- but f(n)  g(n).
- Better notation: T(n)  O(f(n)).
› Meaningless statement. Any comparison-based sorting
algorithm requires at least O(n log n) comparisons.
- Statement doesn't "type-check."
- Use  for lower bounds.
14
Properties
› Transitivity
- If f = O(g) and g = O(h) then f = O(h).
- If f = (g) and g = (h) then f = (h).
- If f = (g) and g = (h) then f = (h).
› Additivity
- If f = O(h) and g = O(h) then f + g = O(h).
- If f = (h) and g = (h) then f + g = (h).
- If f = (h) and g = O(h) then f + g = (h).
15
Asymptotic Bounds for Some Common Functions
› Polynomials. a0 + a1n + … + adnd is (nd) if ad > 0.
› Polynomial time. Running time is O(nd) for some constant d
independent of the input size n.
› Logarithms. O(log a n) = O(log b n) for any constants a, b > 0.
can avoid specifying the base
› Logarithms. For every x > 0, log n = O(nx).
log grows slower than every polynomial
› Exponentials. For every r > 1 and every d > 0, nd = O(rn).
every exponential grows faster than every polynomial
16
Linear Time: O(n)
› Linear time. Running time is at most a constant factor
times the size of the input.
› Computing the maximum. Compute maximum of n
numbers a1, …, an.
max  a1
for i = 2 to n
{
if (ai > max)
max  ai
}
17
Linear Time: O(n)
› Merge. Combine two sorted lists A
into one sorted list.
= a1,a2,…,an with B = b1,b2,…,bn
i = 1, j = 1
while (both lists are nonempty) {
if (ai  bj) then append ai to output list and
increment i
else append bj to output list and increment j
}
append remainder of nonempty list to output list
18
O(n log n) Time
› O(n log n) time. Arises in divide-and-conquer algorithms.
› Sorting. Mergesort and heapsort are sorting algorithms that
perform O(n log n) comparisons.
› Largest empty interval. Given n time-stamps x1, …, xn on which
copies of a file arrive at a server, what is largest interval of time
when no copies of the file arrive?
› O(n log n) solution. Sort the time-stamps. Scan the sorted list
in order, identifying the maximum gap between successive timestamps.
19
Quadratic Time: O(n2)
› Quadratic time. Enumerate all pairs of elements.
› Closest pair of points. Given a list of n points in the plane
(x1, y1), …, (xn, yn), find the pair that is closest.
› O(n2) solution. Try all pairs of points.
min  (x1 - x2)2 + (y1 - y2)2
for i = 1 to n {
for j = i+1 to n {
d  (xi - xj)2 + (yi - yj)2
if (d < min)
min  d
}
}
don't need to
take square roots
see chapter 5
20
Cubic Time: O(n3)
› Cubic time. Enumerate all triples of elements.
› Set disjointness. Given n sets S1, …, Sn each of which is a subset of
1, 2, …, n, is there some pair of these which are disjoint?
› O(n3) solution. For each pairs of sets, determine if they are disjoint.
foreach set Si {
foreach other set Sj {
foreach element p of Si {
determine whether p also belongs to Sj
}
if (no element of Si belongs to Sj)
report that Si and Sj are disjoint
}
}
21
Polynomial Time: O(nk) Time
› Independent set of size k. Given a graph, are there k nodes such that no
two are joined by an edge?
k is a constant
› O(nk) solution. Enumerate all subsets of k nodes.
foreach subset S of k nodes {
check whether S in an independent set
if (S is an independent set)
report S is an independent set
}
}
poly-time for k=17,
but not practical
- Check whether S is an independent set = O(k2).
k
n  n (n 1) (n  2) (n  k 1)
n
- Number of k element subsets =  

k!
k  k (k 1) (k  2) (2) (1)
2
k
k
- O(k n / k!) = O(n ).
22
Exponential Time
› Independent set. Given a graph, what is maximum size of an
independent set?
› O(n2 2n) solution. Enumerate all subsets.
S*  
foreach subset S of nodes {
check whether S in an independent set
if (S is largest independent set seen so far)
update S*  S
}
}
23
Summary: Algorithm analysis
› You must learn the asymptotic order of growth. It is
fundamental when measuring the performance of an
algorithm.
- O-notation
- -notation
- -notation
› Transitivity and additivity
24
Basic data structures
› Linked lists
› Queues
› Stacks
› Balanced binary trees
25
Why data structures?
› Programs manipulate data
› Data should be organized so manipulations will be efficient
- Search (e.g. Finding a word/file/web page)
› Better programs are powered by good data structures
› Naïve choices are often much less efficient than clever choices
› Data structures are existing tools that can help you
- Guide your design
- and save you time (avoid re-inventing the wheel)
Linked list
› A linked list is
- a collection of items (stored in “positions” in the list)
- that supports the following operations
- addFirst( newItem )
- Add newItem at the beginning of the list
- addLast( newItem )
- Add newItem at the end of the list
- addAfter( existingPosition, newItem )
- Add NewItem after existingPosition
- getFirst( )
- getLast( )
- …
Singly Linked Lists
› A singly linked list is a data
structure consisting of a
sequence of nodes
next
› Each node stores
node
elem
- element
- link to the next node

A
B
C
D
Inserting at the Head
1.
2.
3.
4.
5.
Allocate a new node
Insert new element
Have new node point to
first element
Have Head point to new
node
Extra checks…
Tail
A
Head

B
C
D
Removing at the Head
1.
Update head to point to
next node in the list
2.
Delete the former first node
Inserting at the Tail
1.
2.
3.
4.
5.
Allocate a new node
Insert new element
Have new node point to
null
Have old last node point to
new node
Update tail to point to new
node
Removing at the Tail
› Removing at the tail of a singly
linked list cannot be efficient!
› There is no constant-time way
to update the tail to point to the
previous node
Doubly Linked List
› A doubly linked list is often more
convenient!
› Nodes store:
prev
next
- element
- link to the previous node
- link to the next node
elem
node
› Special trailer and header nodes
nodes/positions
header
Linked Lists
trailer
elements
33
Insertion
› We visualize operation insertAfter(p, X), which returns position q
p
A
B
C
p
A
q
B
C
X
p
A
q
B
X
C
Deletion
› We visualize remove(p), where p == last()
A
B
C
A
B
C
p
D
p
D
A
B
C
Worst-cast running time
› In a doubly linked list
+ insertion at head or tail is in O(1)
+ deletion at either end is on O(1)
-- element access requires O(n)
The Queue data structure
› The Queue data structure stores
arbitrary objects
› Insertions and deletions follow the
first-in first-out (FIFO) scheme
› Insertions are at the rear of the
queue and removals are at the
front of the queue
› Main queue operations:
- enqueue(object): inserts an element
at the end of the queue
- object dequeue(): removes and
returns the element at the front of the
queue
› Auxiliary queue operations:
- object front(): returns the
element at the front without
removing it
- integer size(): returns the
number of elements stored
- boolean isEmpty(): indicates
whether no elements are stored
Example
Operation
Output
Q
enqueue(5)
–
(5)
enqueue(3)
–
(5, 3)
dequeue()
5
–
enqueue(7)
dequeue()
(3)
3
front()
(3, 7)
(7)
7
dequeue()
7
()
dequeue()
“error”
()
(7)
isEmpty()
true
()
enqueue(9)
–
(9)
enqueue(7)
–
(9, 7)
size()
2
(9, 7)
enqueue(3)
–
(9, 7, 3)
enqueue(5)
–
(9, 7, 3, 5)
dequeue()
9
(7, 3, 5)
Applications of Queues
› Direct applications
- Waiting lists
- Access to shared resources (e.g., printer)
- Simulation
› Indirect applications
- Auxiliary data structure for algorithms
- Component of other data structures
Queue Interface in Java
public interface Queue<E> {
public int size();
public boolean isEmpty();
public E front()
throws EmptyQueueException;
public void enqueue(E element);
public E dequeue()
throws EmptyQueueException;
}
Queue implementation using singly linked lists
› Note that we need to keep pointers to both the first and
the last nodes in the list
Queue implementation using singly linked lists
›
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
43
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Size is a counter starting at 0
incremented when “enqueue”
decremented when “dequeue”
44
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Size is a counter starting at 0
incremented when “enqueue”
decremented when “dequeue”
isEmpty = (Is size == 0)?
45
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Head
Tail

A
B
C
D
46
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Tail
A
Head

B
C
D
47
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Head
Tail

A
B
C
D
48
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Head
Tail

A
B
C
D
49
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
Head
Tail

A
B
C
50
Queues
› public interface Queue<E> {
›
public int size();
›
public boolean isEmpty();
›
public E front()
›
public void enqueue(E element);
›
public E dequeue()
}
These operations can be performed in
O(1) time per operation.
The Stack
› The Stack data structure stores
arbitrary objects
› Auxiliary stack operations:
› Insertions and deletions follow the
last-in first-out (LIFO) scheme
- object top(): returns the last
inserted element without
removing it
› Think of a spring-loaded plate
dispenser
- integer size(): returns the
number of elements stored
› Main stack operations:
- boolean isEmpty(): indicates
whether no elements are stored
- push(object): inserts an element
- object pop(): removes and returns
the last inserted element
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Size is a counter starting at 0
incremented when “push”
decremented when “pop”
53
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Size is a counter starting at 0
incremented when “push”
decremented when “pop”
isEmpty = (Is size == 0)?
54
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Head

A
B
C
D
55
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Tail
A
Head

B
C
D
56
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Head

A
B
C
D
57
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Head

A
B
C
D
58
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
}
Head

B
C
D
59
Stack
› public interface Stack {
›
public int size();
›
public boolean isEmpty();
›
public Object top()
›
public void push(Object o);
›
public Object pop()
These operations can be performed in
O(1) time per operation.
Parentheses Matching
› Each “(”, “{”, or “[” must be paired with a
matching “)”, “}”, or “[”
- correct: ( )(( )){([( )])}
- correct: ((( )(( )){([( )])}
- incorrect: )(( )){([( )])}
- incorrect: ({[ ])}
- incorrect: (
Parentheses Matching Algorithm
Algorithm ParenMatch(X,n):
Input: An array X of n tokens, each of which is either a grouping symbol, a
variable, an arithmetic operator, or a number
Output: true if and only if all the grouping symbols in X match
Let S be an empty stack
for i=0 to n-1 do
if X[i] is an opening grouping symbol then
S.push(X[i])
else if X[i] is a closing grouping symbol then
if S.isEmpty() then
return false {nothing to match with}
if S.pop() does not match the type of X[i] then
return false {wrong type}
if S.isEmpty() then
return true {every symbol matched}
else
return false {some symbols were never matched}
HTML Tag Matching
For fully-correct HTML, each <name> should pair with a matching </name>
<body>
<center>
<h1> The Little Boat </h1>
</center>
<p> The storm tossed the little
boat like a cheap sneaker in an
old washing machine. The three
drunken fishermen were used to
such treatment, of course, but
not the tree salesman, who even as
a stowaway now felt that he
had overpaid for the voyage. </p>
<ol>
<li> Will the salesman die? </li>
<li> What color is the boat? </li>
<li> And what about Naomi? </li>
</ol>
</body>
The Little Boat
The storm tossed the little boat
like a cheap sneaker in an old
washing machine. The three
drunken fishermen were used to
such treatment, of course, but not
the tree salesman, who even as
a stowaway now felt that he had
overpaid for the voyage.
1. Will the salesman die?
2. What color is the boat?
3. And what about Naomi?
Trees
Make Money Fast!
Stock
Fraud
Ponzi
Scheme
Bank
Robbery
What is a Tree
› In computer science, a
tree is an abstract model
of a hierarchical structure
Computers”R”Us
› A tree consists of nodes
with a parent-child relation
Sales
Manufacturing
› Applications:
- Organization charts
- File systems
- Programming environments
US
Europe
International
Asia
Laptops
Canada
Desktops
R&D
Tree Terminology
Subtree: tree consisting of
a node and its
descendants
› Root: node without parent (A)
› Internal node: node with at least one
child (A, B, C, F)
› External node (a.k.a. leaf ): node
without children (E, I, J, K, G, H, D)
A
› Ancestors of a node: parent,
grandparent, grand-grandparent, etc.
B
C
D
› Depth of a node: number of
ancestors
› Height of a tree: maximum depth of
any node (3)
› Descendant of a node: child,
grandchild, grand-grandchild, etc.
E
F
I
J
G
K
H
subtree
Binary Trees
Applications:
› A binary tree is a tree with the
following properties:

- Each internal node has at most two
children (exactly two for proper
binary trees)


arithmetic expressions
decision processes
searching
A
- The children of a node are an
ordered pair
› We call the children of an internal
node left child and right child
› Alternative recursive definition: a
binary tree is either
- a tree consisting of a single node, or
- a tree whose root has an ordered
pair of children, each of which is a
binary tree
B
C
D
E
H
F
I
G
Binary Trees
› Notation
n number of nodes
e number of external nodes
i number of internal nodes
h height
Properties:
 e  i  1
 n  2e  1
 h  i
 h  (n  1)/2
 e  2h
 h  log2 e
 h  log2 (n  1)  1
Binary Trees
› A node is represented by
an object storing
-

Element
Parent node
Left child node
Right child node
B
› Node objects implement
the Position ADT

B
A
A
D
C

D

E

C


E
Binary Search Trees - Ordered Dictionaries
› Keys are assumed to come from a total order.
› Operations
- insertion(key): insert key into dictionary
- delete(key): delete key from dictionary
- boolean find(key): does the key exists in the dictionary
Binary Search
› Binary search can perform operation find(k) on a dictionary
implemented by means of an array-based sequence, sorted by key
- at each step, the number of candidate items is halved
- terminates after O(log n) steps
› Example: find(7)
0
1
3
4
5
7
8
9
11
14
16
18
m
l
0
1
3
4
5
m
l
0
1
3
1
3
h
8
9
11
14
16
18
19
8
9
11
14
16
18
19
8
9
11
14
16
18
19
h
4
l
0
7
19
4
5
m
5
7
h
7
lm h
Binary Search Trees
› A binary search tree is a binary
tree storing keys (or key-value
entries) at its internal nodes and
satisfying the following property:
- Let u, v, and w be three nodes such
that u is in the left subtree of v and
w is in the right subtree of v. We
have
key(u)  key(v)  key(w)
6
2
› External nodes do not store items
1
9
4
8
Search
› To search for a key k, we trace
a downward path starting at the
root
› The next node visited depends
on the outcome of the
comparison of k with the key of
the current node
Algorithm TreeSearch(k, v)
if T.isExternal (v)
return v
if k < key(v)
return TreeSearch(k, T.left(v))
else if k  key(v)
return v
else { k > key(v) }
return TreeSearch(k, T.right(v))
<
› If we reach a leaf, the key is
not found and we return nukk
› Example: find(4):
- Call TreeSearch(4,root)
2
1
6
9
>
4 
8
Insertion
› To perform operation
insert(k), we search for key
k (using TreeSearch)
2
9
>
1
› Assume k is not already in
the tree, and let w be the
leaf reached by the search
4
8
>
w
› We insert k at node w and
expand w into an internal
node
› Example: insert 5
6
<
6
2
1
9
4
8
w
5
Deletion
› Assume key k is in the tree,
and let let v be the node
storing k
2
9
>
4 v
1
8
w
5
› If node v has a leaf child w, we
remove v and w from the tree
with operation
removeExternal(w), which
removes w and its parent
› Example: remove 4
6
<
› To perform operation
remove(k), we search for key k
6
2
1
9
5
8
Deletion (cont.)
› We consider the case where
the key k to be removed is
stored at a node v whose
children are both internal
1
v
3
2
8
6
w
- we find the internal node w that
follows v in an inorder traversal
› Example: remove 3
5
z
- we copy key(w) into node v
- we remove node w and its left
child z (which must be a leaf) by
means of operation
removeExternal(z)
9
1
v
5
2
8
6
9
Performance
› Consider a dictionary
with n items implemented
by means of a binary
search tree of height h
- the space used is O(n)
- methods find, insert and
remove take O(h) time
› The height h is O(n) in
the worst case and O(log
n) in the best case
AVL Trees
› AVL trees are balanced.
44
› An AVL Tree is a binary search
tree such that for every internal
node v of T, the heights of the
children of v can differ by at
most 1.
4
2
17
78
1
3
2
32
88
50
1
48
62
1
1
› Local property that guarantees a
global property.
An example of an AVL tree where the
heights are shown next to the nodes:
78
Height of an AVL Tree
Theorem: The height of an AVL tree storing n keys is O(log n).
Proof: Let us bound n(h): the minimum number of internal nodes of an AVL tree of
height h.
› We easily see that n(1) = 1 and n(2) = 2
› For n > 2, an AVL tree of height h contains the root node, one AVL subtree of
height h-1 and another of height h-2.
› That is, n(h) = 1 + n(h-1) + n(h-2)
› Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So
n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … (by induction),
n(h) > 2in(h-2i)
› Solving the base case we get: n(h) > 2
3
h/2-1
› Taking logarithms: h < 2log n(h) +2
› Thus the height of an AVL tree is O(log n)
4
n(2)
n(1)
Inserting with balanced height
› Insert node into binary search tree as usual
- Insert occurs at leaves
- Increases height of some nodes along path to root
› Walk up towards root
- If unbalanced height is found, restructure unbalanced
region with rotation operation
80
Insertion in an AVL Tree
› Insertion is as in a binary search tree
› Always done by expanding an external node.
› Example:
44
44
17
78
c=z
a=y
17
78
32
32
50
50
88
88
48
48
62
62
54
w
before insertion
after insertion
b=x
Restructuring
› let (a,b,c) be an inorder listing of x, y, z
› perform the rotations needed to make b the topmost node of the
three
(other two cases
are symmetrical)
a=z
a=z
case 2: double rotation
(a right rotation about c,
then a left rotation about a)
c=y
b=y
T0
T0
b=x
c=x
T1
T3
b=y
T2
case 1: single rotation
(a left rotation about a)
T1
T3
a=z
T0
b=x
T2
c=x
T1
T2
a=z
T3
T0
c=y
T1
T2
T3
Insertion in an AVL Tree
17
17
78
32
50
48
44
44
44
88
62
17
78
32
50
48
88
62
62
32
50
48
78
54
54
before insertion
after insertion
(unbalanced)
after double rotation
88
Removal in an AVL Tree
› Removal begins as in a binary search tree, which means
the node removed will become an empty external node.
Its parent, w, may cause an imbalance.
› Example:
44
44
17
62
32
50
48
17
62
78
54
before deletion of 32
50
88
48
78
54
after deletion
88
Rebalancing after a removal
› Let z be the first unbalanced node encountered while travelling up
the tree from w. Also, let y be the child of z with the larger height,
and let x be the child of y with the larger height.
› We perform restructure(x) to restore balance at z.
› As this restructuring may upset the balance of another node
higher in the tree, we must continue checking for balance until the
root of T is reached
a=z
w
44
62
17
b=y
62
50
48
c=x
78
54
44
88
17
78
50
48
88
54
Running Times for AVL Trees
› a single restructure is O(1)
- using a linked-structure binary tree
› find is O(log n)
- height of tree is O(log n), no restructures needed
› insert is O(log n)
- initial find is O(log n)
- Restructuring up the tree, maintaining heights is O(log n)
› remove is O(log n)
- initial find is O(log n)
- Restructuring up the tree, maintaining heights is O(log n)
Summary data structures
› Queues
- Enqueue, dequeue, first and size operations in O(1) time.
› Stacks
- Push, pop, top and size operations in O(1) time
› Balanced binary trees (e.g. AVL trees)
- Insert, delete and find operations in O(log n) time