Transcript Lecture 8 -
DCO20105 Data structures and algorithms
Lecture
8:
Trees
General model of a tree
Binary Tree
Tree representations
Heap and Heap sort
Binary Search Tree: construction and search
-- By Rossella Lau
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
A reason for Tree
Basic
sequential containers do not support efficient
processes on all of {insert, delete, search}
vectors: can support efficient search but not
insert/delete
list: can support efficient insert/delete but not search
Any other structures support efficient processes on
all the above operations?
Tree
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Tree
In
the most general sense, is a set of vertices, or nodes,
and a set of edges, where each edge connects a pair of
distinct nodes, such that there is one and only one
connecting path on these edges between any pair of
nodes.
A tree
in the above sense is called a free tree
By
picking up a distinguished node, denoting it as a
root, as the entrance of the tree, a tree can be
represented as an oriented tree
A free
tree may have numerous oriented trees
corresponding to a given free tree
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Different orientations of a tree
B
A
D
A
E
F
B
C
D
E
C
A
D
E
Rossella Lau
C
F
F
A
B
B
D
E
C
F
Lecture 8, DCO20105, Semester A,2005-6
Binary Tree
A binary tree can be empty or
partitioned into three disjointed subsets:
1. A single element called the root of the tree
2. A left sub-tree, which is a binary tree, of itself
3. A right sub-tree, which is a binary tree, of itself
Tree or binary tree’s definition is a recursive definition
and operations on trees are usually in a recursive manner
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Notations of a binary tree
A is the root of the tree
A is the parent of B and C (B is the parent
of D and E, …)
B is a left child of A
C is a right child of A
A or B is an ancestor of E
E or B is a descendant of A
B and C are siblings
D, G, H, I are leaves of the tree
The level of A is 0, the level of B is 1, …, the
level of G is 3
Depth = max{ level of leaves} = 3
A
B
D
C
E
G
Rossella Lau
F
H
I
Lecture 8, DCO20105, Semester A,2005-6
Structures that are not binary trees
A
B
D
G
C
E
H
A
A
B
F
I
D
B
C
E
D
F
C
E
F
G
G
H
I
All
of these trees contain a tree which is not a sub-tree of
itself
There
Rossella Lau
is more than one path connecting two of the nodes
Lecture 8, DCO20105, Semester A,2005-6
Traversing a binary tree
To
pass through a binary tree and enumerate each of
its nodes once
To enumerate, e.g., to print the contents of each node, to
update the contents of each node
When a node is enumerated, it is visited
There are, usually, three ways to traverse a binary tree
Preorder
Inorder
(depth-first order)
(symmetric order)
Postorder
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
The algorithms for traversing a binary tree
Preorder:
1. Visit the root
2. Traverse the left sub-tree in preorder sequence
3. Traverse the right sub-tree in preorder sequence
Inorder:
1. Traverse the left sub-tree in inorder sequence
2. Visit the root
3. Traverse the right sub-tree in inorder sequence
Postorder:
1. Traverse the left sub-tree in postorder sequence
2. Traverse the right sub-tree in postorder sequence
3. Visit the root
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Examples of traversing a binary tree
For the binary tree on page 6
Preorder:
Inorder:
Postorder:
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Representations of Binary Tree
Static:
Vector representation
Dynamic:
Rossella Lau
Pointer (Node) representation
Lecture 8, DCO20105, Semester A,2005-6
Complete binary trees
A Complete
binary tree of depth d:
all of whose leaves are at level d
all of non-leaf (internal) nodes have exactly two children
A binary
tree of depth d is an almost complete binary
tree if:
1. Any node n at level from 0 to d-2 has two children
2. For each node n in the tree with a right descendant at level d
n must have a left child and
every left descendant of n is either a leaf at level d or has
two children
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
A complete binary tree of depth 3
A
B
C
D
H
Rossella Lau
E
I
J
F
K
L
G
M
N
O
Lecture 8, DCO20105, Semester A,2005-6
Examples of almost complete binary trees
A
B
D
C
E
F
G
Is
H
this an almost complete
binary tree?
I
A
A
B
D
H
Rossella Lau
E
I
B
C
J
F
D
G
H
C
E
I
F
G
J
Lecture 8, DCO20105, Semester A,2005-6
Density of a tree
A complete
binary tree has the highest density:
number of nodes: 2d+1 - 1
A tree
with nodes which all have a single child has the
lowest density: number of nodes: d+1
A tree
with not many nodes is called sparse
Give
number of nodes n, a tree can be with a depth of
n-1 to log2 (n+1) - 1
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Implicit array representation of a binary tree
For
each almost complete binary tree, we can label
each node from 0 to n, where n < (2d+1 - 1)
0 A
1
3
2
B
C
4
D
5
E
F
6
G
0 1 2 3 4 5 6 7 8 9
A B C D E F G H I J
7
H
8
9
I
J
label is the subscript of an array The content of a
numbered node can be stored in the corresponding
position of an array
The
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Extensions to almost complete BT
For trees that are not a complete binary tree, we may add null
nodes to make the trees become almost complete
A
B
H
C
D
F
A
E
G
I
K
B
C
D
F
H
L
I
M
E
G
J
J
K
L
M
0 1 2 3 4 5 6 7 8 9 10 11 12
0 1 2 3 4 5 6 7 8 9
A B C
H I J K L
Rossella Lau
D E
F G
M
Lecture 8, DCO20105, Semester A,2005-6
Some operations on vector representation
Some
basic binary tree operations can be easily implemented:
vector<Data> bt;
left_child(node): 2 * node + 1
right_child(node): 2 * (node + 1)
parent(node): (node – 1) / 2 when node > 0
data(node): bt[node]
For efficient
calculation of parent and children, representation
may start the root from subscript at 1 instead of 0
left_child(node): 2 * node (equivalent to node<<1)
right_child((node): left_child + 1
parent(node): node / 2 (equivalent to node>>1)
the calculation can be simplified to bit-shift operations
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Exercises on implicit representation
Ford’s
Rossella Lau
written exercises: 14:11a, 12c
Lecture 8, DCO20105, Semester A,2005-6
An application of array representation: Heap
A heap
is an almost complete binary tree in which each
node is less than or equal to its parent
Since
it is an almost complete binary tree, its
implementation uses implicit array representation
The
common use of a heap is as a priority queue
A sample
of a heap
57
37
25
Rossella Lau
48
12
Lecture 8, DCO20105, Semester A,2005-6
Heap insert
To
insert the item as the last leaf in the tree then shift
it up whenever it is larger than its parent
E.g., Adding
92 to the previous heap
57
37
25
48
12
Rossella Lau
57 92
57
92
37
25
48
12
92 48
92
37
25
92 57
12
48
Lecture 8, DCO20105, Semester A,2005-6
Heap delete
To remove the maximum from the heap:
1. Swap the root (maximum) with the last element in the array (the
last node in the tree) the heap is reduced by one element
2. Shift the new root down whenever it is less than its larger child
within the reduced heap
92
67
25
22
67
57
12
Rossella Lau
22
92
67
25
22
67
22
25
22
25
57
12
92
22
57
12
Lecture 8, DCO20105, Semester A,2005-6
Exercise on heap
Ford’s
Rossella Lau
written exercises: 14:19b
Lecture 8, DCO20105, Semester A,2005-6
Heap sort
A binary
tree can be represented by a vector; a list of
data in a vector can also be treated as an almost
complete binary tree!
Heap
sort makes use of this feature to construct data
on a vector into a heap then sort data in order
It is a kind of selection sort
Each time it finds the largest from a list (the heap) then
places it to the last position of the list
The process continues the “selection” on the sub-lists
starting from the first elements which are not in the proper
positions yet.
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Heap sort method 1
1st phase: Construction of a heap
it inserts elements to the heap one by one ()
2nd phase: Selection sort
1. remove the maximum from the heap and replace it to
the last (the heap is reduced in the first n-1 nodes)
2. process continues until reduced heap becomes a
single node
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
An example of heap sort (method 1)
Input
stream: 25 57 48 37 12 92 86 33
Then
insert data one by one into the heap:
92
37
33
86
12
48
57
25
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Phase II of heap sort
92
25 86
37
33
25 92
86
12
48
86
25
57
57
25
37
33
57
12
48
……
25
92
12
25
37
33
48
57
86
92
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Heapsort (method 2)
Instead
of inserting data one by one, it converts the
tree to a heap in the first phase makeHeap() in Ford:
14-2.
Iteratively applying the heap condition to each
internal node (sub-trees) starting at the last and
working up to the root
Then
Rossella Lau
it applies the second phase of method 1
Lecture 8, DCO20105, Semester A,2005-6
Phase I of method 2
25
25
57
37
48
12
92
48 92
57
86
33
37
12
92
25
57
33
Rossella Lau
86
33
92
37
48
92
92
12
48
25
86
86 25
57
37
86
12
48
25
33
Lecture 8, DCO20105, Semester A,2005-6
Performance of heap sort
For
each insertion, it takes O(logn) because the
process is on an almost complete binary tree
For n elements, it takes O(nlogn) even for worst case
Experiments
show that heapsort doubles the time of
quicksort but out performs quicksort in the worst case
since the process keeps working on an almost
complete binary tree (level at most log(n+1)).
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Dynamic pointer representation
Reference
program: BST.h: use two classes: BNode
and BTree
template <class T>
class BNode {
T
item;
BNode *left;
BNode *right;
//end of data member
……}
Rossella Lau
template <class T>
class BTree {
BNode<T>
*root;
size_t countNodes;
//end of data member
……
}
Lecture 8, DCO20105, Semester A,2005-6
Implementation of inorder traversal
template<class T>
void BTree<T>::inOrder() const {
if ( root )
inOrder(root);
else
cout << “Empty tree\n”; }
template <classT>
void BTree<T>::inOrder(BNode<T> const *bnode)
const {
if (bnode->left) inOrder(bnode->left);
cout << bnode->item << “ “;
if (bnode->right) inOrder(bnode->right); }
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Pretty tree
In
order to make a tree visible, we may imagine the tree with a
90 degree left rotation, then we have a special printing
method: a reversed inorder traversal with nodes printed
according to their levels void prettyTree () {
if (root) pretty_tree(root, 0);
else ……
}
void prettyTree (BNode<T> const *bnode,
size_t const level) const {
if (bnode->right) prettyTree(bnode->right, level + 1);
// make space for different levels
for (size_t i=0; i<level; i++) cout << “
cout << bnode->item << endl;
“;
if (bnode->left) prettyTree(bnode->left, level + 1); }
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Binary Search Tree (BST)
A BST
is a binary tree in which all the key values stored in
the left descendents of a node are less than the key value
of the node, and all the key values stored in the right
descendants of a node are greater than the key value of
the node. E.g.,
50
28
22
75
40
35
Rossella Lau
90
87
95
Lecture 8, DCO20105, Semester A,2005-6
Dynamic representation for a BST
Same
as a Binary Tree; sample program: BST.h
template <class T>
class BSTNode {
T
item;
BSTNode *left;
BSTNode *right;
//end of data member
……}
Rossella Lau
Template <class T>
class BSTree {
BSTNode<T>
*root;
size_t countNodes;
//end of data member
……
}
Lecture 8, DCO20105, Semester A,2005-6
The algorithm for searching on a BST
The
searching can use a recursive approach.
BSTNode<T>* BSTree<T>::search (T const& target) const
return root ? search(root, target) : 0;
}
{
BSTNode<T>* BSTree<T>::search (BSTNode const *node,
T const& target) const {
if ( target == node->item ) return node;
if ( target < node->item )
return node->left ? search(node->left, target) : 0;
else
return node->right? search(node->right, target): 0;
}
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
The iterative version for searching on a BST
However,
it is also quite easy to convert the recursive
algorithm to a non-recursive(iterative) one since it only
involves "going down" the tree.
BSTNode<T>* BSTree<T>::search(T const& target)
const {
BSTNode<T> *cur = root;
while (cur) {
if (target == cur->item) return cur;
if (target < cur->item;)
cur = cur->left;
else
cur = crr->right;
}
return 0; }
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
A better way to return the result: find()
Searching usually follows the operations of insert or delete but the traditional
search returns
a null pointer when a new item is required to insert; i.e., the insert has to find the
proper position to insert the item, again!
the node for deletion which requires checking if the node is on the right hand side
or the left hand side of its parent, again!
With the reference supported in C++, we can write a find() which is similar
the one in List.h in Lecture 4 for efficient insert() and remove() with one
single search operation even if these operations require a search to make sure
the node does not exist or does exist
Node *& means a reference of pointer that can be interpreted as the reference of
the location where the pointer stores. From another view, if the name is on the
right hand side of an expression, it refers to the value of the pointer, i.e., the node
pointed to by the pointer; if the name is on the left hand side, it refers to the
location storing the pointer; or the "parent" of the node! Assigning new values to
the name means to change its "child"!
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
The implementation of find()
BSTNode<T>*& find (T const & target) {
if ( !root || target == root->item )
return root;
BSTNode<T>* par = root; // parent of current node
while( 1 )
{
if ( target < par->item )
if (!par->left || target == par->left->item)
return par->left;
else
par = par->left;
else
if (!par->right || target == par->right->item)
return par->right;
else
par = par->right; }}
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Insert an item with find()
To insert an item involves searching for the correct place and
usually, a BST assumes no duplication, then attach the new node
to the target found by find()
Add
an additional function attach() to BSTree
bool insert(T const & target) {
BSTNode<T> *& curRef ( find ( target ) );
if ( !curRef )
return attach(curRef, target);
else
return false; // duplication }
bool attach( BSTNode<T> *& nodeRef, T const & x )
{
nodeRef = new BSTNode<T>( x );
return nodeRef;}
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Construction of the BST using insert()
Input sequence: 50, 28, 40, 75, 90, 22, 35, 95, 87
50
28
22
75
40
35
90
87
95
An
online animation is also available at:
http://www.cs.jhu.edu/~goodrich/dsa/trees/btree.html
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
More exercises on BST
Ford’s
Rossella Lau
exercises: 10:20, 22
Lecture 8, DCO20105, Semester A,2005-6
Complexity considerations
If
the binary tree is constructed in a random order,
the levels of the left sub-tree and right sub-tree of
the resulting tree may be similar and each later
search process is similar to a binary search in an
array
Therefore,
the optimal complexity for searching on
a BST is about O(log2n)
However,
if the input sequence for the BST is in
sequential order, it may result in the tree on the next
page. The complexity of find() becomes O(n)
Therefore,
the complexity of the search on a BST is
from O(log2n) to O(n).
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
The worst case of searching on a BST
Input
sequence: 22, 28, 35, 40, 50, 75, 87, 90, 95
22
28
35
40
50
75
87
90
95
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Complexity for insert()
As
the logic of insert() is find() + attach()
If
there is a fast memory allocation method, the
running time of attach() is O(1)
insert() is similar to find(), insert() has the same
complexity as find()
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Summary
A binary
tree is a typical recursive structure and has three
parts: root, left and right sub-trees
A binary
tree is used to being stored in node representation
Sometimes,
it is also efficient to store a binary tree in implicit
array (vector) representation and its typical applications are
heap and heap sort which is quite an efficient sorting
algorithm for all cases
There
are three usual ways to traverse a binary tree: preorder,
inorder, and postorder
The
binary search tree (BST) keeps smaller values on the left
side of a node and larger values on the right
The
optimal complexity for insert/search of a BST is O(log2n)
Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6
Reference
Ford:
10.1-6, 14.1-2
Structures and Algorithms in C++ by Michael T.
Goodrich, Roberto Tamassia, David M. Mount :
Chapter 6,8
Data
Example
programs: BST.h, testBST.cpp
-- END -Rossella Lau
Lecture 8, DCO20105, Semester A,2005-6