Transcript unit 11a

Unit 11: Data Structures & Complexity
syllabus

We discuss in this unit
•
•
•
•
Graphs and trees
Binary search trees
Hashing functions
Recursive sorting: quicksort, mergesort
basic
programming
concepts
object oriented
programming
topics in
computer
science
1
unit 11a
Graphs and Trees
Graph: a data representation which includes nodes
and edges, where each edge connects two nodes
Example: the internet
Tree: a connected graph with no loops, and a root
Example: inheritance tree
2
unit 11a
Graphs and Trees
b)
a)
ROOT
c)
internal vertices
LEAF NODES
3
unit 11a
Binary Trees

A rooted tree is called a binary tree if every internal
vertex has no more than 2 children
The tree is called a full binary tree if every internal vertex
has exactly 2 children

An ordered rooted tree is a rooted tree where the children
of each internal vertex are ordered; we call the children of
a vertex the left child and the right child, if they exist

A rooted binary tree of height H is called balanced if all its
leaves are at levels H or H-1
4
unit 11a
Binary tree: example
Lou
Hal
Ed
Max
Ken
Joe
Sue
Ted
5
unit 11a
Tree Properties
Theorem: A tree with N vertices has N-1 edges
Theorem: There are at most 2 H leaves in a binary tree of
height H
Corallary: If a binary tree with L leaves is full and
balanced, then its height is
H =  log2 L 
Theorem: There are at most (2 H+1–1) nodes in a binary
tree of height H
6
unit 11a
Binary Search Tree (BST)
A special kind of binary tree in which:
1. Each vertex contains a distinct key value
2. The key values in the tree can be compared using
“greater than” and “less than”
3. The key value of each vertex in the tree is
less than every key value in its left subtree, and greater
than every key value in its right subtree
unit 11a
Example: Binary Search Tree
Lou
Hal
Ed
Max
Ken
Joe
Sue
Ted
8
unit 11a
Shape of a BST
Depends on its key values and their order of insertion:


Insert the elements ‘J’ ‘E’ ‘F’ ‘T’ ‘A’ in that order.
The first value to be inserted is put into the root.
‘J’
9
unit 11a
Inserting ‘E’ into the BST
Thereafter, each value to be inserted begins by comparing
itself to the value in the root, moving left it is less, or
moving right if it is greater. This continues at each level
until it can be inserted as a new leaf.
‘J’
‘E’
10
unit 11a
Inserting ‘F’ into the BST
Begin by comparing ‘F’ to the value in the root, moving left it
is less, or moving right if it is greater. This continues until
it can be inserted as a leaf.
‘J’
‘E’
‘F’
11
unit 11a
Inserting ‘T’ into the BST
Begin by comparing ‘T’ to the value in the root, moving left it is less, or
moving right if it is greater. This continues until it can be inserted as
a leaf.
‘J’
‘T’
‘E’
‘F’
12
unit 11a
Inserting ‘A’ into the BST
Begin by comparing ‘A’ to the value in the root, moving left it is less, or
moving right if it is greater. This continues until it can be inserted as
a leaf.
‘J’
‘T’
‘E’
‘A’
‘F’
13
unit 11a
Order of insertion
what BST is obtained by inserting the elements ‘A’ ‘E’ ‘F’ ‘J’ ‘T’
in that order?
‘A’
‘E’
‘F’
‘J’
‘T’
14
unit 11a
Another binary search tree
‘J’
‘T’
‘E’
‘A’
‘H’
‘M’
‘P’
‘K’
Add nodes containing these values in this order:
‘D’
‘B’
‘L’
‘Q’
‘S’
‘V’
‘Z’
15
unit 11a
Task: is ‘F’ in the tree?
‘J’
‘T’
‘E’
‘A’
‘D’
‘B’
‘V’
‘M’
‘H’
‘Z’
‘P’
‘K’
‘L’
‘Q’
‘S’
16
unit 11a
Search(x)

1.
2.
3.
4.
start at the root of the tree which contains y:
the tree is empty  x is not present
x = y (the item at the root)  the root is returned
x < y  recursively search the left subtree
x > y  recursively search the right subtree
17
unit 11a
Operations

Search(x)

Insert(x)

Delete(x)
tree algs/demo
18
unit 11a
Complexity

Search(x) – O(H)

Insert(x) – O(H)

Delete(x) – O(H)

worst case O(n) when tree is a list
best case O(log n) when tree is full and balanced
19
unit 11a
Traversal Algorithms

A traversal algorithm is a procedure for
systematically visiting every vertex of an ordered
binary tree

Tree traversals are defined recursively

Three traversals are named:
preorder
inorder
postorder
20
unit 11a
PREORDER Traversal Algorithm
Let T be an ordered binary tree with root r:

If T has only r, then r is the preorder traversal

Otherwise, suppose T1, T2 are the left and right
subtrees at r; the preorder traversal is
•
visit r
•
traverse T1 in preorder
•
traverse T2 in preorder
21
unit 11a
Preorder Traversal
Visit first
ROOT
‘J’
‘T’
‘E’
‘A’
‘H’
Visit left subtree second
‘M’
‘Y’
Visit right subtree last
result: J E A H T M Y
22
unit 11a
INORDER Traversal Algorithm
Let T be an ordered binary tree with root r:

If T has only r, then r is the inorder traversal

Otherwise, suppose T1, T2 are the left and right
subtrees at r; then
•
traverse T1 in inorder
•
visit r
•
traverse T2 in inorder
23
unit 11a
Inorder Traversal
Visit second
ROOT
‘J’
‘T’
‘E’
‘A’
‘H’
Visit left subtree first
‘M’
‘Y’
Visit right subtree last
result: A E H J M T Y
24
unit 11a
POSTORDER Traversal Algorithm
Let T be an ordered binary tree with root r:

If T has only r, then r is the postorder traversal

Otherwise, suppose T1, T2 are the left and right
subtrees at r; then
•
traverse T1 in postorder
•
traverse T2 in postorder
•
visit r
25
unit 11a
Postorder Traversal
Visit last
ROOT
‘J’
‘T’
‘E’
‘A’
‘H’
Visit left subtree first
‘M’
‘Y’
Visit right subtree second
result: A H E M Y T J
26
unit 11a
A Binary Expression Tree
ROOT
‘-’
‘8’
INORDER TRAVERSAL:
‘5’
8 - 5
PREORDER TRAVERSAL:
- 8 5
POSTORDER TRAVERSAL:
8 5 -
has value 3
27
unit 11a
Binary Expression Tree
A special kind of binary tree in which:
1. Each leaf node contains a single operand
2. Each nonleaf node contains a single binary
operator
3. The left and right subtrees of an operator node
represent subexpressions that must be evaluated
before applying the operator at the root of the
subtree
28
unit 11a
A Binary Expression Tree
‘*’
‘+’
‘4’
‘3’
‘2’
What value does it have?
( 4 + 2 ) * 3 = 18
29
unit 11a
A Binary Expression Tree
‘*’
‘+’
‘4’
‘3’
‘2’
What infix, prefix, postfix expressions does it represent?
30
unit 11a
A Binary Expression Tree
‘*’
‘+’
‘4’
‘3’
‘2’
Infix:
((4+2)*3)
Prefix:
* + 4 2 3
evaluate from right
Postfix:
4 2 + 3 *
evaluate from left
31
unit 11a
Levels Indicate Precedence
When a binary expression tree is used to
represent an expression, the levels of the
nodes in the tree indicate their relative
precedence of evaluation
Operations at higher levels of the tree are
evaluated later than those below them; the
operation at the root is always the last
operation performed
32
unit 11a
Example
‘*’
‘-’
‘8’
‘/’
‘5’
‘3’
‘+’
‘4’
‘2’
What infix, prefix, postfix expressions does it represent?
33
unit 11a
A binary expression tree
‘*’
‘/’
‘-’
‘8’
‘5’
‘3’
‘+’
‘4’
‘2’
Infix:
((8-5)*((4+2)/3))
Prefix:
*-85 /+423
Postfix:
8 5 - 4 2 + 3 / * has operators in order used
34
unit 11a
Hash tables




Goal: accesses data with complexity O(1), even when the
data needs to be dynamically administered (mostly by
inserting new or deleting existing data)
Solution: array
Problem: access is only O(1) if the index of the element is
known – otherwise O(log n)
Solution: each data item to be stored is associated with a
hash value which gives array index:
• generate array of size m, where m is prime and sufficiently
large
• assign unique numerical value N(k) to key of element k
• h(k) = N(k) mod m
35
unit 11a
Hash tables - cont


Problem: hash value is not unique anymore!
Solution: a collision procedure must determine the
position for the new object
hasshing
36
unit 11a
Operations



Search(x)
Insert(x)
Delete(x)
Complexity:
 worst case O(n) when hash value is always the
same (and table is really a list)
 best case O(1) when hash value is always distinct
37
unit 11a
Example: keys are names

list of unique identifiers
• washington
• lincoln
• bush





103288042987600
5201793578
151444
the range currently is all positive integers, which is
not possible to implement
for m=10007
• washington
• lincoln
• bush



3249
4873
1339
38
unit 11a
why modulus prime number?



underlying reason: the new code numbers should
appear random
technical reason: the modulus operation should
be a “field”
example:
• original codes are all even numbers
• m is even
• only even buckets in the table will get filled, half
the table will be empty
39
unit 11a
Back to sorting...
Two recursive algorithms:

MergeSort

QuickSort
40
unit 11a
MergeSort

divide array to two equal parts

recursively sort left part

recursively sort right part

merge the two sorted lists
MergeSort
41
unit 11a
QuickSort

choose pivot arbitrarily

divide to left part (< = than pivot) and right part (>
than pivot)

sort each part recursively
quicksort
42
unit 11a
Complexity

MergeSort
worst case O(n * log n)

QuickSort
worst case O(n2)
average case O(n * log n)
43
unit 11a