Transcript PowerPoint

Class 16 - Searching
 Linear search
 Binary search
 Binary search trees
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 1
The problem
 Given a set of data items {k1, k2, ..., kn} -
called keys - determine whether another key
k occurs in the set.
 (Variation: Start with pairs (k1,v1), ...,
(kn,vn). Then given k, determine whether k
= ki for some i; return vi.)
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 2
“Data structures”
 We will consider two costs:
 The
cost of inserting the keys ki in some data
structure (array, list, whatever)
 The cost of searching for k
 Cost depends primarily on how data are
stored.
 Best data structure can depend upon ratio of
searches to insertions, how searches and
insertions are distributed.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 3
Data structures (cont.)
 We will look at three data structures:
 Linear
storage in array - unsorted
 Linear storage in array - sorted
 Binary tree
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 4
Linear storage, unsorted
Store data in array keys[0...count], unsorted
1. To insert key k: keys[count] = k; count++;
2. To find key k:
for (i=0; i<count; i++)
if (keys[i] == k) return i;
Cost of inserting new key: O(1)
Cost of searching for key: O(n)
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 5
Linear storage, sorted
Store data in array keys[0...count], sorted
1. To insert key k: insert in correct order, as in
insertion sort
2. To find key k: same as above
Cost of inserting new key: O(n)
Cost of searching for key: O(n)
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 6
Binary search
Cost of searching in sorted array can be
dramatically reduced by “divide and
conquer”.
boolean bsearch (int k, int[] keys,
int i, int j)
// find whether k occurs in keys[i..j]]
Can cut size of subarray in half with one
comparison:
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 7
Binary search (cont.)
i
m
j
keys
m = (j+i)/2
 Cases:
k
== keys[m]: done
 k < keys[m]: k occurs in keys[i..m-1] (if at all)
 k > keys[m]: k occurs in keys[m+1..j] (if at all)
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 8
Binary search (cont.)
boolean bsearch (int k, int[] keys,
int i, int j) {
// find whether k occurs in keys[i..j]
int m = (i+j)/2;
if (i > j) return false;
if (k == keys[m]) return true;
if (k < keys[m])
return bsearch(k, keys, i, m-1);
return bsearch(k, keys, m+1, j);
}
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 9
Efficiency of binary search
# of comparisons = 2 * # of times array can be
divided in half: log2 n
E.g. If n = 1,000,000, worst-case cost of linear
search = 1,000,000; worst-case cost of
binary search = 40.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 10
Binary trees
Can obtain log n insertions and searches (on
average) by storing data in binary tree
structure.
Def. A binary tree is a structure consisting of a
number (the label ) and two binary trees, the
left and right subtrees. Both are optional.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 11
Binary trees (cont.)
15
10
9
19
12
11
16
14
Def. Size of tree t = number of nodes
(i.e. numbers) in t.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 12
Binary search trees
 Def. A binary search tree (BST) is a binary
tree with the following property: its label is
greater than any of the labels in its left
subtree and less than any labels in its right
subtree. Furthermore, its left and right
subtrees are binary search trees.
 Example: tree on previous slide
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 13
Balanced BST’s
 A binary search tree is balanced if its left
and right subtrees are of approximately
equal size, and are both balanced.
 Def. The height of a tree is the length of the
longest path from the top to the bottom.
 Observation If a BST of size n is balanced,
its height  log2 n.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 14
Searching in BST
k’
To find k in BST T =
Cases:



T1
T2
k = k’: Done
k < k’: Search in T1 (if it exists)
k > k’: Search in T2 (if it exists)
Time (in worst case) is proportional to height
of T. If T is balanced, time is order log2 n.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 15
Inserting in BST
k’
To insert k in BST T =
Cases:
T1
T2
k = k’: Done (do nothing)
 k < k’: Insert in T1. If T1 absent, add new left
subtree containing just k
 k > k’: Insert in T2. If T2 absent, add new
right subtree containing just k

If T is balanced, time is order log2 n.
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 16
Implementing BST’s
 Represent trees by objects containing an
integer and two references to other trees.
(Recall that a reference to an object can be
null, meaning it doesn’t really point to
anything.)
class BinarySearchTree {
int val;
BinarySearchTree left, right;
BinarySearchTree (int v, BinarySearchTree l,
BinarySearchTree r) {
val = v; left = l; right = r;
}
}
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 17
Implementing BST’s (cont.)
 Could make the search and insert operations
instance methods, but, as for lists, we will
instead make them class methods of a
separate class, BST.
class BST {
static BinarySearchTree makeBST (int k) {
return new BinarySearchTree(k, null, null);
}
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 18
Implementing BST’s (cont.)
static boolean search (int k, BinarySearchTree t) {
if (t == null) return false;
else if (k == t.val) return true;
else if (k < t.val) return search(k, t.left);
else return search(k, t.right);
}
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 19
Implementing BST’s (cont.)
static void insert (int k, BinarySearchTree t) {
// t must be non-null
if (k == t.val) return;
if (k < t.val)
if (t.left == null)
t.left = makeBST(k);
else
insert(k, t.left);
else
if (t.right == null)
t.right = makeBST(k);
else
insert(k, t.right);
}
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 20
Implementing BST’s (cont.)
Here is an example of using these operations:
public static void main (String[] args) {
BinarySearchTree T = BST.makeBST(7);
BST.insert(3,T); BST.insert(5,T);
BST.insert(13,T); BST.insert(11,T);
BST.insert(9,T); BST.insert(1,T);
System.out.print(BST.search(3, T)); // true
System.out.print(BST.search(4, T)); // false
}
24/3/00
SEM107 - © Kamin & Reddy
Class 16 - Searching - 21