Power Point 2000

Download Report

Transcript Power Point 2000

Assignment 2:
A Simple Address Book
Andy Wang
Data Structures, Algorithms, and
Generic Programming
Learning Objectives
 Gain further experience in C++
 Manage project develop using make
 Apply the concept of hash functions
Mission
 Write a simple address book
 Use hash functions to access a table



Insertion
Lookup
Deletion
Deliverables (Due 9/26/2003)
 makefile
 hashtable.h
 hashtable.cpp
 main.cpp
 Development logs
More on Development Log
 Due in each class
 Cumulative



9/17, turn in a log that covers 9/15 – 9/17
9/22, turn in a log that covers 9/15 – 9/22
And so on…
Requirements
 Create a proj2 directory




makefile
hashtable.h
hashtable.cpp
main.cpp
hashtable.h
 Interface of the class HashTable
 Required public interface:
bool Insert(const string &name, const string &addr);
bool Lookup(const string &name, string *addr) const;
string Remove(const string &name);
Constructor
 Hashtable(unsigned int size);
 Default size: 5
Insert()
bool Insert(const string &name, const string &addr);
• Return value
•
•
true on success
false on failure
•
•
Duplicate name
Table is full
Insert()
Name: Michael Jackson
Address:
0
1
2
3
Hash Table
Insert()
Name: Michael Jackson
Address: Never-Never Land
Hash(name)
0
1
2
3
Hash Table
Insert()
Name: Michael Jackson
Address: Never-Never Land
Hash(name)
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
Insert()
Name: Mickey Mouse
Address: Disneyland
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
Insert()
Name: Mickey Mouse
Address: Disneyland
Oh, Boy!
Hash(name)
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
One problem
with hashing—collision
 Collision handling techniques:





Increase the hash table size
Chaining
Linear probing
Quadratic probing
Double hashing
Increase the Hash Table Size
 If collision, double the size, rehash all entries
+ Conceptually simple

Unbounded growth of table size
Birthday Paradox
In a group of 60 people, you are very likely to find
two people with the same birthday.
 The probability of collisions is higher than you think!

Chaining
 Requires a data structure from a later lecture
 Basic idea: each table entry is associated
with multiple (key, value) pairs
 Sequentially look through those pairs
+ incremental growth of the table
- poor performance when an entry is associated
with too many (key, value) pairs
Linear Probing
 Sequentially find the next empty table entry
(Hash(key) + 1) % size
 (Hash(key) + 2) % size …
+ Simple
+ Use all table entries before size increase
- Clustering (nonuniform distribution of table
entries)
- Can degenerate into sequential searches

Quadratic Probing
 Try to avoid clustering
(Hash(key) + 12) % size
2
 (Hash(key) + 2 ) % size…
+ Simple
- Secondary clustering (not as severe)

Double Hashing
 Use another hash function to determine the
skipping distance
If Hash1(key) points to an occupied entry
 Use (Hash1(key) + Hash2(Hash1(key))) % size
 (Hash1(key) + 2*Hash2(Hash1(key)) % size…
+ Avoids secondary clustering
+ Widely used in compilers

Lookup()
bool Lookup(const string &name, string *address) const;
• Return value
•
true if found
•
•
address contains the address
false if not found
Lookup()
Name: Michael Jackson
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
Lookup()
Name: Michael Jackson
Hash(name)
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
Remove()
string Remove(const string &name) const;
• Return value
•
The address associated with the name
Remove()
Name: Michael Jackson
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
Remove()
Name: Michael Jackson
Hash(name)
0
1 Michael Jackson Never-Never Land
2
3
Hash Table
Remove()
Name: Michael Jackson
Hash(name)
0
1
2
3
Hash Table
Your Hash Function
 Hash(name) // may contain spaces
 Develop/choose your own hash function(s)



Use all characters in the name
Use the ordering information of each character
Short names should not cluster in the table
Milestones
1. Create all files





Add #include into .cpp files
Add #ifndef into .h files
Add files names into the makefile
make clean
make
Milestones
2. Add the class definitions to the header files

Try to compile
3. Write the skeleton code in your .cpp files


Nothing but empty functions
Try to get the interface to compile
4. Write pseudo code (comments) in each
skeleton function
Milestones
5. Implement the constructor/destructor
routines


Add private functions as needed
Compile and test those routines
6. Implement a dump routine to visualize the
hash table
Milestones
7. Implement the accessor routines that only
read your private data members

Compile and test those routines
8. Implement the accessor routines that write to
your private data members

Compile and test those routines
Milestones
9. Implement the Insert routine




Implement the associated hash function(s)
Test the hash function in isolation
Check for clustering
Check for duplicate names
10. Implement collision handling
Milestones
11. Implement the Lookup routine

Compile and test the routine
12. Implement the Remove routine. Compile
and test the routine
Pace Yourself
 Try to accomplish 1 to 2 milestones per day
 Start your project early!