Chap 2 dhamdhere

Download Report

Transcript Chap 2 dhamdhere

Chapter - 2
Data strucuters for Language processing
Classification
1. Based on nature ---- Linear and Non-linear
eg :- Linear = array , stack etc.
Non-Linear = Tree , Graph etc.
2. Based on Purpose --- Search or allocation
eg :- Search = Binary search tree
allocation = stacks,heaps
3.Based on Lifetime ---- whether used during Language Processing or during
target program executions
eg :- Lang. Processing = Object based data model
Target program = Hash tables
Search Data Structures
Entry Format
Fixed and Variable Length entries
Generic Search Procedure
Binary Search Organization
•
•
•
All entries in table satisfies the order relation
They keep entries according to order
Uses symbols like ‘<’ and ‘>’
• It uses algorithm to make entries represened
in next slide
Hash Tables
H is the Hasing function.
S is the symmbols for entry
S(e) is current entry symbol
Hashing Function
•
•
•
•
•
•
Hashing function is used to make search system faster.
It transforms the source symbol or group of symbols to numerical
numbers to make faster comparisons and searching
Hashing do not change the original meaning of symbols it just
transforms them to other form.
Size is pre decided for transforming message to particular format
If message is of less size than that size , it performs “folding”
operation
In folding message is padded with 0’s to complete the size of it.
Properties of good hashing
func.
examples of hash function
1. Multiplication of numbers h(s) will result some number
by multiplying symbols with predefined number to
generate various hash values
2. Division function , it will also divide the symbol by some
pre defined number and convert the symbol to numeric
value.
3.Exponential method , like all mathematical funciton this
can also be used
Collision in hashing
Many function result into same number generation which leads to collision of
numbers and searching will crash
Thus to avoid collision we have varsious collision handling techniques
1. Rehasing technique
2. Overflow chaining technique
Rehasing
Overflow chaining
•
•
•
It avoids collision by creating new table for colliding entry in
overflow table
Every entry contains a pointer for mapping with pointer
poiting towards other entry in overflow table
Symbols associated with tables are chained together thus
this technique is called chaining technique.
Linked List
Binary Tree
•
•
•
Each entry in tree is having two fields 1.left_pointer and
2. right_pointer
If s is symbol on left of tree then all the entries on left
will be less then root node
And if s is on right then every S will be greater then root
node/entry
Nested Search
Stacks
Extended Stack model
Heaps
Memory managment
•
•
•
Due to repetition of allocation and deallocation of
memory area holes are created in memory area.
Memory management takes care of this holes and
reallocate this area by managing it properly
It increases performance and speed of allocation and
deallocation of memory spaces
Identifying free memory area
Two popular techniques are :1. Reference counts
2.Garbage Colletion
Reference count
•
•
•
It associates a reference count number with memory
area to indicate number of active users
Number increases when user access memory and
decreases when user releases memory area
Area is said to be free when number drops to zero
Garbage collection
•
•
•
•
It makes two passes over memory to get unused areas
In first pass it traverse through all memory and marks
every memory in use.
In second pass all unmarked areas are declared as free
memory areas.
They are called every time system runs out of memory
to allocate new memory area.