Transcript ppt slides
Popular Ranking Algorithms
Prepared by
-Ranjan Dash
Contents
Efficient ways of Ranking
Algorithms for ranking
Sort Algorithm
Scan Algorithm
FA Algorithm
TA Algorithm
Efficient ways of Ranking
Besides choosing a proper ranking function, efficient way
to execute also decides the performance.
So given a ranking function the execution of this
following a particular ranking algorithm plays a key role
in the efficiency.
Algorithms for ranking
Prominent Algorithms to get top K results are
Sort Algorithm
Scan Algorithm
FA Algorithm
TA Algorithm
Sort Algorithm
Most simple way to decide the top K results of a ranking
function like
Score (ObjectId) = Linear combinations of attributes
is to sort the result and take the top K.
This will take nlogn time.
Very slow for very large relations where n is quite large.
Scan Algorithm
Keep K tuples in a buffer.
Scan this buffer for every tuple in the relation.
Replace the lowest one in the buffer if the input tuple is
more than that.
Takes O(n.K) time.
Still low for a large n.
FA Algorithm
Fagin’s Algorithm known as FA Algorithm. Developed by
Ron Fagin.
Takes the help of data structures prepared offline.
Though there is a cost associated with these data
structures, yet the amortized cost is very low.
Sorted access to the attributes. Supports GetNext()
operation and is sequential. One sorted table per
attribute.
Random access through the ObjectId. Supports
Get(ObjId) operation.
The pre processing requires the preparation of above
two types of data structures which will be used again
and again during the processing.
FA Algorithm
Step1
Example of determining top 1 restaurant based on the given
ranking function
Score(RestId) = 2.Cusine + Location
RestId
Cusine
Location
RestId
Cusine
RestId
Location
1
2
3
4
5
6
7
2
1
4
7
3
5
3
3
5
2
4
4
2
3
4
6
3
5
7
1
2
7
5
4
3
3
2
1
2
5
4
1
7
3
6
5
4
4
3
3
2
2
Original relation
Sorted for Cusine
Sorted for Location
FA Algorithm
Step1
Do the GetNext from both sorted tables in round robin.
Stop when K objects have been seen in common from all lists – 1
in our example
RestId
Cusine
RestId
Location
4
6
3
5
7
1
2
7
5
4
3
3
2
1
2
5
4
1
7
3
6
5
4
4
3
3
2
2
Sorted for Cusine
Sorted for Location
1st
Round
2nd
Round
3rd
Round
4
2
6
5
3
4
RestId 4 is winner in
our case
FA Algorithm
Step2
Random access to calculate the score for all visited tuples in step
1.
Take the top K after evaluation
This algorithm is applicable if the problem shows monotonic
property.
The worst case will be same as scan algorithm.
The worst case memory requirement is unbounded.
TA Algorithm
Known as Threshold Algorithm
Similar to FA but sorted access and random access are
interleaved.
Step 1
Do sorted access (and corresponding random accesses) until you
have seen the top K answers.
Step 2
Determine threshold value (Hypothetical tuple) based on objects
currently seen under sorted access.
K objects with overall score ≥ threshold value ? Stop.
Else go to next entry position in sorted list and repeat step 1
Faster than FA.
Requires less memory.