Transcript Fnu Pooja

Project 2 Presentation & Demo
Course: Distributed Systems
By
Pooja Singhal
11/22/2011
1
Outline




Requirement
Requirement Analysis
Challenges
Design








Data Structures
3-Tiered Model
Algorithms
Implementation
Learning and Experience
Summary
Acknowledgement
Demo
2
Requirement




Design and Implement a 3-Tiered Client Server Model.
Given a City and State, find 5 nearest Airports.
Using RPC
System: Linux machine, Sun RPC, Language: C++/C
3
Requirement Analysis
1) Parsing of Places2k.txt – Fast and Efficient. Which DS?
2) Parsing of airport-locations.txt - Spatial Partitioning. Which DS?
3) 3-Tiered Model Requirement

4) Requirement of Algorithm
Client


Places Server

Airport Server
To search Nearest Neighbors
4
Challenges



3-Tiered Client Server Model.
Spatial Partition: Did not know much about it!
How to Search 5 nearest Airports ? Again No idea!
1) Parsing of Places2k.txt
2) Design and Implement Client
3) Design and Implement Places Server
Test the code So far
4) Design and Implement 3-Tiered Model
5) Design and implement Spatial Partitioning of data in airportlocations.txt
6) Design and implement N nearest Airports Search
5
Design
6
Design Tactics
Weekly Submissions made the job easy!

First week Design: Parse both the files, Data Structures

Second Week Design:
 IDL Design : Structure Design, Function Design
 Client Design and Logic:
 Places Server Design: Server for Client

Final Week Design
 3-Tiered Model Design
 Change of Data Structure for airport-locations.txt records
 Nearest Neighbor Search design
7
3-Tiered Model
CLIENT
PLACES SERVER
AIRPORT SERVER
8
Client Design

Client



Gets the location from User in the form of CITY and STATE.
Pass it to Places Server
Display the results
9
Places Server Design
Phase 1
 As a Server to Client:







On start up, parse places2k.txt in hash table
Hash Table key is combination of “CITY” and “STATE”
Latitude and Longitude are stored as DATA along with the key
Gets the inputs (CITY and STATE) from Client
Make the Key: Apply Hash Function
Search Hash Table
If Found: Get Latitude and Longitude
Phase 2:
 As a Client to Airport Server:



Pass on Latitude and Longitude to Airport Server
Get the result back from Airport Server
Return the result to Client.
10
Airport Server Design

Act as a Server to Places Server







On start up, parse airport-location.txt
Creates a K-D Tree in memory
Gets the latitude and location from places server.
Search Nearest Neighbor
Calculate the Distance
Sort the results on Distance
Return 5 neared neighbor back to places server
11
Design - Data Structures
12
Data Structures (1)

Hash Table





K-D Tree




Store places2k.txt records
Key is combination of CITY and STATE
Data: Latitude and Longitude
Advantages: Fast
Spatial Partitioning of 2 Dimensional Points consist of Latitude and Longitude
Node consists of 2 Dim points (Latitude, Longitude)
Node Data: Airport Code, Airport Name, State
Linked List


Store Results consisting of 5 nearest airports
Since, pointers do not get passed over RPC, needed to store the address of the
next record
13
Data Structures (2)

K-D Tree




Space Partitioning Data Structure for storing a finite set of points in a kdimensional space
Invented by J Luis Bentley in 1975
Is a Binary Tree: Special example of Binary Space Partitioning Trees
Applications in wide areas: Neural Networks, searching multidimensional data
1. X Plane Division
2. Y Plane Division
3. X Place Division
(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)
14
Source: http://pointclouds.org/documentation/tutorials/kdtree_search.php
Design - Algorithms
15
Algorithms
Nearest Neighbor Search





Starting with the root node, the algorithm moves down the tree recursively
Once the algorithm reaches a leaf node, it saves that node point as the "current
best“
The algorithm unwinds the recursion of the tree, performing the following steps at each node: If
the current node is closer than the current best, then it becomes the current best.
The algorithm checks whether there could be any points on the other side of the splitting plane
that are closer to the search point than the current best

done by intersecting the splitting hyperplane with a hypersphere around the search point that
has a radius equal to the current nearest distance.

Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see
whether the difference between the splitting coordinate of the search point and current node
is less than the distance (overall coordinates) from the search point to the current best.

If the hypersphere crosses the plane, there could be nearer points on the other side of the
plane, so the algorithm must move down the other branch of the tree from the current node
looking for closer points, following the same recursive process as the entire search.

If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking
up the tree, and the entire branch on the other side of that node is eliminated.
When the algorithm finishes this process for the root node, then the search is complete.
16
Source: http://en.wikipedia.org/wiki/File:KDTree-animation.gif
17
Source: http://en.wikipedia.org/wiki/File:KDTree-animation.gif
18
Source: http://en.wikipedia.org/wiki/File:KDTree-animation.gif
19
Source: http://en.wikipedia.org/wiki/File:KDTree-animation.gif
20
Source: http://en.wikipedia.org/wiki/File:KDTree-animation.gif
21
Source: http://en.wikipedia.org/wiki/File:KDTree-animation.gif
22
Implementation
23
Implementation
Weekly Submissions made the job easy!

First week Implementation:


Second Week Implementation: Make Client and Places Server work
Properly




Parsing of places2k.txt and airport-locations.txt, Storage in Hash Tables
IDL implementation: client.x
Implementation of Client Logic : client.c
Places Server Implementation: places_server.c , client_svc.c
Final Implementation



Phase 1 : 3-Tiered Model Implementation
Phase 2: Implementation of K-D Tree
Phase 3: Implementation of Nearest Neighbor Search
24
Implementation

Phase 1 : 3-Tiered Model Implementation




Phase 2 : K-D Tree Creation



2nd IDL : placesclient.x created
places_server.c modified to call airport server
Ist IDL client.x modified to include “host” input
Airport server creates K-D tree and stores airport records.
Libkdtree++ is used: kd_create(), kd_insert3()
Phase 3: Nearest Neighbor Search




kd_nearest_range(tree, point, radius)
Calculation of Distance of all the points inside the circle with the Point
Top 5 Nearest Airports were selected and returned
Change of Resultant Structure: Made as Linked List
25
Learning and Experience
GREAT LEARNING EXPERIENCE !!





3-Tiered Client - Server Model
Data Distribution on different Servers
Space Partitioning of multi dimensional data
Search in Multi Dimensional data – Practical Approach
Working with Hash Tables, K-D Trees, Linked List, Sort
26
Summary





Implemented 3-Tiered Client Server Model
Use of Hash Table to store places2k.txt
Use of K-D Tree to store airport-locations.txt
Use of Nearest Neighbor search algorithm
Use of Linked List to return Result containing nearest airports
27
Acknowledgements

Martin Krafft, Paul Harris, Sylvain Bougerel

Library: libkdtree- Open Source STL Like implementation of K-D Trees
28
Demo
29
30