Related Work - Lehigh University
Download
Report
Transcript Related Work - Lehigh University
Design and Implementation of
HTTP-Gnutella Gateway
Baoning Wu (baw4)
Wei Zhang (wez5)
CSE Department
Lehigh University
Motivation
Peer-to-peer networking is a hot topic.
Can P2P nodes search and get files from Web
sites?
Can one P2P network search and get files from
other P2P networks?
In our project, we have built a special gateway
between Gnutella and Web sites.
Related Work
David McNab has launched Freenet search engine.
Asiayeah is a Gnutella search engine.
Filedonkey.com is an Edonkey search engine.
Kalepa Networks , Inc is doing work about connecting
different P2P systems.
Our work is kind of reverse to all above works.
Mechanism of Gnutella Searching
Node A sends a query to its neighbor B;
Node B boardcasts the query to its neighors C, D;
Node C has the objects node A needs and then returns a query hit
message to node B;
Node B forwards the query hit message by consulting the local
states.
Architecture of HTTP-Gnutella
Gateway
Mechanism of the gateway
1.
2.
3.
4.
5.
6.
7.
Node A broadcasts a query message directly or indirectly to the
HTTP-Gnutella gateway;
The HTTP-Gnutella gateway forwards the translated query
message to search engine;
The search engine returns a bunch of query results to the
gateway;
The gateway translates the results into Gnutella formats and then
forwards them to node A;
If node A initializes a download requests to the gateway, the
gateway will translate the Gnutella request into a well-formatted
HTTP request to the Web server;
The gateway fetches the data from the Web server;
The gateway forwards the data from the Web server to node A.
Handle Query Messages
We still use the original Gnutella mechanism to
judge whether to forward the message or not.
The gateway captures all of queries with hops# <
5 and sends them to search engine.
Search Engine API
Google search engine API has a limit of up to
1,000 requests per day.
Search engine API consists of three main
functions:
Query conversion
Extraction of URLs
Measurement of content size
Generate Query Hit Messages
Two considerations:
Let Gnutella nodes contact Web servers directly
Let the gateway work as a proxy
The gateway fills its own IP address and a specific
port number (currently 9999) in the query hit
messages.
File names are URLs of Web objects.
Downloading Service
Translate Gnutella download request into a wellformatted HTTP request. e.g.
GET /get/1234/http://www.foo.com/foo.mp3 HTTP/1.1
User-Agent: Gnutella
Host: 123.123.123.123:6346
=>
GET http://www.foo.com/foo.mp3 HTTP/1.1
User-Agent: Gnutella
Host: www.foo.com
It should handle Gnutella handshakes properly.
It also records the bytes transferred.
Problems & Solutions
Irregular handshakes
File size
We handle all possibilites
We use HTTP HEAD request to get file size
Broken Pipe signal
We use forked process
Experiment Results
Outline
Basic verification and validation
Log file format
Results #1 to #4
Basic Verification & Validation
Run our special gateway on machine 1 and run a
normal gtk-gnutalla client on machine 2. After
machine 2 connects to machine 1, we use machine
2 to send query messages and downloading
request to machine 1.
For downloaded files from machine 1, we use
wget to get the same file from web server directly
and use diff to test if they are identical.
Log File Format
Log 1
Time stamp, MUID, IP address, Type, Query
Log 2
Time stamp, IP address, URL, Size, Code, Success
Results #1
No. of Query messages: 319,245
No. of Query Hit messages: 930,860
No. of served requests: 113,391
Average Response Time: 16.33 seconds
Result #2
100000
number of requests
90000
80000
70000
60000
50000
40000
30000
20000
10000
0
1
2
3
4
5
6
7
number of responses
8
9
Result #3
No. of Downloading requests: 952
No. of Different IP addresses: 67
No. of served Requests: 945
No. of sucessfully served requests: 740
Total size transfered: 244,227,881 bytes
Average response time: 3.15 seconds
Average total download time: 15.92 seconds
Result #4
number of downloaded files
120
110
100
90
80
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
different sites
Future Work
Support a variety of file types and measure their
popularity
Build a gateway to connect different P2P systems
Deployment of such gateways
Conclusion
An HTTP-Gnutella gateway was built and worked
for the Gnutella users.
Only 5 days, the gateway transferred about
244MB data from the Web sites to the Gnutella
nodes.
The systems achieved all goals of our design.
Question?