Peer-to-Peer Algorithms and Systems
Download
Report
Transcript Peer-to-Peer Algorithms and Systems
Peer-to-Peer Algorithms and System
CS600 Assignment #5
Nov. 22. 2006
Byungchul Park
[email protected]
DPNM Lab., Dept. of CSE, POSTECH
- 1/25 -
DP&NM Lab.
CSE, POSTECH
Presentation Outline
History of P2P
Motivation
Definition of P2P
Most popular P2P application - Skype
Overlay Network
– Unstructured P2P
– Structured P2P
Comparison of P2P models
Application Services & technology
Open Issues
Conclusion
- 2/25 -
DP&NM Lab.
CSE, POSTECH
History of P2P
Origin of Internet
– ARPANET was a P2P system (not client/server model)
– FTP : killer application of early Internet
– Usenet and DNS
Revolution of network model – Napster
– Developed by Shawn Fanning
– MP3 file sharing
– Centralized P2P concept
Obstacles of P2P
– Firewall
– Dynamic IP
– NAT
Success of Skype
- 3/25 -
DP&NM Lab.
CSE, POSTECH
Motivation
Requirements of Internet application
– Scalability
– Security & Reliability
– Flexibility & QoS
• Client/Server model can not satisfies the requirements
=>DOS attack, Bottleneck
P2P Network
– Self-organizing
– More flexible than Client/Server model
- 4/25 -
DP&NM Lab.
CSE, POSTECH
Definition of P2P [1]
Significant autonomy from central servers
Exploits resources at the edges of the Internet
– Storage and contents
– Computing power
– Connectivity and presence
Resources at edge have intermittent connectivity, being
added & removed
- 5/25 -
DP&NM Lab.
CSE, POSTECH
Skype [12]
One of the most popular VoIP
application
Skype provide VoIP and IM services
that works behind NAT and firewalls
50 million users
Valued at $2.6 billon
Encrypted protocol & proprietary
network
How P2P VoIP traffic in Skype
differs from other P2P traffic and
from traffic in traditional voicecommunication?
Resource consumption?
- 6/25 -
DP&NM Lab.
CSE, POSTECH
Skype Overview
Skype’s three services
–
–
–
–
VoIP : two-way audio stream, conferences of up to 4user
IM (instant messaging) : small text message
File-transfer
Paid service : initiate & receive calls via regular telephone
number through VoIP-PSTN gateway
Related to KaZaA – Supernode-based P2P network
Inherent Structure
Supernode : plenty of
spare bandwidth &
publicly reachable
- 7/25 -
DP&NM Lab.
CSE, POSTECH
Overlay Network
Figure 1. from [1]
- 8/25 -
DP&NM Lab.
CSE, POSTECH
Overlay Network (cont)
Virtual edge
– TCP connection or a pointer to an IP address
Maintenance of Overlay networks (typical)
–
–
–
–
Periodically ping to make sure neighbor is still alive
verify liveness while messaging
If neighbor goes down, may establish new edge
New node needs to bootstrap
- 9/25 -
DP&NM Lab.
CSE, POSTECH
Overlay Network (cont)
Design flexibility
– Topology
– Protocol
– Messaging over TCP, UDP,
ICMP
Underlying physical net is
transparent to developer
- 10/25 -
DP&NM Lab.
CSE, POSTECH
Unstructured P2P [2]
Unstructured P2P
– No relations between peer and contents
– Searching : query flooding to central server or neighbor peer
Centralized Server model
– Central server keeps peer information
– Scalability problem and copyright problem
Pure P2P
– No central server (or directory)
– Searching : query flooding to neighbor peer
– excessive traffic generation
Hybrid P2P
– Super node
– Hierarchical architecture
- 11/25 -
DP&NM Lab.
CSE, POSTECH
Unstructured P2P Applications
Napster (Centralized model) [6]
Application-level, clientserver point-to-point TCP
Steps:
–
–
–
–
connect to Napster
upload list of files
give server search query
receive “best” of correct
answers (ping)
- 12/25 -
DP&NM Lab.
CSE, POSTECH
Unstructured P2P Applications (cont)
Gnutella [7,8]
Focus: decentralized method of searching for files
– Central directory server no longer problem (bottleneck)
– Harder control of the system
Each application instances serves to:
– store/serve files
– route queries to its neighbors
– respond to request queries
Limited scope query
– if you don’t have the file you want, query 7 of your neighbors.
– if they don’t have it, they contact 7 of their neighbors, for a
maximum hop count of 10.
– reverse path forwarding for responses (not files)
- 13/25 -
DP&NM Lab.
CSE, POSTECH
Unstructured P2P Applications (cont)
Overlay management of Pure P2P
New node uses bootstrap node to get IP addresses of
existing Gnutella nodes
New node establishes neighboring relations by sending
join messages
- 14/25 -
DP&NM Lab.
CSE, POSTECH
Unstructured P2P Applications (cont)
KaZaA & Skype [3]
Every peer is either a
supernode or is assigned to
a supernode
Each supernode knows
where the other
supernodes are (Mesh
structure)
Cross between Napster
and Gnutella
- 15/25 -
DP&NM Lab.
CSE, POSTECH
Structured P2P
Structured P2P
– Contents-peer mapping
– Content-addressing tech.
– Searching complexity : O(logN) -> scalability
What is DHT?
– Distributed Hash Table
– files are associated with a key (produced, by hashing the file
name) and each node in the system is responsible for storing a
certain range of keys [3]
Example
– Hash (“The Devil Wears Prada”) = 8045
– Pastry, Tapestry, Chord, CAN, Viceroy
- 16/25 -
DP&NM Lab.
CSE, POSTECH
Comparison of P2P models [5]
- 17/25 -
DP&NM Lab.
CSE, POSTECH
Application Service & Tech.
1.
2.
3.
4.
Information Sharing
File Sharing
Bandwidth efficiency
Computing power sharing
- 18/25 -
DP&NM Lab.
CSE, POSTECH
Information Sharing
Presence information
– Basic operation of P2P application
– e.g. buddy list of Skype or MSN
Document Management
– Most documents are stored with personal PCs
– P2P networking provides the repository that can coordinates
distributed local data
– e.g. NextPage-NXT 4 [9]
Collaboration
– P2P networking provides collaboration service without central
management system
– e.g. Groove Virtual Office [10] (ad-hoc collaboration system)
• Instance messaging, File sharing
• Real-time DB synchronization, etc
- 19/25 -
DP&NM Lab.
CSE, POSTECH
File Sharing
Representative P2P application service
Over 70% of Internet traffic generated by P2P file sharing
File searching mode
– Flooding model (e.g. Gnutella)
– Central directory model (e.g. Napster)
– Document routing model (e.g. Freenet)
Document routing model
– A file saved on anonymous peer
– An identifier assigned to each file and peer
– A file saved on a peer that has similar identifier
- 20/25 -
DP&NM Lab.
CSE, POSTECH
Bandwidth Efficiency
Client/Server model involved with bottleneck problem and
queuing delay
Load balancing
– Multimedia streaming service and VoD service
– A contents copied to numerous peers
– e.g. PeerCast, Peer-to-Peer-Radio
Improving download speed
– If file is found in multiple nodes, user can select parallel
downloading
– Most likely HTTP byte-range header used to request different
portions of the file from different nodes
– Automatic recovery when server peer stops sending file
- 21/25 -
DP&NM Lab.
CSE, POSTECH
Computing Power Sharing
seti@home [11]
–
–
–
–
–
Search for ET intelligence
Central site collects radio telescope data
Data is divided into work chunks of 300Kbytes
User obtains client, which runs in background
Peer sets up TCP connection to central computer, downloads
chunk
– Peer does FFT on chunk, uploads results, gets new chunk
» Not peer to peer, but exploits
» resources at network edge
- 22/25 -
DP&NM Lab.
CSE, POSTECH
Open Issues
Copyright issues
Avoid Firewall or detection
– Dynamic port allocation
– NAT traversal
- 23/25 -
DP&NM Lab.
CSE, POSTECH
Conclusion
P2P networking is a promising paradigm for services
operating at the edges of the network [4]
Significant autonomy from central servers
Scalability, Security & Reliability, Flexibility & QoS
P2P networks still have a long way to go
- 24/25 -
DP&NM Lab.
CSE, POSTECH
References
[1] K.W. Ross and Dan Rubenstein, "Tutorial on P2P Systems " presented at Infocom
2003, http://cis.poly.edu/~ross/p2pTheory/P2Preading.htm.
[2] Peer-to-peer information systems: concepts and models, state-of-the-art, and future
systems, Karl Aberer, Manfred Hauswirth Tutorial at the 18th International Conference
on Data Engineering, February 26-March 1, 2002, San Jose, California.
[3]Nathaniel Leibowitz, Matei Ripeanu, and Adam Wierzbicki, "Deconstructing the
Kazaa Network", 3rd IEEE Workshop on Internet Applications (WIAPP'03), June 2003.
[4] Management of Peer-to-Peer Networks, K. Tutschku., http://www3.informatik.uniuerzburg.de/ITG/2001/vortrag/vortrag13.pdf.
[5] 전자통신동향분석 제 21권 제 5호, ETRI, 2006년 10월
[6] Napster, “Napster Messages,” http://opennap.sourceforge.net/napster.txt, 2000. M.
Ripeanu, I. Foster, and A. Iamnitchi, “Mapping
[7]the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and
Implications for System Design,” IEEE Internet Computing Journal, Vol.6, No.1, 2002.
[8] M. Ripeanu, “Peer-to-Peer Architecture Case Study: Gnutella Network,” In Proc. of
IEEE 1st Int’l Conf. on Peer-to-Peer Computing, 2001.
[9] Next Page Inc, http://www.nextpage.com/pdfs/collateral/wp/nxt4 build
wp090903lo.pdf, 2004.
[10] Groove Networks, http://www.groove.net, 2004.
[11] D. Anderson, SETI@home, chapter 5, O’Reilly, Sebastopol, 2001, pp.67–76.
[12] Saikat Guha, Neil Daswani, Ravi Jain, “An Experimental Study of the Skype Peerto-Peer VoIP System”, IPTPS 2006.
- 25/25 -
DP&NM Lab.
CSE, POSTECH