Presentation - NORDUnet Networking Conferences

Download Report

Transcript Presentation - NORDUnet Networking Conferences

Peer-to-peer Networks :
promise and trouble.
Bart Dhoedt
Ghent University - Faculty of Applied Sciences
Department of Information Technology (INTEC)
e-mail : [email protected]
phone : ++32 9 264 99 66
Presentation at NORDUnet Network Conference
August 24-27, Reykjavik, 2003
UNIVERSITEIT
GENT
Tuesday, August 27, 2003.
1
OUTLINE
1. Introduction
2. Taxonomy of P2P-systems
3. Issues in P2P-systems
4. P2P-trends
5. Concluding remarks
2
Defining P2P
• about sharing
content
disk space
bandwidth
1001010
Software resources
computer cycles
liability
• symmetric (architectural view)
• creating an application-level overlay network
• decentralized
• application critical infrastructure owned by many
4
Sharing resources ?
- estimate of edge resources
total number of Internet hosts : 150 M
average disk capacity : 10 GB
average available memory : 128 MB
average processing power : 1 GFLOPS
average BW : 100Kb/s
- available for P2P-network
1% hosts
50% processing power
50% memory
10% disk space
25% network bandwidth 1.5 Mprocessors
disk storage : 1.5 PB
processing power : 1.5 PFLOPS
BW/link : 25 Kb/s
5
Sharing resources ?
• What about supercomputers ?
IBM ASCI White
12.3 TFLOPS
8192 processors
512 RS/6000 processing nodes
6.2 TB memory storage
160 TB disk storage
110 M$
106 tons
P2P-supercomputer
> x 10 !
1.5 PFLOPS
1.5 M processors
92 TB memory storage
1.5 PB disk storage
? M$
? tons
6
P2P @ edge ?
• How to unleash the power of the
“Internet’s dark matter ?”
7
P2P popularity
2003 summer download hit parade
[Last week]
[Total]
1. Kazaa Media Desktop
2 644 777
261 405 295
2. ICQ Lite
588 141
25 423 064
3. AOL Instant Messenger (AIM) 532 897
17 521 190
4. iMesh
392 703
55 145 269
5. WinZip
351 865
100 741 790
332 624
233 204 712
P2P 6. ICQ Pro 2003a beta
7. Spybot – Search & Destroy
232 993
2 764 380
8. Ad-aware
224 720
19 078 555
179 347
114 140 262
P2P 9. Morpheus
10. DownloadAccelerator Plus
119 601
36 355 895
P2P
P2P
P2P
P2P
[www.download.com]
8
P2P popularity
Napster : the early days …
Internet Applica tions Adoption Ra te
70
Millions
60
50
40
30
20
10
23
21
19
17
15
13
11
9
7
5
3
1
0
Month
Hotmail
ICQ
Napster
Gnutella network : up to 400 000 nodes operating world wide
9
Architectural view
Mediated P2P
Napster
Audiogalaxy
Pure P2P
Early Gnutella
FreeNet
Hybrid P2P
Gnutella
FastTrack
Kazaa
10
P2P-architectures
mediated
data traffic
control traffic
efficiency
P2P
client-server
pure
P2P
P2P
+ efficient search
+ efficient control
- inefficient search
- BW consuming
scalability
- control hot spot
(mirrors needed ?)
- BW needed
grows rapidly
robustness
- single point of failure
- easy to attack
+ graceful degradation
+ difficult to attack
accountability
easy
difficult
hybrid
P2P
local : client-server
long distance : P2P
+/good compromise
?
difficult
11
P2P taxonomy
content sharing
distributed computing
instant messaging
collaborative working
mediated
pure
hybrid
13
File Sharing performance
1.6 M downloads/day
150 M searches/day
10 TB data transfer/day
1-2 TB data transfer/day
100 servers
15000 servers
14
Distributed computing performance
SETI
=“Search for extraterrestrial Intelligence”
• started in 1998 as a 2 year project (but still running)
• 4 M users signed up so far
• Radio telescope data sent to clients for digital signal analysis
• Nodes process data when cycles are available
(works as screen saver)
• Using resources to allow better signal analysis
35 GB/tape
16 hours recorded data
10 tapes/week, 350 GB
10 000 0.3 MB work units
15
Distributed computing performance
3.1x1012 FP-operations
700 000/day
computations per work unit
work unit throughput
22x1017 FLOP/day
>25 TFLOPS
Processing
Cost
SETI@home
ASCI White@DoE
25 TFLOPS
12.3 TFLOPS
1 M USD
110 M USD
16
Scaling problems
Mechanisms in GNUTELLA to limit traffic
• Network horizon set by TTL
• Descriptor ID’s avoid cyclic routing
• PONG/QueryHIT/Push NOT flooded
“1 Gnutella request would cause
90MB data traffic on
Napster scale network”
BUT ...
Bandwidth
10000
KB/PING
8000
6000
4000
2000
0
0
2
4
6
8
Horizon
17
Scaling answers
1. Reduce network horizon to reduce f
2. Use of reflectors
= node with high BW available
- mimics peer sharing all files of its “clients”
high BW access
low access BW
handles all
PING/PONG
QUERY/QUERYHIT
Traffic
3. Use of UltraPeers
= same principle as reflector,handle
but ONLY
chosen dynamically
download traffic
18
Robustness
• self-organization leads to power-law networks
(1% of servents shows server-like behaviour …)
• very robust to random node failure
• more vulnerable to targeted attacks
Simulation result for
FreeNet peers
[T. Hong, “Performance”, Chapter 14 in “Peer-to-peer : Harnessing the
Benefits of a Disruptive Technology”, ISBN 0-596-00110-X, O’Reilly,
March 2001.]
19
Free-riding on Gnutella
Network size since Jan 2002
- only 30 % of nodes offering content
[www.limewire.com]
- 50% of queries satisfied by 1% of servents
20
Overlay mismatch
Mismatch between application layer
network and physical network
based on network traffic analysis
• 40% Gnutella clients belong to top 10% AS
• only 2-5% links within AS
based on domain names
Gnutella’s clustering logic shows no/little
correlation with domain name based clustering
[M. Ripeanu, A. Iamnichi, I. Foster, “Mapping the Gnutella Network”,
IEEE Internet Computing, January-February 2002.]
21
Business Models
How to monetise P2P ?
• authors agree on “P2P business models are unclear”
• reality : few companies make money on P2P
• current situation : File sharing application sponsored
by advertisement (banners)
• some other possibilities
• micropayment mechanisms
• indirect mechanisms
(P2P will increase BW-need and hence …)
• tip based strategy (cf. US-model …)
• make “low”-quality content available to get people interested
in specific content
• make use of end users devices to reduce cost !
22
Problems/issues/barriers/challenges
Problems
Solutions
node/link transient nature
robustness
File-sharing : content redundancy
Cycle-sharing : checkpointing
scalability
bandwidth consumption
Hybrid approach
Avoid floodings
(e.g. FreeNet : intelligent routing)
Content/Query caching
TTL
Avoid routing cycles
Network discontinuities
(firewalls, (dynamic) NAT)
(Ab)use of port 80
Rendez-vous servers
23
Problems/issues/barriers/challenges
Problems
Privacy/trust
Anonymity
Solutions
Encryption techniques
(e.g. FreeNet : plausible deniability
for node operators)
application redesign
P2P-frameworks
free-riding
accountability
micro-payment
asymmetric bandwidth
in access (ADSL, HFC)
combine uplink capacity
(e-donkey)
inefficient overlay
Network/infrastructure
aware routing
???
business models ?
24
P2P-trends
• emergence of platforms
• convergence between Grid-computing and P2P-technology
• enhance P2P-performance
• semantic searches
(Tapestry, Content Addressable Networks …)
• Query/result caching
25
Platform emergence
Application areas
File sharing
Distributed
computing
Instant
Messaging
Dedicated
Application Programs
and Protocols
Freenet
eDonkey
Collaboration
?
Gnutella
• for 1 application area
• non-generic
• 1 application class
• 1 specific problem
• network interoperability ?
SETI@home
Platforms
Frameworks
Groove
?
• offer generic services
• support the P2P paradigm
• used to build P2P applications
?
?
?
?
26
JXTA
• developed by Sun Microsystems
• set of 6 XML based open protocols
• Java API offered
e-mail
auctioning
data storage
Applications
JXTA Community
Applications
indexing
Sun JXTAsearching
JXTA
Applications
file sharing Shell
Services
JXTA Community
Services
Peer
Sun JXTA peer establishment
management
Commands
Services communication
routing
Core
Peer Groups
Peer Pipes
Peer Monitoring
Security
[http://www.jxta.org]
27
BOINC
• Berkeley Open Infrastructure for Network Computing
• allows participants to participate to solve selected problems
• = “generic SETI@Home”
[http://boinc.berkeley.edu]
28
Conclusions
For network operators
P2P applications can be very BW-consuming
• extremely popular (and addictive)
• use of inefficient strategies (broadcast, flooding, …)
• “tragedy of the commons”
Danger for Bottlenecks
• overlay network has little relation to physical infrastructure
• symmetric relations between peers
Change in user behaviour
• “always” online
• information provider AND information consumer
29
Conclusions
For application developers
People are (extremely) interested in digital content
People are willing to share resources for free
(and even want to spend money …)
• make people feel they participate in a large project
• give some credit to users (competition)
(top 10 list, eternal fame if solution is found, …)
To avoid digging ones own grave
• avoid BW-consuming strategies
• include micropayment/trust mechanisms as
- encouragement to participate
- avoid free-riding
- avoid DoS attacks
30
Conclusions
For application developers
Hacker danger
• need for encryption mechanisms
High performance P2P-platforms are emergent
• reuse of efforts
• reuse of user community
Make sure your application has some scaling effect
• the more users, the more interesting to join !
31