Transcript ird5-2_env

Internet Resources Discovery (IRD)
Peer-to-Peer (P2P)
Technology (1)
Thanks to Carmit Valit and Olga Gamayunov
1
A. Frank
Content
• Computer Networks
– Client-Server Networks
– Peer-to-Peer (P2P) Networks
• Centralized Server
• Distributed Service
•
•
•
•
2
P2P vs. SEs
P2P infrastructure
Some leading P2P Websites
Research Issues for future systems
A. Frank
Computer Networks (1)
Computer networks enable to:
• Communicate.
• Share files electronically.
• Have an electronic mail system.
• Have a networked storage area for backing up
critical information.
• Share expensive equipment such as laser
printers and CD-ROM drives.
3
A. Frank
Computer Networks (2)
Computer networks come in two flavors:
1. Client-Server Networks
2. Peer-to-Peer Networking (P2P)
– Centralized Server
– Distributed Service
4
A. Frank
Client Server
5
A. Frank
Client-Server Networks (1)
• A Client-Server network is a
communication model which:
– Has a central, dedicated computer, called a
server.
– Has a number of PCs, known as clients,
connected to the server through the network.
– The same machine can be both a server and a
client.
6
A. Frank
Client-Server Networks (2)
• The server acts as a hub for:
–
–
–
–
Sharing printers
Doing global backup
Providing network security
Performing general management of the network
• The server may also provide access to the
company’s database, data files and E-mail
messages.
7
A. Frank
Advantages (1)
• Software Consistency
– All users use the same software version.
– Upgrading software on the server affects all
users.
• Hardware Flexibility
– The server alone is responsible for directing
the network traffic.
8
A. Frank
Advantages (2)
• Centralized Storage
– Data is not lost when a PC “crashes”.
– Data is accessible to all relevant and authorized
users (not just to the author).
• Security
– Data is accessible only to the relevant and
authorized users.
• Backup
9
– Relevant data is backed up on the server.
– Another server can backup the current active
server.
A. Frank
Disadvantages
•
•
•
•
Expensive
Difficult to set up
Difficult to maintain
The power of the clients is wasted
– The clients are treated as dumb computers and their
power is not being used.
• Dependency on the server
– If the server “falls”, this is a problem.
10
A. Frank
What is P2P (Peer-to-Peer)
• Every participating node acts as both a client
and a server (“servent”).
• Every node “pays” its participation by
providing access to (some of) its resources.
• Properties:
–
–
–
–
no central coordination and central database.
no peer has a global view of the system.
global behavior emerges from local interactions.
all existing data and services are accessible from
any peer.
– peers are autonomous.
11
A. Frank
Types of P2P Systems
• E-commerce systems:
– eBay, B2B market places, B2B integration
servers...
• File sharing systems:
– Napster, Gnutella, Groove, …
• Distributed Databases:
– Mariposa [Stonebraker96], …
• Networks:
12
– Internet
– Mobile ad-hoc networks
A. Frank
Peer-to-Peer Networking
Peer-to-Peer networks come in 2 basic flavors:
• Centralized Server - Servers direct the traffic.
– Examples: Napster, Groove
• Distributed Service - Server-free implementations that
directly connect desktops over an IP network.
– Example: Gnutella
• Hierarchical model - Mix of centralized and
decentralized model introduces of “super-peers”.
– Example: FastTrack (?)
13
A. Frank
Centralized Server
Files and info move through the server and
through each other
14
A. Frank
Centralized Server
P2P with a Centralized Server is derived from the ClientServer model.
• The clients are connected to the server and to each
other.
– This enables the clients to communicate with each other
without using an intermediate server.
• The server doesn’t act as a hub for managing the
network, but focuses on specific tasks to help the
communication between the users, like:
– Helps with the first “handshake” between the users
(Napster).
– Saves information temporarily (Groove).
15
A. Frank
Advantages
• All Client-Server advantages remain.
• Using the power of the clients.
– The clients are no longer dumb computers,
and participate in managing the network.
• The server has less responsibilities.
– Which reduces the need for a powerful
server or for several servers.
16
A. Frank
Disadvantages
• Low level of security
– Allowing actions between clients
without server supervision might spread
viruses in the network.
• Dependency on the server
– Reduced, but still exists.
17
A. Frank
Napster
• Napster is an application and music indexing
service from Napster, Inc., San Mateo, CA.
• Provides an index to MP3 music files residing
on other computers currently logged onto the
Internet.
• The digital music itself is not located on
Napster servers, only the index service.
18
A. Frank
Napster System Architecture
• Central (virtual) database which holds an index of
offered MP3/WMA files.
• Clients connect to this server, identify themselves
(account) and send a list of MP3/WMA files they are
sharing (C/S).
• Other clients can search the index and learn from
which clients they can retrieve the file (P2P).
• Combination of client/server and P2P approaches.
• First time users must register an account.
19
A. Frank
Napster Communication Model
20
A. Frank
Napster Limitations (1)
• Napster has a heavy cost in Internet
traffic
– MP3s are typically huge files (2-10 MB).
– Napster turns every user into a server,
tossing a huge amount of data out onto the
networks.
Result: Napster has high bandwidth
demands.
21
A. Frank
Napster Limitations (2)
•
“Transfer Error”
–
–
The available music depends on who is online at
the time.
When a user goes offline, all the other users who
started downloading from his hard drive get a
transfer error.
Result: Users need to continually check the
Napster directory when downloading files.
22
A. Frank
Napster Limitations (3)
• Low level of security
– The exchanging of files is done by the users
without the supervision of a server.
– There’s no protection from viruses that might have
been built as MP3 files.
Result: Users are more in need of a personal
firewall than before.
23
A. Frank
Groove (1)
• Groove is software that enables small groups of
users to quickly get together online to
collaborate on projects.
• The users can share all kinds of digital data.
• Groove functions by creating a working space
on each participating PC.
24
A. Frank
Groove (2)
• The work space includes tools to support
collaboration:
–
–
–
–
–
–
25
Sharing Microsoft Office documents
Text chat
Live-voice chat
Photo viewing
Drawing pad
Browser
• Only a Groove member, who was invited to a
specific PC, can access its space.
A. Frank
Groove (3)
• When two or more users are online at the same
time, they can work on the same document.
– Any change made to a document is transmitted
“live” over the Net to other users.
• If the other users aren't online, the
modifications are stored on a relay server.
– As soon as a user plugs back in, his Groove space is
updated.
• Groove links users via their PCs without the
assistance of a central server, but a server is in
use.
A. Frank
26
Groove Limitation
• Only for small groups
– The software is designed to work best for
groups of 25 people or fewer.
27
A. Frank