Peer-to-Peer Systems
Download
Report
Transcript Peer-to-Peer Systems
Peer-to-Peer Systems
SVTH: Lê Thành Nguyên 00707174
Võ Lê Quy Nhơn 00707176
1
Peer-to-Peer
2
An alternative to the client/server model of distributed computing
is the peer-to-peer model.
Client/server is inherently hierarchical, with resources centralized
on a limited number of servers.
In peer-to-peer networks, both resources and control are widely
distributed among nodes that are theoretically equals. (A node
with more information, better information, or more power may be
“more equal,” but that is a function of the node, not the network
controllers.)
Decentralization
3
A key feature of peer-to-peer networks is decentralization. This
has many implications. Robustness, availability of information
and fault-tolerance tends to come from redundancy and shared
responsibility instead of planning, organization and the
investment of a controlling authority.
On the Web both content providers and gateways try to profit by
controlling information access. Access control is more difficult in
peer-to-peer, although Napster depended on a central index.
Technology Transition
The Client/Server Model
4
The Peer-to-Peer Model
Classification
Pure P2P vs. Hybrid (servers keep info)
Centralized Napster
Decentralized KaZaA
Structured CAN
Unstructured Gnutella
Hybrid JXTA
5
Applications outside Computer Science
Bioinformatics
Education and academic
Military
Business
Television
Telecommunication
6
Why Peer-to-Peer Networking?
7
The Internet has three valuable fundamental assets- information,
bandwidth, and computing resources - all of which are vastly
under utilized, partly due to the traditional client-server
computing model.
Information - Hard to find, impossible to catalog and index
Bandwidth - Hot links get hotter, cold ones stay cold
Computing resources - Heavily loaded nodes get overloaded,
idle nodes remain idle
Information Gathering
8
The world produces two exabytes of information
(2x1018 bytes) every year…..out of which
The world publishes 300 terabytes of information
(2x1012 bytes) every year
Google searches 1.3x109 pages of data
Data beyond web servers
Transient information
Hence, finding useful information in real time is increasingly
difficult.
Bandwidth Utilization
A single fiber’s bandwidth has increased by a factor of
106, doubling every 16 months, since 1975
Traffic is still congested
More devices and people on the net
More volume of data to move around same destinations
( eBay, Yahoo, etc.)
9
Computing Resources
Moore’s Law: processor speed doubles
every 18 months
Computing devices ( server, PC, PDA, cellphone) are
more powerful than ever
Storage capacity has increased dramatically
Computation still accumulates around
data centers
10
Benefits from P2P
Theory
Dynamic discovery of information
Better utilization of bandwidth, processor, storage, and
other resources
Each user contributes resources to network
Practice examples
Sharing browser cache over 100Mbps lines
Disk mirroring using spare capacity
Deep search beyond the web
11
Figure 10.1: IP and overlay routing
for peer-to-peer
IP
Scale
Load balancing
Network dynamics
(addition/deletion of
objects/nodes)
Fault tolerance
Target identification
Security and anonymity
12
Application-level routing overlay
32
IPv4 is limited to 2 addressable nodes. The
IPv6
name space is much more generous
128
(2
), but addresses in both versions are
hierarchically structured and much of the space
is pre-allocated according to administrative
requirements.
Loads on routers are determined by network
topology and associated traffic patterns.
Peer-to-peer systems can address more objects.
The128
GUID name space is very large and flat
(>2 ), allowing it to be much more fully
occupied.
Object locations can be randomized and hence
traffic patterns are divorced from the network
topology.
IP routing tables are updated asynchronously on Routing tables can be updated synchronously or
a best-efforts basis with time constants on the asynchronously with fractions of a second
order of 1 hour.
delays.
Redundancy is designed into the IP network by Routes and object references can be replicated
its managers, ensuring tolerance of a single
n-fold, ensuring tolerance of n failures of nodes
router or network connectivity failure. n-fold or connections.
replication is costly.
Each IP address maps to exactly one target
Messages can be routed to the nearest replica of
node.
a target object.
Addressing is only secure when all nodes are Security can be achieved even in environments
trusted. Anonymity for the owners of addresses with limited trust. A limited degree of
is not achievable.
anonymity can be provided.
Distributed Computation
13
Only a small portion of the CPU cycles of most computers is
utilized. Most computers are idle for the greatest portion of the
day, and many of the ones in use spend the majority of their time
waiting for input or a response.
A number of projects have attempted to use these idle CPU
cycles. The best known is the SETI@home project, but other
projects including code breaking have used idle CPU cycles on
distributed machines.
Discussion Question: Computer or
Infomachine?
The first computers were used primarily for computations. One
early use was calculating ballistic tables for the U.S. Navy during
World War II.
Today, computers are used more for sharing information than
computations—perhaps infomachine may be a more accurate
name than computer?
Distributed computation may be better suited to peer-to-peer
systems while information tends to be hierarchical and may be
better suited to client/server.
NJIT has both Computer Science and Information Systems
departments.
14
Current Peer-Peer Concerns
15
Topics listed in the IEEE 7th annual conference:
Dangers and Attacks on P2P
16
Poisoning (files with contents different to description)
Polluting (inserting bad packets into the files)
Defection (users use the service without sharing)
Insertion of viruses (attached to other files)
Malware (originally attached to the files)
Denial of Service (slow down or stop the network traffic)
Filtering (some networks don’t allow P2P traffic)
Identity attacks (tracking down users and disturbing them)
Spam (sending unsolicited information)
The SETI@home project
The SETI (Search for Extra Terrestrial Intelligence) project
looks for patterns in radio frequency emissions received from
radio telescopes that suggest intelligence. This is done by
partitioning data received into chunks and sending each chunk
to several different computers owned by SETI volunteers for
analysis.
Link: http://setiathome.ssl.berkeley.edu/
17
Children of SETI@home
In 2002, David Anderson, the director of SETI@home,
launched the Berkeley Open Infrastructure for Network
Computing (BOINC).
There are currently over 40 BOINC projects running to share
spare computation on idle CPUs . You can see some of the
projects at
http://boinc.berkeley.edu/projects.php
Folding@home
As of September, 2007, the most powerful distributed
computing network on Earth is Folding@home, a project to
simulate protein folding which can run on Sony Playstation 3
game consoles. At that time, the network reached a capacity of
one petaflop (one quadrillion folding point operations per
second) on a network of 40,000 game consoles. See
http://folding.stanford.edu/
Napster
20
The first large scale peer-to-peer network was Napster, set up in
1999 to share digital music files over the Internet. While Napster
maintained centralized (and replicated) indices, the music files
were created and made available by individuals, usually with
music copied from CDs to computer files. Music content owners
sued Napster for copyright violations and succeeded in shutting
down the service. Figure 10.2 documents the process of
requesting a music file from Napster.
Figure 10.2: Napster: peer-to-peer
file sharing
peers
Napster serv er
Index
1. Fil e lo cation
req uest
2. Lis t of pee rs
offerin g th e fi le
Napster serv er
Index
3. Fil e req uest
5. Ind ex upd ate
4. Fil e de livered
21
Napster: Lessons Learned
Napster created a network of millions of people, with thousands
of files being transferred at the same time.
There were quality issues. While Napster displayed link speeds to
allow users to choose faster downloads, the fidelity of recordings
varied widely.
Since Napster users were parasites of the recording companies,
there was some central control over selection of music. One
benefit was that music files did not need updates.
There was no guarantee of availability for a particular item of
music.
22
Middleware for Peer-to-Peer
A key problem in Peer-to-Peer applications is to provide a way
for clients to access data resources efficiently. Similar needs in
client/server technology led to solutions like NFS. However,
NFS relies on pre-configuration and is not scalable enough for
peer-to-peer.
Peer clients need to locate and communicate with any available
resource, even though resources may be widely distributed and
configuration may be dynamic, constantly adding and removing
resources and connections.
23
Non-Functional Requirements for
Peer-to-Peer Middleware
Global Scalability
Load Balancing
Local Optimization
Adjusting to dynamic host availability
Security of data
Anonymity, deniability, and resistance to censorship
(in some applications)
24
Routing Overlays
25
A routing overlay is a distributed algorithm for a middleware
layer responsible for routing requests from any client to a host
that holds the object to which the request is addressed.
Any node can access any object by routing each request
through a sequence of nodes, exploiting knowledge at each of
theme to locate the destination object.
Global User IDs (GUID) also known as opaque identifiers are
used as names, but do not contain location information.
A client wishing to invoke an operation on an object submits a
request including the object’s GUID to the routing overlay,
which routes the request to a node at which a replica of the
object resides.
Figure 10.3: Distribution of
information in a routing overlay
26
Routing Overlays
Basic programming interface for a distributed hash table (DHT) as implemented
by the PAST API over Pastry
put(GUID, data)
The data is stored in replicas at all nodes responsible for the object identified by
GUID.
remove(GUID)
Deletes all references to GUID and the associated data.
value = get(GUID)
The data associated with GUID is retrieved from one of the nodes responsible it.
The DHT layer take responsibility for choosing a location for data item, storing it
(with replicas to ensure availability) and providing access to it via get()
operation.
Routing Overlays
Basic programming interface for distributed object location and routing (DOLR)
as implemented by Tapestry
publish(GUID)
GUID can be computed from the object. This function makes the node performing a
publish operation the host for the object corresponding to GUID.
unpublish(GUID)
Makes the object corresponding to GUID inaccessible.
sendToObj(msg, GUID, [n])
Following the object-oriented paradigm, an invocation message is sent to an object in
order to access it. This might be a request to open a TCP connection for data transfer or to
return a message containing all or part of the object’s state. The final optional parameter
[n], if present, requests the delivery of the same message to n replicas of the object.
Object can be stored anywhere and the DOLR layer is responsible for
maintaining a mapping between GUIDs and the addresses of the nodes at which
replicas of the objects are located.
Pastry
All the nodes and objects that can be accessed through Pastry
are assigned 128-bit GUIDs.
In a network with N participating nodes, the Pastry routing
algorithm will correctly route a message addressed to any
GUID in O(logN) steps.
If the GUID identifies a node that is currently active, the
message is delivered to that node; otherwise, the message is
delivered to the active node whose GUID is numerically closest
to it (the closeness referred to here is in an entirely artificial
space- the space of GUIDs)
Pastry
When new nodes join the overlay they obtain the data needed to
construct a routing table and other required state from existing
members in O(logN) messages, where N is the number of hosts
participating in the overlay.
In the event of a node failure or departure, the remaning nodes can
detect its absence and cooperatively reconfigure to reflect the
required changes in the routing structure in a similar number of
messages.
Each active node stores a leaf set- a vector L (of size 2l) containing
the GUIDs and IP addresses of the nodes whose GUIDs are
numerically closet on either side of its own (l above and l below)
The GUID space is treated as circular: GUID 0’s lower neighbor is
2128-1
Pastry- Routing algorithm
The full routing algorithm involves the use of a routing table at
each node to route messages efficiently, but for the purposes of
explanation, we describe the routing algorithm in two stages:
The first stage decribes a simplified form of the algorithm
which routes messages correctly but inefficiently without a
routing table
The second stage describes the full routing algorithm which
routes a request to any node in O(logN) messages.
Pastry- Routing algorithm
Stage 1:
Any node A that recieves a message M with destination address
D routes the message by comparing D with its own GUID A and
with each of the GUIDs in its leaf set and forwarding M to the
node amongst them that is numerically closet to D
At each step M is forwarded to node that is closer to D than the
current node and that this process will eventually deliver M to
the active node closer to D
Very inefficient, requiring ~N/2l hops to deliver a message in a
network with N nodes
Pastry- Routing algorithm
The diagram illustrates the
routing of a message from node
65A1FC to D46A1C using leaf
set information alone, assuming
leaf sets of size 8 (l=4)
Pastry- Routing algorithm
Stage 2:
Each Pastry node maintains a routing table giving GUIDs and IP
addresses for a set of nodes spread throughout the entire range of
2128 possible GUID values
The routing table is structured as follows: GUIDs are viewed as
hexadecimal values and the table classifies GUIDs based on their
hexadecimal prefixes
The table has as many rows as there are hexadecimal digits in a
GUID, so for the prototype Pastry system that we are describing,
there are 128/4 = 32 rows
Any row n contains 15 entries – one for each possible value of the
nth hexadecimal digit excluding the value in the local node’s GUID.
Each entry in the table points to one of the potentially many nodes
whose GUIDs have the relevant prefix
Pastry- Routing algorithm
Stage 2 (cont.):
GUID prefixes and corresponding nodehandles n
p=
0
0
n
1
n
2
n
3
n
4
n
5
n
6
7
n
8
n
9
n
A
n
B
n
C
n
D
n
E
n
F
n
1
60
n
61
n
62
n
63
n
64
n
65
66
n
67
n
68
n
69
n
6A
n
6B
n
6C
n
6D
n
6E
n
6F
n
2
650
n
651
n
652
n
653
n
654
n
655
n
656
n
657
n
658
n
659
n
65A
65B
n
65C
n
65D
n
65E
n
65F
n
3
65A0 65A1 65A2 65A3 65A4 65A5 65A6 65A7 65A8 65A9 65AA 65AB 65AC 65AD 65AE 65AF
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
The routing table is located at the node whose GUID begins 65A1
Pastry- Routing algorithm
Stage 2 (cont.):
To handle a message M addressed to a node D (where R[p,i] is the
element at column i, row p of the routing table)
1.
If (L-l < D < Ll) { //the destination is within the leaf set or is the current node
Forward M to the element Li of the leaf set with GUID closest to D or the current node A
2.
3.
} else {
// use the routing table to despatch M to a node with the closer GUID
Find p (the length of the longest common prefix of D and A), and i (the (p+1)th
hexadecimal digit of D)
4.
5.
If (R[p,i] null) forward M to R[p,i] //route M to a node with a longer common prefix
6.
else { //there is no entry in the routing table
Forward M to any node in L and R with a common prefix of length i, but a GUID
that is numerically closer.
7.
}
8.
9.
}
Tapestry
Tapestry is another peer-to-peer model similar to Pastry. It hides
a distributed hash table from applications behind a Distributed
object location and routing (DOLR) interface to make replicated
copies of objects more accessible by allowing multiple entries in
the routing structure.
Identifiers are either NodeIds which refer to computers that
perform routing actions or GUIDs which refer to the objects.
For any resource with GUID G, there is a unique root node with
GUID RG that is numerically closest to G.
Hosts H holding replicas of G periodically invokde publish(G)
to ensure that newly arrived hosts become aware of the
existence of G. On each invocation, a publish message is routed
from the invoker towards node RG.
Tapestry
4377 (Root for 4378)
Tapestry routings
for 4377
437A
43FE
publish path
Location mapping
for 4378
4228
4378
Phil’s
Books
4361
4664
4A6D
4B4F
Routes actually
taken bysend(4378)
E791
57EC
AA93
4378
Phil’s
Books
Replicas of the file Phil’s Books (G=4378) are hosted at nodes 4228 and AA93. Node 4377 is the root node
for object 4378. The Tapestry routings shown are some of the entries in routing tables. The publish paths show
routes followed by the publish messages laying down cached location mappings for object 4378. The location
mappings are subsequently used to route messages sent to 4378.
Squirrel web cache
The node whose GUID is numerically closest to the GUID of
an object becomes that object’s home node, responsible for
holding any cached copy of the object.
If the fresh copy of a required object is not in the local cache,
Squirrel routes a Get request via Pastry to the home node.
If the home node has a fresh copy it directly responds to the
client with a not-modified message.
If the home node has a stale copy or no copy of the object it
issues a Get to the origin server. The origin server may
respond with a not-modified or a copy of the object.
Squirrel web cache
Origin server
Home node
Squirrel web cache
Evaluation
The reduction in total external bandwidth used:
With each client contributing 100MB of disk storage, hit ratio of 28%
(36000 active client in Redmond), and 37% (105 active client in
Cambridge).
The latency perceived by users for access to web objects:
Local transfers take only a few milliseconds, whereas transfers across the
Internet require 10-100ms => the latency for access to objects found in the
cache is swamped by the much greater latency of access to object not found
in the cache
The computational and storage load imposed on client nodes:
The average number of cache request served for other nodes by each node
over the whole period was low at only 0.31 per minute.