Chap10_Peer-to-Peer_Team4
Download
Report
Transcript Chap10_Peer-to-Peer_Team4
Distributed Systems Concepts
and Design
Chapter 10:
Peer-to-Peer Systems
Bruce Hammer, Steve Wallis, Raymond Ho
10.1:
Introduction
Peer-to-Peer Systems
Where data and computational resources are contributed
by many hosts
Objective to balance network traffic and reduce the load
on the primary host
Management requires knowledge of all hosts, their
accessibility, (distance in number of hops), availability
and performance.
They exploit existing naming, routing, data replication
and security techniques in new ways
Bruce Hammer, Steve Wallis, Raymond Ho
2
10.1:
Introduction
Goal of Peer-to-Peer Systems
Sharing data and resources on a very large scale
‘Applications that exploit resources available at
the edges of the Internet – storage, cycles,
content, human presence’ (Shirky 2000)
Uses data and computing resources available in
the personal computers and workstations
Bruce Hammer, Steve Wallis, Raymond Ho
3
10.1: Introduction
Characteristics of Peer-to-Peer Systems
Each computer contributes resources
All the nodes have the same functional
capabilities and responsibilities
No centrally-administered system
Offers a limited degree of anonymity
Algorithm for placing and accessing the data
Balance workload, ensure availability
Without adding undue overhead
Bruce Hammer, Steve Wallis, Raymond Ho
4
10.1: Introduction
Evolution of Peer-to-Peer Systems
Napster – download music, return address
Freenet, Gnutella, Kazaa and BitTorrent
More sophisticated – greater scalability, anonymity
and fault tolerance
Pastry, Tapestry, CAN, Chord, Kademlia
Peer-to-peer middleware
Bruce Hammer, Steve Wallis, Raymond Ho
5
10.1: Introduction
Evolution (Continued)
Immutable Files, (music, video)
GUIDs (Globally Unique Identifiers)
Middleware to provide better routing algorithms,
react to outages
Evolve to mutable files
Application within one company’s intranet
Bruce Hammer, Steve Wallis, Raymond Ho
6
10.2: Napster and its Legacy
Napster
Provided a means for users to share music files –
primarily MP3s
Launched 1999 – several million users
Not fully peer-to-peer since it used central servers to
maintain lists of connected systems and the files they
provided, while actual transactions were conducted
directly between machines
Proved feasibility of a service using hardware and data
owned by ordinary Internet users
Bruce Hammer, Steve Wallis, Raymond Ho
7
10.2: Napster and its Legacy
peers
Napster server
Index
1. File locati on
requ est
2. List of peers
offering the fil e
Napster server
Index
3. File requ est
5. Inde x upda te
4. File del ivered
Bruce Hammer, Steve Wallis, Raymond Ho
8
10.2: Napster and its Legacy
Bit Torrent
Designed and implemented 2001
Next generation from Napster - true Peer To Peer (P2P)
Can handle large files e.g WAV, DVD, FLAC (e.g 1CD =
approx 500KB)
After the initial pieces transfer from the seed, the pieces
are individually transferred from client to client. The
original seeder only needs to send out one copy of the file
for all the clients to receive a copy
Tracker URL hosted at Bit Torrent site e.g Traders Den
Bruce Hammer, Steve Wallis, Raymond Ho
9
10.2: Napster and its Legacy
Bit Torrent (contd)
Many Bit Torrent clients e.g Vuze
Keep track of seeders and leechers
Torrent – contains metdata about the files to be shared and
about the tracker
Tracker - coordinates the file distribution, and which
controls which other peers to download the pieces of the
file.
Bruce Hammer, Steve Wallis, Raymond Ho
10
10.2: Napster and its Legacy
Bruce Hammer, Steve Wallis, Raymond Ho
11
10.2: Napster and its Legacy
Bruce Hammer, Steve Wallis, Raymond Ho
12
10.2: Napster and Its Legacy
Bruce Hammer, Steve Wallis, Raymond Ho
13
10.2: Napster and its Legacy
Bruce Hammer, Steve Wallis, Raymond Ho
14
10.2: Napster and its Legacy
Bruce Hammer, Steve Wallis, Raymond Ho
15
10.2: Napster and its Legacy
Bruce Hammer, Steve Wallis, Raymond Ho
16
10.3: Peer-to-Peer Middleware
Peer To Peer Middleware
To provide mechanism to access data resources anywhere
in network
Functional Requirements :
Simplify construction of services across many hosts in wide
network
Add and remove resources at will
Add and remove new hosts at will
Interface to application programmers should be simple and
independent of types of distributed resources
Bruce Hammer, Steve Wallis, Raymond Ho
17
10.3: Peer-to-Peer Middleware
Peer To Peer Middleware (contd)
Non-Functional Requirements :
Global Scalability
Load Balancing
Optimization for local interactions between neighboring peers
Accommodation to highly dynamic host availability
Security of data in an environment simplify construction of
services across many hosts in wide network
Anonymity, deniability and resistance to censorship
Bruce Hammer, Steve Wallis, Raymond Ho
18
10.3: Peer-to-Peer Middleware
Peer To Peer Middleware (contd)
Global scalability, dynamic host availability and load
sharing and balancing across large numbers of computers
pose major design challenges.
Design of Middleware layer
Knowledge of locations of objects must be distributed
throughout network
Use of replication to achieve this
Bruce Hammer, Steve Wallis, Raymond Ho
19
10.4: Routing Overlays
Routing Overlays
Sub-systems, APIs, within the peer-to-peer middleware
Responsible for locating nodes and objects
Implements a routing mechanism in the application layer
Separate from any other routing mechanisms such as IP routing
Ensures that any node can access any object by routing
each request thru a sequence of nodes
Exploits knowledge at each node to locate the destination
Bruce Hammer, Steve Wallis, Raymond Ho
20
10.4: Routing Overlays
GUIDs
‘pure’ names or opaque identifiers
Reveal nothing about the locations of the objects
Building blocks for routing overlays
Computed from all or part of the state of the
object using a function that deliver a value that is
very likely to be unique. Uniqueness is then
checked against all other GUIDs
Not human readable
Bruce Hammer, Steve Wallis, Raymond Ho
21
10.4: Routing Overlays
Tasks of a routing overlay
Client submits a request including the object
GUID, routing overlay routes the request to a
node at which a replica of the object resides
A node introduces a new object by computing its
GUID and announces it to the routing overlay
Clients can remove an object
Nodes may join and leave the service
Bruce Hammer, Steve Wallis, Raymond Ho
22
10.4: Routing Overlays
Types of Routing Overlays
DHT – Distributed Hash Tables
DOLR – Distributed Object Location and Routing
Computes the GUID from all or part of the state of the object
DOLR is a layer over the DHT that maps GUIDs and address of
nodes
DHT – GUIDs are stored based on the hash value
DOLR – GUIDs host address is notified using the
Publish() operation
Bruce Hammer, Steve Wallis, Raymond Ho
23
10.5: Overlay Case Studies: Pastry, Tapestry
Both Pastry and Tapestry adopt the prefix
routing approach
Pastry has a straightforward but effective
design. It is the message routing infrastructure
deployed in applications such as PAST, an
archival file system
Bruce Hammer, Steve Wallis, Raymond Ho
24
10.5: Overlay Case Studies: Pastry, Tapestry
Tapestry is the basis for OceanStore storage
system. It has a more complex architecture
than Pastry because it aims to support a wider
range of locality approaches
Bruce Hammer, Steve Wallis, Raymond Ho
25
10.5: Overlay Case Studies: Pastry, Tapestry
Let’s talk about Pastry
Bruce Hammer, Steve Wallis, Raymond Ho
26
10.5: Overlay Case Studies: Pastry, Tapestry
Pastry
A routing overlay with the common
characteristics
All the nodes and objects are assigned 128-bit
GUIDs
Nodes are computed by applying a secure hash
function such as SHA-1 to the public key with
each node is provided
Bruce Hammer, Steve Wallis, Raymond Ho
27
10.5: Overlay Case Studies: Pastry, Tapestry
Objects such as files he GUIDs is computed by a
secure hash function to the object’s name or to
some part of the object’s stored state
The resulting GUID has the usual properties of
secure hash values randomly distributed in the
range 0 to 2128 -1
In a network with N participating nodes, the
Pastry routing algorithm will correctly route a
message addressed to an GUID in O(log N) steps
Bruce Hammer, Steve Wallis, Raymond Ho
28
10.5: Overlay Case Studies: Pastry, Tapestry
GUID delivers message to an identified active
node, otherwise, delivers to another active node
numerically closest to the original one
Active nodes take responsibility of rprocessing
requests addressed to al objects in their numerical
neighborhood
Routing steps involve the user of an underlying
transport protocol (normally UDP) to transfer the
message to a Pastry node that is ‘closer’ to its
destination
Bruce Hammer, Steve Wallis, Raymond Ho
29
10.5: Overlay Case Studies: Pastry, Tapestry
The real transport of a message across the
Internet between two Pastry nodes may requires a
substantial number of IP hops
Pastry users a locality metric based on network
distance in the underlying network to select
appropriate neighbors when setting up the routing
tables used at each node
Bruce Hammer, Steve Wallis, Raymond Ho
30
10.5: Overlay Case Studies: Pastry, Tapestry
The participated Hosts are fully self organizing
and obtaining the data need to construct a routing
table and other required state from existing
members in O(log N) messages, where N is the
number of hosts participating in the overlay
When a node fails, the remaining nodes can
detect its absence and cooperatively reconfigure
to reflect the required changes in the routing
structure
Bruce Hammer, Steve Wallis, Raymond Ho
31
10.5: Overlay Case Studies: Pastry, Tapestry
Pastry Routing Algorithm
The algorithm involves the user of a routing table
at each node to route messages efficiently.
Describe the algorithm in two stages
Stage 1: Simplified form to routes messages correctly
but inefficiently without a routing table
Stage 2: Describe full routing algorithm with routing
table which routes a request to any node in O(log N)
messages
Bruce Hammer, Steve Wallis, Raymond Ho
32
10.5: Overlay Case Studies: Pastry, Tapestry
Stage 1:
Each active node stores a leaf set – a vector L (of
size of 2l)
The vector contains the GUIDs and IP addresses
of the nodes whose GUIDs are numerically
closest on either side of its own (l above and l
below)
Leaf sets are maintained by Pastry as nodes join
and leave
Bruce Hammer, Steve Wallis, Raymond Ho
33
10.5: Overlay Case Studies: Pastry, Tapestry
Even after a node failure they will be corrected
within a short time within the defined maximum
rate of failure
The GUID space is treated as circular
Bruce Hammer, Steve Wallis, Raymond Ho
34
10.5: Overlay Case Studies: Pastry, Tapestry
Stage 1: Circular routing alone is correct but
inefficient
Destination D
Node A (65A1FC)
receives message M
with destination
address D (D46A1C)
Bruce Hammer, Steve Wallis, Raymond Ho
35
10.5: Overlay Case Studies: Pastry, Tapestry
Stage 2:
Full Pastry algorithm
Efficient routing is achieved with the aid of
routing tables
Each Pastry node maintains a tree-structured
routing table giving GUIDs and IP address for a
set of nodes spread throughout the entire range of
2128 possible GUID values
Bruce Hammer, Steve Wallis, Raymond Ho
36
10.5: Overlay Case Studies: Pastry, Tapestry
Structure of a routing table
Bruce Hammer, Steve Wallis, Raymond Ho
37
10.5: Overlay Case Studies: Pastry, Tapestry
Routing a message with the aid of routing table and the message can be
delivered in ~log 16 (N) hops.
Bruce Hammer, Steve Wallis, Raymond Ho
38
10.5: Overlay Case Studies: Pastry, Tapestry
Pastry’s routing algorithm
Bruce Hammer, Steve Wallis, Raymond Ho
39
10.5: Overlay Case Studies: Pastry, Tapestry
Host integration
New nodes use a joining protocol to acquire their
routing table and leaf set contents
Notify other nodes of changes they must make to
their tables.
Bruce Hammer, Steve Wallis, Raymond Ho
40
10.5: Overlay Case Studies: Pastry, Tapestry
Host failure or departure
Nodes in Pastry infrastructure may fail or depart
without warning
A node is considered failed when its immediate
neighbours can no longer communicate with it
Required to repair the leaf sets that contain the
failed node’s GUID
Bruce Hammer, Steve Wallis, Raymond Ho
41
10.5: Overlay Case Studies: Pastry, Tapestry
Locality
The locality metric is used to compare candidates
and the closest available node is chosen
This mechanism cannot produce globally optimal
routings because available information is not
comprehensive
Bruce Hammer, Steve Wallis, Raymond Ho
42
10.5: Overlay Case Studies: Pastry, Tapestry
Fault tolerance
Use ‘at-least-once’ delivery mechanism and
repeat several time sin the absence of a response
to allow Pastry a longer time window to detect
and repair node failures
Bruce Hammer, Steve Wallis, Raymond Ho
43
10.5: Overlay Case Studies: Pastry, Tapestry
Both Pastry and Tapestry adopt the prefix
routing approach
Pastry has a straightforward but effective design.
Tapestry has a more complex architecture than
Pastry because it aims to support a wider range of
locality approaches
Bruce Hammer, Steve Wallis, Raymond Ho
44
10.5: Overlay Case Studies: Pastry, Tapestry
Let’s talk about Pastry
Bruce Hammer, Steve Wallis, Raymond Ho
45
10.5: Overlay Case Studies: Pastry, Tapestry
Pastry
A routing overlay network
All the nodes and objects are assigned 128-bit
GUIDs
Nodes are computed by applying a secure hash
function such as SHA-1 to the public key with
each node is provided
Bruce Hammer, Steve Wallis, Raymond Ho
46
10.5: Overlay Case Studies: Pastry, Tapestry
Objects such as files the GUIDs is computed by a
secure hash function to the object’s name or to
some part of the object’s stored state
The resulting GUID has the usual properties of
secure hash values randomly distributed in the
range 0 to 2128 -1
In a network with N participating nodes, the
Pastry routing algorithm will correctly route a
message addressed to an GUID in O(log N) steps
Bruce Hammer, Steve Wallis, Raymond Ho
47
10.5: Overlay Case Studies: Pastry, Tapestry
GUID delivers message to an identified active
node, otherwise, delivers to another active node
numerically closest to the original one
Active nodes take responsibility of processing
requests addressed to all objects in their
numerical neighborhood
Routing steps involve the user of an underlying
transport protocol (normally UDP) to transfer the
message to a Pastry node that is ‘closer’ to its
destination
Bruce Hammer, Steve Wallis, Raymond Ho
48
10.5: Overlay Case Studies: Pastry, Tapestry
The real transport of a message across the
Internet between two Pastry nodes may requires a
substantial number of IP hops
Pastry uses a locality metric based on network
distance in the underlying network to select
appropriate neighbors when setting up the routing
tables used at each node
Bruce Hammer, Steve Wallis, Raymond Ho
49
10.5: Overlay Case Studies: Pastry, Tapestry
The participated Hosts are fully self organizing
Nodes obtains data from network to construct a
routing table and other required state from
existing members
When a node fails, the remaining nodes
reconfigure the required changes in the routing
structure
Bruce Hammer, Steve Wallis, Raymond Ho
50
10.5: Overlay Case Studies: Pastry, Tapestry
Pastry Routing Algorithm
The algorithm involves the use of a routing table
at each node to route messages efficiently.
Describe the algorithm in two stages
Stage 1: Simplified form to routes messages correctly
but inefficiently without a routing table
Stage 2: Describe full routing algorithm with routing
table which routes a request to any node in O(log N)
messages
Bruce Hammer, Steve Wallis, Raymond Ho
51
10.5: Overlay Case Studies: Pastry, Tapestry
Stage 1:
Each active node stores a leaf set – a vector L (of
size of 2l)
The vector contains the GUIDs and IP addresses
of the nodes whose GUIDs are numerically
closest on either side of its own (l above and l
below)
Leaf sets are maintained by Pastry as nodes join
and leave
Bruce Hammer, Steve Wallis, Raymond Ho
52
10.5: Overlay Case Studies: Pastry, Tapestry
Even after a node failure they will be corrected
within a short time within the defined maximum
rate of failure
The GUID space is treated as circular
Bruce Hammer, Steve Wallis, Raymond Ho
53
10.5: Overlay Case Studies: Pastry, Tapestry
Stage 1: Circular routing alone is correct but
inefficient
Destination D
Node A (65A1FC)
receives message M
with destination
address D (D46A1C)
Bruce Hammer, Steve Wallis, Raymond Ho
54
10.5: Overlay Case Studies: Pastry, Tapestry
Stage 2:
Full Pastry algorithm
Efficient routing is achieved with the aid of
routing tables
Each Pastry node maintains a tree-structured
routing table giving GUIDs and IP address for a
set of nodes spread throughout the entire range of
2128 possible GUID values
Bruce Hammer, Steve Wallis, Raymond Ho
55
10.5: Overlay Case Studies: Pastry, Tapestry
Structure of a routing table
Bruce Hammer, Steve Wallis, Raymond Ho
56
10.5: Overlay Case Studies: Pastry, Tapestry
Routing a message with the aid of routing table and the message can be
delivered in ~log 16 (N) hops.
Node A
Bruce Hammer, Steve Wallis, Raymond Ho
57
10.5: Overlay Case Studies: Pastry, Tapestry
Pastry’s routing algorithm
Bruce Hammer, Steve Wallis, Raymond Ho
58
10.5: Overlay Case Studies: Pastry, Tapestry
Host integration
New nodes use a joining protocol to acquire their
routing table and leaf set contents
Notify other nodes of changes they must make to
their tables.
Bruce Hammer, Steve Wallis, Raymond Ho
59
10.5: Overlay Case Studies: Pastry, Tapestry
Host failure or departure
Nodes in Pastry infrastructure may fail or depart
without warning
A node is considered failed when its immediate
neighbours can no longer communicate with it
Required to repair the leaf sets that contain the
failed node’s GUID
Bruce Hammer, Steve Wallis, Raymond Ho
60
10.5: Overlay Case Studies: Pastry, Tapestry
Locality
The Pastry routing structure is highly redundant
The locality metric is used to compare candidates
and the closest available node is chosen
This mechanism cannot produce globally optimal
routings because available information is not
comprehensive
Bruce Hammer, Steve Wallis, Raymond Ho
61
10.5: Overlay Case Studies: Pastry, Tapestry
Fault tolerance
Send ‘heartbeat’ messages to neighboring nodes
Use ‘at-least-once’ delivery mechanism and
repeat several times in the absence of a response
to allow Pastry a longer time window to detect
and repair node failures
Introduce randomness into the Pastry routing
algorithm to overcome the problem
Bruce Hammer, Steve Wallis, Raymond Ho
62
10.5: Overlay Case Studies: Pastry, Tapestry
Dependability
Additional dependability measures and some
performance optimizations in the host
management algorithms were included in the
update version called MSPastry by the authors
Bruce Hammer, Steve Wallis, Raymond Ho
63
10.5: Overlay Case Studies: Pastry, Tapestry
Evaluation Work
Use MSPastry to evaluate the impact on
perfromance and dependability of the host
join/leave rate and the associated dependability
mechanisms
Bruce Hammer, Steve Wallis, Raymond Ho
64
10.5: Overlay Case Studies: Pastry, Tapestry
Tapestry
Implements a distributed hash table and routes
messages to nodes based on GIDs associated with
resources using prefix routing in a manner similar
to Pastry.
API conceals the distributed hash table from
applications behind a Distributed Object Location
and Routing (DOLR) interface
Bruce Hammer, Steve Wallis, Raymond Ho
65
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Squirrel
OceanStore
Ivy file stores
Bruce Hammer, Steve Wallis, Raymond Ho
66
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Squirrel
The SHA-1 secure hash function is applied to the
RL of each cached object to produce a 128-bit
Pastry GUID
Authors based on the end-to-end argument, the
HTTPS protocol should be used to achieve a
much better guarantee of those interactions that
require it
Bruce Hammer, Steve Wallis, Raymond Ho
67
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Squirrel implementation
the node whose GUID is numerically closest to
the GUID of an object becomes that object’s
home node to hold any cached copy of the object
Client nodes respond to cache local and remote
web objects
Request a fresh copy of a object from home node
if no copy in local cache
Bruce Hammer, Steve Wallis, Raymond Ho
68
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Evaluation of Squirrel
The reduction in total external bandwidth used
The latency perceived by users for access to web
objects
The computational and storage load imposed on
client nodes
Bruce Hammer, Steve Wallis, Raymond Ho
69
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
OceanStore file store
Aims to provide a very large scale, incrementallyscalable persistent storage facility for
mutable data objects
long-term persistence
reliability in constantly changing network and
computing resources
used for NFS-like file service, electronic mail hosting,
database sharing persistent storage of large numbers
of data objects
Bruce Hammer, Steve Wallis, Raymond Ho
70
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Built prototype, called Pond to validate the
OceanStore design and compare its performance
with traditional approaches
Pond uses Tapestry routing overlay mechanism to
place blocks of data at nodes distributed
throughout the Internet and to dispatch requests to
them
Bruce Hammer, Steve Wallis, Raymond Ho
71
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Bruce Hammer, Steve Wallis, Raymond Ho
72
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
OceanStore/Pond Storage organization
Data objects are analogous to files, with their data
stored in a set of blocks
Each object represents an ordered sequence of
immutable version
Three types of identifier used in the storage
BGUID - Secure hash of a data block
VGUID - BGUID of the root block of a version
AGUID – Uniquely identifies all the version of an object
Bruce Hammer, Steve Wallis, Raymond Ho
73
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Performance of OceanStore/Pond
Pond is prototype to prove feasibility of a
scalable peer-to-peer file service
Evaluated against several purpose-designed
benchmarks including Andrew benchmark
Use Simple emulation of an NFS client and
server
Bruce Hammer, Steve Wallis, Raymond Ho
74
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Conclusions of OceanStore/Pond
When operating over a wide-area network
Substantially exceeds NFS for reading
Within a factor of three of NFS for updating files and
directories
LAN results are substantially worse
Overall, the results suggest that an Internet-scale peerto-peer file service would be an effective solution for
the distribution of files that do not change very rapidly
Bruce Hammer, Steve Wallis, Raymond Ho
75
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Ivy file system
A read/write file system supports multiple readers
and writers
Implemented over an overlay routing layer
Distributed hash-addressed data store
Emulates a Sun FNS server
Stores the state of files as logs
Scans the logs to reconstructs the files
Bruce Hammer, Steve Wallis, Raymond Ho
76
10.6: Application Case Studies:
Squirrel, OceanStore, Ivy Squirrel
Ivy file system
Resolved issues to host files in partially trusted or
unreliable machines
The maintenance of consistent file metadata
Partial trust between participants and vulnerability
Continued operation during partitions
Bruce Hammer, Steve Wallis, Raymond Ho
77
10.7: Summary
Napster – immutable data, unsophisticated
routing
Current – mutable data, routing overlays,
sophisticated algorithms
Internet or company intranet support
Distributed Computing (SETI)
Bruce Hammer, Steve Wallis, Raymond Ho
78
10.7: Summary
Benefits of Peer-to-Peer Systems
Ability to exploit unused resources (storage,
processing) in the host computers
Scalability to support large numbers of clients
and hosts with load balancing of network links
and host computer resources
Self-organizing properties of the middleware
platforms reduces costs
Bruce Hammer, Steve Wallis, Raymond Ho
79
10.7: Summary
Weaknesses of Peer-to-Peer Systems
Costly for the storage of mutable data compared
to trusted, centralized service
Can not yet guarantee anonymity to hosts
Bruce Hammer, Steve Wallis, Raymond Ho
80
10: Peer-to-Peer Systems
Questions????
Comments??
Bruce Hammer, Steve Wallis, Raymond Ho
81