Distributed Hash Tables (DHTs)

Download Report

Transcript Distributed Hash Tables (DHTs)

1
Distributed Hash Tables (DHTs)
 Lars Jørgen Lillehovde
 Jo Grimstad Bang
Distributed Hash Tables (DHTs)
2
Background
–
–
–
–
Distributed Data Structures
Chord, CAN, Pastery, Tapestry
File Sharing Networks
1st , 2nd , 3rd Generation
Distributed Hash Tables (DHTs)
3
Overlay networks
– Network on top of another network
– Provide extra functionality
– E.g. Dial-up Internet, VPN, TOR
– Bootstrapping node
Distributed Hash Tables (DHTs)
4
Hash table
– Key-value pair
Key
Value
Key1
Value1
Key2
Value2
Key3
Value3
Distributed Hash Tables (DHTs)
5
Hash table
– Hash table divided on several nodes
Node 2
Node 1
Key
Value
Key
Value
Key 1-1
Value 1-1
Key 2-1
Value 1-1
Key 1-2
Value 1-2
Key 2-2
Value 1-2
Key 1-3
Value 1-3
Key 2-3
Value 1-3
Node 3
Key
Value
Key 3-1
Value 3-1
Key 3-2
Value 3-2
Key 3-3
Value 3-3
Distributed Hash Tables (DHTs)
6
Structure
– Key space partitioning
– Consistent hashing
– Overlay network structure
Distributed Hash Tables (DHTs)
7
Chord
– The chord ring: modulo 2^m
Distributed Hash Tables (DHTs)
8
Chord
– Each node has a determined place within the
ring
– A node has a successor and a predecessor
– A node stores the keys between its
predecessor and itself
– A routing/finger table is used to keep track of
the other nodes
Distributed Hash Tables (DHTs)
9
Chord
Example:
Node 8
Key space
Distributed Hash Tables (DHTs)
10
Chord
– Each node contains a routing table with at
most m entries
– For j = 1,m -> entry: p+2^(j-1) to closest
successor node
– Checks if k is found between n and
successor of n. If not forward the request to
the closest node preceding n
– Each node knows a lot of nearby nodes, and
a little about far away nodes
– The target node will eventually be found after
at most (log n) steps
Distributed Hash Tables (DHTs)
11
Chord
Node 8
j
Maps to Real Node
1
x+1
N14
2
x+2
N14
3
x+4
N14
4
x+8
N21
5
x+16
N32
6
x+32
N42
Distributed Hash Tables (DHTs)
12
Chord
Distributed Hash Tables (DHTs)
13
Functionality
–
–
–
–
–
DHT Optimalization
Degrees
Hops
Churn
Replication
Distributed Hash Tables (DHTs)
14
Overlay networks in
practice
– Overlay network definition: Network built on top of
another network.
– Overlay network examples:
– IntServ (Integrated services)
– DiffServ (Differentiated services)
– IP Multicast
– + Improved quality of service
– - Require modification
Distributed Hash Tables (DHTs)
15
Overlay networks in
practice
– DHT
– Peer-to-peer protocols:
– Freenet
– I2P
– Gnutella – A file sharing network
• An open protocol used by several clients
• The most popular file sharing network on the
Internet (as of 2007, 40% share)
• Example client: LimeWire
Distributed Hash Tables (DHTs)
16
Overlay networks in practice
Gnutella
– “Pong” packets
– Responses to Ping
packets
– Pong-caching
– Floodsearching
– Older Gnutella
versions only
– Bad scalability
– Newer versions: No
query flooding
Distributed Hash Tables (DHTs)
17
Overlay networks in practice
LimeWire
– A Gnutella client (became free and opensource in 2001)
– LimeWire uses the BitTorrent protocol and the Gnutella
network
Distributed Hash Tables (DHTs)
18
Example application: DHT in LimeWire
– ...
Distributed Hash Tables (DHTs)
19
Protocols and their
implementations
– DHTs facilitate complex services and systems:
– Distributed file systems
– P2P file sharing
– Content distribution services
– Cooperative web caching
– Multicast and anycast
– Domain name services
– Instant messaging
Distributed Hash Tables (DHTs)
20
Protocols and their
implementations
Some overlay network implementations:
–
–
–
–
–
–
CAN (Content Addressable Network)
Chord
Kademlia (most used DHT, used by BitTorrent and Gnutella)
Pastry
P-Grid
Tapestry
• The figure illustrates Chord network topology:
Bold - The first four implementations (2001)
Distributed Hash Tables (DHTs)
21
DHT example application
BitTorrent
– BitTorrent is the most videly used P2P
protocol that supports DHT
– One user needs to act as the file-provider
– Providing a “seed”
– .torrent file with metadata
– Peers receives different piece of the file
– Initial user is relieved
– Peer-to-seed shifts
– “Swarm”
– BitTorrent tracker assists
– Hash has to be known
Distributed Hash Tables (DHTs)
22
DHT example application
BitTorrent
– ...
Distributed Hash Tables (DHTs)
23
DHT Benefits and Limitations
Some wanted properties:
• Fault-tolerance
• Locating objects (and quick retrieval)
• Scalability
• Incremental deployment
• Availability and performance (efficiency)
• Load balancing
• Data integrity (and quick data storage)
• Redundancy (replication)
Distributed Hash Tables (DHTs)
24
DHT Benefits and Limitations
– DHTs work well when executed as intended
– Security limitations
Distributed Hash Tables (DHTs)
25
DHT Security Considerations
– Assumption of trusted nodes!
– Behaviour inconsistencies
– Peer-to-peer systems must operate even with
malicious participants
– Anonymity less important in DHTs
– Malicious participants
– Nodes with malicious intends
– Several known DHT attacks
– Missing DHT security considerations
Distributed Hash Tables (DHTs)
26
DHT Security Considerations
The «Sybil» attack
...
Distributed Hash Tables (DHTs)
27
DHT Security Considerations
– «Eclipse» attack
– Isolation from benign nodes
– “Eclipsed” node
– Sybil and Eclipse attacks facilitate other attacks:
– Routing attacks
• Incorrect lookup routing
• Incorrect routing updates
• Partitioning - “bootstrapping node”
– Storage and retrieval attacks
• Usage of replicas
Distributed Hash Tables (DHTs)
28
DHT Security Considerations
Other types of attacks
– Inconsistent behaviour
– Malicious nodes behaving like benign ones
– Overload of targeted nodes (DoS)
– Rapid joins and leaves
– Rebalancing requires traffic
– Joins and leaves in file sharing
systems
– Unsolicited messages
– Use digital signatures or MACs
– Use random nonce (preferably)
Distributed Hash Tables (DHTs)
29
Questions?
– Distributed
– Hash
– Tables
Distributed Hash Tables (DHTs)