Using the Small-World Model to Improve Freenet Performance

Download Report

Transcript Using the Small-World Model to Improve Freenet Performance

Using the Small-World Model to
Improve Freenet Performance
Hui Zhang
Ashish Goel
USC
Ramesh Govindan
Peer-to-peer systems


IP overlay networks

without any central control or hierarchical organization

Each node in the network is equivalent in functionality
A special class: content-addressed
networks

a document is accessed using a descriptive title of the
content rather than the location where the document is
stored.


A data object is represented by a point in a key space.
At the core lies the distributed algorithm for content lookup
and dissemination
INFOCOM 2002
2
Structured vs. Unstructured

Structured
system

there is global consensus on
which network node a
document is stored.




no such consensus exists

Algorithmic mapping
between document key
and node identifier.
Example

Unstructured
system

CAN, CHORD, Tapestry,
Pastry.
INFOCOM 2002
Document key and node
identifier are irrelevant.
Example

Gnutella, Freenet
3
Structured vs. Unstructured (cont’d)

Structured
system

Unstructured
system
Advantage





Efficient query

Guarantee retrieval of
existing documents

Disadvantage


Advantage

Hard to provide
anonymity

INFOCOM 2002
Resistant to attacks
Disadvantage

Problem caused by
DoS and selective
attacks
Simplicity in network
maintenance
Retrieval cost not
bounded
Search may fail even if
the requested data
exists
4
Freenet



[Clark et al 2001]
A Distributed Anonymous
Storage and Retrieval System
An
adaptive
application
Peer-to-Peer
Information
network
No broadcast search or centralized
location index is employed in Freenet
INFOCOM 2002
5
Freenet routing protocol [Clark et al 2001]




Files are identified by binary file keys
obtained by applying a hash function.
Each node maintains a datastore and a
routing table of <key,pointer> values
LRU (Least Recently Used) route-cache
Replacement Scheme
Steepest-ascent hill-climbing search with
backtracking.
INFOCOM 2002
6
Our contributions

A new cache replacement scheme for
Freenet


A small change to local user behavior leads to a significant
global improvement on system performance.
Discovery:
A
carefully
configured
unstructured system gives comparable
performance as structured systems in
terms of the size of the routing table and
the average number of hops per request.
INFOCOM 2002
7
An example search in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
E has a copy
of key 8
E
B
A
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
8
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
E has a copy
of key 8
E
Key 8?
B
A
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
9
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
E has a copy
of key 8
E
B
A
Key 8?
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
10
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
E has a copy
of key 8
E
B
A
C
No,
sorry
D
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
11
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
B
A
E has a copy
of key 8
E
Key 8?
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
12
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
E has a copy
of key 8
E
B
A
Key 8?
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
13
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
E has a copy
of key 8
E
Key 8!
B
A
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
14
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
…
B
A
E has a copy
of key 8
E
Key 8!
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
15
A searching example in Freenet network
B’s Routing Table
Key
Pointer
7
C
10
D
8
E
…
E has a copy
of key 8
E
Key 8!
B
A
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
16
A searching example in Freenet network
A’s Routing Table
Key
Pointer
…
8
E
…
B’s Routing Table
Key
Pointer
7
C
10
D
8
E
…

E has a copy
of key 8
E
B
A
D
C
D’s Routing Table
Key
Pointer
8
E
…
INFOCOM 2002
17
Main premise behind Freenet

The routing in Freenet is expected
to run efficiently


nodes should come to specialize in
locating sets of similar keys
nodes
should
become
similarly
specialized in storing clusters of files
having similar keys.
INFOCOM 2002
18
Simulation Results (I)
Ring Topology
(300 nodes, cache size=200)
Ring Topology
1.2
1.2
1
1
0.8
0.6
LRU
0.4
0.2
0
0
10
20
30
Request Hit Ratio
Hit Ratio
(300 nodes, cache size=40)
0.8
0.6
LRU
0.4
0.2
0
0
# of keys generated per node
Fig. 1: Hit Ratio vs. Key Size
10
20
30
# of keys generated per node
Fig.2: Simulation with a larger datastore
and routing table.
For the original Freenet protocol, there is a steep reduction in the hit ratio
with increasing load .
INFOCOM 2002
19
Why can’t Freenet search efficiently
under heavy work load?


Hypothesis: Frequent local caching
actions could break up clusters caused
by the Freenet routing mechanism
Our simulation results showed that Freenet
does not achieve clustering with light
randomness in its route cache when lots
of data items were inserted into the
network
INFOCOM 2002
20
Snapshots of Freenet node’s datastore
Key Distribution in Datastore under Heavy Load
6
5
4
3
2
1
0
Retrieved Times
Retrieved Times
Key Distribution in Datastore under Light Load
0
1E+0 2E+0 3E+0 4E+0 5E+0 6E+0 7E+0 8E+0 9E+0 1E+0
5
5
5
5
5
5
5
5
5
6
6
5
4
3
2
1
0
0
Key Value
Fig.3: average number of keys
generated per node =2
1E+0 2E+0 3E+0 4E+0 5E+0 6E+0 7E+0 8E+0 9E+0 1E+0
5
5
5
5
5
5
5
5
5
6
Key Value
Fig.4: average number of keys
generated per node =20
INFOCOM 2002
21
Problem Definition


How can we
improve the routing
performance of the Freenet protocol
efficiently without affecting its design
goals?
Two obvious solutions which do not work

Increasing the cache-size


the cache-size is locally administered since it depends on
individual system resources.
Increasing the HopsToLive value

could increase the hit ratio, at the expense of significantly
increased access latency.
INFOCOM 2002
22
Small-world model [Watts et al 1998]

A network between order and randomness


short-distance clustering (like regular graph)
+ long-distance shortcuts (result in short global path
length like random graph)
Local contact
Shortcut
An one-dimensional small-world network example
INFOCOM 2002
23
Kleinberg’s theorem


[Kleinberg 1999]
The network model consists of a onedimensional linear network plus one longdistance shortcut for each node.
The expected steps to delivery a
message in the network model is O(log2n)
when the shortcut for each node is
chosen with the probability inversely
proportional to the distance.
INFOCOM 2002
24
Clustering with randomness

To improve routing performance, we want the
routing table at node x to conform to the smallworld model.
S(x)
Random key (shortcut)
Clustered keys
Key Space

Crucial Observation: Such clustering can be
achieved by just changing the route-cache
replacement policy
INFOCOM 2002
25
Enhanced-clustering cache replacement scheme


Each node chooses a seed randomly when
joining the network
When a new key (file) u is to be cached, the
node chooses in the current datastore the key
v farthest from the seed
Distance (v, seed) = Max Distance (x, seed)


If Distance (u, seed) < Distance (v, seed), cache u
and evict v (clustering)
If Distance (u, seed) > Distance (v, seed), cache u
and evict v with probability P (randomness)

Can make P dependent on the two distances
INFOCOM 2002
26
Simulation Results (II)
Ring Topology
Ring Topology
(300 nodes, cache size=40)
(300 nodes, cache size=40)
1.2
20
Hit Ratio
1
0.8
Enhanced-clustering:
P=0
0.6
0.4
Enhanced-clustering
with Random shortcuts
0.2
0
0
10
20
Avg. Hops Per
Successful Request
LRU
15
Enhanced-clustering:
P=0
10
5
Enhanced-clustering
with random shortcuts
0
30
0
# of keys generated per node
Fig. 3: Hit Ratio vs. Key Size
LRU
10
20
30
# of keys generated per node
Fig. 4: Avg. Hops per Successful Request
vs. Key Size
Enhanced-clustering with random shortcuts achieves both the highest hit ratio
and the lowest avg. hops per successful request.
INFOCOM 2002
27
Network model: transit keys link nodes to nodes
Node Y
Node X
v
Seed(Y)
Shortcut of Y
2K keys clustering
around Seed(Y)
Node X
Node Y
INFOCOM 2002
28
An idealized model of Freenet from the node level
Local contact
shortcut
Node 1
Node k-1 Node k
Node k+1
Node N
N nodes in the Network
INFOCOM 2002
29
Theorem 1
In the idealized network model
if each node x chooses its random shortcut so that the random
shortcut has an endpoint y with probability proportional to 1/d
where d = |sx - sy|
E [Number of hops per request] = O(log2n)
using Freenet’s search algorithm.
INFOCOM 2002
30
“misleading” links: two cases
Node X
Node Y
Searching direction
Seed(Y)
Shortcut of Y
Node X
Shortcut of Y
Searching direction
Seed(Y)
2K keys clustering
around Seed(Y)
2K keys clustering
around Seed(Y)
(a) A request is passed from X to Y
through the shortcut of Y
Node Y
(b) A request is passed from X to Y
through the rightmost key of Y
INFOCOM 2002
31
Theorem 2
In
the
idealized
network
model
considering the effect of “misleading” links
if the routing table size is (log2n) and the shortcuts are chosen in
the same way as in Theorem.1
2n)
O(logn)
E [Number of hops per request] = O(log
using Freenet’s search algorithm.
INFOCOM 2002
32
Conclusion
Performance Comparison of several P2P systems
System
CAN
Expected hops per
Request
Expected Routing
table size
O(dn1/d)
O(d)
O(logn)
O(logn)
O(logn)
O(logn)
Kleingbor’s unique
Small-world model
O(log2n)
O(1)
Freenet in the
idealized model
O(logn)
O(log2n)
[Ratnaswamy et al 2001]
CHORD
[Stoica et al 2001]
Tapestry
[Zhao et al 2000]
INFOCOM 2002
33