Declustering the iTrust Search and Retrieval Network to Increase
Download
Report
Transcript Declustering the iTrust Search and Retrieval Network to Increase
Declustering the iTrust
Search and Retrieval Network
to Increase Trustworthiness
Presentation by Christopher Badger
Research conducted in collaboration with
Yung-Ting Chuang, Isai Michel Lombera,
Louise E. Moser and P. M. Melliar-Smith
University of California, Santa Barbara
Supported in part by NSF Grant CNS 10-16103
Overview
Introduction
Design and Implementation of iTrust
Peer Neighborhoods
Declustering Algorithm
Clustering Coefficients
Results and Analysis
Expectation of Cooperation
Conclusions and Future Work
WEBIST 2012
iTrust
Christopher Badger
Introduction
Search engines such as Google and Yahoo! have played
an increasingly important role in today's world
They offer fast and accurate search results … ideally
They are centralized, and therefore vulnerable to:
Attack
Censorship
iTrust is our solution to this problem
iTrust is a P2P network that functions by distributing metadata
about documents and search requests to random nodes in the
iTrust membership
Designed to be resistant to censorship and attack
WEBIST 2012
iTrust
Christopher Badger
Design of iTrust
Source of
Information
WEBIST 2012
iTrust
Christopher Badger
Design of iTrust
Source of
Information
Request
Encounters
Metadata
Requester of
Information
WEBIST 2012
iTrust
Christopher Badger
Design of iTrust
Source of
Information
Request
Matched
Requester of
Information
WEBIST 2012
iTrust
Christopher Badger
HTTP Implementation of iTrust
apache
PHP
cURL
metadata functions
metadata xml engine
register metadata list
apply xml
publish xml list
helper functions
nodes wrapper
keywords wrapper
log
PECL http
tools
user settings
statistics
SQLite
session
public interface
resource wrapper
metadata inbox
leave membership
delete nodes
tag keyword resource
search functions
search
globals / navigation
query
inbox
tika / lucene / dictionary
(a)
WEBIST 2012
(b)
(c)
iTrust
Christopher Badger
Neighborhoods
We define the neighborhood of a node as:
all of the other nodes to which the node is directly connected
Importance of a node's neighborhood
The flow of information
Neighborhoods cannot be unlimited in size
Too expensive to track the entire network
WEBIST 2012
iTrust
Christopher Badger
Neighborhoods
A
Green nodes comprise peer A's neighborhood
WEBIST 2012
iTrust
Christopher Badger
Neighborhoods
Beneficial to have only trustworthy nodes in one's
neighborhood
How to determine which nodes are trustworthy?
How to define trustworthiness?
Randomness
Why is a random neighborhood useful?
How to achieve neighborhood randomness?
Declustering Algorithm
WEBIST 2012
iTrust
Christopher Badger
Declustering Algorithm
The process
Ask all current neighbors for a list of their neighbors
Create a master list containing all of these gathered lists
Ensure the list contains only unique peers
Drop all existing connections, effectively clearing the
neighborhood
Select new neighbors randomly from the obtained list
Can be done in a manner similar to the binomial distribution,
where each node has an equal chance to become a neighbor
WEBIST 2012
iTrust
Christopher Badger
Declustering Algorithm
WEBIST 2012
iTrust
Christopher Badger
What Declustering Does
Randomizes each node's neighborhood
Reduces the clustering coefficient of the node performing
declustering
The clustering coefficient is a measure of how cliquish the network is
Is performed locally by each node
Does not require a global context
Lowers the expectation of cooperation
WEBIST 2012
iTrust
Christopher Badger
Metrics
Local clustering coefficient is defined as:
the number of edges between a node's neighbors
the number of edges that could occur between them
To calculate the local clustering coefficient of node X,
put all of X's neighbors into a set S
Find E, the number of possible edges between all
nodes in S
S )× x((S│S│)− 11))
• For an undirected graph, this number is: (│S│
2
Find e, the number of edges that exist between nodes
in S
The local clustering coefficient for X is given as:
e
E
Global clustering coefficient is defined as:
WEBIST 2012
The average of all of the local clustering coefficients
iTrust
Christopher Badger
Metrics
Maximum degree of any node in the network
The number of connections of the most connected node
in the network
Used as a reference for the prevalence of hubs in the network
Match probability
The probability that an iTrust search in the particular graph results
in a hit
Network view
The cardinality of the set containing all of a node's neighbors
and the neighbors of those neighbors
Average network view
The average of all nodes' network views
WEBIST 2012
iTrust
Christopher Badger
Results
Erdős–Rényi Graph
WEBIST 2012
Barabási–Albert Graph
iTrust
Watts-Strogatz Graph
Christopher Badger
Results
Maximum
Hub Degree
Average
Network View
Global
Clustering
Coefficient
Match
Probability
Erdős–Rényi Graph
Initial
1st Pass
2nd Pass
3rd Pass
192
187
187
190
1000
1000
1000
1000
0.1502
0.1501
0.1501
0.1499
0.9282
0.9283
0.9282
0.9279
150
187
185
180
301
1000
1000
1000
0.7450
0.1506
0.1503
0.1501
0.2858
0.9286
0.9283
0.9290
492
246
187
186
1000
1000
1000
1000
0.2399
0.1533
0.1505
0.1508
0.9652
0.9297
0.9281
0.9283
Watts-Strogatz Graph
Initial
1st Pass
2nd Pass
3rd Pass
Barabási–Albert Graph
Initial
1st Pass
2nd Pass
3rd Pass
WEBIST 2012
iTrust
Christopher Badger
Results
WEBIST 2012
iTrust
Christopher Badger
Results and Analysis
Declustering reduced the clustering coefficient in both the
Watts-Strogatz and Barabási–Albert graphs
Declustering evened out the degree distribution in the
network, acting to eliminate any hubs
For the Watts-Strogatz graph, the iTrust match probability
greatly increased
Overall, declustering was able to effectively turn the
Watts-Strogatz and Barabási–Albert graphs into
random graphs similar to the Erdős–Rényi graph
By promoting network randomization, the minimum
expectation of cooperation was decreased, thereby
increasing robustness
WEBIST 2012
iTrust
Christopher Badger
Expectation of Cooperation
Definition:
Subjectively, the degree to which nodes act or rely on
information provided by other nodes
Minimum Expectation of Cooperation
The minimum degree of cooperation expected from
all nodes in order for the network to function well
Importance
A lower minimum expectation of cooperation allows
nodes in the network to continue functioning well,
despite increased resistance or attack by others
WEBIST 2012
iTrust
Christopher Badger
Expectation Of Cooperation
WEBIST 2012
iTrust
Christopher Badger
Conclusions
The declustering strategy increases iTrust's trustworthiness
by randomizing peer neighborhoods
Declustering also decreases the global clustering coefficient
of the network, which helps improve message forwarding
performance
iTrust can be valuable for people who seek information on
the Internet and are wary of potential censorship
WEBIST 2012
iTrust
Christopher Badger
Future Work
We are looking into combining declustering and different
message relaying strategies to increase network robustness
In addition to the HTTP implementation, we are also
developing an SMS implementation for iTrust
We intend to release all iTrust source code and related
documentation to the general public
WEBIST 2012
iTrust
Christopher Badger
Questions?
Our iTrust Web Site
http://itrust.ece.ucsb.edu
Contact Information
Christopher Badger: [email protected]
Yung-Ting Chuang: [email protected]
Isai Michel Lombera: [email protected]
Our project is supported by NSF CNS 10-16193
WEBIST 2012
iTrust
Christopher Badger