Declustering the iTrust Search and Retrieval Network to Increase

Download Report

Transcript Declustering the iTrust Search and Retrieval Network to Increase

Declustering the iTrust
Search and Retrieval Network
to Increase Trustworthiness
Presentation by Christopher Badger
Research conducted in collaboration with
Yung-Ting Chuang, Isai Michel Lombera,
Louise E. Moser and P. M. Melliar-Smith
University of California, Santa Barbara
Supported in part by NSF Grant CNS 10-16103
Overview
 Introduction
 Design and Implementation of iTrust
 Peer Neighborhoods
 Declustering Algorithm
 Clustering Coefficients
 Results and Analysis
 Expectation of Cooperation
 Conclusions and Future Work
WEBIST 2012
iTrust
Christopher Badger
Introduction
 Search engines such as Google and Yahoo! have played
an increasingly important role in today's world

They offer fast and accurate search results … ideally

They are centralized, and therefore vulnerable to:


Attack
Censorship
 iTrust is our solution to this problem

iTrust is a P2P network that functions by distributing metadata
about documents and search requests to random nodes in the
iTrust membership

Designed to be resistant to censorship and attack
WEBIST 2012
iTrust
Christopher Badger
Design of iTrust
Source of
Information
WEBIST 2012
iTrust
Christopher Badger
Design of iTrust
Source of
Information
Request
Encounters
Metadata
Requester of
Information
WEBIST 2012
iTrust
Christopher Badger
Design of iTrust
Source of
Information
Request
Matched
Requester of
Information
WEBIST 2012
iTrust
Christopher Badger
HTTP Implementation of iTrust
apache
PHP
cURL
metadata functions
metadata xml engine
register metadata list
apply xml
publish xml list
helper functions
nodes wrapper
keywords wrapper
log
PECL http
tools
user settings
statistics
SQLite
session
public interface
resource wrapper
metadata inbox
leave membership
delete nodes
tag keyword resource
search functions
search
globals / navigation
query
inbox
tika / lucene / dictionary
(a)
WEBIST 2012
(b)
(c)
iTrust
Christopher Badger
Neighborhoods
 We define the neighborhood of a node as:
all of the other nodes to which the node is directly connected
 Importance of a node's neighborhood

The flow of information
 Neighborhoods cannot be unlimited in size

Too expensive to track the entire network
WEBIST 2012
iTrust
Christopher Badger
Neighborhoods
A
Green nodes comprise peer A's neighborhood
WEBIST 2012
iTrust
Christopher Badger
Neighborhoods
 Beneficial to have only trustworthy nodes in one's
neighborhood

How to determine which nodes are trustworthy?

How to define trustworthiness?
 Randomness

Why is a random neighborhood useful?

How to achieve neighborhood randomness?
 Declustering Algorithm
WEBIST 2012
iTrust
Christopher Badger
Declustering Algorithm
 The process

Ask all current neighbors for a list of their neighbors

Create a master list containing all of these gathered lists

Ensure the list contains only unique peers

Drop all existing connections, effectively clearing the
neighborhood

Select new neighbors randomly from the obtained list

Can be done in a manner similar to the binomial distribution,
where each node has an equal chance to become a neighbor
WEBIST 2012
iTrust
Christopher Badger
Declustering Algorithm
WEBIST 2012
iTrust
Christopher Badger
What Declustering Does
 Randomizes each node's neighborhood
 Reduces the clustering coefficient of the node performing
declustering

The clustering coefficient is a measure of how cliquish the network is
 Is performed locally by each node

Does not require a global context

Lowers the expectation of cooperation
WEBIST 2012
iTrust
Christopher Badger
Metrics
 Local clustering coefficient is defined as:
the number of edges between a node's neighbors
the number of edges that could occur between them

To calculate the local clustering coefficient of node X,
put all of X's neighbors into a set S

Find E, the number of possible edges between all
nodes in S


S )× x((S│S│)− 11))
• For an undirected graph, this number is: (│S│
2
Find e, the number of edges that exist between nodes
in S
The local clustering coefficient for X is given as:
e
E
 Global clustering coefficient is defined as:

WEBIST 2012
The average of all of the local clustering coefficients
iTrust
Christopher Badger
Metrics
 Maximum degree of any node in the network

The number of connections of the most connected node
in the network

Used as a reference for the prevalence of hubs in the network
 Match probability

The probability that an iTrust search in the particular graph results
in a hit
 Network view

The cardinality of the set containing all of a node's neighbors
and the neighbors of those neighbors
 Average network view

The average of all nodes' network views
WEBIST 2012
iTrust
Christopher Badger
Results
Erdős–Rényi Graph
WEBIST 2012
Barabási–Albert Graph
iTrust
Watts-Strogatz Graph
Christopher Badger
Results
Maximum
Hub Degree
Average
Network View
Global
Clustering
Coefficient
Match
Probability
Erdős–Rényi Graph
Initial
1st Pass
2nd Pass
3rd Pass
192
187
187
190
1000
1000
1000
1000
0.1502
0.1501
0.1501
0.1499
0.9282
0.9283
0.9282
0.9279
150
187
185
180
301
1000
1000
1000
0.7450
0.1506
0.1503
0.1501
0.2858
0.9286
0.9283
0.9290
492
246
187
186
1000
1000
1000
1000
0.2399
0.1533
0.1505
0.1508
0.9652
0.9297
0.9281
0.9283
Watts-Strogatz Graph
Initial
1st Pass
2nd Pass
3rd Pass
Barabási–Albert Graph
Initial
1st Pass
2nd Pass
3rd Pass
WEBIST 2012
iTrust
Christopher Badger
Results
WEBIST 2012
iTrust
Christopher Badger
Results and Analysis
 Declustering reduced the clustering coefficient in both the
Watts-Strogatz and Barabási–Albert graphs
 Declustering evened out the degree distribution in the
network, acting to eliminate any hubs
 For the Watts-Strogatz graph, the iTrust match probability
greatly increased
 Overall, declustering was able to effectively turn the
Watts-Strogatz and Barabási–Albert graphs into
random graphs similar to the Erdős–Rényi graph
 By promoting network randomization, the minimum
expectation of cooperation was decreased, thereby
increasing robustness
WEBIST 2012
iTrust
Christopher Badger
Expectation of Cooperation
 Definition:

Subjectively, the degree to which nodes act or rely on
information provided by other nodes
 Minimum Expectation of Cooperation

The minimum degree of cooperation expected from
all nodes in order for the network to function well
 Importance

A lower minimum expectation of cooperation allows
nodes in the network to continue functioning well,
despite increased resistance or attack by others
WEBIST 2012
iTrust
Christopher Badger
Expectation Of Cooperation
WEBIST 2012
iTrust
Christopher Badger
Conclusions
 The declustering strategy increases iTrust's trustworthiness
by randomizing peer neighborhoods
 Declustering also decreases the global clustering coefficient
of the network, which helps improve message forwarding
performance
 iTrust can be valuable for people who seek information on
the Internet and are wary of potential censorship
WEBIST 2012
iTrust
Christopher Badger
Future Work
 We are looking into combining declustering and different
message relaying strategies to increase network robustness
 In addition to the HTTP implementation, we are also
developing an SMS implementation for iTrust
 We intend to release all iTrust source code and related
documentation to the general public
WEBIST 2012
iTrust
Christopher Badger
Questions?
 Our iTrust Web Site
 http://itrust.ece.ucsb.edu
 Contact Information
 Christopher Badger: [email protected]
 Yung-Ting Chuang: [email protected]
 Isai Michel Lombera: [email protected]
 Our project is supported by NSF CNS 10-16193
WEBIST 2012
iTrust
Christopher Badger