Immune System Methodology for Search in p2p Network.

Download Report

Transcript Immune System Methodology for Search in p2p Network.

Immune System and Search Technology
Designing a Fast Search Algorithm for P2P
Network using concepts from Immune Systems
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Overview of the Presentation
●
P2P Network
–
Paradigm for Decentralised Computing
●
Immune System Features
●
Experimental Setup
●
Simulation Results
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Peer To Peer Network
●
Most Direct Method of Connecting Computers
–
Simple
–
Inexpensive
–
No Boss
–
No Regulation
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Peer To Peer Network
●
●
PCs at the edge of the network are called “Peers”
Peers can retrieve objects directly from each other
Advantages of a P2P Network
A large collection of peers may be
available for content distribution-sometimes millions!
User takes advantage of the network’s
currently available resources.
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Peer To Peer Network
●
Problem of Hugeness
–
●
Emergence of Protocol
Centralized Directory
–
●
Decentralized Directory
–
●
Napster
KaZaA
Query Flooding
–
Gnutella
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Centralized Directory (Napster)
When peer connects, it informs
central server:
–
IP address
–
content
Centralized
directory server
1
Bob
peers
1
Alice queries for
Das Wunder von Bern
1
Alice requests file from Bob
3
1
While file transfer is
decentralized,
locating content is highly
centralized
Alice
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Centralized Directory (Napster)
●
●
Fast
Single point of failure
–
●
●
●
Centralized
directory server
Application crash
1
Performance bottleneck
Huge database to maintain
Copyright infringement
–
Bob
peers
1
1
Legal proceedings may
result in the company having
to shut down directory server
3
1
Alice
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Intermediate Arrangement (Kazaa)
Feature
Has a centralized server that
•maintains user registrations,
•logs users into the systems to keep
statistics,
•provides downloads of client software.
^
Two client types are supported:
Supernodes (fast cpus + high
bandwidth connections)
Nodes (slower cpus and/or
connections)
Supernodes addresses are provided in
the initial download. They also
maintain searchable indexes and
proxies search requests for users.
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Totally Decentralized (Gnutella)
Basic Feature
●
●
●
no hierarchy, peers have
similar responsibilities: no
group leader
no peer maintains directory
info
highly decentralized
^
Joining Algorithm
●
●
use bootstrap node to learn
about others
Join message
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Totally Decentralized (Gnutella)
Message Query :
●
●
●
●
●
●
Send query to neighbors
Neighbors forward query
If queried peer has object, it
sends message back to
querying peer
The queried peer forwards the
query to its immediate neighbor.
The resulting results are carried
back to the user.
A message Flooding occurs
^
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Totally Decentralized (Gnutella)
Pros :
●
Totally Decentralized query
●
Robust; Query doesn't stop on
break down of one of the nodes
●
Fresh Results : No outdated
Index
Cons
●
Query radius: Query Radius can
be long
●
Excessive query traffic : 25% of
the total traffic is query traffic
Courtesy : Limewire
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Totally Decentralized (Gnutella)
Challenges Ahead :
● Reduce Query time
● Stop Flooding; use
Intelligent method for
search to stop network
congestion
Total Traffic in Gnutella
Network is 1.7 Gbps
1.7% of total traffic in US
Internet Backbone
Topology of Gnutella Network
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
P2P: Totally Decentralized (Gnutella)
Perspective
● Introduce Intelligence in the
System through BioInspired Techniques
● Ants, Immune System
Total Traffic in Gnutella
Network is 1.7 Gbps
1.7% of total traffic in US
Internet Backbone
Topology of Gnutella Network
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Artificial Immune System
●
Relatively new branch of computer science
–
–
●
Using natural immune system as a metaphor for solving computational
problems
Not modelling the immune system
Variety of applications so far …
–
–
–
–
–
–
Fault diagnosis (Ishida)
Computer security (Forrest, Kim)
Novelty detection (Dasgupta)
Robot behaviour (Lee)
Machine learning (Hunt, Timmis, de Castro)
AIS are computational systems, inspired by theoretical immunology
and observed immune functions, which are applied to complex problem
domains (Timmis, 2001)
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Why the Immune System?
●
Recognition
–
Anomaly detection
–
Noise tolerance
●
Robustness
●
Feature extraction
●
Diversity
●
Reinforcement learning
●
Memory
●
Distributed
●
Adaptive
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Role of the Immune System
MHC protein
●
●
Protect our bodies from
infection
Primary immune
response
–
●
Launch a response to
invading pathogens
Antigen
(I)
APC
Peptide
( II )
T- cell
( III )
( IV )
Lymphokines
Activated T - cell
( VI )
Secondary immune
response
–
–
Remember past
encounters
Faster response the
second time around
(V)
B- cell
Activated B - cell
(plasma cell)
( VII )
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Role of the Immune System
●
Remembers encounters
–
–
No need to start from
scratch
Memory cells
The immune recognition
is based on the
complementarily
between the binding
region of the receptor
and a portion of the
antigen called epitope.
B -cell Receptors
Primary lymphoid
organs
Secondary
lymphoid
organs
Tonsils and
adenoids
Thymus
Spleen
Epitopes
Peyer’s patches
Antigen
Appendix
Lymph nodes
Bone marrow
Lymphatic vessels
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Role of the Immune System
●
●
Antibodies present a single type of
receptor, antigens might present
several epitopes.
This means that different antibodies
can recognize a single antigen
The immune recognition
is based on the
complementarily
between the binding
region of the receptor
and a portion of the
antigen called epitope.
B -cell Receptors
Primary lymphoid
organs
Secondary
lymphoid
organs
Tonsils and
adenoids
Thymus
Spleen
Epitopes
Peyer’s patches
Antigen
Appendix
Lymph nodes
Bone marrow
Lymphatic vessels
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
●
●
●
●
Clonal Selection (Burnet, 1978)
Elimination of self antigens
Clonaldeletion
(negative
selection)
Proliferation and differentiation on contact
Self-antigen
Proliferation
M
(Cloning)
of mature lymphocytes with antigen
M
Restriction
of
one
pattern
to
one
Antibody
differentiated cell and retention of that
Differentiation
pattern by clonal descendants
Generation
changes,
of
new
subsequently
random
Memory
cells
Selection
Plasma
cells
genetic
expressed
Foreign
antigens
as
diverse antibody patterns by a form of
Self-antigen
accelerated somatic mutation
Clonaldeletion
(negative
selection)
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
General Framework for AIS
Solution
Immune Algorithms
Affinity Measures
Representation
Application Domain
ImmuneSearch Algorithm
Similarity (message,search item)
Search Item - Antigen
P2P Network Search
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Reiterating the Perspective
Design Search Algorithm
Stop Flooding;
● Reduce Query Time
Solution
●
ImmuneSearch Algorithm
Similarity (message,search item)
Search Item - Antigen
P2P Network Search
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Modelling the Network
Design Search Algorithm
Stop Flooding;
● Reduce Query Time
●
User
Information Profile –
Immune System
Search Profile – Fußball
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Modelling the Network
Design Search Algorithm
Stop Flooding;
● Reduce Query Time
●
Zipf Law
(Information and
SearchProfile)
1
0
1
2
1
0
1
1
0
1
1
2
3
0
1
0
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Flooding
Flooding essentially implies
sending the message packet
to all the neighboring nodes
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Random Walk
A Message packet travels at
its will
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Immune Search
Algorithm Consists of two
parts
1. The movement of Message
Packets
2. Rearrangement of
Topology
Proliferation
High Concentration
of Packets Homing
Antibodies
Mutation
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Immune Search
Algorithm Consists of two
parts
1. The movement of Message
Packets
2. Rearrangement of
Topology
Aim
Cluster Similar Nodes (Similar
in Information and Search
Profile)
Algorithm
Move nodes similar to user
node closer to the user
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Immune Search
Movement Depends on
1. The Distance from the user
node
2. Amount of Matching
3. Age
Aim
Cluster Similar Nodes (Similar
in Information and Search
Profile)
Algorithm
Move nodes similar to user
node closer to the user
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Immune Search
Movement Depends on
1. The Distance from the user
node
2. Amount of Matching
3. Age
Aim
Cluster Similar Nodes (Similar
in Information and Search
Profile)
Algorithm
Move nodes similar to user
node closer to the user
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Search the Network – Immune Search
Movement Depends on
1. The Distance from the user
node
2. Amount of Matching
3. Age
Aim
Cluster Similar Nodes (Similar
in Information and Search
Profile)
No Movement
Algorithm
Move nodes similar to user
node closer to the user
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Experimental Results
100
Experiment :
•
•
Run for 100 generation,
without changing the
participating nodes
Each Generation 100
searches by users
100
selected randomly
Efficiency
•
No. Of Search Items found
in 50 time steps
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Experimental Results (Clustering)
100
100
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Experimental Results
100
Experiment :
Change 20 % of the node
100
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Experimental Results
100
Experiment :
Change 5% of the node at
each generation
100
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Experimental Results
Amount of Change in Neighborhood
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Future Work
●
●
●
●
Simulate the Results in Real Network
Take into account the important concept of Network
Traffic
Test the algorithm with sophisticated Information
Profile and Search Profile
Building up mathematical framework through which
the simulation results can be analytically justified
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>
Fragen und Antworten
Zentrum für Hochleistungsrechnen (ZHR) – A Bios Group Presentation
Niloy Ganguly <[email protected]>